Trending December 2023 # How To Overcome The 3 Biggest Blogging Challenges # Suggested January 2024 # Top 21 Popular

You are reading the article How To Overcome The 3 Biggest Blogging Challenges updated in December 2023 on the website We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 How To Overcome The 3 Biggest Blogging Challenges

Content creation and blogging have become extremely saturated.

But you knew that already.

As a blogger, you’ve come face to face with countless challenges.

Thankfully, you aren’t alone. And you don’t have to sit back and watch your content struggle.

Here are the three biggest blogging challenges and how you can overcome them.

Challenge 1: Topic Ideation / Keyword Research

There are millions upon millions of bloggers, companies, and brands putting out content on a daily basis.

That’s millions and millions of keywords being sought after.

Topic ideation and finding worthwhile keywords that don’t take hundreds of DR70+ inbound links is a needle in a haystack.

Take this keyword for example:

It’s relatively long-tail, it’s top of the funnel, a great way to bring in some organic traffic and build brand awareness.

But a 58 KD rating? Yikes. For 70 searches a month?

That’s 118 inbound links just to crack the top 10 for this keyword.

Worth it? Probably not.

The moral of the story is: most keywords are dominated already, and it’s only getting worse.

So, how do you bring in precious traffic and find topics that aren’t taken? Here is how.

Solution: Stop writing keyword content. Keyword content can be useful if you have the brand awareness already.

But are people really going to trust your brand over HubSpot when it comes to sales? Or Search Engine Journal (wink wink) when it comes to SEO?


They’ll see Search Engine Journal or HubSpot and think: yeah, those folks know what they’re talking about.

If you are struggling to generate ideas for topics that aren’t already beaten to a pulp, stop focusing so much on the keywords.

Instead, focus on these elements:

Take a new angle on an existing topic.

Create your own keyword.

Develop a research study.

Taking a new angle on an existing topic can help you stand out in the SERPs that are way too crowded with “xx tips” post or “why XX is the best strategy.”

For example, talking about how amazing PPC is and the benefits is old news.

PPC is great for X and Y and Z reasons. Snooze.

Instead, capture attention by flipping the script:

This hits the proverbial curiosity nail right on the head.

Creating your own keyword is similar, and is a proven concept. Take Brian Dean of Backlinko for example. His Skyscraper technique wasn’t a keyword, and now it gets tons of searches every month years after it went viral:

Did you find a technique for increasing conversions? ROI? CTR? Anything?

Name that specific, actionable, repeatable technique and blog about how you did it.

Conduct outreach for links, share it and discuss on social and you can build your own keyword search volumes from the ground up.

Lastly, this directly ties into the third point: creating your own research study.

Original research is on fire right now, and for good reason: bloggers need data to back up their claims, and marketers need fresh ideas to test against their own audiences.

Both create a winning formula for links, rankings, and traffic.

Challenge 2: Writing High-Quality 2,000+ Word Posts

While content length is generally arbitrary, it’s been proven that articles close to or above 2,000 words generate better rankings, more links, and more traffic.

At least, they should.

Just because studies show that your content should be longer doesn’t mean filler is the answer.

It means more of everything users are looking for: subtopics, tutorials, guides, exact steps, and images to showcase them.

Solution: Structure your content to match SERP flow and user intent.

There is no magic number for word count. Just having a page with 2,000 words doesn’t send signals to Google that say “gimme the rankings.”

Each section of content in your piece should solve user queries, and most of all satisfy user intent completely.

In a tool like Ahrefs or SEMrush, look up your target phrase and start compiling an outline based on related topics, searches, and intent:

Next, head to Google and inspect the SERP. Here is where you can start to do a deep dive on searcher intent to help refine your outline based on keywords and related searches.

First, start with the top-ranking post. This will help you align your content with the flow of the SERP.

What’s the headline? What intent are they aiming at compared to the other content on the front page?

Notice how the first ranking piece is tailored directly at beginners. And it makes sense. Someone searching for “outreach marketing” is more likely going to know nothing about it.

Meaning tactics come secondary for this intent.

Meaning your post should follow that same trend: explain it in grave detail before you even touch on a tactic.

Now, look at the SERP features: what specific questions are listed? What other searches are commonly done after this generic search?

Using these factors will help you provide value to users and search engines alike.

And with it, you’ll be cracking 3,000 words without breaking a sweat.

Challenge 3: Getting Traffic from Content

Creating content is the easy part. Getting real people to visit your content, enjoy it, and come back is brutal.

With links being a critical SEO factor (and traffic factor), how do you get them?

How do you build that authority to help your amazing content get seen?

Here are some solutions.

Solution: Outreach & promotion, custom images, audio, and video.

First off, outreach and promotion need to be apart of your workflow, asap.

If you think content can just pick up steam, you’re mostly wrong. The odds of that happening are slim to none.

But, typical sharing of your content on social doesn’t work nearly as well as it did a few years ago.

Instead, your promotion should be focused mainly on any email lists you’ve built or direct outreach for links.

Direct outreach can be improved by creating custom images for your content and allowing other publications to cite them.

Turn stats into a custom graph.

Summarize the main points with a nice graphic.

Lastly, start repurposing your blog content into other mediums that can drive traffic:

Only 2 million podcasts are indexed by Google currently. Compare that to the millions of blog posts published DAILY.

Don’t have time for a podcast? Utilize video tutorials in your content to showcase step-by-step sections and publish them on YouTube.

Expand your written content beyond just your blog.


Blogging today isn’t as easy as write, publish, share.

Everything from your topic and keyword research to the length and quality of content matter more than ever.

Put these tips to use to conquer your biggest blogging challenges, improve traffic, and build your brand.

More Resources:

Image Credits

Screenshots by author, August 2023

You're reading How To Overcome The 3 Biggest Blogging Challenges

Voice & Conversational Search: Top Challenges & How To Overcome Them

For a while, it seemed we’d be saying “next year will be the year of voice” annually, as was the case with mobile. Until last year, according to Google’s Annual Global Barometer Study, 2023 became “The Year of the Mobile Majority.”

There’s certainly no shortage of predictions on future voice usage.

In 2023, 20 percent of queries on mobile were voice, as announced at Google I/O 2023.

By 2023, 50 percent of search is expected to be driven by voice according to ComScore.

If Google Trends’ Interest over time on search terms “Google Home” and “Alexa” are any indication, eyes-free-devices just crashed into our lives with a festive bang.

Over 50 million smart speakers are now expected to ship in 2023, according to a Canalys report published on New Year’s Day.

No doubt aggressive holiday period sales strategies from both Amazon and Alphabet (Google’s parent company) to move smart speakers en masse, contributed strongly.

After Amazon reduced prices on Amazon Echo and Echo Dot, Google followed suit, slashing prices on Black Friday on Home and Home Mini smart speakers.

While analysts predict both companies broke even or made a loss, Google Trends interest demonstrated a hockey-stick curve.

‘Seasons Tweetings’

If the objective was to ignite mass device-engagement during seasonal family gatherings, this appears to have worked.

Social media screamed: “Step aside Cluedo, charades, and Monopoly… there’s a new parlor game in town, and it’s called Google Home and Alexa.”

A YouTube video of an 85-year-old Italian grandmother ‘scared’ of “Goo Goo”, as she called it, broke the internet, with over 2 million views so far.

People on Twitter “freaked out” at the “magic” of smart speakers, with one anecdotal tweet going viral at Home and Alexa seemingly communicating spontaneously with each other from across opposite sides of a room.

Bustle claimed an Amazon rep explained the technical reasons behind “the magic” following the tweet sensation. There is no magic. Merely explainable programming, and the automatic triggering of action words (or ‘hot words’) and sequential responses by both devices.

Challenges of Conversational Search & Humans

Machines are predictable; humans less so.

In conversational search, users ask questions in obscure and unpredictable ways. They ask them without context, in many different ways, and ask impossibly unanswerable ones too.

Amit Singhal gave a humorous example in an interview with Guy Kawasaki back in 2013, explaining how users ask questions like “Does my hair make me look bad?”

Unanswerable, unfortunately.

With Assistant and Home, humans may not say the “action” words needed to trigger smart speakers, such as “play” and “reminder,” and may instead receive recited lists of tracks in response.

Likewise, the understanding and extraction of the right data to meet the query may be carried out unsuccessfully by the search engine.

Voice Recognition Technology Improvements

Google is definitely getting better at voice recognition, with error rates almost on par with those of humans as Google’s Sundar Pichai claims.

My Search Engine Journal colleague Aleh Barysevich discussed this recently. We also know Mary Meeker’s annual Internet Trends report confirmed voice recognition is improving rapidly.

Voice ‘Recognition’ Is Not ‘Understanding’

However, simply because search engines have found a way to recognize voices and words, actually understanding meanings and context to return gold standard spoken answers appropriately still holds challenges.

And Which Answers Are Being Provided?

It’s clear we need to be able to gain sight of some voice query data soon.

Cross-platform, multi-device attribution and assisted conversions need to be measured commercially if we’re triggering answers and providing useful information, and we need to be able to get sight of how far we are from being considered a good result, so we can improve.

There is very little to no visibility currently available, other than we know it’s going on.

Below, Glenn Gabe illustrated some queries appearing in Google Search Console (maybe just for beta testers), but not yet separated from written desktop web and mobile).

Lots of Questions

One thing is for sure. Search engine users ask a LOT of questions.

According to Google’s recently released annual Global ‘Year in Search’, in 2023 we asked “how?” more than anything else:

No surprise then that a huge amount of industry and academic research in mining and analyzing community question and answer text is underway, with a selection of papers from one of the main Information Retrieval conferences Web Search and Data Mining Conference (WSDM) a small illustration:

Google’s Voice Search & Assistant Raters Guidelines

Something which is useful in helping us understand what is considered a good spoken result is the Voice Search and Assistant Quality Raters Guidelines, published in December 2023. The guide is for human raters to mark the quality of voice query and Assistant action word results as an important part of the search quality feedback loop.

Here’s an example of what failure looks like for voice search as per the Google Raters Guidelines:

Proposition: [will it rain this evening?]

Response: “I’m not sure how to help with that.”

Suggested Rating: Fails to Meet

Rater Guidelines Further Commentary: The device failed to answer the query. No users would be satisfied with this response.

I haven’t been able to find any figures on this but it would be interesting to know how often Google Home or Assistant on mobile says “I’m sorry I can’t help you with that” or “I don’t understand” as a percentage of total voice queries (particularly on smart speakers outside of action words and responses).

I did reach out to Google’s John Mueller on Twitter to ask if there were any figures available, but he didn’t answer.

Unsurprisingly so.

This is what the raters guide says on each of these attributes:

Information Satisfaction: The content of the answer should meet the information needs of the user.

Length: When a displayed answer is too long, users can quickly scan it visually and locate the relevant information. For voice answers, that is not possible. It is much more important to ensure that we provide a helpful amount of information, hopefully not too much or too little. Some of our previous work is currently in use for identifying the most relevant fragments of answers.

Formulation: It is much easier to understand a badly formulated written answer than an ungrammatical spoken answer, so more care has to be placed in ensuring grammatical correctness.

Elocution: Spoken answers must have proper pronunciation and prosody. Improvements in text-to-speech generation, such as WaveNet and Tacotron 2, are quickly reducing the gap with human performance.

From the examples provided in the guide, SEO pros can also get an idea of the type of response considered a high-quality one.

Spoiler: It’s one which meets informational needs, in short answers, grammatically correct (syntactically well-formed), and with accurate pronunciation.

Seems straightforward, but there is more insight we can gain to help us cater for voice search.

Note ‘Some of Our Previous Work’

You’ll notice “Some of our previous work” is briefly referred to on the subject of “length” and how Google is handling that for voice search and assistant.

The work is “Sentence Compression by Deletion with LSTMs”.

It is an important piece of work, which Wired explains as “they’ve learned to take a long sentence or paragraph from a relevant page on the web and extract the upshot — the information you’re looking for.” Only the most relevant bits are used from the content or the Knowledge Graph in voice search results.

One of the key researchers behind it is Enrique Alfonseca, part of the Google Research Team in Zurich. Alfonseca is well-placed as an authority on the subject matter of conversational search and natural language processing, with several published papers.

European Summer School on Information Retrieval 2023

Last summer, I attended a lecture by Alfonseca. He was one of a mix of industry and academic researchers from the likes of Facebook, Yahoo, and Bloomberg during the biennial European International Summer School in Information Retrieval (ESSIR). 

Alfonseca’s lecture gave insight into some of the current challenges faced by Google in providing gold standard (the best in information retrieval terms), high-quality results for conversational search users.

There is some cross-over between the raters guidelines and what we know already about voice search. However, the main focus and key points in Enrique’s lecture overall may give further insight to reinforce and supplement.

Alfonseca in his closing words made the point that better ranking for the conversational search was needed because the user tends to focus on a single response.

This was also discussed in a Voicebot podcast interview with Brad Abrams, a Google Assistant Product Manager, who said at best only 2-3 answers will be returned. So, there can be only one, or two, or three.

One thing is for sure. We need all the information we can get to compete.

Some Key Takeaways from the Lecture

Better ranking needed because the user tends to focus on a single answer.

A rambled answer at the end is the worst possible scenario.

There is not yet a good way to read an answer from a table.

Knowledge Graph entities (schema) first, web text next.

Better query understanding is needed, in context.

There is no re-ordering in voice search – no paraphrasing – just extraction and compression.

Multi-turn conversations are still challenging.

Linguists are manually building lexicons versus automation.

Clearly there are some differences in voice search when compared with keyboard based or desktop written search.

Further Exploration of the Key Lecture Points

We can look at each of the lecture points in a little more detail and draw some thoughts:

Rambled Answer at the End Is the Worst Possible Scenario

This looks at the length attribute and some formulation and presentation and pretty much ties in with the raters guidelines. It emphasizes need to answer the question early in a document, paragraph, or sentence.

The raters guide has a focus on short answers being key, too.

Presumably, this is aside from not returning an answer at all, which is a complete failure.

This indicates the need for a second separate strategy for voice search, in addition to desktop and keyboard search.

There Is Not a Good Way to Read Tables in Voice Search

“There is not currently a good way to read tables in voice search,” Alfonseca shared.

This is important because we know that in featured snippets tables provide strong structure and presentation via tabular data and may perform well, whereas, because of the difficulty in translating these to well-formulated sentences, they may perform far less well in voice search.

Pete Meyers from Moz did a voice search study of 1,000 questions recently and found only 30 percent of the answers were returned from tables in featured snippets. Meyers theorized the reason may be tabular data is not easy to read from, and Alfonseca confirms this here.

Knowledge Graph Entities First, Web Text Second and Better Query Understanding Is Needed, in Context

I’m going to look at these two points together because one strikes me as being very related and important to the other.

Knowledge Graph Entities First, Web Text Second

Google’s Inside Search voice search webpage tells us:

“Voice search on desktop and the Google app use the power of the Knowledge Graph to deliver exactly the information you need, exactly when you need it.”

More recently, Google shared in their December Webmaster blog post Evaluation of Speech for Google the contents of the voice response are sometimes sourced from the web. One presumes this means beyond the “power of the Knowledge Graph” spoken of in the voice search section of Inside Search.

Coupled with Alfonseca’s lecture it would not be amiss to consider that quite a lot of remaining answers come from normal webpages aside from the Knowledge Graph.

Alfonseca shared with us the Knowledge Graph (schema) is checked first for entities when providing answers in conversational search, but when there is no entity in the Knowledge Graph, conversation search seeks answers from the web.

Presumably much of this ties in with answers appearing in featured snippets, however, Meyers flagged up there are some answers whose source did not share featured snippets. He found only 71 percent of featured snippets mapped to answers in his study of 1,000 questions with Google Home.

We know there are several types of data which could be extracted for conversational search from the web:

Structured data (tables and data stored in databases)

Semi-structured data (XML, JSON, meta headings [h1-h6])

Semantically-enriched data (marked up schema, entities)

Unstuctured data (normal web text copy)

If voice search answers are extracted from unstructured data in normal webpages in addition to the better formed featured snippets and entities, this could be where things get messy and lacking in context.

There are a number of problems with unstructured data in webpages. Such as:

Unstructured data is loose and fuzzy. It is difficult to understand what it is about for a machine, although humans may well be able to understand it well.

It’s almost devoid of hierarchical or topical structure or form. Made worse if no well-structured website section and topically related pages to inherit relatedness.

Volume is an issue. There’s a huge amount of it.

It’s noisy and sparse of categorization into topics and content type.

Here’s Where Relatedness & Disambiguation Matter a Lot

Disambiguation is still an issue and more contextual understanding is vital. In his closing words, Alfonseca highlighted one of the challenges is “better query understanding is needed, in context.”

While we know the context of the user (contextual search such as location, surrounding objects, past search history, etc.) is a part of this, there is also the important issue of disambiguation in both query interpretation and in word disambiguation in text when identifying the most relevant answers and sources to return.

It isn’t just user context which matters in search, but the ontological context of text, site sections and co-occurrence of words together which adds semantic value for search engines to understand and disambiguate meaning.

This also applies to all aspects of search (aside from voice), but this may be even more important (and difficult) for voice search than written keyboard based search.

Words which could have multiple meanings and people say things that mean the same in many ways, but also too, maybe because only extracted snippets (fragments) of information are taken from a page, with irrelevant function words deleted, rather than the page as a whole analyzed.

There is an argument that the surrounding contextual words and relatedness to a topic for voice search will be more important than ever to add weight relevance prior to extraction and deletion.

It’s important to note here that Alfonseca is also a researcher behind a number of published papers on similarity and relatedness in natural language processing.

An important work he co-authored is “A Study on Similarity and Relatedness Using Distributional and Wordnet-Based Approaches” (Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M. and Soroa, A., 2009).

What Is Relatedness?

Words without semantic context mean nothing.

“Relatedness” gives search engines further hints on context to content to increase relevance to a topic, reinforced further via co-occurrence vectors and common linked words which appear in documents or document collections together.

It’s important to note relatedness in this sense is not referring to relations in entities (predicates) but as a way of disambiguating meaning in a large body of text-based information (a collection of webpages, site section, subdomain, domain, or even group of sites).

Relatedness is much more loose in nature than the clearly linked and connected entity relations and can be fuzzy (weak) or strong.

Relatedness is derived from Firthian linguistics, so named after John Firth who championed the notion of semantic context-awareness in linguistics and followed the aged context-principle of Frege…”never … ask for the meaning of a word in isolation, but only in the context of a proposition” (Frege [1884/1980] )

Firth is widely associated with disambiguation in linguistics, relatedness, and the phrase:

“You shall know a word by the company it keeps.”

If we could equate this with you can understand the meaning of the word when there is more than one meaning by what other words live near it or have words which it shares with others in the same text collections, its co-occurrence vectors.

For example, an ambiguous word might be jaguar.

Understanding whether a body of text is referring to a jaguar (cat) or a jaguar (car), is via co-occurrence vectors (words that are likely to share the same company).

To refer back to Firth’s notion; “What company does that word keep?”

For example, here we can see that “car” has five different meanings:

As humans, we would likely know which car was being referred to immediately.

The challenge is for machines to also understand the context of text to understand whether ‘car’ means a cable car, rail car, railway car, gondola, and so forth when understanding queries or returning results from loose and messy unstructured data such as a large body of text, with very few topical hints to guide.

This understanding is still challenging for voice search (and often in normal search too), but appears particularly problematic for voice. It is early days after all.

Paraphrasing: There Is None with Voice Search

With written words in featured snippets and knowledge panels, paraphrasing occurs.

Alfonseca gave the example below, which showed paraphrasing used in a written format in featured snippets.

But with voice search, Alfonseca told us, “There is no reordering in voice search; just extraction and compression. No paraphrasing.”

This is important because in order to paraphrase one must know the full meaning of the question and the answer to return it in different words (often fewer), but with the same meaning.

You cannot do this accurately unless a contextual understanding is there. This further emphasizes the lack of contextual understanding behind voice search.

This may also contribute to why there are still questions or propositions that are not yet answered in voice search.

It isn’t because the answer isn’t there, it’s because it was asked in the wrong way, or was not understood.

This is catered for in written desktop or mobile search because there are a number of query modifying techniques in place to expand or relax the query and rewrite in order to provide some answers, or a collection of at least competing answers.

It is unclear whether this is a limitation of voice search or is intended because no answer would be better than the wrong answer when there can be so few results returned, versus the 10 blues links which can be refined by users further in desktop search.

This means you need to be pretty much on the money with providing the specific answer in the right way because words will be deleted but not added (query expansion) or reordered (query rewriting).

As an aside, in the Twitter chat which followed my request about unanswerable queries to John Mueller, Glenn Gabe mentioned he’d been doing some testing of questions on Google Home, which illustrated these types of differences between voice and normal web search.

The normal query interpretation system in information retrieval might look something like this, and there are several transformations which take place. (This was not supplied by Alfonseca but was a slide from one of the other lecturers at ESSIR. However, it is widely known in information retrieval.)

You will see query rewriting is a key part of the written format query manipulation process in information retrieval. Not to be confused with query refining which refers to users further refining initial queries as they re-submit more specific terms en route to completing their informational needs task.

And here is a typical query rewriting example from IR:

If some or all of these transformations are not present in voice search currently this could be limiting.

One example of this is grammar and spelling.

The “Sentence Compression by Deletion with LSTMs” work referred to as “our other work” in the guidelines, appears to sacrifice by removing syntactic function words still used with other compression techniques to avoid grammatical errors or spelling mistakes.

The Raters Guidelines say:

Formulation: it is much easier to understand a badly formulated written answer than an ungrammatical spoken answer, so more care has to be placed in ensuring grammatical correctness.

Grammar matters with spoken conversational voice search more so than written form.

In written form, Google has confirmed grammar does not impact SEO and rankings. However, this may not apply to featured snippets. It certainly matters to voice search.

Phonetic algorithms are likely used in written search to identify words that sound similar, even if their spelling differs in written form such as the Soundex algorithm or similar variation of phonetic algorithms (such as the more modern “double metaphone algorithm,” which is in part driven by the Aspell spell helper), or various phonetic algorithms for internationalization.

Here is an example from the Aspell Spell Helper:

Multi-Turn Conversations

Alfonseca explained “multi-turn” conversations are still challenging. Single turn is when one question is asked and one (or maybe two ) answers are returned to that single question or proposition. Multi-turn relates to more than one sequential question.

One problematic area is where multi-turn questions likely refer to questions which subsequently rely on pro-nouns instead of naming entities in further questioning.

An example might be:

“What time is it in London?”

“What’s the weather like there?”

In this instance, “there” would relate to London. This relies on the device remembering the previous question and mapping that across to the second question and to the pro-noun “there.”

Anaphoric & Cataphoric Resolution

A major part of the challenges here may relate to issues with something called anaphoric and cataphoric resolution (a known challenge in linguistics), and we can even see examples in the raters guide which seem to refer to these issues when named entities are taken out of context.

Some of the examples provided give instances similar to anaphora and cataphora when a person is referred to out of context, or with pronouns such as her, him, they, she, after or before their name has been declared later on in a sentence or paragraph. When we add multiple people into these answers this becomes even more problematic in multi-turn questions.

For clarity, I have added a little bit more supporting information to explain anaphora and cataphora.

Where we can we should try to avoid pronouns in the short answers we target at voice search.

Building of Language Lexicons

Alfonseca confirmed the language lexicon building is not massively automated yet.

Currently, linguists manually build the language lexicons, tagging up the data (likely using part of speech (POS) tagging or named entity (NE) tagging, which identifies words in a body of text as nouns, adjectives, plural-nouns, verbs, pronouns, and so forth) for conversational search.

In an interview with Wired on the subject, Dave Orr, Google Product Manager on conversational search and Google Assistant, also confirms this manual process and the training of neural nets by human Ph.D. linguists using handcrafted data. Wired reports Google refers to this massive team as ‘Pygmalion’.

Google also, again, refers to the work in this interview from their ‘Evaluation of speech for Google as “explicit linguistic knowledge and deep learning solutions.”

As an aside, Orr answers some interesting questions on Quora on the classification of data and neural networks. You should follow him on there.

Layers of Understanding and Generation

In addition to these main lecture points, Enrique shared with us examples of the different layers of understanding and generation involved in conversational search, and actions when integrated with Google assistant.

Here is one example he shared which seeks to understand two sequential conversational queries, and then set a reminder for when the Manchester City game is.

Notice, the question “Who is Manchester city playing and when?” was not asked, but the answer was created anyway. We can see this is a combination of entities and text extraction.

When we take this, and combine it with the information from the raters guide and the research paper on Sentence Compression by Deletion with LSTMs, we can possibly draw a picture:

Entities from the Knowledge Graph are searched, and (when Knowledge Graph entities do not exist, or when additional information is needed), extractions of fragments of relevant web text (nouns, verbs, adjectives, pro-nouns) are sought.

Irrelevant words are deleted from the query and the text extractions in webpages for voice search in the index, to aid with sentence compression, and only extract important parts.

By deletion this means words which add no semantic value or are not entities. These may be ‘function words’; for example, pro-nouns, rather than ‘content words’ which are nouns, verbs, adjectives and plural-nouns. ‘Function words’ are often only present in any event to make pages syntactically correct in written form , and are less needed for voice search. ‘Content words’ add semantic meaning when coupled with other ‘content words’ or entities. Semantic meaning adds value, aiding with word-disambiguation, and a greater understanding of the topic.

This process is the “Sentence Compression by Deletion with LSTMs” which turns words (tokens) into a series of ones and zero’s (binary true or false) in “our other work” referred to in the raters guide. It is a simple binary decision of yes or no; true or false, whether the word will be kept, so accuracy is important. The difference appears to be with this deletion and compression algorithm there is not the same dependency upon POS (part of speech) tagging or NE (named entity) tagging to differentiate between relevant and irrelevant words.

A Few More Random Thoughts for Discussion

Does Page Length Normalization Apply to Voice Search?

Page length normalization is a type of penalty (but not in the penalty sense of manual actions or algorithmic suppressions like Penguin and Panda).

As Amit Singhal summarized in his paper on pivoted page length:

“Automatic information retrieval systems have to deal with documents of varying lengths in a text collection. Document length normalization is used to fairly retrieve documents of all lengths.”

In written text, ranking a full written page competes with another full written page, therefore, necessitating the “level playing field” dampener between long and shorter pages (bodies of text), whereas in voice search it is merely a single answer fragment that is extracted.

Page length normalization is arguably less relevant for voice search, because only the most important snippets are extracted and compressed, with the unimportant parts deleted.

Or maybe I am wrong? As I say, these are points for discussion.

How Can SEO Pros Seek to Utilize This Combined Information?

Answer All the Questions, in the Right Way, and Answer with Comprehensive Brevity

It goes without saying that we want to answer all the questions, but it’s key to identify not just the questions, but as many long tailed ways they are asked by our audience, along with propositions too.

It’s not just answering the questions though, it’s the way we answer them.

Voice Queries May Be Longer but Keep Answers Short and Sweet

Voice queries are longer than desktop queries. But it’s important to have brevity and target short sentences for much longer tailed conversational search.

We talk much faster than we type – and we talk a lot.

Ensure the sentences are short and concise and the answer is at the beginning of the page, paragraph, or sentence.

Summarize at the top of the page with a TL;DR, table of contents, executive summary, or a short bulleted list of key points. Add longer form content expanding upon the answer if appropriate to target keyboard based search.

Create an On-Site Customer Support Center or at the Least an FAQ Section

Not only will this help to answer all the many frequently answered questions your audience have, but with some smart internal linking via site sections you can add relatedness cues and hints to other sections.

Adding a support center also has additional benefits from a CRM (customer relationship management) perspective because you’ll likely reduce costs on customer service and also have fewer disgruntled customers.

The rich corpus of text within the section will again add many semantic cues to the whole thematic body of the site, which should also help with ‘relatedness’ again and direct answers for both spoken word and appearance in answer boxes.

WordPress has a particularly straightforward plugin called DW Question and Answer.

Even Better – Co-Create Answers with Audience Members

As an added benefit, on the customer loyalty ladder, co-creation with audience members as partners in projects is considered one of the highest levels achievable.

Become a Stalker: Know Your Audience, Know Them Well & Simulate Their Conversations

Unless your audience is a technical one, or you’re offering a technical product or service offering, it’s highly likely they’ll speak in technical terms.

Ensure you write content in the language they’re likely to talk in and watch out for grammatical errors and pronunciation.

Grammatical errors and misspellings in text based written form on webpages are dealt with by algorithms to correct them.

Soundex, for example, and other phonetic algorithms may be used. In voice search the text appears to be pronounced exactly as it is written to spelling and grammar matter much more.

Carry out interviews with your audience. Hold panel discussions.

Add a customer feedback survey on site and collect questions and answers there too. Tools like Data Miner provide a free solution to take to forums where your community gathers.

At a high level, use Google Analytics audience demographics to get a high view of who your visitors are, then drill down by affinity groups and interests.

There is even psychographic audience assessment software such as Crystal Knows, which builds personality maps of prospects.

Use Word Clouds to Visualize Important Key Supporting Textual Themes

Carry out keyword research, related queries, mine customer service, email and live chat data, and from the collated data build simple word clouds to highlight the most prominent pain points and audience micro-topics.

Find out What Questions Come Next: Multi-Turn Questions & Query-Refinement

Anticipate the next question or informational need.

What does your audience ask next, and at what stage in their user search journey, and how do they ask it?

What are the typical sequential questions which follow each other?

Think about user tasks and the steps taken to achieve those tasks when searching. We know that Google talks of this Berry Picking, or foraging search behavior as micro-moments, but we need to get more granular than this and understand what all the user tasks are around search queries and anticipate them.

Anaphoric and Cataphoric Resolution

Remember to consider anaphora and cataphora. Remember, this is particularly exasperated when we introduce multiple characters to a body of text.

Then you may need to consider a separate short section on the page focusing on voice and avoiding anaphora and cataphora. Make a clear connection in question answering and proposition meeting which entity or instance is referred to.


Query-refinement (via ‘People also asked’) in search results provides us with some strong clues as to what comes next from typical users.

There are some interesting papers in information retrieval which discuss how ‘categories’ of query options are provided there to sniff out what people are really looking for next and provide groups of query types to present to users and draw out search intent in their berry-picking search behavior.

In the example below we can see the types of queries can be categorized as tools and further informational content:

Find out What People Use Voice Search For

Back in 2014, Google produced a report which provides insight into what people use voice search for.

Even though the figures will be out of date you will get some ideas around the tasks people carry out with voice search.

Get Consistently Local, Understand Local Type Queries and Intent

Mobile intent is very different to desktop. Be aware of this.

Even as far back as 2014 over 50 percent of search on mobile had local intent, and that was before the 2023 Global Year of the Mobile Majority.

Realize voice searches on mobile are likely to be far more locally driven than on eyes-free home devices. Eyes-free devices will differ again from desktop and on-the-go mobile.

Understand which queries are typical to which device type, in which scenario and typical audience media type consumption preferences.

The way you formulate pages will need to be adapted to, and carer for these different devices and user behavior (text vs. written words).

People will say different things at different times on different types of devices and in different scenarios. People still want to be able to consume information in different ways and we know there are seven learning styles (visual, verbal, physical, aural, logical, social leaners and solitary learners). We each have our preferences or partial preferences with a mix of styles dependent upon scenarios or even our mood at the time.

Be consistent in the data you can control online. Ensure you claim and optimize (not over-optimize), all possible opportunities across Google My Business and Google Maps to own local.

Focus on Building Entities, Earning Featured Snippets & Implementing Schema

Given the Knowledge Graph and Schema is the first place to look for answers to voice search it certainly strengthens the business case for adding schema wherever you can on your site to mark up entities, predicates, and relationships where possible.

We need to ensure we provide structure around data and avoid the unstructured mess of standard webpage text wherever possible. More than ever voice search means this is vital.

My good friend and an awesome SEO, Mike, mentioned the speakable schema to me recently, which has some possibilities worth exploring for voice search.

It also goes without saying we should implement HowTo schema, given in 2023 users asked “How” more than anything else.

Remember Structure and ‘Relatedness’ Matters ‘a Lot’

Add meaning with relatedness to avoid being “fuzzy” in your unstructured content. Add semi-structured data as often as you can to support the largely unstructured text mass in webpages and ‘noise’.

Related content is not just there for humans but to add strong semantic cues. Be as granular as you can with this for stronger disambiguation. It goes without saying categorizing and subcategorizing to build strong “relatedness” has never been more important.

Dealing with the Tabular Data Issue

For now, it seems wise for voice search to have both a table and solid text answers accompanying answers.


While we are still gathering information on how to best handle voice search, what is clear is the strategy we need to employ will have many differences to those involved in competing in written form search.

We may even need a whole new strategy to target the types of answers and formulations needed to win. Semantic understanding is still an issue.

We need to be aware of the issues behind this, which the likes of anaphoric and cataphoric resolution can create, and bear in mind there is no paraphrasing currently in voice search, so you need to answer all the questions and answer them in the right way.

Focus on ensuring strong relatedness to ensure a lot of context is passed throughout your site in this environment. For tabular data, we need to target both written and verbal search.

Hopefully, over coming months we will get sight of more voice search data so we can find more ways to improve and maybe be “the one.”

Further Information Example of Sentence Compression

The sentence compression technology used to pull out the most important fragments within the sentences in text to answer a query is built on top of the linguistic analysis functionality of machine learning algorithms Parsey McParseface, designed to explain the functional role of each word in a sentence.

The example Alfonseca provided was:

“She married Philip Mountbatten, Duke of Edinburgh, in 1947, became queen on February 6, 1952.”

where the following underlined words are kept and the others are discarded:

“She married [Philip Mountbatten], [Duke of Edinburgh], in 1947, became queen on February 6, 1952.”

This would likely answer a sequential question around when Queen Elizabeth became Queen of England when also connected via other more structured data / entity relation of [capital city of england].

The sentence is compressed with everything between “She” and “became queen on February 6, 1952” omitted.

Sentence Compression by Deletion with LSTMs

Sentence Compression by Deletion with LSTMs appears to be exceptional because it does not totally rely on Part of Speech (POS) tags or Named Entity (NE) recognizer tags to differentiate between words which are relevant and those which are irrelevant in order to extract relevant words (tokens) and delete those which are not.

To clarify, POS tagging is mostly used in bodies of text to identify content words such a nouns, verbs, adjectives, pronouns, and so forth, which provide further semantic understanding as part of word disambiguation. Whereas, NE recognizer tags are described as by Stanford as:

“Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names.”

Function words help to provide clear structuring of sentences but do not provide more information or added value to help with word-disambiguation. They are merely there to make the sentence read better.

Examples of these are pronouns, determiners, prepositions and conjunctions. These make for more enjoyable reading and are essential to text sounding natural, but are not ‘knowledge-providing’ types of words. Examples are “in”, “and”, “to”, “the”, “therefore”, “whereby”, and so forth.

Therefore, naturally compressing the sentence by simply cutting down the words to only useful ones, with a simple yes or no binary decision.

The technology uses long short-term memory (LSTM) units (or blocks), which are a building unit for layers of a recurrent neural network (RNN)

Here are some other examples of general sentence compression.

Anaphoric & Cataphoric Resolution

What Is Anaphora?

A search of “anaphora” provides a reasonable explanation:

“In grammar, anaphora is the use of a word referring back to a word used earlier in a text or conversation, to avoid repetition, for example, the pronouns he, she, it and they and the verb do in I like it and do they.”

This problem may be particularly prevalent in sequential multi-turn conversations.

We are not provided with information on whether the human raters ask sequential multi-turn questions, or query propositions, but as we know from Alfonseca’s lecture, this is still a problematic area, so we can presume this is will require human rater feedback to seek gold standard answers over time.

May refer to multi-chaining of anaphora and cataphora for example:

An example might be:

Who is the president of the United States

Where was he born?

Where did he live before he was president?

Who is he married to?

What are his children called?

When he married Michelle where did [they] get married? – ‘They’ here may be where things get very problematic because we have introduced another person and they are both referred to with a pronoun of they.

Examples of anaphora

The student studied really hard for her test.

The student saw herself in the mirror.

John studied really hard for his test.

Examples of cataphora

Because she studied really hard, Nancy aced her test.

Here are more anaphora & cataphora examples.

Image Credits

Screenshots taken by author, January 2023

How Guest Blogging Is Getting Huge

Guest blogging has been around for quite some time: plenty of blogs have been inviting contributions from outside for ages. On the other hand, it has always been a great way to expose your brand to a wide audience and win new followers and contacts via providing great content.

The win-win concept behind guest blogging has made the tactic highly popular (this was the main reason I decided to build My Blog Guest by the way).

Recently, the phenomenon has grown by leaps and bounds: the hugest brands and resources are in the game and here’s why:

1. Guest Blogging Means Getting Heard

Top blogs and online magazines feature guest posts regularly because the insight provided by the guest poster is priceless.


TechCrunch features guest posts several times a week: these are usually marketing case studies or success stories or personal stories (e.g. My Life As A CEO (And VC): Chief Psychologist and Guest Post: Could Tiny Somaliland Become the First Cashless Society?)

2. Guest Blogging Means Better Chances to Win Social Media

You can spend months on developing your own blog and trying to get more attention to it but never see any of your thoroughly-created top-notch stories hit the front page of Digg or generate more than 50 Tweets.

Or you can spend a day writing a great story in the area of your expertise, pitch it to a “social-media friendly” blog and here you go: thousands of votes, thumb-ups and tweets.

Guest blogging is the best “indirect” way to reach thousands of social media users:

3. Guest Blogging Means Direct Conversation with Your Customers

Do you want your customers to think you are “cool”? – Talk to them using their favorite (and well-respected) platform. People tend to trust and look up to the blogger they follow. Guest blogging is one of the best ways to “redirect” this trust to your brand by using the blog to publish your guest posts.

Guest blogging provides brands with a great opportunity to start an open dialog with customers.

So what’s your guest blogging strategy?

How To Overcome Position Bias In Recommendation And Search?


In this article, we’re going to discuss the following topics:

Which types of biases do exist, and how to measure them?

Overcoming position bias with Inverse Propensity Weighting and downsides of such approach.

Position-aware Learning is a way to teach your ML model to consider bias while training.

This article was published as a part of the Data Science Blogathon.

Table of Contents Biases in Ranking

Every time you present a list of things, such as search results or recommendations (or autocomplete suggestions and contact lists), to a human being, we can hardly ever impartially evaluate all the items in the list.

Model bias: When you train an ML model on historical data generated by the same model.

In practice, the position bias is the strongest one — and removing it while training may improve your model reliability.

Experiment: Measuring Position Bias

We conducted a small crowd-sourced research about position bias. With a RankLens dataset, we used a Google Keyword Planner tool to generate a set of queries to find a particular movie.

All major crowd-sourcing platforms like Amazon Mechanical Turk and chúng tôi have out-of-the-box templates for typical search evaluation:

But there’s a nice trick in such templates, preventing you from shooting yourself in the foot with position bias: each item must be examined independently. Even if multiple items are present on screen, their ordering is random!

Inverse Propensity Weighting

Relevance: The importance of the item within the current context (like BM25 score coming from ElasticSearch, and cosine similarity in recommendations)

In the MSRD dataset mentioned in the previous paragraph, it’s hard to distinguish the impact of position independently from BM25 relevance as you only observe them combined together.

But how can you estimate the propensity in practice? The most common method is introducing a minor shuffling to rankings so that the same items within the same context (e.g., for a search query) will be evaluated on different positions.

But adding extra shuffling will definitely degrade your business metrics like CTR and Conversion Rate. Are there any less invasive alternatives not involving shuffling?

Position-Aware Learning

A position-aware approach to ranking suggests asking your ML model to optimize both ranking relevancy and position impact at the same time:

On training time, you use item position as an input feature,

In the prediction stage, you replace it with a constant value.

In other words, you trick your ranking ML model into detecting how position affects relevance during the training but zero out this feature during the prediction: all the items are simultaneously being presented in the same position.

But which constant value should you choose? The authors of the PAL paper did a couple of numerical experiments on selecting the optimal value — the rule of thumb is not to pick too high positions, as there’s too much noise.

Practical PAL

The PAL approach is already a part of multiple open-source tools for building recommendations and searches:

ToRecSys implements PAL as a bias-elimination approach to train recommender systems on biased data.

Metarank can use a PAL-driven feature to train an unbiased LambdaMART Learn-to-Rank model.

As the position-aware approach is just a hack around feature engineering, in Metarank, it is only a matter of adding yet another feat

On an MSRD dataset mentioned above, such a PAL-inspired ranking feature has quite a high importance value compared to other ranking features:


The position-aware learning approach is not only limited to pure ranking tasks and position de-biasing: you can use this trick to overcome any other type of bias:

For the presentation bias due to a grid layout, you can introduce a pair of features for an item’s row and column position during the training. But swap them to a constant during the prediction.

The ML model trained with the PAL approach should produce an unbiased prediction. Considering the simplicity of the PAL approach, it can also be applied in other areas of ML where biased training data is a usual thing.

While conducting this research, we made the following main observations:

Position bias can be present even in unbiased datasets.

Shuffling-based approaches like IPW can overcome the problem of bias, but introducing extra jitter in predictions may cost you a lot by lowering business metrics like CTR.

The Position-aware learning approach makes your ML model learn the impact of bias, improving the prediction quality.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.


How To Play The Sims 3 In Linux

Note: This is NOT a guide on how to pirate The Sims. You will still need a working install DVD and license key.


Of the multiple approaches to running Windows software on Linux, PlayOnLinux is without a doubt the best choice for Sims 3. This is not because it’s any more “capable’ of running the program, but because PlayOnLinux provides an install script specifically for this game which makes the process FAR simpler than it would be on Cedega or Wine alone.

I recommend that you do not use the PlayOnLinux package provided by your distribution. In testing for this article, I found the pre-packaged versions to be much less functional and reliable than those provided by the PlayOnLinux website. They have up-to-date packages for nearly every major distribution available here.

Beginning Installation

For the record, I believe they are incorrect about Shop Mode not working, as I had no trouble building and furnishing a house.

A Note About Prefixes

PlayOnLinux uses prefixes to isolate game installations. When you install a game such as The Sims, you get the equivalent of a new instance of Windows for that game. If you then install a different game such as Spore, PlayOnLinux will create a NEW instance of Windows (a new prefix) for Spore to run. This prevents your files and settings for one game from interfering with another.


This is where PlayOnLinux really shines for Sims 3. On plain Wine or Cedega, you’d have to manually install several packages into your prefix before you can even begin the actual Sims install. Fortunately for us, PlayOnLinux takes care of all that. Before Sims begins, you’ll be prompted to install packages such as Gecko, the Microsoft C++ Runtime Library, and Mono. Allow each of these to complete before moving forward to the next step.

You’ll be given a choice as to whether you’d like to install through the DVD or the downloaded package. Either should work with PlayOnLinux, but the remainder of this guide will be using the DVD edition.

Sims Install

When the dependencies have completed, you’ll be asked to insert your media. CD/DVD drives should be detected automatically, but if you have trouble, you can specify a location by choosing Other. Pick the drive and move to the next step.

STOP. At this point it should have launched the Sims 3 Installer from your DVD. If that didn’t happen, go back and verify that you’ve selected the proper location of your installer.

If you DID get the Sims installer, then proceed exactly as you would in Windows.

If asked about the Download Manager, I’d recommend that you not install it. While it’s possible it may work on your setup, it has caused nothing but trouble on the tests I’ve done. Game patches can be applied manually (discussed in more detail at the end of this guide).

When the Sims 3 Installer window is finished and closed, proceed to the next PlayOnLinux screen. You’ll be asked a little bit of basic information about your video card and if you’d like to create shortcuts.

Important – Before You Play

In the opening paragraph, I hinted that one of the problems with getting Sims 3 to run was because of the copy protection. In order to work around this problem you’ve got to replace the “TS3.exe” file with one that does not contain this copy protection. Unfortunately MakeTechEasier cannot provide such “cracks” or even links to them. You should have no problem obtaining more information from big brother G.

For Sims 3 to run, you’ll have to find a modified chúng tôi on your own, and use it to replace the one in your Sims 3 installation. This will likely be found in “~/.PlayOnLinux/wineprefix/TheSims3/drive_c/Program Files/Electronic Arts/The Sims 3/Game/Bin“.

Once that’s done, you’re ready to play!

Extra – Getting Updates

Earlier I recommended that you skip installing the EA Download Manager. This leaves us with no updates to the game, and Sims 3 is certainly not without its glitches. Fortunately PlayOnLinux has a Patches category which includes a script to install Sims 3 updates manually.

That’s it. Enjoy!

Joshua Price

Josh Price is a senior MakeTechEasier writer and owner of Rain Dog Software

Subscribe to our newsletter!

Our latest tutorials delivered straight to your inbox

Sign up for all newsletters.

By signing up, you agree to our Privacy Policy and European users agree to the data transfer policy. We will not share your data and you can unsubscribe at any time.

The Biggest (And Strangest) Moments Of E3 2023

I think it’s clear that this wouldn’t have been much of an E3 to write home about without Microsoft and Nintendo. I would also count Geoff Keighley’s Summer Game Fest as a major contributor, even though it technically wasn’t part of E3 proper. It was at those shows, however, that we got the biggest announcements.

As for the single biggest announcement of the entire event, that’s a hard thing to call. Summer Game Fest, for instance, gave us not only our first look at Elden Ring gameplay footage but also gave us a release date for the game. Then you had Microsoft revealing things like Forza Horizon 5, Halo Infinite multiplayer, and even the release date for Bethesda’s Starfield. Putting a cap on the show was Nintendo, which shared another teaser for Breath of the Wild 2 and told us that it’s targeting a 2023 release date for the game.

These were the big, show-stopping reveals that people tune in to E3 specifically. There were plenty of other big and exciting reveals during all three shows, but if I had to pick the biggest of the event as a whole, I would count all of those I just listed. If my feet were to the fire and I had to pick one that was the show’s biggest reveal, I would probably say it was Elden Ring.

The trailer we saw for Elden Ring was much more substantial than, say, the trailer for Breath of the Wild 2, and Souls fans have been waiting a very long time to hear more about it. People have been asking for more on Elden Ring for so long that Geoff Keighley even expressed relief that it was finally shown this year, proclaiming, “I hope you guys are happy – I’m free! Out of prison!” after the trailer debuted. We can only imagine how much that poor man has been hounded by rabid Dark Souls fans wanting to know more about Elden Ring over the past two years.

While Elden Ring definitely stole the show during Summer Game Fest’s Kickoff Live, there were other exciting announcements during that show as well. For instance – and this is coming from someone who feels a little cool toward Borderlands these days – I think that Tiny Tina’s Wonderlands sounds like an awesome concept for a Borderlands spin-off and I’m excited to see more.

I also think that Metal Slug Tactics looks great, but I’ve always been a sucker for a good tactics game. As SlashGear’s resident man-child who never stopped thinking that dinosaurs are awesome, I’m tentatively excited for Jurassic World Evolution 2 as well. However, I’m hoping that Frontier goes a little deeper with the simulation mechanics this time around.

The Xbox and Bethesda Showcase was packed from start to finish with game announcements as well. I’m very excited for Forza Horizon 5, especially after confirmation that we’ll be heading to Mexico in this installment. Not only is this American excited to explore Forza’s take on a country like Mexico, but after entries in Australia and Great Britain, he’s also ready to start driving on the right side of the road again.

Microsoft’s conference also announced The Outer Worlds 2 in what was quite possibly the best trailer of E3 2023, and it gave us a big surprise when it revealed Diablo II Resurrected’s release date. Redfall, which is an upcoming game from Arkane Austin, sounds promising, but of course, we need to see the game in action as well. The conference also brought word that Microsoft is making the Xbox Series X mini-fridge a reality, and I think that’s wonderful.

Then we had Nintendo, which covered a lot of ground as well. As I already said, Breath of the Wild 2 was unquestionably the biggest part of the show, but Metroid Dread – which served as the Direct’s opener – definitely gave it a run for its money. The new WarioWare game looks fantastic, and I know I’m not alone in saying that I’m super excited for the return of Advance Wars. It has been far too long since we heard from the Advance Wars series, so hopefully, these remakes that are on the way to Switch signal a larger revival for the franchise.

While the presentations from smaller publishers like Square Enix and Ubisoft were a little lighter than they usually are, there were still some exciting announcements to be found during those events. I’m very interested in hearing more about the so-called “pixel remasters” of the first six Final Fantasy games that were announced during Square Enix’s show, while Mario + Rabbids Sparks of Hope was not only the highlight of Ubisoft’s show for me but a highlight of E3 in general.

Update the detailed information about How To Overcome The 3 Biggest Blogging Challenges on the website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!