Monday, October 8, 2012

Another step to reward high-quality sites

Google has said before that search engine optimization, or SEO, can be positive and constructive—and we're not the only ones. Effective search engine optimization can make a site more crawlable and make individual pages more accessible and easier to find. Search engine optimization includes things as simple as keyword research to ensure that the right words are on the page, not just industry jargon that normal people will never type.

“White hat” search engine optimizers often improve the usability of a site, help create great content, or make sites faster, which is good for both users and search engines. Good search engine optimization can also mean good marketing: thinking about creative ways to make a site more compelling, which can help with search engines as well as social media. The net result of making a great site is often greater awareness of that site on the web, which can translate into more people linking to or visiting a site.

The opposite of “white hat” SEO is something called “black hat webspam” (we say “webspam” to distinguish it from email spam). In the pursuit of higher rankings or traffic, a few sites use techniques that don’t benefit users, where the intent is to look for shortcuts or loopholes that would rank pages higher than they deserve to be ranked. We see all sorts of webspam techniques every day, from keyword stuffing to link schemes that attempt to propel sites higher in rankings.

The goal of many of our ranking changes is to help searchers find sites that provide a great user experience and fulfill their information needs. We also want the “good guys” making great sites for users, not just algorithms, to see their effort rewarded. To that end we’ve launched Panda changes that successfully returned higher-quality sites in search results. And earlier this year we launched a page layout algorithm that reduces rankings for sites that don’t make much content available “above the fold.”

In the next few days, we’re launching an important algorithm change targeted at webspam. The change will decrease rankings for sites that we believe are violating Google’s existing quality guidelines. We’ve always targeted webspam in our rankings, and this algorithm represents another improvement in our efforts to reduce webspam and promote high quality content. While we can't divulge specific signals because we don't want to give people a way to game our search results and worsen the experience for users, our advice for webmasters is to focus on creating high quality sites that create a good user experience and employ white hat SEO methods instead of engaging in aggressive webspam tactics.

Here’s an example of a webspam tactic like keyword stuffing taken from a site that will be affected by this change: 


Of course, most sites affected by this change aren’t so blatant. Here’s an example of a site with unusual linking patterns that is also affected by this change. Notice that if you try to read the text aloud you’ll discover that the outgoing links are completely unrelated to the actual content, and in fact the page text has been “spun” beyond recognition: 


Sites affected by this change might not be easily recognizable as spamming without deep analysis or expertise, but the common thread is that these sites are doing much more than white hat SEO; we believe they are engaging in webspam tactics to manipulate search engine rankings.

The change will go live for all languages at the same time. For context, the initial Panda change affected about 12% of queries to a significant degree; this algorithm affects about 3.1% of queries in English to a degree that a regular user might notice. The change affects roughly 3% of queries in languages such as German, Chinese, and Arabic, but the impact is higher in more heavily-spammed languages. For example, 5% of Polish queries change to a degree that a regular user might notice.

We want people doing white hat search engine optimization (or even no search engine optimization at all) to be free to focus on creating amazing, compelling web sites. As always, we’ll keep our ears open for feedback on ways to iterate and improve our ranking algorithms toward that goal.

Wednesday, March 14, 2012

Farewell to soft 404s

We see two kinds of 404 ("File not found") responses on the web: "hard 404s" and "soft 404s." We discourage the use of so-called "soft 404s" because they can be a confusing experience for users and search engines. Instead of returning a 404 response code for a non-existent URL, websites that serve "soft 404s" return a 200 response code. The content of the 200 response is often the homepage of the site, or an error page.

How does a soft 404 look to the user? Here's a mockup of a soft 404: This site returns a 200 response code and the site's homepage for URLs that don't exist.



As exemplified above, soft 404s are confusing for users, and furthermore search engines may spend much of their time crawling and indexing non-existent, often duplicative URLs on your site. This can negatively impact your site's crawl coverage—because of the time Googlebot spends on non-existent pages, your unique URLs may not be discovered as quickly or visited as frequently.

What should you do instead of returning a soft 404?
It's much better to return a 404 response code and clearly explain to users that the file wasn't found. This makes search engines and many users happy.

Return 404 response code



Return clear message to users



Can your webserver return 404, but send a helpful "Not found" message to the user?
Of course! More info as "404 week" continues!



More on 404

Now that we've bid farewell to soft 404s, in this post for 404 week we'll answer your burning 404 questions.

How do you treat the response code 410 "Gone"?
Just like a 404.

Do you index content or follow links from a page with a 404 response code?
We aim to understand as much as possible about your site and its content. So while we wouldn't want to show a hard 404 to users in search results, we may utilize a 404's content or links if it's detected as a signal to help us better understand your site.

Keep in mind that if you want links crawled or content indexed, it's far more beneficial to include them in a non-404 page.

What about 404s with a 10-second meta refresh?
Yahoo! currently utilizes this method on their 404s. They respond with a 404, but the 404 content also shows:



We feel this technique is fine because it reduces confusion by giving users 10 seconds to make a new selection, only offering the homepage after 10 seconds without the user's input.

Should I 301-redirect misspelled 404s to the correct URL?
Redirecting/301-ing 404s is a good idea when it's helpful to users (i.e. not confusing like soft 404s). For instance, if you notice that the Crawl Errors of Webmaster Tools shows a 404 for a misspelled version of your URL, feel free to 301 the misspelled version of the URL to the correct version.

For example, if we saw this 404 in Crawl Errors:
http://www.google.com/webmsters  <-- typo for "webmasters"

we may first correct the typo if it exists on our own site, then 301 the URL to the correct version (as the broken link may occur elsewhere on the web):
http://www.google.com/webmasters

Have you guys seen any good 404s?
Yes, we have! (Confession: no one asked us this question, but few things are as fun to discuss as response codes. :) We've put together a list of some of our favorite 404 pages. If you have more 404-related questions, let us know, and thanks for joining us for 404 week!
http://www.metrokitchen.com/nice-404-page
"If you're looking for an item that's no longer stocked (as I was), this makes it really easy to find an alternative."
-Riona, domestigeek

http://www.comedycentral.com/another-404
"Blame the robot monkeys"
-Reid, tells really bad jokes

http://www.splicemusic.com/and-another
"Boost your 'Time on site' metrics with a 404 page like this."
-Susan, dabbler in music and Analytics

http://www.treachery.net/wow-more-404s
"It's not reassuring, but it's definitive."
-Jonathan, has trained actual spiders to build websites, ants handle the 404s

http://www.apple.com/iPhone4g
"Good with respect to usability."
http://thcnet.net/lost-in-a-forest
"At least there's a mailbox."
-JohnMu, adventurous

http://lookitsme.co.uk/404
"It's pretty cute. :)"
-Jessica, likes cute things

http://www.orangecoat.com/a-404-page.html
"Flow charts rule."
-Sahala, internet traveller

http://icanhascheezburger.com/iz-404-page
"I can has useful links and even e-mail address for questions! But they could have added 'OH NOES! IZ MISSING PAGE! MAYBE TIPO OR BROKN LINKZ?' so folks'd know what's up."
-Adam, lindy hop geek

Source: http://googlewebmastercentral.blogspot.com/2008/08/now-that-weve-bid-farewell-to-soft-404s.html

Twitter Delicious Facebook Digg Stumbleupon Favorites More

 
Design by Free WordPress Themes | Bloggerized by Lasantha - Premium Blogger Themes | coupon codes