Google changes the algorithm; nothing new but what about the bias of coders?


Image001

Here is the thinking, which has wider implications than a small change at Google….

If you took a complex algorithm and asked a 15 year old, a 30 year old and a 65 year old; both male and female, from different countries, using different computing languages and compliers to cut some code: will you get the same output from the same test datasets using the different implementations of the algorithm? – Probably not!

So changing the algorithm is one thing; changing compliers (and who coded that), language and the age, sex, experience (life and skills) of the coders is another……but we depend on them.

Yes there are tools to help ensure maintainability, supportability, scalability, performance and conformity but we do have a massive and increasing reliance on the coders ethics and lack of bias in the way the interrupt an algorithm……just wondering who is thinking about this as well.

Why this is important to digital footprints. Someone you don’t know is taking your data and predicting your future based on their and others bias…..

-------

From the Google blog Ten recent algorithm changes

11/14/11 | 8:30:00 AM

Today we’re continuing our long-standing series of blog posts to share the methodology and process behind our search ranking, evaluation and algorithmic changes. This summer we published a video that gives a glimpse into our overall process, and today we want to give you a flavor of specific algorithm changes by publishing a highlight list of many of the improvements we’ve made over the past couple weeks.

We’ve published hundreds of blog posts about search over the years on this blog, our Official Google Blog, and even on my personal blog. But we’re always looking for ways to give you even deeper insight into the over 500 changes we make to search in a given year. In that spirit, here’s a list of ten improvements from the past couple weeks:

  • Cross-language information retrieval updates: For queries in languages where limited web content is available (Afrikaans, Malay, Slovak, Swahili, Hindi, Norwegian, Serbian, Catalan, Maltese, Macedonian, Albanian, Slovenian, Welsh, Icelandic), we will now translate relevant English web pages and display the translated titles directly below the English titles in the search results. This feature was available previously in Korean, but only at the bottom of the page. Clicking on the translated titles will take you to pages translated from English into the query language.
  • Snippets with more page content and less header/menu content: This change helps us choose more relevant text to use in snippets. As we improve our understanding of web page structure, we are now more likely to pick text from the actual page content, and less likely to use text that is part of a header or menu.
  • Better page titles in search results by de-duplicating boilerplate anchors: We look at a number of signals when generating a page’s title. One signal is the anchor text in links pointing to the page. We found that boilerplate links with duplicated anchor text are not as relevant, so we are putting less emphasis on these. The result is more relevant titles that are specific to the page’s content.
  • Length-based autocomplete predictions in Russian: This improvement reduces the number of long, sometimes arbitrary query predictions in Russian. We will not make predictions that are very long in comparison either to the partial query or to the other predictions for that partial query. This is already our practice in English.
  • Extending application rich snippets: We recently announced rich snippets for applications. This enables people who are searching for software applications to see details, like cost and user reviews, within their search results. This change extends the coverage of application rich snippets, so they will be available more often.
  • Retiring a signal in Image search: As the web evolves, we often revisit signals that we launched in the past that no longer appear to have a significant impact. In this case, we decided to retire a signal in Image Search related to images that had references from multiple documents on the web.
  • Fresher, more recent results: As we announced just over a week ago, we’ve made a significant improvement to how we rank fresh content. This change impacts roughly 35 percent of total searches (around 6-10% of search results to a noticeable degree) and better determines the appropriate level of freshness for a given query.
  • Refining official page detection: We try hard to give our users the most relevant and authoritative results. With this change, we adjusted how we attempt to determine which pages are official. This will tend to rank official websites even higher in our ranking.
  • Improvements to date-restricted queries: We changed how we handle result freshness for queries where a user has chosen a specific date range. This helps ensure that users get the results that are most relevant for the date range that they specify.
  • Prediction fix for IME queries: This change improves how Autocomplete handles IME queries (queries which contain non-Latin characters). Autocomplete was previously storing the intermediate keystrokes needed to type each character, which would sometimes result in gibberish predictions for Hebrew, Russian and Arabic.

If you’re a site owner, before you go wild tuning your anchor text or thinking about your web presence for Icelandic users, please remember that this is only a sampling of the hundreds of changes we make to our search algorithms in a given year, and even these changes may not work precisely as you’d imagine. We’ve decided to publish these descriptions in part because these specific changes are less susceptible to gaming.

For those of us working in search every day, we think this stuff is incredibly exciting -- but then again, we’re big search geeks. Let us know what you think and we’ll consider publishing more posts like this in the future.