Googles Approach to Automatically Identifying Related Documents

Publishers are always focused on how to make a customer’s search experience more relevant. One way for providers of search solutions to add more value is to help the customer find documents that might not have been retrieved in the original answer set.

A common technique is for expert editors to identify documents that are meaningful related to other documents. For example, Wolters Kluwer’s expert editors identify explanations, annotations and regulations that are relevant to a section of legislation. While viewing a legislative section, the end user can click on “Related Documents” and interact with a list. Recent patent filings by Google reveal an interesting perspective on providing links to related documents. In US Patent Application 20100153422, published on June 17, 2010, Google discusses techniques for automatically recommending documents related to the document that a customer is viewing. The basis of the recommended documents can come from content in the document itself. But other inputs can be information about actual end user, including the geographical location, previous search queries, links selected by the user and even personal information that a user provided when registering for an account. There even is discussion of creating user profiles based on the prior search queries that a individual ran.  I find these ideas interesting but I also have several questions and concerns. 

First, I might search Google one time for personal reasons and another time for business reasons.  The business and personal search queries have no connection to each other.  Is there any way to construct a meaningful profile based on such a search history?  Second, I travel frequently to several countries.  The geographical location of my laptop is frequently different from the desired geographical location of the search results.  How can this be resolved?  Third, there are privacy concerns.  Do I really want Google to store all of that information?  Another issue, of course, is editorial expertise.  If a customer views a section of legislation retrieved through Google, will Google’s algorithm be able to identify meaningfully related regulations and cases?  It seems quite a challenge. 

I applaud Google’s efforts to always attempt to make search results more relevant to the customer by automatically identifying related documents. I’m a frequent user of Google and am impressed by how it presents related searches and seems to intelligently rank search results.  But it does not always answer my question.  I continue to use a lot of professional databases.  And I ask myself the question:  should Wolters Kluwer’s research products adjust related documents based on a customer’s search history or user profile?  Is there a single profile for each end user?  These ideas sound appealing.  What do you think?

 

Share and Enjoy:
  • Facebook
  • Google Bookmarks
  • LinkedIn
  • Twitter
  • Yahoo! Bookmarks

Leave a Reply