Web Search Query - Characteristics

Characteristics

Most commercial web search engines do not disclose their search logs, so information about what users are searching for on the Web is difficult to come by. Nevertheless, a study in 2001 analyzed the queries from the Excite search engine showed some interesting characteristics of web search:

  • The average length of a search query was 2.4 terms.
  • About half of the users entered a single query while a little less than a third of users entered three or more unique queries.
  • Close to half of the users examined only the first one or two pages of results (10 results per page).
  • Less than 5% of users used advanced search features (e.g., boolean operators like AND, OR, and NOT).
  • The top four most frequently used terms were, (empty search), and, of, and sex.

A study of the same Excite query logs revealed that 19% of the queries contained a geographic term (e.g., place names, zip codes, geographic features, etc.).

A 2005 study of Yahoo's query logs revealed 33% of the queries from the same user were repeat queries and that 87% of the time the user would click on the same result. This suggests that many users use repeat queries to revisit or re-find information. This analysis is confirmed by a Bing search engine blog post telling about 30% queries are navigational queries

In addition, much research has shown that query term frequency distributions conform to the power law, or long tail distribution curves. That is, a small portion of the terms observed in a large query log (e.g. > 100 million queries) are used most often, while the remaining terms are used less often individually. This example of the Pareto principle (or 80–20 rule) allows search engines to employ optimization techniques such as index or database partitioning, caching and pre-fetching.

But in a recent study in 2011 it was found that the average length of queries has grown steadily over time and average length of non-English languages queries had increased more than English queries.

Read more about this topic:  Web Search Query