Evaluating Search Engines for Chemistry

Part of The Alchemist's Lair Web Site
Maintained by Harry E. Pence, Professor of Chemistry, SUNY Oneonta, for the use of his students. Any opinions are totally coincidental and have no official endorsement, including the people who sign my pay checks. Comments and suggestions are welcome (pencehe@oneonta.edu).

Last Revised Nov. 4, 2000

YOU ARE HERE> Alchemist's Lair > Web Tutorial > Engine Evaluation

More up-to-date information on this topic is now available at An Interim Update on WWW Search Engines for Chemistry on this site.

Evaluating WWW Search Engines for Chemistry, Harry E. Pence, Richard Bachelder, Michael Branciforti, Susan Donadio, Brian John, Joo M Jung, Matthew Glidden, Melanie Krom, Kelly Modoo, and Todd Morris, SUNY Oneonta, Oneonta, NY

INTRODUCTION

In a relatively short time, the World Wide Web has become a widely used source of chemical information for for both college students and faculty. At its best, the Web is a rapid and convenient method to search; at its worst, it can be frustrating and time wasting. Frequently, the difference between success and failure at using the Web depends on the search engine chosen. Thus, search engine selection can be a critical decision for chemists who use the WWW.

For many Web users, including some chemists, the choice of a search engine is based more on Web location or external advertising than any other criteria. Engines that are well placed on popular Web portals or are widely advertised in the media seem to do much better, regardless of how useful they are. This selection process may be adequate for casual surfers, but chemists need to be more selective if they expect to find the specialized information they need.

Many popular computer journals review search engines, but these evaluations are intended for the general user. A good summary of such reviews is available on the Web. One effort to focus specifically on chemistry is Best Search Engines for Finding Scientific Information on the Web , which was developed by Alexander Lebedev of Moscow University. On August 3, 1996 and on February 10, 1997 Lebedev compared the number of hits recorded by eleven different search engines for eight different keywords important in physics and/or chemistry. He discovered that the number of hits could differ by several orders of magnitude from one search engine to another. Unfortunately this work has not been updated since May 17, 1997.

The rate of change on the Web is so rapid that Lebedev's results need to be reevaluated. Even in two years there have been important changes, including the elimination or modification of some search engines in the original study. In May of 1999, the senior chemistry seminar class at SUNY Oneonta set out to update Lebedev's results.

CRITERIA FOR SELECTING A SEARCH ENGINE

There are at least three important criteria that should be used to evaluate search engines, comprehensiveness, currency, and efficiency. Comprehensiveness is a measure of what fraction of the total web sites the search engine actually reviews. This is particularly important for chemists. An article by Steve Lawrence and C. Lee Giles, NEC Research Institute (Science, 280, April 3, 1998, pegs. 98-100) reported that not all Web sites can be accessed by search engines and even the best of the search engines misses over a third of these accessible sites. (A summary of this article is available on the WWW.)Since engines are more likely to identify popular sites, that is, those with many links to other pages, this partially explains why chemists cannot find the specialized pages they need.

Currency measures how often the search engine revisits sites to determine whether or not there have been any changes. Not only are new web sites constantly being created, but also many sites are vanishing. Failure to keep up to date can produce useless links that no longer exist. In September, 1998, a further study by Lawrence and Giles concluded that the Web is growing faster than the increase in the search engine coverage, and engines are returning a greater percentage of dead links. The situation is getting worse, not better.

The final concern is efficiency. Are the most useful sites not just included but listed early in the search results? This is probably the most difficult to evaluate quantitatively.

Lebedev argues that the number of documents is most important when looking for scientific information and so focused mainly on comprehensiveness. He argues that the number of scientific publications is only 10-20% of the total number of documents found by search engines, and so listing more documents increases the probability that nothing useful will be missed. This approach ignores two other important criteria mentioned above, currency and efficiency, but it does provide a helpful perspective for chemists.

Lebedev chose a short list of scientific terms and recorded the number of hits for each term on each search engine. He found that the number of hits changed by several orders of magnitude from one search engine to another. Based on his results, he recommended AltaVista as the most comprehensive search engine.

SEARCH ENGINE RESULTS

During May of this year, students in the senior seminar at SUNY Oneonta repeated the survey that had previously been done by Lebedev, with several changes. The list of Search engines used was modified by eliminating those that gave very few hits with scientific search terms, as well as those that had changed format in such a way that they were no longer could be compared. Yahoo, which is highly rated for general use, consistently returns very small numbers of hits for these scientific terms, and so was eliminated. Two search engines, Northern Light and Microsoft Network, were added, since these are reputed to give good results. A slightly shorter list of search terms was used. The results are shown in Table I (which will open in another window). The 1996 and 1997 results are from Lebedev's study and the 1999 results are the current project.

DISCUSSION

Several of the trends reported by Lebedev are continued with the most recent data. During the period from 1996 to 1997, two search engines, Inktomi and NlightN, terminated or became inaccessible. Since then, Magellen has changed to focus mainly on forming chat groups and Lycos no longer returns the number of hits. The number of hits recorded with Excite decreased in each case from 1996 to 1997, and these values were, in turn, even less in 1999. AltaVista usually returned the greatest number of hits. Although Lebedev reported that the number from AltaVista declined from 1996 to 1997, and the latest data generally shows these values have increased, often quite substantially. Northern Light, which was not among the engines in Lebedev's study, competes best with AltaVista, and in some cases even provides more hits.

There are several alternative sources of evaluations that tend to confirm these results. One of the most useful sources of information about search engines is the Search Engine Watch site edited by Danny Sullivan. This site compares search engines in several ways, including the size of each search engine's index . The most recent results from that site (May 1, 1999) indicate that AltaVista has the largest index, followed by Northern Light, then Inktomi (used by several engines, including HotBot and MSN). A larger index indicates a greater chance of finding unusual information, which would presumably include chemical terms.

The article by Lawrence and Giles (Science, 280, April 3, 1998, pegs. 98-100) mentioned previously reports that the most comprehensive engines are HotBot (which is powered by Inktomi), AltaVista, and Northern Light (in that order). In their later report, they note that in comparison to their previous study, Northern Light has significantly increased its coverage relative to the other engines, and the difference between the largest and smallest coverage of the engines is not as great. All of these results are in agreement with the results obtained in this paper. Finally, it should be noted that Lawrence and Gales found that Northern Light, Microsoft Network, and Lycos returned about twice as high a percentage of invalid links as the other engines.

LIMITATIONS OF THE METHOD

Perhaps the most surprising result of the project was the discovery that searching the same term on the same engine at two different times did not always give the same results. This was rather puzzling, especially since the differences were very large, sometimes as much as 30%. Steve Lawrence (NEC Research Institute), who had done so much excellent work on search engines, was kind enough to reply to an inquiry about this behavior. He suggested that,

The most likely answer is that the search engines are doing some kind of partial search. An example might be that a search engine imposes some maximum time limit to a query, and if the time limit is reached then the engine returns the results found so far. Hence the engine might return fewer results during periods of high load. AltaVista seems to be quite good at returning different numbers of results for the same query! Note that if you repeat a query in a short interval you are most likely getting a cached response, so the results will be the same.

Our results do agree with Steve's suggestion that this problem is particularly common with AltaVista. This suggests that, if completeness of response is an important consideration, it would be wise to search during times when the search engine is less likely to be busy. Perhaps more important, it suggests that this method for search engine evaluation must be used with some caution. To minimize this type of problem, most of the results in the current project were repeated several times and the results are averages.

Perhaps a more important caveat is the question of whether or not the total number of hits is a sound method for search engine evaluation. A large index may make it more likely that all of the relevant sites are listed, but when the number of citations is as much as one hundred thousand, it is unlikely that even the most dedicated searcher will review all of the material returned. For many web searchers, currency and efficiency may be more significant criteria of engine performance than simply comprehensiveness.

CONCLUSIONS

Perhaps the most important conclusion from the available data (Table I) is to reiterate the findings of Lawrence and Gales that no single search engine covers the entire WWW, and so a really through search would require the use of more than one engine. Even though a number of engines now have roughly equivalent indexes, AltaVista still seems to be slightly better for use by scientists, with Northern Light giving results that are almost as good and sometimes may be a little better. It is possible to search multiple WWW engines by using a metasearch engine, like Dogpile, but these are usually limited in the number of hits that may be viewed. (Instead of crawling the web to build an index, metacrawlers send search terms to several search engines, then combine the results.) Finally, it is clear that the WWW is still in a state of rapid development, and even these conclusions must be considered to be tentative, until the next new development.

ACKNOWLEDGEMENT

The Authors with to acknowledge the prompt and helpful comments from Steve Lawrence of the NEC Research Institute, which significantly clarified our resul,ts.

Return to The Alchemist's Lair Web Site

Return to Web Tutorial Home Page .

Return to Chem 398 Assignments Home Page.

You are the visitor to the Alchemist's Lair site since Jan. 10,1997.