Search Engines and Web Guides

Part of The Alchemist's Lair Web Site
Maintained by Harry E. Pence, Professor of Chemistry, SUNY Oneonta, for the use of his students. Any opinions are totally coincidental and have no official endors ement, including the people who sign my pay checks. Comments and suggestions are welcome (pencehe@oneonta.edu).

Last Revised May 28, 1999


Search Engines

Search engines offer general search capabilities for information of interest on the Internet. Many of these services use "spiders" or "web robots" that visit web sites regularly to automatically create catalogs of web pages. You should be aware that most of the scientific liter ature is not on the net, and so a web search will give only a small fraction of the articles that could be found with Chemical Abstracts or some similar service. The advantage of web searching is that you may discover a great deal of supplemental information,which will not show up in more conventional search methods. Ideally, a combination of the two search methods would seem to be the best approach.

Comparing Search Engines

Not all good search engine are equally useful for science ma jors. To make this clear, look at Alexander Lebedev's site, Best Search Engines for Finding Scientific Information on the Web. The number of hits can differ by several orders of magnitude when you shift from one search engine to another. Notice that Alex uses specific scientific terms, so his results don't necessarily agree with studies that are based on more general search problems. It is also unfortunate that his work has not been updated sin ce May 17, 1997. This type of study has recently been updated and a report is available at Evaluating Search Engines for Science . A good first stop for any search is the web site maintained by Debbie Abilock of the Nueva Library. Debbie attempts to match the type of search with the best web resources and does a fine job.

Bush Library at Hamlin University (St. Paul, MN ) supports Understanding and Comparing Web Search Tools, which gives links to a number of Search Engine Evaluations. A good summary of search engine reviews in popular computer journals is available on the Web. Neither of these evaluations is specifically based on science topics, however. For more extensive information a bout how to evaluate web sites, you should look at the bibliographies at Evaluating Web Sites for Educational Uses (Notice that a list of questions for evaluation is at the end of the bibliography) or the Bibliography on Evaluating Internet Resources.

Some other useful sites if you want more information are:


Common Search Engines

Digital's Alta Vista is one of the most powerful and rapid search engines on the net, but it tends to be very popular , and so may be hard to access. The site has recently added a plain language search engine, that allows you to submit simple questions. The searchable database contains both the results of net spiders as well as addresses submitted directly to alta vista. To search a specific phrase, use double quotation marks, i.e. "acid rain". All lower case is preferred, since this will return both upper and lower case citations. You may also use a plus sign (+) to force combinations and a minus sign (- ) to ex clude words. For example, acid+rain-smog would return all sites that mention acid raid with the exception that if smog is also mentioned the site would be excluded. The results are returned with the best matches at the top. It is often helpful to add a wild card (*) at the end of search terms to include different endings of the same work. Thus, cataly* would return not only catalyst and catalysis, but also catalytic. Typical search time for AltaVista's entire web index is usually less than a second.

Northern Light advertises that it has an index of over 150 million web pages. It is highly rated for its advanced search capabilities, and is especially attractive to scientists because it has such a large index of pages. It supports full Boolean capability (AND, OR, NOT), including parenthetical expressions. If Boolean operators appear in quotes, they are interpreted as a search term or part of a search term, rather than a Boolean operator.  To use the example that they give, "War AND Peace" will search for the phrase "War and Peace" , but War AND Peace will return any documents that contain both of the words War and Peace, and War OR Peace will return any document that contains either one of these words. This is particularly powerful, since there is no limit to the amount of parenthese that can be nested.

HotBot is one of the newer search engines, but it has been rated to be the number one search engine by several industry publications. HotBot has a database of more than 110 million documents that are updated on a daily basis. Notice that it will also allow you to search by image, audio, or video, but requiring this doesn't mean that your search term will be included in the URL for the particular video. For example, if you are search for catalysis and specify images are included, you will get pages that have a logo for a company as the image. The "look for - - all of the w ords" window allows several powerful search options, similar to Boolean searching. The More Search options button somewhat expands this capability.

Although Excite is really a subject guide, it also uses "concept-based" search technology, called Excite Spider, to summarize information on 50 million web pages. The standard search allows you to use +. -, or and to create a more effective search and ranks the hits in terms of the likelihood tha t they are relevant. According to the site description, this engine also searches for ideas closely related to your specific query. For scientific topics, it doesn't seem to provide as many hits as the first two engines, but the relevance for those that are returned is usually rather good.

Like many of the traditional search engines, Infoseek has extended its focus beyond just searching for information, in order to provide more services that will attract consumers. It still provides a good search en gine, except it is now called The Go Network. Many of the suggestions made above with respect to the Alta Vista Engine, will also apply here, but to be sure that you are getting the most out of the engine, you should check the Infoseek help site , which is especially good. The Advanced Search Site is set up much like Hotbot, and so allows complicated Boolean type searching.

Lycos mainly looks for key words that are either in the title of the web page or else actually in the URL; therefore, it will not return as many hits as spiders that search throughout the individual web sites. If you are after pictures and/or sounds, you can search the Lycos Image Gallery , which also provides an option to search the entire web for pictures, so unds, etc.

Direct Hit attempts to make your searches more selective by using the amount of time that people have spent at a site as a measure of how useful it is. I tried several chemistry searches, and it seemed to perform at least as well as more traditional, engines, and may have even been somewhat more focused. According to the site listing, "Direct Hit's award-winning technology provides highly relevant results for any Internet search. Our Pop ularity Engine tracks the amount of time spent at sites that people actually select from the search results list. By analyzing the activity of millions of previous Internet searchers, Direct Hit determines the most popular and relevant sites for your search request."


Web Guides

Web guides are categories or groupings of web sites that usually have been reviewed by the guide's editors. Sites that are submitted are reviewed and then assigned to the appropria te category by the editor.

Yahoo! is an acronym for "Yet Another Hierarchically Odiferous Oracle" and that pretty well describes the rather hip approach of this pioneering web guide. It is a subject guide that is updated daily, and so although it is is now integrated with Alta Vista's search engine, it is still not usually the best place to look for technical topics.

The Electric Library allows you to pose a question in plain English and will search newspapers, magazines, international newswires, classic books, maps, photographs, major works of literature and art.


Combined Search Engines

This category allows you to submit your inquiry to several different search engines. This should increase the number of hits and there fore make it more probable that you will find all of the sites that are relevant to your search. It may also, however, yield to many extra hits that it becomes more difficult for you to sort through and find the sites that are actually most useful.

Metacrawler will submit your inquiry to several different search engines, and return the combined results of all the engines. It can be useful to check which engine might give the greatest number of hits for a given search. It would seem that this is the best way to go, since it combines several different search engines. Regretta bly, the number of hits that result may be less than result from a conventional technique on a single engine! This happens because the number of references provided may be limited to the default mode of each individual, engine.

Dogpile is a meta-search engine that allows you to specify up to 25 search engine . If you go to the custom search listing on the main page, you can specify the order in which the various engines will be reported. For example, the default listing gives Yahoo as first choice, then Thunderstone, etc. Since these are not especially good for chemistry, you might want to reorganize the priority listing. If you do this once,however, the same listing will be used each time that you sign on.


Second Generation Search Engines

Recently, the first of a new breed of search engines has become available. "Ask Jeeves!" is among the first search engines designed to respond to simple-language questions instead of keywords. This site analyzes sentence construction, then sorts through the database of research and discards the sites that don't appear to match the question. It combines the functions of a metasearch engine with its own search process. Based on only a few searches, it does seem that this approach is more likely to provide useful information on scientific searches than some of the other engines. A Press Release about this engine is available.

WebCrawler is another "natural language" search engine. It has an index that is updated daily and includes WebCrawler Select site reviews. Based on a limited number of searches, the results don't appear to be as good as those obtained with Ask Jeeves (the previous engine), but they do offer some additional resources. This site allows you to use Boolean Searching, and if you wish to do this go to the page on Advanced Search Methods.

Miningco.com has the slogan, "We mine the Net so you don't have to." It uses key words rather than plain language, but if you hit a topic that has already been researched by one of the folks who work for the Mining Company, you may get bonanza of useful articles, many written especially to respond to this topic. These "Guides&quo t; also run chat rooms and bulletin boards on the topics which are their specialities. Notice that you can either search the mining company site only or spread out to the entire web. Even the Web search seems to produce an unusually relevant set of articles. This is a very useful search engine. Special Note: This site has recently been renamed as about.com. If you don't have any luck signing on with the above address, try this link.


Searching for Images

Arriba vista is a search engine that specializes in images. You may search by key words, and the engine returns thumb nail versions of each image. This is a very efficient method of image searching, if you can find what you want. This is a relatively new search engine,which currently lists approximately 1.5 million images. It is hoped to expand to 2.5 million images in the near future. Don't forget that you can also search the Lycos Image Gallery , which also provides an option to search the entire web for pictures, sounds, etc.

As the name implies, Free Graphics offers links to various sources of graphics that may be used without charge. There doesn't seem to be much chemistry here, but if you want to add some spiffy buttons, bullets, etc. to your presentation, this is a good source. It is especially valuable, since it avoids t he problems with copyright. The links to Create Your Own Graphics can be very helpful if you are designing your own web page.

The TechSmith site lists several types of shareware software that can be useful, especially a program called Snagit. According to the description, Snagit "captures anything on the Windows desktop quickly and easily. From one-step capture of scrolling web pages to video capture and text conversion, SnagIt does it all. Shareware or shelfware, SnagIt is the most powerful Windows screen capture tool available." Since I'm a Macperson, I don't have much experience with this software, but it does sound interesting.


Miscellaneous

If you are really serious about designing web pages and other graphics material, you will probably want to take a look at HighFive, the on-lin e magazine of web design. Each month it features news articles, interviews, and reviews about the best sites on the web.


Return to The Alchemist's Lair Web Site

Return to Chem 398 Assignments Home Page.

You are the visitor to the Alchemist's Lair site since Ja n. 10,1997.