Ten Things to Know About Google
1. The database that Google licenses to Yahoo! [http://google.yahoo.com] is not the same size: it's smaller than the Google.com database. It does not contain links to cached versions of pages. This database is also used to supply "fall-through" content (material not in Yahoo's own database). It is often found listed as "Web page" content.
2. Google utilizes the Open Directory Project database as its Web Directory [http://directory.google.com].
3. You can search stop words by placing a + in front of the word (ex. "+To +Be +Or Not +To +Be").
4. At the present time the Google database is refreshed about once every month.
5. You can limit your search to only .pdf files by using the syntax filetype:pdf.
6. Google is the only major search engine to crawl Adobe Acrobat .pdf files.
7. If you are a frequent Google searcher, save time by using the Google Toolbar [http://toolbar.google.com] and Google Buttons [http://www.google.com/ options/buttons.html].
8. A Boolean "OR" is available with Google. For it to function, capitalize the OR.
9. Google only crawls and makes searchable the first 110 k of a page. Long documents may have substantial content invisible to Google.
10. Entering a U.S. street address into the query box will return a link to a map of that address location. Typing in a person or business name, city, and state will also run the query to the Google phone directory. Several other combinations are available that will also query the phone directory service, including typing in the area code and number to run a reverse search [http://www.google.com/ help/features.html#wp].
Ten Things to Know About AllTheWeb
1. AllTheWeb licenses its database to Lycos. The identical database is searched and makes up some of the content on a Lycos results page.
2. Unlike Google and AltaVista, this search engine does not have a limit on the amount of content crawled on a Web page.
3. AllTheWeb indexes every word. Words traditionally considered as "stop words" are searchable.
4. AllTheWeb does not permit the use of Boolean operators.
5. If plus and/or minus signs are not used, AllTheWeb implies a plus sign in front of each term or phrase. This results in an implied "anding" of terms.
6. AllTheWeb is now promising a complete refresh of its database every 9-12 days.
7. AllTheWeb permits syntax to be used direct from the "basic" search page to limit a query. See http://www.alltheweb.com/ help/basic.html#special.
8. A query to the AllTheWeb text database simultaneously runs the search in the AllTheWeb Image, Video, MP3, and FTP databases. If it finds anything, these results are linked on the right side of the results page.
9. AllTheWeb offers a search engine dedicated to Mobile Web content [http://mobile.alltheweb.com].
10. Fast Search and Transfer (FAST), the company behind AllTheWeb, has deployed its software to power the Scirus science search engine from Elsevier.
Ten Things to Know About AltaVista
1. AltaVista is the only major search engine that allows a searcher to use the proximity operator, NEAR (in simple search) near (advanced search). Using this operator finds terms within 10 words of each other in either direction.
2. AltaVista indexes only the first 100 k of text on a page.
3. An asterisk (*) can be used in a phrase to represent an entire word. (Ex. "One small step for man, one giant * leap for mankind")
4. AltaVista News http://news.altavista.com] is "powered" by Moreover. This continuous feed of material can be searched using AltaVista syntax.
5. The use of the "sort by" box on the AltaVista Advanced interface allows you to give certain words or phrases a higher relevancy weighting.
6. Caveat: If you use Advanced Search, make sure to place some term or terms in the Sort-By box; otherwise, results return in completely random order.
7. AltaVista's directory comes from Looksmart.
8. AltaVista's advanced search does not allow for the use of + and — signs.
9. If you search AltaVista in the "simple" mode entering multiple terms without syntax, it will result in an "implied" OR. In the advanced mode, multiple terms are considered a phrase.
10. AltaVista software powers the Health Resources and Services (U.S. government) search engine. This means that all AltaVista syntax can be utilized there. This site also illustrates AltaVista capability of indexing full-text .pdf documents on the site-specific and intranet level [http://search.hrsa.gov].
Ten Things to Know About MSN Search
1. MSN (Microsoft Search Network) Search is "powered" by an Inktomi database. Remember that Inktomi licenses its database to many search sites. Each site gets a different "flavor" of the total database.
2. The MSN Advanced Search interface offers numerous limiting options via fill-in boxes and pull-down menus [http://search.msn.com/advanced.asp].
3. The Advanced Search interface permits limiting to pages at a certain depth in the site. For example, limiting to pages Depth 3 will limit the search to only pages no more than three directories deep from an entire site [e.g., http://www.testsearch.com/ Directory1/Directory2/Directory3/].
4. MSN Search allows use of the asterisk (*) as a truncation symbol.
5. According to the most current Search Engine Showdown rankings, MSN Search has the largest database of any Inktomi partner.
6. The directory portion of MSN search is powered by the Looksmart database.
7. On the Advanced Search interface, checking the "Acrobat" box will retrieve pages with links to pages that contain .pdf files. It does not search content "inside" these files.
8. Greg Notess points out that the same syntax available to limit Hotbot will also work with MSN Search [http://hotbot.lycos.com/ help/tips/search_features.asp].
9. Danny Sullivan notes that MSN also employs human editors to "hand-pick" key sites in the Web Directory and Featured Link sections of the site. Although most of the time the "Featured Links" represent major MSN advertisers, editors can add other content.
10. Selecting and search under the MSN "News Search" tab returns results predominantly from MSNBC.
Ten Things to Know About Northern Light
1. Make sure to study the Northern Light "Power" search page. It provides many limiting options without the knowledge of any syntax [http://nlresearch.northernlight.com/ power_research.html].
2. Instead of entering http://www.northernlight.com, use http://www.nlresearch.com to go straight to the Northern Light Research site. This site aimed at the enterprise market (but available to any searcher) contains access to several databases not available from the main URL. Most of these resources are fee-based. They include EIU Search and market research content from FIND/SVP and MarkIntel.
3. Northern Light provides FREE full-text access to a database of continuously updating news content from 56 newswires. Material stays in this database, available for free access, for 2 weeks. Then the content moves to the Northern Light Special Collection database.
4. Northern Light's Special Editions are subject specific portals that combine material from the "open Web" and NL's proprietary databases. Topics of Special Alerts include XML, managed care, and electronic commerce.
5. The Northern Light Special Collection currently contains content (fee-based, pay-per-document) from over 7,100 sources. A catalog of these publications is available at http://nlresearch.northernlight.com/ docs/specoll_help_catlook.html.
6. Northern Light allows the use of Boolean operators and + and - signs.
7. Multiple truncation symbols can be used in a query. Northern Light has two truncation symbols. The asterisk (*) for multiple letters and the percent symbol (%) for single or absent letters, e.g., medieval/mediaeval.
8. In addition to the limiting capabilities of the "Power" search page, NL has several terms available for field searching. These include text:, text:, and pub:. (This last prefix allows searching in a specific Special Collection publication title.) You can find a complete list at
http://nlresearch.northernlight.com/ docs/search_help_quickref.html.
9. Northern Light's free "Alerts" feature is one resource you must know about. This feature allows you to set up search strategies in ANY/ALL of the NL databases and have those strategies searched up to three times daily. If any new material hits on the strategy, results will be delivered to you via e-mail. I use this tool to bring me a customized feed of news via the NL News Search database. Remember, the full-text content is free to access for 2 weeks.
10. Northern Lights "Geo Search" provides an opportunity to search the Web with keywords and U.S. and Canadian address information. Results also get the benefit of NL's organization with its "custom folders."
|