This will be our sixth and final article in the series related to what is commonly referred to as the “the invisible web.” As I previously stated and again emphasize, these sites are certainly not your beginner or skip guesser sites. Most of them require certain advanced skills to manipulate, massage and extract the data. For the professional tracer to obtain the desired information needed to complete their searches it is imperative that they possess the required skills and a thorough understanding of “exact data extraction techniques.” I have visited and conducted searches on each of these sites to insure they are working and functional as of July 14, 2017.
To briefly refresh our memory, in the past five installments we have been reviewing the vast amount of data search engines usually will not show you without a specific search. The term “invisible web” mainly refers to the vast repository of information that search engines and directories do not have direct access to, such as databases. Unlike pages on the visible web (that is, the Web you can access from search engines and directories), information in databases is generally inaccessible to the software spiders and crawlers that create search engine indexes. It is estimated that the invisible web is thousands of times larger than the visible web content. The invisible web contains nearly 550 billion individual documents compared to the one billion of the visible web.
The major search engines do not bring back all the “hidden” content in a typical search, simply because they can’t see that content without specialized search parameters and/or search expertise. When a tracer knows how to access this data they are provided with much more information.
Many tracers are not aware of the invisible web and therefore are limited to what can be found with common search engines such as Google or Yahoo. There are times when a tracer looking for something a bit complicated or obscure cannot find that bit of data they are searching for with a common search engine. The professional tracer understands the invisible web is the place they must access to obtain critical data.
I have found that most of the data on the invisible web is maintained by academic institutions, and has a higher quality than search engine results. There are “academic gateways” that can help you find this information. To find nearly any educational resource on the Web, simply type in this search string to your favorite search engine: site:.edu “subject I’m looking for” and in most cases your search will return with only .edurelated sites.
The invisible web offers a vast array of resources to the professional tracer on just about anything and everything. The links we have covered in the previous five issues barely begin to touch the total resources available on the invisible web. The invisible web continues to grow at an unprecedented pace and utilization by the tracer becomes a great asset.
Finally, it is important to clarify that the invisible web is also an entity on the move. For example, a site that is not accessible via Google today can be included by the search engine tomorrow when Google’s spider visits the site. But it is also possible that if the whole site or certain pages are accessible only after registration (even if it is free), the site will never appear on Google. Like grains of sand in the desert, they are always there but may be constantly shifting in fashion and form.
Here is a list of some of the most popular searchable sites and search engines for the invisible web and although some of them provide additional metasearch options for searching with five, 10 or even more search engines, each of these has its own indexing capabilities for invisible web pages.
http://www.directsearch.net is considered one of the biggest and most intensively maintained resources for the invisible web. It is not only a search engine, but also a search directory with links in many categories. One of its advantages is the topical compilations, which are gatherings of links connected to a specific topic — for instance, Almanacs/Factbooks/Statistical Reports and Related Reference Tools.
http://www.queryscoop. com/ is a multi-search three-step search tool which allows the user to search with multiple search engines including Google, Bing, Yahoo and DuckDuck- Go. It is very easy to use and covers any subject for which you desire to search using keywords.
The Invisible Web
Livewire-47 Alternatives to Wikipedia www.lifewire.com/ alternatives-to-wikipedia-3482764 is a valuable resource for the invisible web because it provides links to over 70,000 searchable databases and specialty search engines. The trick with dynamic searchable databases is that they are more difficult to crawl. Because of this, they are rarely indexed by major search engines. When a search for a term is performed, it is submitted to multiple databases simultaneously, so in a sense this is also a metasearch engine.
http://www.geniusfind. com/ is similar to the now defunct Complete Planet in that it is a database and search engine finder. Although it does not have the rich resources of the other search engines, Geniusfind offers topical search engines and databases. It can be used to locate an engine or a database for a topic of interest, which a searcher can use to perform an additional search.
The all-purpose search engines for the invisible web can be used to directly find the information in which you are interested. But often it is faster and easier to use an all-purpose search engine to locate a topical search engine or a database. After you find search engines or databases on topics of interest, you can go to the site to see whether they provide the information you need. There are search engines and searchable databases for almost any topic imaginable.
I could list many other invisible web databases, but I think that after providing you with ideas about how to find them, you can make your own list based on your specific needs, so let the hunt begin! Good luck and good hunting until we meet again.