2 How to do it
2.3 Searching for information on the Web
What do you do if you don't know the URL of the website you are looking for, or haven't been able to browse to it? The Web is not like a library – it isn't carefully organised and catalogued, and it is growing all the time. Luckily, there are search sites that can help you find what you want.
2.3.1 Portals
Activity 3
Visit the Excite home page. Spend no more than a few minutes getting a sense of what information is available from this page.
Comment
Some websites are set up as web portals: they aim to provide you with a one-stop-shop for everything on the Web. They provide their own editorial material, news headlines, weather and other up-to-date information, as well as links to commercial partners and paid advertisements. They also provide a search facility – did you spot the one on Excite?
Internet service providers (ISPs) usually provide their own portal and may configure your browser to use it as your home page. (Note that ‘home page’ here means the default page your browser displays when first started.)
2.3.2 Search sites
Other sites such as Google and Yahoo! concentrate on providing search facilities.
Yahoo! started out as a list of useful websites put together by two Stanford University students, but has grown somewhat since then. It still offers a web directory: a huge list of useful web pages collected by Yahoo! staff that you can browse. The directory is organised in the same way as a classified phone directory, but the difference is that categories can be browsed in successively greater detail.
Activity 4
Visit Yahoo! Find the Web Site Directory (see image above) and follow links to a topic that interests you.
Comment
For example, I followed these links: Science > Animals > Mammals > Primates > Apes > Gorillas to reach this page:
Web directories can be a useful starting point if you are looking for information in a general area. If you are looking for more specific information or want to look more widely, a full-text search engine provides an alternative.
Full-text search
A full-text search is when you search the full text of the original source, rather than the keywords associated with it.
Search engines attempt to search all the text on all the pages of the Web. They use software spiders to seek out and index web pages, storing the results in huge databases. We will see how this is done later in this unit.
Spiders
Programs that crawl over the web, fetching web pages by following links. Spiders are used by search engines to find pages for indexing.
Activity 5
Visit Google. Search for a topic that interests you.
Comment
For example, I searched for ‘gorilla’, with the following results:
Search sites often provide both directories and full-text search, and will combine results to offer you the best of both worlds.
2.3.3 Search results
Let us look at the results returned by a search engine. I've chosen to use Google, but you may use another search engine; the layout is likely to be different in detail but most of the same elements will be present.
Activity 6
Visit Google or another search site and search for a topic that interests you.
Comment
The page of results may include hits from several different sources. Google, for example, may include some results from current news stories. Search sites will often include prominent results that are paid for by advertisers.
Hits
Documents that meet your search criteria.
Activity 7
Look at some results. Can you distinguish those that come from advertisers?
Comment
On Google some results are marked as ‘sponsored links’ – businesses and organisations pay Google a fee so that their pages are associated with particular keywords.
Each hit returned provides several pieces of information to help you decide whether to visit the page. This may include the title of the page, a short extract with search terms highlighted, and the domain (which will give you clues to the publisher of the page). For pages that also appear in the search site's directory there may be a short description and a link to the category in which the page appears.
A search engine may return a huge number of hits, but surprisingly often the information you wanted can be found by following one of the first few links. This is because the results are ranked to offer you the ‘best’ first.