Skip to main content

Web Search Technology [Lecture notes Information retrieval]



Web search engines work by storing information about many web pages, which they retrieve from the Web itself. Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site [1]. This is what it means when someone refers to a site being "spidered" or "crawled". The spider returns to the site on a regular basis, such as every month or two, to look for changes. Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalogue, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with the new information. Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed". Until it is indexed (added to the index), it is not available to those searching through the search engine [2, 1]. The third element is a ranking algorithm. Search engines use a ranking algorithm to determine the order in which matching web pages are returned on the results page [2]. They build indices mostly based on keyword occurrence, link popularity and frequency, for query negotiation using these indices. Using these connectivity-based algorithms, they measure the quality of each individual page so that users will receive a ranked page list for their queries. The working of a search engine can be summarized in three simple steps as follows:

a. Crawling the Web
b. Matching the keyword with the web pages available in the Web repository
c. Providing result to the user’s query

1. Sergey Brin and Larry Page. Google search engine, http://google.stanford.edu.
2.  Brin S. and  Page L.,"The Anatomy of a Large-scale Hypertextual Web Search Engine", in Proceedings of WWW, 1997.


Comments

Popular posts from this blog

Inter-Organizational Value Chain

The value chain of   a company is part of over all value chain. The over all competitive advantage of an organization is not just dependent on the quality and efficiency of the company and quality of products but also upon the that of its suppliers and wholesalers and retailers it may use. The analysis of overall supply chain is called the value system. Different parts of the value chain 1.  Supplier     2.  Firm       3.   Channel 4 .   Buyer
Advantages and Disadvantages of EIS Advantages of EIS Easy for upper-level executives to use, extensive computer experience is not required in operations Provides timely delivery of company summary information Information that is provided is better understood Filters data for management Improves to tracking information Offers efficiency to decision makers Disadvantages of EIS System dependent Limited functionality, by design Information overload for some managers Benefits hard to quantify High implementation costs System may become slow, large, and hard to manage Need good internal processes for data management May lead to less reliable and less secure data

System Analysis and Design (SAD)

Introduction to System Analysis and Design (SAD) System are created to solve Problems. One can think of the systemsapproch as an organised way of dealing with a problem. In this dynamic world , the subject system analysis and design, mainly deals with the software development activities. This post include:- What is System? What are diffrent Phases of System Development Life Cycle? What are the component of system analysis? What are the component of system designing? What is System? A collection of components that work together to realize some objectives forms a system. Basically there are three major components in every system, namely input, processing and output. In a system the different components are connected with each other and they are interdependent. For example, human body represents a complete natural system. We are also bound by many national systems such as political system, economic system, educational system and so forth. The objective of the system demands tha...