Skip to main content

Web Search Technology [Lecture notes Information retrieval]



Web search engines work by storing information about many web pages, which they retrieve from the Web itself. Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site [1]. This is what it means when someone refers to a site being "spidered" or "crawled". The spider returns to the site on a regular basis, such as every month or two, to look for changes. Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalogue, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with the new information. Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed". Until it is indexed (added to the index), it is not available to those searching through the search engine [2, 1]. The third element is a ranking algorithm. Search engines use a ranking algorithm to determine the order in which matching web pages are returned on the results page [2]. They build indices mostly based on keyword occurrence, link popularity and frequency, for query negotiation using these indices. Using these connectivity-based algorithms, they measure the quality of each individual page so that users will receive a ranked page list for their queries. The working of a search engine can be summarized in three simple steps as follows:

a. Crawling the Web
b. Matching the keyword with the web pages available in the Web repository
c. Providing result to the user’s query

1. Sergey Brin and Larry Page. Google search engine, http://google.stanford.edu.
2.  Brin S. and  Page L.,"The Anatomy of a Large-scale Hypertextual Web Search Engine", in Proceedings of WWW, 1997.


Comments

Popular posts from this blog

Advantages and Disadvantages of EIS Advantages of EIS Easy for upper-level executives to use, extensive computer experience is not required in operations Provides timely delivery of company summary information Information that is provided is better understood Filters data for management Improves to tracking information Offers efficiency to decision makers Disadvantages of EIS System dependent Limited functionality, by design Information overload for some managers Benefits hard to quantify High implementation costs System may become slow, large, and hard to manage Need good internal processes for data management May lead to less reliable and less secure data

Inter-Organizational Value Chain

The value chain of   a company is part of over all value chain. The over all competitive advantage of an organization is not just dependent on the quality and efficiency of the company and quality of products but also upon the that of its suppliers and wholesalers and retailers it may use. The analysis of overall supply chain is called the value system. Different parts of the value chain 1.  Supplier     2.  Firm       3.   Channel 4 .   Buyer

CONCEPTUAL VIEW OF MIS

The concept is a blend of principles, theories and practices of management, information and system giving rise to a single product called MANAGEMENT INFORMATION SYSTEM . The concept of management gives high regard to the individual and his ability to use the information. MIS gives information through data analysis. While analyzing the information, it relies on many academic disciplines like management science, OR, organization behavior, psychology, etc. The foundation of MIS is the principles of management and its practices. MIS uses the concept of management control in its design and relies heavily on the fact that the decision maker is a human being and is a human processor of information. A MIS can be evolved for a specific objective it is evolved after systematic planning and design. It calls for an analysis of business, management views and policies, organization culture and the management style. The MIS,therefore relies heavily on systems theory.The systems theory offers soluti