Skip to main content

Posts

Web Search Technology [Lecture notes Information retrieval]

Web search engines work by storing information about many web pages, which they retrieve from the Web itself. Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site [1]. This is what it means when someone refers to a site being "spidered" or "crawled". The spider returns to the site on a regular basis, such as every month or two, to look for changes. Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalogue, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with the new information. Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed". Until ...

Lecture Notes Infromation Retrieval (cont)

Acronyms and Abbreviations CSV: Comma Separated Values. DARPA: Defense Advanced Research Projects Agency.  EOI: Effort of Improvement. 4.       ISP: Internet Service Provider. 5.       NIST: National Institute of Standards and Technology 6.       PEOU: Perceived Ease of Usefulness. PU: Perceived Usefulness. RBS: Rule Based System. SEO: Search Engine Optimization. SERP: Search Engine Result Pages. SU: System Usage. TAM: Technology Acceptance Model. Text Retrieval Conference UDA: User Dependency Algorithm. UI: User Intention. URL: Uniform Resource Locator. VDC: Vicious Dependency Cycle. WG: Web Graph.

Modern Web IR

                                 Evolution of Modern WebIR In 1995, everything changed with the creation of the web. Web objects are the largest collection of information ever created by humans, and this collection changes continuously when new objects are created and old ones removed. In order to adapt to this changed scenario, a new discipline has been created: Web Information Retrieval [1,2,3]. It uses some concepts of traditional IR, and introduces many innovative ones. Modern WebIR [4] is a discipline which has exploited some of the classical results of information retrieval, thereby developing innovative models of information access. A recent report showed that 80% of Web surfers discover new sites (that they visit) through search engines [4,5] (such as Ask, Google, MSN or Yahoo). 1.3.1 Types of Modern WebIR Information retrieval on the Web can be broadly classified into two technologi...

Classic Model of Web IR

The classic model for IR An IR system typically consists of three main subsystems: document representation, representation of user’s requirements (queries), and the algorithms used to match user requirements (queries) with document representations. A document collection consists of many documents containing information about various subjects or topics of interests [1]. Document contents are transformed into a document representation (either manually or automatically) which is done in a way such that matching these with queries is easy and these representations should correctly reflect the author's intention [2]. The primary concern in representation is how to select proper index terms. Typically, representation proceeds by extracting keywords that are considered as content identifiers and organizing them into a given format. Queries transform the user's information need into a form that correctly represents the user's underlying information requirement and is suitable...

A Brief History of Information Retrieval

                                     The Classic Model for Information Retrieval   Information retrieval (IR) can be understood as the task of finding material (usually documents) of an unstructured nature (usually text), which satisfies information need from within large collections (usually stored on computers). Formal retrieval models have formed the basis of IR research. Since early 1960s, a number of different models have been developed to describe aspects of the retrieval task: document content and structure, inter-document linkage, queries, users, their information needs and the context in which the retrieval task is embedded. The reliability on formal retrieval models is one of the great strengths of IR research [1,2,3, 4]. While using an IR system, a user, driven by an information need, constructs a query in some query language. The query is then submitted to a system...

Journey of Information Retrieval

                                                            Background The explosion of the World Wide Web (more commonly referred to as the Web) as an important information source has moulded the behaviour of many information seekers and consumers [1,2,3]. With such a popularity of the Web, a new discipline based on the concepts of traditional information retrieval (IR), called the Web information retrieval (WebIR) has been created; many innovative ones have also been introduced. In 1999, the Web was estimated to have only one–two billion publicly accessible pages, but was growing exponentially. Search systems, primarily viewed as tools for topical research, are now often used in a growing number of tasks, including navigation and shopping assistance. A s more and more users are relying on the Internet for information, search engin...

Electronic Payments

Introduction Online electronic payments are not tantamount to electronic payments. In the emergence of e-commerce, credit cards have long been represented by electronic means of payment, credit cards in shopping malls. Many hotels and other places and items could swipe of the card, POS terminals Regulations, ATM cash forms of payment. And online electronic payments, online payments also known as electronic currency, broadly speaking, refer to a transaction in the online exchange of funds; It is a network-based electronic financial, a business card transactions for all types of electronic tools and media,the electronic computer and communications technologies as a means Electronic data (binary data) stored in the bank's computer system. and through the computer network system in the form of the flow of electronic information transfer and payment. Electronic Payment System is the basis for online payments, and online payments system development is a higher form of electronic...