Classic Model of Web IR

The classic model for IR

An IR system typically consists of three main subsystems: document representation, representation of user’s requirements (queries), and the algorithms used to match user requirements (queries) with document representations. A document collection consists of many documents containing information about various subjects or topics of interests [1]. Document contents are transformed into a document representation (either manually or automatically) which is done in a way such that matching these with queries is easy and these representations should correctly reflect the author's intention [2]. The primary concern in representation is how to select proper index terms. Typically, representation proceeds by extracting keywords that are considered as content identifiers and organizing them into a given format. Queries transform the user's information need into a form that correctly represents the user's underlying information requirement and is suitable for the matching process [3,4]. A matching algorithm matches a user's requests (in terms of queries) with the document representations and retrieves documents that are most likely to be relevant to the user. A lot of theoretical models from natural language processing, statistical text analysis, word-stemming, stop lists and information theory have been experimented with the IR system. In order to find useful information, two paradigms are well-established in traditional information retrieval. Searching is a discovery paradigm which is useful for a user who knows precisely what to look for, while browsing is a paradigm useful for a user who is either unfamiliar with the content of the data collection or who has casual knowledge of the jargon used in a particular discipline. Browsing and searching complement each other, and they are most effective when used together [5,6].

Since, in the Web context, the human–computer interaction factors and the cognitive aspects play a significant role [7], it is useful to detail this model further. IR systems recognize that the information need is associated with some task. This need is verbalized (usually mentally, not loud) and translated into a query posed to a search engine. This process of deriving a query from an information need in the Web context has received a great deal of attention.

1. Allan J. , Carterette B., and Lewis J., " When will information retrieval be good enough?", In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 433–440, New York, NY, USA,ACM Press,2005.

2. Lee U. et al.,"Automatic identification of user goals in web search", Proceedings of WWW 2005.ACM Press.

3. Jarvelin K. and Kek¨al¨ainen J. ,"Cumulated gain-based evaluation of IR techniques", ACM Trans. Inf. Syst., 20(4):422–446, 2002.

4. Jarvelin K and Kekalainen J.,"IR evaluation methods for retrieving highly relevant documents", In Proceedings of the ACM Conference on Research and Development on Information Retrieval (SIGIR), 2000.

5. Hellmann M. ,”Fuzzy Logic Introduction”, Epsilon Nought Radar Remote Sensing Tutorials, 2001.

6. Montebello M., “Wrapping WWW Information Sources”, Proceedings of the 2000 International Database Engineering and Applications Symposium (IDEAS’00).

7. Lieberman H. and Selker. T, "Out of context: Computer systems that adapt to, and learn from, context", IBM Systems Journal 39(3 & 4),2007.

Shruti Speak's

Search This Blog

Classic Model of Web IR

Labels

Comments

Popular posts from this blog

Inter-Organizational Value Chain

Big-M Method and Two-Phase Method