[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[eu_members at aclweb dot org] Book Announcement:: Statistical Language Models for Information Retrieval


Statistical Language Models for Information Retrieval

ChengXiang Zhai (University of Illinois, Urbana-Champaign)

Synthesis Lectures on Human Language Technologies #1 (Morgan & 
Claypool Publishers), 2009, 141 pages

As online information grows dramatically, search engines such as 
Google are playing a more and more important role in our lives. 
Critical to all search engines is the problem of designing an 
effective retrieval model that can rank documents accurately for a 
given query. This has been a central research problem in information 
retrieval for several decades. In the past ten years, a new generation 
of retrieval models, often referred to as statistical language models, 
has been successfully applied to solve many different information 
retrieval problems. Compared with the traditional models such as the 
vector space model, these new models have a more sound statistical 
foundation and can leverage statistical estimation to optimize 
retrieval parameters. They can also be more easily adapted to model 
non-traditional and complex retrieval problems. Empirically, they tend 
to achieve comparable or better performance than a traditional model 
with less effort on parameter tuning. This book systematically reviews 
the large body of literature on applying statistical language models 
to information retrieval with an emphasis on the underlying 
principles, empirically effective language models, and language models 
developed for non-traditional retrieval tasks. All the relevant 
literature has been synthesized to make it easy for a reader to digest 
the research progress achieved so far and see the frontier of research 
in this area. The book also offers practitioners an informative 
introduction to a set of practically useful language models that can 
effectively solve a variety of retrieval problems. No prior knowledge 
about information retrieval is required, but some basic knowledge 
about probability and statistics would be useful for fully digesting 
all the details.

Table of Contents: Introduction / Overview of Information Retrieval 
Models / Simple Query Likelihood Retrieval Model / Complex Query 
Likelihood Model / Probabilistic Distance Retrieval Model / Language 
Models for Special Retrieval Tasks / Language Models for Latent Topic 
Analysis / Conclusions


This title is available online without charge to members of 
institutions that that have licensed the Synthesis Digital Library of 
Engineering and Computer Science.  Members of licensing institutions 
have unlimited access to download, save, and print the PDF without 
restriction; use of the book as a course text is encouraged.  To find 
out whether your institution is a subscriber, visit <http://www.morganclaypool.com/page/licensed
 >, or just click on the book's URL above from an institutional IP 
address and attempt to download the PDF.  Others may purchase the book 
from this URL as a PDF download for US$30 or in print for US$40.  
Printed copies are also available from Amazon and from booksellers 
worldwide at approximately US$40 or local currency equivalent.