Statistical language models for information retrieval
As online information grows dramatically, search engines such as Google are playing a more and more important role in our lives. Critical to all search engines is the problem of designing an effective retrieval model that can rank documents accurately for a given query. This has been a central resea...
Main Author: | |
---|---|
Format: | Electronic |
Language: | English |
Published: |
San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) :
Morgan & Claypool Publishers,
c2008.
|
Series: | Synthesis lectures on human language technologies (Online) ;
# 1. |
Subjects: | |
Online Access: | Abstract with links to full text |
Table of Contents:
- Introduction
- Basic concepts in information retrieval
- Statistical language models
- Overview of information retrieval models
- Similarity-based models
- Probabilistic relevance models
- Probabilistic inference models
- Axiomatic retrieval framework
- Decision-theoretic retrieval framework
- Summary
- Simple query likelihood retrieval model
- Basic idea
- Event models for [theta] d
- Multinomial [theta] d
- Multiple Bernoulli [theta] d
- Multiple Poisson [theta] d
- Comparison of the three models
- Estimation of [theta] d
- A general smoothing strategy using collection language model
- Jelinek-Mercer smoothing (fixed coefficient interpolation)
- Dirichlet prior smoothing
- Absolute discounting smoothing
- Interpolation vs. backoff
- Other smoothing methods
- Comparison of different smoothing methods
- Smoothing and TF-IDF weighting
- Two-stage smoothing
- Exploit document prior
- Summary
- Complex query likelihood retrieval model
- Document-specific smoothing of [theta]d
- Cluster-based smoothing
- Document expansion
- Beyond unigram models
- Parsimonious language models
- Full Bayesian query likelihood
- Translation model
- Summary
- Probabilistic distance retrieval model
- Difficulty in supporting feedback with query likelihood
- Kullback-Leibler divergence retrieval model
- Estimation of query models
- Model-based feedback
- Markov chain query model estimation
- Relevance model
- Structured query models
- Negative relevance feedback
- Summary
- Language models for special retrieval tasks
- Cross-lingual information retrieval
- Distributed information retrieval
- Structured document retrieval and combining representations
- Personalized and context-sensitive search
- Expert finding
- Passage retrieval
- Subtopic retrieval
- Other retrieval-related tasks
- Modeling redundancy and novelty
- Predicting query difficulty
- Summary
- Language models for latent topic analysis
- Probabilistic latent semantic analysis (PLSA)
- Latent dirichlet allocation (LDA)
- Extensions of PLSA and LDA
- Topic model labeling
- Using topic models for retrieval
- Summary
- Conclusions
- Language models vs.traditional retrieval models
- Summary of research progress
- Future directions.