копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

A Latent Dirichlet Framework for Relevance Modeling

V. Ha-Thuc, и P. Srinivasan. Information Retrieval Technology (2009)

Аннотация

Relevance-based language models operate by estimating the probabilities of observing words in documents relevant (or pseudo relevant) to a topic. However, these models assume that if a document is relevant to a topic, then all tokens in the documentare relevant to that topic. This could limit model robustness and effectiveness. In this study, we propose a Latent Dirichletrelevance model, which relaxes this assumption. Our approach derives from current research on Latent Dirichlet Allocation(LDA) topic models. LDA has been extensively explored, especially for discovering a set of topics from a corpus. LDA itself,however, has a limitation that is also addressed in our work. Topics generated by LDA from a corpus are synthetic, i.e., theydo not necessarily correspond to topics identified by humans for the same corpus. In contrast, our model explicitly considersthe relevance relationships between documents and given topics (queries). Thus unlike standard LDA, our model is directlyapplicable to goals such as relevance feedback for query modification and text classification, where topics (classes and queries)are provided upfront. Thus although the focus of our paper is on improving relevance-based language models, in effect ourapproach bridges relevance-based language models and LDA addressing limitations of both.

Линки и ресурсы

URL:

http://dx.doi.org/10.1007/978-3-642-04769-5_2

ключ BibTeX:

viet2009latent

искать в:

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!

Цитировать эту публикацию

@article{viet2009latent,
  abstract = {Relevance-based language models operate by estimating the probabilities of observing words in documents relevant (or pseudo
relevant) to a topic. However, these models assume that if a document is relevant to a topic, then all tokens in the documentare relevant to that topic. This could limit model robustness and effectiveness. In this study, we propose a Latent Dirichletrelevance model, which relaxes this assumption. Our approach derives from current research on Latent Dirichlet Allocation(LDA) topic models. LDA has been extensively explored, especially for discovering a set of topics from a corpus. LDA itself,however, has a limitation that is also addressed in our work. Topics generated by LDA from a corpus are synthetic, i.e., theydo not necessarily correspond to topics identified by humans for the same corpus. In contrast, our model explicitly considersthe relevance relationships between documents and given topics (queries). Thus unlike standard LDA, our model is directlyapplicable to goals such as relevance feedback for query modification and text classification, where topics (classes and queries)are provided upfront. Thus although the focus of our paper is on improving relevance-based language models, in effect ourapproach bridges relevance-based language models and LDA addressing limitations of both.},
  added-at = {2010-05-04T08:55:46.000+0200},
  author = {Ha-Thuc, Viet and Srinivasan, Padmini},
  biburl = {https://puma.uni-kassel.de/bibtex/2e33cdabe533c95b97ec11788f3d6d85e/folke},
  interhash = {90e160e155f6c30cf8047231ac582469},
  intrahash = {e33cdabe533c95b97ec11788f3d6d85e},
  journal = {Information Retrieval Technology},
  keywords = {dirichlet information ir relevance},
  pages = {13--25},
  timestamp = {2010-05-04T08:55:47.000+0200},
  title = {A Latent Dirichlet Framework for Relevance Modeling},
  url = {http://dx.doi.org/10.1007/978-3-642-04769-5_2},
  year = 2009
}

%0 Journal Article
%1 viet2009latent
%A Ha-Thuc, Viet
%A Srinivasan, Padmini
%D 2009
%J Information Retrieval Technology
%K dirichlet information ir relevance
%P 13--25
%T A Latent Dirichlet Framework for Relevance Modeling
%U http://dx.doi.org/10.1007/978-3-642-04769-5_2
%X Relevance-based language models operate by estimating the probabilities of observing words in documents relevant (or pseudo
relevant) to a topic. However, these models assume that if a document is relevant to a topic, then all tokens in the documentare relevant to that topic. This could limit model robustness and effectiveness. In this study, we propose a Latent Dirichletrelevance model, which relaxes this assumption. Our approach derives from current research on Latent Dirichlet Allocation(LDA) topic models. LDA has been extensively explored, especially for discovering a set of topics from a corpus. LDA itself,however, has a limitation that is also addressed in our work. Topics generated by LDA from a corpus are synthetic, i.e., theydo not necessarily correspond to topics identified by humans for the same corpus. In contrast, our model explicitly considersthe relevance relationships between documents and given topics (queries). Thus unlike standard LDA, our model is directlyapplicable to goals such as relevance feedback for query modification and text classification, where topics (classes and queries)are provided upfront. Thus although the focus of our paper is on improving relevance-based language models, in effect ourapproach bridges relevance-based language models and LDA addressing limitations of both.

PUMA

Аннотация

Линки и ресурсы

Комментарии и рецензии
(0)

Tags

Цитировать эту публикацию

Метаданные

сообщество

тэги (@folke- тэги данного пользователя выделены )

PUMA

копироватьудалитьдобавить публикацию в буферЗапись сообществапосмотреть историю данной записиURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML A Latent Dirichlet Framework for Relevance Modeling

Аннотация

Линки и ресурсы

Комментарии и рецензии (0)

Tags

Цитировать эту публикацию

Метаданные

сообщество

тэги (@folke- тэги данного пользователя выделены )

Комментарии и рецензии
(0)