Skip to main navigation Skip to search Skip to main content

Information retrieval test collection for searching spontaneous Czech speech

  • Pavel Ircing
  • , Pavel Pecina
  • , Douglas W. Oard
  • , Jianqiang Wang
  • , Ryen W. White
  • , Jan Hoidekr

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

This paper describes the design of the first large-scale IR test collection built for the Czech language. The creation of this collection also happens to be very challenging, as it is based on a continuous text stream from automatic transcription of spontaneous speech and thus lacks clearly defined document boundaries. All aspects of the collection building are presented, together with some general findings of initial experiments.

Original languageEnglish
Title of host publicationText, Speech and Dialogue - 10th International Conference, TSD 2007, Proceedings
PublisherSpringer Verlag
Pages439-446
Number of pages8
ISBN (Print)9783540746270
DOIs
StatePublished - 2007
Event10th International Conference on Text, Speech and Dialogue, TSD 2007 - Pilsen, Czech Republic
Duration: Sep 3 2007Sep 7 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4629 LNAI

Conference

Conference10th International Conference on Text, Speech and Dialogue, TSD 2007
Country/TerritoryCzech Republic
CityPilsen
Period09/3/0709/7/07

Fingerprint

Dive into the research topics of 'Information retrieval test collection for searching spontaneous Czech speech'. Together they form a unique fingerprint.

Cite this