Information Retrieval Test Collection for Searching Spontaneous Czech Speech

Publication at Faculty of Mathematics and Physics |

2007

Abstract

This paper describes the design of the first large-scale IR test collection built for the Czech language. This collection also happens to be very challenging, as it is based on a continuous text stream from automatic transcription of spontaneous speech and thus lacks clearly defined document boundaries.

All aspects of the collection building are presented, together with some initial experiments.

Keywords

Information Retrieval Test Collection for Searching Spontaneous Czech Speech

Information Retrieval Test Collection for Searching Spontaneous Czech Speech

Abstract

Keywords

People