The paper presents balanced corpus ORAL2008 designed as a representation of authentic spoken Czech. It concentrates on the data collection, its broad coverage, the transcription system, and it also outlines possible findings based on the data.