|LDC Catalog No.:||LDC98S68|
|Sample Type:||1-channel pcm|
|Data Source(s):||telephone speech|
|Application(s):||speech recognition, speaker identification|
LDC User Agreement for Non-Members
|Online Documentation:||LDC98S68 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Reynolds, Douglas. LLHDB LDC98S68. Web Download. Philadelphia: Linguistic Data Consortium, 1998.|
LLHDB consists of recordings of people speaking into ten different telephone handsets. The aim was to create a corpus for the study of telephone transducer effects on speech which minimized confounding factors, such as variable telephone channels and background noise. LLHDB was created by having volunteers speak prompted and extemporaneous speech into different transducers in a sound-proof room and directly digitizing the output from the transducers on a SunSparc A/D at a 8kHz sampling rate and a 16-bit resolution.
There were three types of speech recorded for each handset. First, the speaker read the "rainbow passage" [Nolan 83], a 97 word passage sometimes used in phonetic research. Second, the speaker read ten sentences extracted from TIMIT (LDC93S1). Finally, the speaker was asked to describe a photograph for approximately 40 seconds (a different photograph was used for each handset). LLHDB contains speech from 53 speakers (24 males and 29 females) recruited from the laboratory.
Because the same handsets are used in both HTIMIT (LDC98S67) and LLHDB, it is possible to compare the effects of the two different recording methods.
Relative to the original CD-ROMs produced in 1998 by the Linguistic Data Consortium, the extension of the audio files was changed from ".wav" to ".sph."