CHAracterizing INdividual Speakers (CHAINS)
|Item Name:||CHAracterizing INdividual Speakers (CHAINS)|
|Author(s):||Fred Cummins, Marco Grimaldi, Thomas Leonard, Juraj Simko|
|LDC Catalog No.:||LDC2008S09|
|Release Date:||November 18, 2008|
|Sample Type:||16 bit linear PCM|
|Data Source(s):||microphone speech|
LDC User Agreement for Non-Members
|Online Documentation:||LDC2008S09 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Cummins, Fred, et al. CHAracterizing INdividual Speakers (CHAINS) LDC2008S09. Web Download. Philadelphia: Linguistic Data Consortium, 2008.|
CHAINS was created by researchers at University College Dublin and contains recordings of thirty-six English speakers reading fables and selected sentences in different speaking styles. The data was obtained in two different sessions with a time separation of about two months. The goal of the corpus is to provide a range of speaking styles and voice modifications for speakers sharing the same accent. Other existing corpora, in particular CSLU Speaker Recognition Version 1.1, TIMIT and the IViE corpus (English Intonation in the British Isles), served as referents in the selection of material. This design decision was made to ensure that methods designed and evaluated on the CHAINS corpus might be directly testable on these other corpora, which were recorded using quite different dialects and channel characteristics.
Additional documentation about the corpus and its methodolgy is available at the CHAINS website.
The data was collected in two recording sessions in a total of six different speaking styles. The first recording session was carried out in a professional recording studio in December 2005. Speakers were recorded in a sound-attenuated booth reading text in the solo, synchronous and retell styles using a Neumann U87 condenser microphone. Additional tracks using other microphones (near and far-field) were also recorded and may be made available upon request to the authors. The second recording session took place from March 2006 to May 2006 in a quiet office environment, using an AKG C420 headset condenser microphone. Speakers read text in the rsi, whisper and fast modes. The six different speaking styles were:
- solo reading
- synchronous reading
- spontaneous speech (retell)
- reptitive synchronous imitation (rsi)
- whispered fast reading
- fast speech reading
In two of the speaking conditions adopted, speakers modified their speech in a constrained fashion towards a known target in the synchronous condition, the speech of the co-speaker served as a target, while in rsi, there was an explicit known static target. The presence of a known target which speakers aim to copy raises the bar in the discovery and design of procedures for automatic speaker identification, as the target speech provides a potentially highly confusing foil. The whisper and fast speech conditions are also well defined speaking styles which require substantial voice modification by the speaker.
Participants were recruited through the University College Dublin and were paid for their participation. No participant had any known speech or hearing deficit. The speakers were from the United Kingdom, the eastern part of Ireland (Dublin and adjacent counties) and the United States. Further information about the speakers, their gender and dialect is available in the documentation released with this corpus.
For the example of the data in this particular corpus please examine this sound file of the fast reading type