OLAC Language Resource Catalog

Navigation Aids

OLAC Language Resource Catalog
Search for language resources
 

Main Content

CSLU: Multilanguage Telephone Speech Version 1.2
Title:
CSLU: Multilanguage Telephone Speech Version 1.2
ID:
LDC2006S35
https://catalog.ldc.upenn.edu/LDC2006S35
ISBN: 1-58563-390-9
ISLRN: 871-936-811-171-7
Online:
Yes
Archive:
Date:
2006
Publisher:
Linguistic Data Consortium
https://www.ldc.upenn.edu
Description:
*Introduction* The Multilanguage Telephone Speech corpus consists of telephone speech from 11 languages: English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spanish, Tamil, Vietnamese. The corpus contains fixed vocabulary utterances (eg. days of the week) as well as fluent continuous speech. The current release includes recorded utterances from about 2,052 speakers, for a total of about 38.5 hours of speech. Time-aligned phonetic transcriptions for 619 of the utterances are also included. *Data* Each subject called the CSLU data collection system by dialing a toll-free number. An analog telephone line was connected to a Gradient Technologies box. Data from incoming calls were recorded by the Gradient box. The sampling rate was 8 khz and the files were stored in 16-bit linear format on a UNIX file system. Each utterance was recorded as a separate file. *Samples* For an example of the data in this corpus, please listen to these audio samples in Tamil and English.
Content language:
Vietnamese
Tamil
Spanish
Iranian Persian
Korean
Japanese
Hindi
French
English
German
Mandarin Chinese
Linguistic type:
Primary text
DCMI type:
Sound
Other format:
Sampling Rate: 8000
Sampling Format: pcm
Distribution: Web Download
Other language:
Vietnamese
Tamil
Spanish
Iranian Persian
Korean
Japanese
Hindi
French
English
German
Mandarin Chinese
Other rights:
Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf
Rights holder: Portions © 1992, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2006 Trustees of the University of Pennsylvania
Complete OLAC record:
Link for this page:

Find Related Information:

Archive: The LDC Corpus Catalog
Online: Yes
Linguistic type: Primary text
DCMI type: Sound
Content language: English
Content language: French
Content language: German
Content language: Hindi
Content language: Iranian Persian
Date: 2000 - 2009
Date: 2000 and later
Contributor: Cole, Ronald Allan
Contributor: Muthusamy, Yeshwant
Contributor: Oshika, Beatrice
Publisher: Linguistic Data Consortium
Publisher: https://www.ldc.upenn.edu
Title: CSLU: Multilanguage Telephone Speech Version 1.2
Other format: Distribution: Web Download
Other format: Sampling Format: pcm
Other format: Sampling Rate: 8000
Other language: English
Other language: French
Other language: German
Other language: Hindi
Other language: Iranian Persian
Other rights: CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf
Other rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Other rights: Rights holder: Portions © 1992, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2006 Trustees of the University of Pennsylvania