OLAC Language Resource Catalog

Navigation Aids

OLAC Language Resource Catalog
Search for language resources
 

Main Content

Collins Multilingual database (MLD) - WordBank
Title:
Collins Multilingual database (MLD) - WordBank
ID:
ELRA-T0376
Link to the object:
Online:
Yes
Archive:
Date:
2016-07-12
Publisher:
ELRA (European Language Resources Association)
Description:
Terminological Resources
The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank) and a multilingual set of sentences in 28 languages (the PhraseBank, distributed separately under reference ELRA-T0377). The WordBank contains 10,000 words for each language (Arabic, Chinese, Croatian, Czech, Danish, Dutch, American English, British English, Finnish, French, German, Greek, Italian, Japanese, Korean, Norwegian, Polish, Portuguese (Iberian), Portuguese (Brazilian), Russian, Spanish (Iberian), Spanish (Latin American), Swedish, Thai, Turkish, Vietnamese, Hindi, Tamil, Bengali, Malayalam, Romanian, Ukrainian), XML-annotated for part-of-speech, gender, irregular forms and disambiguating information for homographs. An additional dataset of 10,000 headwords is included for 12 languages (Chinese, American and British English, French, German, Italian, Japanese, Korean, Iberian and Brazilian Portuguese, Iberian and Latin American Spanish). All English headwords contain Cobuild learner?s dictionary style definitions and one or more examples of the word in context. Lemmatized lists and verb tables are available for English, French, German, Spanish and Italian. Romanization is provided for Chinese, Japanese, Korean and Thai. The corresponding audio files are available for 26 languages of the 32 languages (thus excluding Hindi, Tamil, Bengali, Malayalam, Romanian and Ukrainian) and are distributed in a package referenced ELRA-S0382.
This multilingual lexicon covers Real Life Daily vocabulary in 32 languages. It contains 10,000 words for each language, XML-annotated for part-of-speech, gender, irregular forms and disambiguating information for homographs, and 10,000 additional headwords for 12 languages.
Content language:
Arabic
Chinese
Czech
Danish
Dutch
English
Finnish
French
German
Modern Greek (1453-)
Italian
Japanese
Korean
Norwegian
Polish
Portuguese
Russian
Spanish
Swedish
Thai
Turkish
Vietnamese
Hindi
Tamil
Bengali
Malayalam
Romanian
Ukrainian
Linguistic type:
Primary text
DCMI type:
Text
Other language:
Arabic
Chinese
Croatian
Czech
Danish
Dutch, Flemish
English
Finnish
French
German
Greek, Modern (1453-)
Italian
Japanese
Korean
Norwegian
Polish
Portuguese
Russian
Spanish, Castilian
Swedish
Thai
Turkish
Vietnamese
Hindi
Tamil
Bengali
Malayalam
Romanian
Ukrainian
Complete OLAC record:
Link for this page:

Find Related Information:

Archive: ELRA Catalogue of Language Resources
Online: Yes
Linguistic type: Primary text
DCMI type: Text
Content language: Arabic
Content language: Bengali
Content language: Chinese
Content language: Czech
Content language: Danish
Date: 2000 and later
Date: 2010 - 2019
Publisher: ELRA (European Language Resources Association)
Title: Collins Multilingual database (MLD) - WordBank
Other language: Arabic
Other language: Bengali
Other language: Chinese
Other language: Croatian
Other language: Czech