OLAC Language Resource Catalog

Navigation Aids

OLAC Language Resource Catalog
Search for language resources
 

Main Content

UMC 0.1: Czech-Russian-English Multilingual Corpus
Title:
UMC 0.1: Czech-Russian-English Multilingual Corpus
Link to the object:
Online:
Yes
Archive:
Contributor:
Klyueva, Natalia (author)
Bojar, Ondřej (author)
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Description:
UMC 0.1 Czech-English-Russian is a multilingual parallel corpus of texts in Czech, Russian and English languages with automatic pairwise sentence alignments. The primary aim of UMC is to extend the set of languages covered by the corpus CzEng mainly for the purposes of machine translation. All the texts were downloaded from a single source — The Project Syndicate (Copyright: Project Syndicate 1995-2008), which contains a huge collection of high-quality news articles and commentaries. We were given the permission to use the texts for research and non-commercial purposes.
FP6-IST-5-034291-STP (EuroMatrix)
Content language:
Czech
Linguistic type:
Primary text
DCMI type:
Text
Other date:
2011-06-28T10:42:32Z
2008-10-02T00:00:00Z
Other rights:
Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0)
http://creativecommons.org/licenses/by-nc-nd/3.0/
Other subject:
multi-language corpus
Other type:
corpus
Complete OLAC record:
Link for this page:

Find Related Information:

Archive: LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Online: Yes
Linguistic type: Primary text
DCMI type: Text
Content language: Czech
Contributor: Bojar, Ondřej
Contributor: Klyueva, Natalia
Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Title: UMC 0.1: Czech-Russian-English Multilingual Corpus
Other date: 2008-10-02T00:00:00Z
Other date: 2011-06-28T10:42:32Z
Other rights: Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0)
Other rights: http://creativecommons.org/licenses/by-nc-nd/3.0/
Other subject: multi-language corpus
Other type: corpus