OLAC Language Resource Catalog

Navigation Aids

OLAC Language Resource Catalog
Search for language resources
 

Main Content

Amharic-English bilingual corpus
Title:
Amharic-English bilingual corpus
ID:
ELRA-W0074
Link to the object:
Online:
Yes
Archive:
Date:
2013-12-17
Publisher:
ELRA (European Language Resources Association)
Description:
Written Corpora
The Amharic-English bilingual corpus contains parallel text from legal and news domains in Amharic script, in transliterated form and in English. The size of the corpus is of 232,653 words in Amharic and 291,701 in English. This parallel corpus contains documents from two domains, namely legal and news, in English and Amharic language. The two domains are separately processed. In addition, for Amharic language, documents were prepared using its own script which is different from Latin alphabet. For easy of use and processing, as well as normalization purposes, the Amharic documents are transliterated and the English documents are converted into lower case format. Furthermore, clean documents were prepared without considering the two domains separately. Amharic is a Semitic language spoken in Ethiopia.
The Amharic-English bilingual corpus contains parallel text from legal and news domains in Amharic script, in transliterated form and in English. The size of the corpus is of 232,653 words in Amharic and 291,701 in English.
Content language:
Amharic
English
Linguistic type:
Primary text
DCMI type:
Text
Other language:
Amharic
English
Other rights:
Rights available for: Research Use, Commercial Use
Complete OLAC record:
Link for this page:

Find Related Information:

Archive: ELRA Catalogue of Language Resources
Online: Yes
Linguistic type: Primary text
DCMI type: Text
Content language: Amharic
Content language: English
Date: 2000 and later
Date: 2010 - 2019
Publisher: ELRA (European Language Resources Association)
Title: Amharic-English bilingual corpus
Other language: Amharic
Other language: English
Other rights: Rights available for: Research Use, Commercial Use