OLAC Language Resource Catalog

Navigation Aids

OLAC Language Resource Catalog
Search for language resources
 

Main Content

The Open ANC (OANC) -- Corpus abierto del lenguaje americano
Title:
The Open ANC (OANC) -- Corpus abierto del lenguaje americano
Link to the object:
Online:
Yes
Archive:
Contributor:
Ide, Nancy (author)
Reppen, Randi (author)
Suderman, Keith (author)
National Science Foundation (BCS-98009, KDI, SBE) (sponsor)
TalkBank project (sponsor)
Department of Computer Science, Vassar College (New York US) (depositor)
Date:
2011-05-31
Publisher:
Department of Computer Science, Vassar College (New York US)
http://www.cs.vassar.edu
Description:
The American National Corpus (ANC) project is creating a massive electronic collection of American English, including texts of all genres and transcripts of spoken data produced from 1990 onward. The ANC will provide the most comprehensive picture of American English ever created, and will serve as a resource for education, linguistic and lexicographic research, and technology development. This open portion of the American National Corpus (OANC) contains approximately 15 millions words from the full corpus.
Le projet American National Corpus (ANC) est en train de rassembler une collection volumineuse sur l'anglais américain qui comprend des textes de tous genres et des transcriptions de paroles à partir de 1990. L'ANC fournira l'image la plus complète de l'anglais américain construite à ce jour, servant de ressource pour l'enseignement, la recherche linguistique et lexicographique, ainsi que les technologies de la langue. Ce fragment en libre accès de l'American National Corpus (OANC) contient environ 15 millions de mots du corpus d'origine.
The following corpora are included:<br /><br />Spoken<br />- Charlotte<br />- Switchboard<br /><br />Written<br />- Eggan (fiction)<br />- Slate<br />- Verbatim<br />- ICIC<br />- OUP<br />- 911 Report<br />- Biomed<br />- Govenment<br />- PLOS<br />- Berlitz<br /><br />The following annotations are also included:<br />- Structural markup (divisions, paragraphs) etc. down to the paragraph level.<br />- Sentence boundaries.<br />- Tokens with Hepple (Penn) part of speech annotations.<br />- Noun chunks<br />- Verb chunks
Content language:
English
Subject language:
English
Language family:
Indo-European
Germanic
Country:
United States
Linguistic type:
Primary text
Linguistic field:
Text and corpus linguistics
Discourse analysis
Language documentation
Discourse type:
Narrative
DCMI type:
Sound
Format:
application/xml
application/zip
Other language:
English, American
Inglés americano
anglais américain
英语, American
Other rights:
info:eu-repo/date/submitted/2011-05-29
info:eu-repo/semantics/openAccess
Free access
The ANC has so far released 22 million words of American English, which is available from the Linguistic Data Consortium.
© Ide, Nancy, and Suderman, Keith (2007). The Open American National Corpus (OANC). http://www.AmericanNationalCorpus.org/OANC
Documents librement communicables. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
Documents freely communicated. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
自由地被传达的文件 (Code du Patrimoine, 艺术。L. 211-1, L. 211-4, L. 213-1)
Documentos libremente comunicables. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
Other subject:
Information Retrieval
Parsing
Sense Disambiguation
Discourse Modeling
Language Teaching
Text Databases
Human Machine Communication
Recherche d'information
désambiguation du sens
modélisation du discours
enseignement des langues
bases de données textuelles
communication humain-machine
English, American
Inglés americano
anglais américain
英语, American
Other type:
info:eu-repo/semantics/dataset
Complete OLAC record:
Link for this page:

Find Related Information:

Archive: Speech and Language Data Repository (SLDR/ORTOLANG)
Online: Yes
Subject language: English
Language family: Germanic
Language family: Indo-European
Geographic region: Europe
Country: United States
Linguistic type: Primary text
Linguistic field: Discourse analysis
Linguistic field: Language documentation
Linguistic field: Text and corpus linguistics
Discourse type: Narrative
DCMI type: Sound
Format: application/xml
Format: application/zip
Content language: English
Date: 2000 and later
Date: 2010 - 2019
Contributor: Department of Computer Science, Vassar College (New York US)
Contributor: Ide, Nancy
Contributor: National Science Foundation (BCS-98009, KDI, SBE)
Contributor: Reppen, Randi
Contributor: Suderman, Keith
Publisher: Department of Computer Science, Vassar College (New York US)
Publisher: http://www.cs.vassar.edu
Title: The Open ANC (OANC) -- Corpus abierto del lenguaje americano
Other language: English, American
Other language: Inglés americano
Other language: anglais américain
Other language: 英语, American
Other rights: Documentos libremente comunicables. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
Other rights: Documents freely communicated. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
Other rights: Documents librement communicables. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
Other rights: Free access
Other rights: The ANC has so far released 22 million words of American English, which is available from the Linguistic Data Consortium.
Other subject: Discourse Modeling
Other subject: English, American
Other subject: Human Machine Communication
Other subject: Information Retrieval
Other subject: Inglés americano
Other type: info:eu-repo/semantics/dataset