Abstract
This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University - the EMILLE Corpus, which contains 14 South Asian languages, and the Lancaster Corpus of Mandarin Chinese. Finally, we will demonstrate how to explore these corpora using Xara and other corpus tools.
Original language | English |
---|---|
Publication status | Published - 2004 |
Event | 4th Workshop on Asian Language Resources - Sanya, China Duration: 25 Mar 2004 → … |
Conference
Conference | 4th Workshop on Asian Language Resources |
---|---|
Country/Territory | China |
City | Sanya |
Period | 25/03/04 → … |