This paper presents the newly released Lancaster Corpus of Mandarin Chinese (LCMC), a Chinese match for the FLOB and Frown corpora of British and American English. We first discuss the major decisions we took when building the corpus. These relate to sampling, text collection, mark-up, and annotation. Following from this we use the corpus to study aspect marking in Chinese and British/American English. The study shows that although Chinese and English are typologically different, aspect markers in the two languages show a strikingly similar distribution pattern, especially across the two broad categories of narrative and expository texts. The study also reveals some important differences in the distribution of aspect markers in Chinese versus English and British versus American English across fifteen text categories, and provides an account of these differences.
McEnery, T., Xiao, R., & Mo, L. (2003). Aspect Marking in English and Chinese: using the Lancaster Corpus of Mandarin Chinese for contrastive language study. Literary and Linguistic Computing, 18(4), 361-378. https://doi.org/10.1093/llc/18.4.361