Quantifying constructions in English and Chinese: A corpus-based contrastive study

T. McEnery, R. Xiao

    Research output: Contribution to conferencePaperpeer-review


    Quantifiers are a linguistic concept that mirrors quantity in reality. They indicate ‘how many’ or ‘how much’, for example, the number of entities denoted by a noun, the count of actions or events, the length of time, and the distance in space. All human languages have linguistic devices that express such ideas, though the encoding of natural language semantics can vary from language to language. This paper compares quantifying constructions in English and Chinese on the basis of comparable corpora of spoken and written data in the two languages. We will focus on classifiers in Chinese and their counterparts in English, as well as the interaction between quantifying constructions and progressives, which is normally ruled out by aspect theory, with the aim of addressing the following research questions: • What linguistic devices are used in Chinese and English for quantification? • How different (or similar) are classifiers in Chinese as a classifier language and in English as a non-classifier language? • Can quantifiers interact with progressives in English and Chinese if such interactions are theoretically ruled out by aspect theory? Before these research questions are explored in detail, it is appropriate to first present the principal data used in this study, which includes two written corpora and two spoken corpora. The Freiburg-LOB (FLOB) corpus is a recent update of LOB, which is composed of approximately one million tokens of written British English sampled proportionally from fifteen text categories published in the early 1990s (Hundt et al. 1998). The Lancaster Corpus of Mandarin Chinese (LCMC) was designed as a Chinese match for FLOB and created using the same sampling criteria, representing written Mandarin Chinese published in China in the corresponding sampling period (McEnery et al. 2003). The two spoken corpora are BNCdemo and CallHome Mandarin. BNCdemo is the demographically sampled component of the British National Corpus (BNC), which contains four million tokens of transcripts of conversations recorded around the early 1990s. The CallHome Mandarin Transcripts corpus, which was released by the LDC, comprises 120 transcripts of 5-to-10-minute telephone conversations recorded in the first half of the 1990s between native Chinese speakers living overseas and their families in China, amounting to approximately 300,000 tokens. While telephone calls differ from face-to-face conversations alongside some dimensions (Biber 1988), the sampling periods of two spoken corpora are roughly comparable. A practical reason for using the CallHome corpus is that this dataset is closest to BNCdemo which is available to us. In the remaining sections of this article, we will first explore classifiers in Chinese and English, on the basis of which the two will be compared. We will then discuss the interaction of the progressive with quantifying constructions in the two languages.
    Original languageEnglish
    Publication statusPublished - 2007
    Event4th Corpus Linguistics Conference - University of Birmingham, United Kingdom
    Duration: 28 Jul 200730 Jul 2007


    Conference4th Corpus Linguistics Conference
    Country/TerritoryUnited Kingdom


    Dive into the research topics of 'Quantifying constructions in English and Chinese: A corpus-based contrastive study'. Together they form a unique fingerprint.

    Cite this