Skip to main content


Language & Education Network Research Seminar - Investigating vocabulary in academic spoken English of Business lectures in China

Event details

Investigating vocabulary in academic spoken English of Business lectures in China

Chen Chen (University of Exeter & Xi’an Jiaotong-Liverpool University)

Recent years have witnessed a rapid increase in the number of English-medium-instruction (EMI) educational institutions worldwide in all phases of education, especially in higher education (Macaro et al., 2018). With this fast-growing phenomenon, particular concerns have been raised in regard to students’ comprehension of EMI lectures, and vocabulary knowledge has been listed as one of the key impediments to comprehension (Ellili-Cherif & Alkhateeb, 2015; Goh, 2013; Wang & Treffers-Daller, 2017; Xie, 2020). This project, therefore, aims at investigating vocabulary in EMI lectures in the discipline of Business at an EMI university in China. 

To fulfil this aim, the project developed an EMI spoken academic corpus in Business (EMIB) with 120 lectures collected from 54 lecturers with nine different L1s, reaching 1.12 million tokens. Based on the corpus, three linked studies have been conducted. Study 1 drew on usage-based approaches (Ellis & Wulff, 2020) and investigated how different word properties, including frequency, salience, and contingency, contributed to word difficulty, as well as how frequency, contextual distinctiveness, and keyness contribute to vocabulary usefulness. Multiple regression models were built, and results show that word frequency and contingency significantly predicted word difficulty, and frequency, semantic diversity, and keyness significantly predicted word usefulness. Study 2 compared lexical difficulty and diversity of EMI Business lectures in China with other academic spoken English in Anglophone and non-Anglophone settings, represented by the British Academic Spoken English Corpus (BASE) and the Corpus of English as a Lingua Franca in Academic Settings (ELFA). Lexical difficulty was operationalised by vocabulary frequency profile and average frequency score; and the VOCD-D has been used to measure the lexical diversity. Results show that the BASE has the highest level of lexical difficulty and diversity. The ELFA has greater lexical difficulty than the EMIB, but there are no significant differences between the two corpora in terms of lexical diversity. Study 3 developed a vocabulary list based on frequency, range, dispersion, and keyness. The list reflects not only the vocabulary prevalence in the EMIB corpus but also the difficulty and usefulness of the words necessary for EMI students who study Business related programmes. Theoretical, pedagogical, and methodological implications of these findings were discussed.