skip to content


Faculty of Modern and Medieval Languages and Linguistics


Dr Paula Buttery

Dr Paula Buttery
Reader in Computing and Language
Affiliated Researcher, Department of Theoretical and Applied Linguistics
Theoretical and Applied Linguistics
Contact details: 

Gonville and Caius College
Trinity Street


Dr Buttery is a computational linguist with research interests in both computer applications (Natural Language Processing) and language cognition (Computational Psycho-linguistics). She is the Director of the Automated Language Teaching and Assessment (ALTA) institute. Her work within ALTA focuses on the spoken language of learners of English; she uses natural language processing and machine learning techniques to provide formative feedback to learners through computer applications. More generally, she is interested in building Natural Language Processing tools that work with non-canonical forms of natural language (spoken language, learners, aphasics) and also with low resource languages (endangered languages, dialects). She is also interested in the computational modelling of first and second language acquisition and language evolution.

Dr Buttery won a Cambridge University Pilkington Prize for teaching in 2015.

Teaching interests: 
  • Computational Linguistics
  • Computational Language Learning
  • Computational Psycholinguistics
Recent research projects: 

Computer-Assisted Language Learning (CALL) for the revitalization of endangered languages: The case of Runyakitara - Funded by Cambridge-Africa Alborada Research Fund and the Cambridge Africa Programme.

The Institute for Research in Automated Language Teaching and Assessment (ALTA). - Funded by Cambridge English Language Assessment

Understanding the Effects of Topic on ‘Opportunity of Use’. - Funded by Cambridge English Language Assessment.

UK version of PubMed Central (UKPMC). - Funded at the European Bioinformatics Institute by the Wellcome Trust and the UKPMC funders group.

Computational Natural Language Processing and the Neuro-Cognition of Language. - Funded by the Engineering and Physical Sciences Research Council (EPSRC) as a Cognitive Systems Foresight Project.

English Profile: Reference Level Descriptions for English. - Funded by Cambridge English Language Assessment 

Published works: 

2015 Bentz, C., Verkerk, A., Kiela, D., Hill, F. and Buttery, P. Adaptive communication: Languages with more non-native speakers tend to have fewer word forms PLOS ONE

2015 Moore, R., Caines, A., Buttery, P. and Calbert G. Incremental dependency parsing and disfluency detection in spoken learner English. Text, Speech, and Dialogue Pilsen, Czech Republic.

2015 Thwaites, A., Nimmo-Smith, I., Fonteneau, E., Patterson, R., Buttery, P. and Marslen-Wilson, W. Tracking cortical entrainment in neural activity: Auditory processes in human temporal cortex. Frontiers in Computational Neuroscience. 9:5 doi:10.3389/fncom.2015.00005

2014 Caines, A. and Buttery, P. The effect of disfluencies and learner errors on the parsing of spoken learner language. Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages p74—81. Computational Linguistics (CoLing), Dublin, Ireland.

2014 Bentz, C. and Buttery, P. Towards a computational model of grammaticalization and lexical diversity. Proceedings of the 5th Workshop on Cognitive Aspects of Computational Language Learning (CogA- CLL). Gothenburg, Sweden. p38—42

2014 Bentz, C., Kiela, D., Hill, F. and Buttery, P. Zipf’s law and the grammar of languages: A quan- titative study of Old and Modern English parallel texts. 10:2, p175—211. Corpus Linguistics and Linguistic Theory.

2013 Buttery P, McCarthy M, Carter R. Chatting in the academy: informality in spoken academic discourse. In: Suganthi, C., Groom, N. and Charles, M. (eds), Corpus and Academic Discourse. Amsterdam: John Benjamins. (33 pages)

2012 Hawkins, J. and Filipovic, L. with Buttery, P., Capel, A., Hawkey, R., Salamoura, A., Saville, N., and Trim, J. Criterial features in L2 English: specifying the reference levels of the Common European Framework. In: Millanovic, M. and Saville, N. (eds), English Profile Studies, Volume 1, Cambridge: Cambridge University Press.

2012 Buttery, P. and Caines, A. Normalising Frequency Counts to Account for ‘opportunity of use’ in Learner Corpora. In: Tono, Y., Kawaguchi, Y., and Minegishi, M. (eds.), Developmental and Crosslinguis- tic Perspectives in Learner Corpus Research. Amsterdam: John Benjamins. p187—204

2012 Buttery, P. and Caines, A. Reclassifying subcategorization frames for experimental analysis and stimulus generation. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12). Istanbul, Turkey. p1694—1698

2012 Caines, A. and Buttery, P. Annotating progressive aspect constructions in the spoken section of the British National Corpus. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12). Istanbul, Turkey. p1699—1703

2011 Buttery, P. and McCarthy, M. Lexis in Spoken Discourse. In: Gee, J., and Handford, M. (eds.), The Routledge Handbook of Discourse Analysis. London: Routledge. p285—300

2010 McEntyre, J.M., Ananiadou, S., Andrews, S., Black, WJ., Boulderstone, R., Buttery, P., Chaplin, D., Chevuru, S., Cobley, N., Coleman, L., Davey, P., Gupta, B., Haji-Gholam, L., Hawkins, C., Horne, A., Kim, J., Lewin, I., Lyte, V., MacIntyre, R., Mansoor, S., Mason, L., McNaught, J., Newbold, E., Nobata, C., Ong, E., Pillai, S., Rebholz-Schuhmann, D., Rosie, H., Rowbotham, R., Rupp, C.J., Stoehr, P., and Vaughan, P. UKPMC: a full text article resource for the life sciences. Nucleic Acids Research, 39: D58- D65

2010 Caines, A. and Buttery, P. ‘You Talking to Me?’ A predictive model for zero-auxilary constructions. Proceedings of the Workshop on NLP and Linguistics, Finding the Common Ground, Association for Computational Linguistics (ACL-2010). Uppsala, Sweden. p43—51

2010 Hawkins, J. and Buttery P. Criterial Features in Learner Corpora: Theory and Illustrations. English Profile Journal, 1:

2010 Thwaites, A., Geertzen, J., Marslen-Wilson, W., and Buttery, P. Lexical Isolation Point Software. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). Malta. p3727—p3731

2010 Williams, C., Thwaites, A., Buttery, P., Geertzen, J., Randall, B., Shafto, M., Devereux, B., and Tyler, L. The Cambridge Cookie-Theft Corpus: A Corpus of Spontaneous and Directed Speech of Brain- Damaged Patients and Healthy Individuals. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). Malta. p2824—2830

2009 Vlachos, A., Buttery, P., Ó Séaghdha, D., and Briscoe, E. J. Biomedical Event Extraction without Training Data. Proceedings of BioNLP, North American Chapter of the Association for Computational Lin guistics (NAACL 2009), Boulder, Colorado. p37—40

2009 Rice, A., Buttery, P., Rai, I., and Beresford, A. Language learning on a next-generation service platform for Africa. Africa Perspective on the Role of Mobile Technologies in Fostering Social and Economic Development, W3C Workshop. Maputo, Mozambique.

2009 Hawkins, J. and Buttery, P. Using Learner Language from Corpora to Profile Levels of Proficiency: Insights from the English Profile Programme. In: Taylor, L. and Weir, C. (eds.), Language Test- ing Matters: Investigating the Wider Social and Educational Impact of Assessment. Cambridge: Cambridge University Press. p158—175

2008 Briscoe, E. J. and Buttery, P. Linguistic adaptions for resolving ambiguity. In: Smith, A. D. M., Smith, K., and Ferrer i Cancho, R. (eds.), The Evolution of Language: Proceedings of the 7th International Conference (EVOLANG7). Barcelona, Spain. Singapore: World Scientific Press. p51—58

2007 Buttery, P. and Korhonen, A. ‘I will shoot your shopping down and you can shoot all my tins’; Automatic lexical acquisition from the CHILDES Database. Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition, Association for Computational Linguistics (ACL-2007). Prague, Czech Republic. p33—40

2007 Buttery, P., Villavicencio, A., and Korhonen A. (eds.). Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition. Prague, Czech Republic. Association for Computational Linguistics.

2006 Buttery, P. Computational models for first language acquisition. Ph.D. thesis published as a technical report UCAM-CL-TR-675. Computer Laboratory, University of Cambridge.

2005 Buttery, P. and Korhonen, A. Large-scale analysis of verb subcategorization differences between child directed speech and adult speech. Verb Workshop 2005: Interdisciplinary Workshop on the Identifi- cation and Representation of Verb Features and Verb Classes. Saarland University, Germany.

2004 Buttery, P. A quantitative evaluation of naturalistic models of language acquisition; the efficiency of the Triggering Learning Algorithm compared to a Categorial Grammar Learner. Pro- ceedings of the Workshop on Psycho-computational Models of Human Language Acquisition. COLING-2004, Association for Computational Linguistics. Geneva, Switzerland. p1—8

2004 Buttery, P. and Briscoe, E. J. The significance of errors to parametric models of language acquisition. In: Cohen, P., Clark, A., Hovy, E., Oates, T., and Witbrock, M. (eds.), Language Learning: An Interdisciplinary Perspective. Papers from the AAAI Spring Symposium. Association for the Advancement of Artificial Intelligence. Stanford, CA. p15—20

2003 Buttery, P. A computational model of first language acquisition. Proceedings of the 6th Annual CLUK Research Colloquium (CLUK-6). Edinburgh. p1—8