skip to content


Faculty of Modern and Medieval Languages and Linguistics


Professor Nigel Collier

Nigel Collier
Professor of Natural Language Processing
Fellow of the Alan Turing Institute
Co-Director of the Language Technology Lab
Professorial Fellow of Murray Edwards College
Nigel Collier is accepting applications for PhD and MPhil students and is available for consultancy.
Theoretical and Applied Linguistics
Faculty of Modern & Medieval Languages & Linguistics
Contact details: 
Telephone number: 
+44 (0)1223 7 60373

Language Technology Lab
Theoretical and Applied Linguistics
Faculty of Modern and Medieval Languages
English Faculty Building
University of Cambridge
9 West Road, Cambridge CB3 9DP
United Kingdom
Office: TR-12
Phone: (+44) 1223 767 389



Nigel’s main research interests span core work on machine learning for Natural Language Processing (NLP). He is active in the areas of Information Extraction and Text Mining, Social Media, Textual Inference, Generation and Real-World Applications. He has experience in applications for the life sciences (including global public health) and finance.

He joined the Department of Theoretical and Applied Linguistics in 2015 as Director of Research in Computational Linguistics and is currently both a University Lecturer and an EPSRC Experienced Research Fellow. He has a joint affiliation with the Alan Turing Institute for data science and artificial intelligence where he holds a fellowship. From 2012 to 2014 he was a Marie Curie Research Fellow at the European Bioinformatics Institute in Cambridge and prior to this an Associate Professor at the National Institute of Informatics in Tokyo where he led the NLP laboratory.  He received his doctorate from the University of Manchester in 1996 and held post-doctoral positions at Toshiba Corporation and the University of Tokyo. His research has been funded by UK, EU and Japanese research councils (JSPS, JST, FP7, EPSRC, MRC).


Teaching interests: 

Natural Language Processing/Computational Linguistics


Research interests: 
  • Information Extraction and Text Mining
  • Integration of Text and Knowledge Graph
  • Social Media
  • Fact Verification
  • Textual Inference
  • Text Generation
  • NLP for Real-World Applications


Recent research projects: 

2020 - 2023 EPI-AI (ESRC funded) Automated Understanding and Alerting of Disease Outbreaks from Global News Media

2015 – 2020, SIPHS (EPSRC funded), Semantic interpretation of personal health messages

2012 – 2014, PhenoMiner (EC FP7 funded), Semantic mining of phenotype associations from the scientific literature

2006 – 2012 BioCaster (JST funded), Detecting public health rumors with a Web-based text mining system


Published works: 
  • Zaiqiao Meng, Fangyu Liu, Thomas Clark, Ehsan Shareghi and Nigel Collier. Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT. EMNLP 2021 (Main).

  • Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang and Nigel Collier. Plan-then-Generate: Controlled Data-to-Text Generation via Planning. EMNLP 2021 (Findings).

  • Fangyu Liu, Emanuele Bugliarello, Edoardo Maria Ponti, Siva Reddy, Nigel Collier and Desmond Elliott. Visually Grounded Reasoning across Languages and Cultures. EMNLP 2021 (Main).

  • Yixuan Su, Zaiqiao Meng, Simon Baker and Nigel Collier. Few-Shot Table-to-Text Generation with Prototype Memory. EMNLP 2021 (Findings).

  • Fangyu Liu, Ivan Vulic, Anna Korhonen and Nigel Collier. Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders. EMNLP 2021 (Main).

  • Qianchu Liu, Fangyu Liu, Nigel Collier, Anna Korhonen and Ivan Vulic. On Eliciting Word-in-Context Representations from Pretrained Language Models. CoNLL 2021.

  • Victor Prokhorov, Yingzhen Li, Ehsan Shareghi and Nigel Collier. Learning Sparse Sentence Encoding without Supervision: An Exploration of Sparsity in Variational Autoencoders. RepL4NLP 2021.

  • Fangyu Liu, Ivan Vulić, Anna Korhonen and Nigel Collier. Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking. ACL 2021 (Main).

  • Yixuan Su, Deng Cai, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, and Yan Wang. Dialogue Response Selection with Hierarchical Curriculum Learning. ACL 2021 (Main).

  • Yixuan Su, David Vandyke, Simon Baker, Yan Wang and Nigel Collier. Keep the Primary, Rewrite the Secondary: A Two-Stage Approach for Paraphrase Generation. ACL 2021 (Findings).