
Language Technology Lab
Theoretical and Applied Linguistics
Faculty of Modern and Medieval Languages
English Faculty Building
University of Cambridge
9 West Road, Cambridge CB3 9DP
United Kingdom
Office: TR-12
Phone: (+44) 1223 767 389
Nigel’s main research interests span core work on machine learning for Natural Language Processing (NLP). He is active in the areas of Information Extraction and Text Mining, Social Media, Textual Inference, Generation and Real-World Applications. He has experience in applications for the life sciences (including global public health) and finance.
He joined the Department of Theoretical and Applied Linguistics in 2015 as Director of Research in Computational Linguistics and is currently both a University Lecturer and an EPSRC Experienced Research Fellow. He has a joint affiliation with the Alan Turing Institute for data science and artificial intelligence where he holds a fellowship. From 2012 to 2014 he was a Marie Curie Research Fellow at the European Bioinformatics Institute in Cambridge and prior to this an Associate Professor at the National Institute of Informatics in Tokyo where he led the NLP laboratory. He received his doctorate from the University of Manchester in 1996 and held post-doctoral positions at Toshiba Corporation and the University of Tokyo. His research has been funded by UK, EU and Japanese research councils (JSPS, JST, FP7, EPSRC, MRC).
Natural Language Processing/Computational Linguistics
- Information Extraction and Text Mining
- Integration of Text and Knowledge Graph
- Social Media
- Fact Verification
- Textual Inference
- Text Generation
- NLP for Real-World Applications
2015 – 2020, SIPHS (EPSRC funded), Semantic interpretation of personal health messages
2012 – 2014, PhenoMiner (EC FP7 funded), Semantic mining of phenotype associations from the scientific literature
2006 – 2012 BioCaster (JST funded), Detecting public health rumors with a Web-based text mining system
Pilehvar, M. T., Prokhorov, V., Kartsaklis, D. and Collier, N. (2018), “CARD-660: A Reliable Evaluation Framework for Rare Word Representation Models”, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium (in press).
Kartsaklis, D., Pilehvar, M. T. and Collier, N. (2018), “Mapping Text to Knowledge Graph Entities using Multi-Sense LSTMs”, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium.
Le, H. Q., Can, D. C., Vu, T. S., Dang, T. H.., Pilehvar, M. T. and Collier, N. (2018), “Large-scale Exploration of Neural Relation Classification Architectures”, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium.
Gritta, M., Pilehvar, M. T., & Collier, N. (2018). “Which Melbourne? Augmenting Geocoding with Maps”, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, pp. 1285-1296.
Conforti, C., Pilehvar, M. T. and Collier, N. (2018), “Towards Automatic Fake News Detection: Asymmetric Stance Detection in News Articles”, in Proceedings of the First Workshop on Fact Extraction and Verificiation at EMNLP 2018, Brussels, Belgium.
Conforti, C., Pilehvar, M. T. and Collier, N. (2018), “Modeling the Fake News Challenge as an Asymmetric Stance Detection Task”, in Proceedings of the 2nd International Workshop on Rumours and Deception in Social Media (RDSM) at CIKM 2018, Turin, Italy.
Gritta, M., Pilehvar, M. T., Limsopatham, N. and Collier, N. (2017), "Vancouver Welcomes You! Minimalist Location Metonymy Resolution", in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2017), Vancouver, Canada, pp. 1248-1259.
Gritta, M., Pilehvar, M. T., Limsopatham, N., & Collier, N. (2017) “What’s missing in geographical parsing?”, Language Resources and Evaluation, 1-21.
Pilehvar, M. T., Camacho-Collados, J., Navigli, R. and Collier, N. (2017), "Towards a Seamless Integration of Word Senses into Downstream NLP Applications", in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2017), Vancouver, Canada, August, pp. 1857-1869.
Pilehvar, M. T. and Collier, N. (2017), "Inducing Representations for Rare Words by Leveraging Lexical Resources", in Proceedings of the European Chapter of the Association for Computational Linguistics (EACL), Valencia, Spain, pp. 388-393.