Marieke Meelen | Faculty of Modern and Medieval Languages and Linguistics

Contact

About

Marieke Meelen’s research interests include information structure, comparative syntax and historical linguistics. She is currently part of two projects: the Emergence of Egophoricity (with Prof Hill at SOAS, University of London) and ‘PaganTibet’ (with Prof Ramble, EPHE-PSL, Paris) as well as the recently-finished AHRC-funded ‘The History of Subject Pronouns (with Prof Willis at Oxford University and Prof Meier in Berlin). As the PI of an ELDP-funded research projects documenting endangered languages in Nepal, she is interested in NLP and corpus creation for low-resource languages, having developed both ASR and HTR models for various Tibeto-Burman varieties.

As part of her British Academy postdoctoral fellowship, she worked on the history of V2 word orders across Indo-European languages and developing a historical treebank of Welsh. Her doctoral thesis combined methods from computational and historical linguistics to reconstruct verb-initial and verb-second word order patterns and information structure in Welsh in their Celtic historical context. She is also a computational linguistic consultant for a project on the annotation of Middle Welsh texts at the Philipps-Universität in Marburg.

Marieke was awarded her PhD at Leiden University in 2016 supervised by Prof Lisa Cheng and Prof Alexander Lubotsky.

Research

Recent research projects:

ERC AG ‘PaganTibet’ (2023-2028)
AHRC ‘Emergence of Egophoricity’ (2022-2026)
AHRC-DFG ‘History of Subject Pronouns in Northern Europe’ (2021-2024)
ELDP SG ‘An Audio-Visual Archive of South Mustang Tibetan’ (2022-2023)

Published works:

Meelen, M. (in press) Syntactic reconstruction in Celtic. In Carnie et al. (eds.) Formal Approaches to Celtic Linguistics, Language Science Press.
Meelen, M. (in press) Middle Cornish syntax. in Nurmio et al (eds.) Palgrave Handbook of Celtic Languages & Linguistics.
Meelen, M. and Willis, D. (2024). The diachrony of Welsh subject pronouns in Elliott Lash (ed.) Studia Celtica Posnaniensia Vol 9. Special Issue: Noun phrase and pronominal syntax in medieval and early modern Celtic languages. 85-112.
O’Neill, A. and Meelen, M. (2024). Diachronic Annotated Corpus of Newar (DACON): from Manuscript to Morphosyntax in Cahiers Linguistique Asie Orientale, 1-30. DOI: 10.1163/19606028-bja10047
Meelen, M, Faggionato, C. and Hill, N. eds (2024). Tibetan digital humanities and natural language processing. Proceedings of the IATS 2022 panel as a Special Issue of the Revue d’Etudes Tibétaines 72.
Meelen, M, Nehrdich, S. and Keutzer, K. (2024). Breakthroughs in Tibetan NLP & Digital Humanities. In Meelen, M, Faggionato, C. and Hill, N. (eds) Tibetan digital humanities and natural language processing. Proceedings of the IATS 2022 panel as a Special Issue of the Revue d’Etudes Tibétaines. pp. 5-25. https://d1i1jdw69xsqx0.cloudfront.net/digitalhimalaya/collections/journals/ret/pdf/ret_72.pdf
Meelen, M, O’Neill, A and Coto-Solano, R. (2024). End-to-End Speech Recognition for Endangered Languages of Nepal in Moeller et al (eds.) Proceedings of the Comput-EL workshop at the EACL, pp. 83-93. https://aclanthology.org/2024.computel-1.12
Meelen, M, Hill, N. and Fellner, H. (2022) What are cognates? in Papers in Historical Phonology vol 7. DOI: https://doi.org/10.2218/pihph.7.2022.7405
Meelen, M. and Willis, D. eds. (2022). Creating annotated corpora for historical languages Special Issue for Journal of Historical Syntax, Vol. 6, pp. 1-6. https://ojs.ub.uni-konstanz.de/hs/index.php/hs/issue/view/47
Felbur, R, Meelen, M and Vierthaler, P. (2022). Crosslinguistic Semantic Textual Similarity for Classical Tibetan & Chinese in Journal of Open Humanities Data 8, 23. DOI: http://doi.org/10.5334/johd.86
Faggionato, C., Hill, N., & Meelen, M. (2022). NLP Pipeline for Annotating (Endangered) Tibetan and Newar Varieties. LREC-EURALI Proceedings, pp. 1-6. https://aclanthology.org/2022.eurali-1.1
Darling, M., Meelen, M., & Willis, D. (2022). Towards coreference resolution for Early Irish. LREC-CLTW Proceedings, pp. 85-93, https://aclanthology.org/2022.cltw4-1.1
Meelen, M and Willis, D. (2022). Towards a historical treebank of Middle and Modern Welsh Syntactic parsing in Meelen & Willis (eds). Annotating Historical Corpora: Special Issue for Journal of Historical Syntax, 5:1-32, https://doi.org/10.18148/hs/2022v6i4-11.135
Meelen, M and Pujol I Campeny, A. (2021). Old Catalan Morphosyntax: Developing an Annotated Corpus in Journal of Open Humanities Data, 7:30, pp. 1–12. DOI: https://doi.org/10.5334/johd.54
Meelen, M., Roux, É., & Hill, N. (2021). Optimisation of the Largest Annotated Tibetan Corpus Combining Rule-based, Memory-based, and Deep-learning Methods. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 20(1), 1-11.
Meelen, M. (2020). The Emergence of V2 in Welsh. In Woods—Wolfe (eds) Rethinking V2, Oxford University Press, pp. 426-454.
Meelen, M and Roux, E. (2020). Meta-dating the PArsed Corpus of Tibetan (PACTib) in Kilian Evang, Laura Kallmeyer, Rafael Ehren,Simon Petitjean, Esther Seyffarth, Djamé Seddah (Editors) Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories, pp. 31–42
Meelen, Marieke & Nurmio, Silva (2020) 'Adjectival agreement in Middle Welsh translated prose' in Journal of Celtic Linguistics. pp. 1-28.
Meelen, Marieke, Mourigh, Khalid & Cheng, Lisa (2020) 'V3 word order in Dutch urban varieties’, in András Bárány, Theresa Biberauer, Jamie Douglas and Sten Vikner (eds.) Clausal Architecture and Its Consequences: Synchronic and Diachronic Perspectives, pp. 55-84.
Faggionato, C., & Meelen, M. (2019). Developing the old Tibetan treebank. In Proceedings of the RANLP. pp. 304-312.
Hill, N. W., & Meelen, M. (2017). Segmenting and POS tagging Classical Tibetan using a memory-based tagger. Himalayan Linguistics, 16(2), 64-86.
Meelen, Marieke & Hill, Nathan (2017) Segmenting and POS tagging Classical Tibetan, Himalayan Linguistics 16 (2), pp. 64-89.
Meelen, Marieke, Hill, Nathan, & Handy, Christopher. (2017a) The Annotated Corpus of Classical Tibetan (ACTib), Part I - Segmented version, based on the BDRC digitised text collection, tagged with the Memory-Based Tagger from TiMBL [Data set]. Zenodo. http://doi.org/10.5281/zenodo.823707
Meelen, Marieke, Hill, Nathan, & Handy, Christopher. (2017b) The Annotated Corpus of Classical Tibetan (ACTib), Part II - POS-tagged version, based on the BDRC digitised text collection, tagged with the Memory-Based Tagger from TiMBL [Data set]. Zenodo. http://doi.org/10.5281/zenodo.822537
Meelen, Marieke (2017) 'Object-initial word order in Middle Welsh narrative prose' in Widmer & Poppe (eds.) Referential Properties and Their Impact on the Syntax of Insular Celtic Languages. pp. 145-178.
Meelen, Marieke (2016) Why Jesus and Job spoke bad Welsh: the origin and distribution of V2 orders in Middle Welsh, Utrecht: LOT publications.
Van Baren, Eva, Meelen, Marieke & Meijs, Lucas (2015) 'Promoting Youth Development Worldwide: The Duke of Edinburgh’s International Award' in Journal of Youth Development 10 (1).

Teaching and supervision

Teaching interests:

Historical Linguistics, NLP for low-resource languages (from a linguistics perspective).

I’m particularly interested in mentoring postdocs and supervising PhD students with a strong linguistics background hoping to work on Celtic or Tibeto-Burman languages in areas of my research interests:

Historical Linguistics (Syntax, Reconstruction and Information Structure)
Grammaticalisation & Pragmaticalisation
NLP for low-resource languages
Celtic & Tibeto-Burman languages

Course contact for:

Li4 Linguistic variation and change

Li11 Historical Linguistics

Li13 History of English

MPhil Seminar in Historical Linguistics & History of English

About

Study With Us

Subjects

Research

People

Library

News and events

Dr Marieke Meelen

Contact

Connect

About

Research

Teaching and supervision