Celtic Languages in the Digital Age (CLIDA 2024)

Lancaster University – 9 April 2024

🎇This event is now over. The Talks are available online on: https://t.co/ejXz5qjP7J


The “Celtic Languages in the Digital Age” workshop is designed to meet the critical need for the development of linguistic resources that support Celtic languages, which are steeped in rich oral traditions and have historical roots. 

Join us on 9 April 2024 for a pioneering workshop dedicated to examining the requirements for future Celtic language resource development, that is, resources that have the flexibility to support Welsh, Irish, Scottish Gaelic, Cornish, Breton and Manx. 



Dr Mo El-Haj, Senior Lecturer in NLP and Co-Director of the UCREL NLP Group at Lancaster University.

Dr Saad Ezzini, Lecturer in Computer Science and member of Software Engineering at UCREL NLP Group at Lancaster University.

Hello, we’re thrilled to host this workshop dedicated to language model creation for under-resourced languages. With a strong focus on linguistic diversity, Lancaster University has organised previous workshops and undertaken projects on languages like Welsh, Igbo, Luxembourgish, and various Arabic dialects. CLIDA 2024 represents our ongoing commitment to pushing the boundaries of linguistic innovation and digital support for these languages.

Keynote Speakers:

Professor Dawn Knight, Cardiff University

Professor Kevin Scannell, Cadhan Aonair

Dr Daniel Cunliffe, University of South Wales

Mr Gruffudd Prys, Bangor University

Dr Inge Birnie, University of Strathclyde

Dr Merryn Davies-Deacon, Queen’s University Belfast

Dr David Howcroft, Edinburgh Napier University

Dr Mícheál J. Ó Meachair, Dublin City University

Dr Cedric Lothritz, University of Luxembourg
Professor Paul Rayson, Lancaster University

Dr Abigail Walsh, ADAPT Centre at Dublin City University

Dr Ignatius Ezeani, Lancaster University

Dr Nouran Khallaf, Lancaster University

Dr Mo El-Haj, Lancaster University

Dr Saad Ezzini, Lancaster University


The UCREL NLP Group at Lancaster University boasts a distinguished history of engagement with Celtic languages, recently through our collaborative endeavours focusing on the Welsh language. In partnership with Cardiff University, with funding from ESRC and AHRC, we have collaborated on the development of CorCenCC (the National Corpus of Contemporary Welsh: www.corcenc.org) and FreeTxt (www.freetxt.app), in addition to the Welsh Summarisation, Thesaurus, and Welsh Digital Grid (www.digigrid.cymru) projects, which were funded by the Welsh Government. These collaborations have been instrumental in the development of pivotal tools such as the Welsh Semantic Tagger and the FreeTxt analysis tool, marking significant advancements in the linguistic technology landscape for Celtic languages, with a particular focus on Welsh. 

Building on its established foundation in Welsh language technologies, the UCREL NLP Group is now eager to support widening the scope of computational methods to encompass other Celtic languages, including Irish, Scottish Gaelic, Manx, Breton, and Cornish. Our objective is to contribute to the development of resources and NLP technologies that support and enhance the digital presence of these linguistically rich but under-resourced languages.

Our workshop will feature an array of speakers and panellists, each bringing a unique perspective on the challenges and opportunities inherent in developing linguistic resources and NLP applications for Celtic languages. Discussions will focus on addressing the scarcity of resources available to Celtic languages, the development of language models tailored to these languages, and the exploration of existing datasets and ongoing efforts in the field. This event promises to be an invaluable platform for fostering collaboration, sharing cutting-edge research, and discussing strategies to enhance digital support for these languages.