WACL4 at COLING’2025
with focus on Arabic Dialects
The workshop will be held online on January 20th, 2025 in conjunction with the 31st edition of COLING in 2025 in Abu Dhabi (UAE).
Submission Deadline: November 8th, 2024

Description
The field of Arabic language research using corpora and corpus methods has experienced significant growth and development in recent years. What once were isolated efforts have now transformed into a vibrant and expansive area of study, advancing rapidly across multiple dimensions in both corpus and computational linguistics. Building upon the success of previous editions—WACL-1 in 2011, WACL-2 in 2013 in conjunction with the Corpus Linguistics Conference at Lancaster University, and WACL-3 in 2019 at the Corpus Linguistics 2019 conference at Cardiff University—we are excited to announce the fourth edition of the Workshop on Arabic Corpus Linguistics (WACL-4).
The primary objectives of WACL-4 are to highlight the latest developments in the creation, annotation, and application of Arabic corpora, including the introduction of new corpora and advancements in annotation techniques, while fostering collaboration among researchers from diverse institutions and regions to stimulate joint research projects and interdisciplinary initiatives. This edition will place a special emphasis on the study of Arabic dialects, including non-standard and regional varieties, to broaden the understanding of Arabic in its various manifestations and support research on under-resourced linguistic varieties. Additionally, WACL-4 aims to encourage the development and refinement of Natural Language Processing (NLP) systems and tools tailored for Arabic, integrating corpora into NLP workflows, creating new computational tools, and evaluating existing systems to improve their efficacy in processing Arabic text.
Rationale
There are 22 Arab-speaking countries in the Arab League, including Algeria, Bahrain, Comoros, Djibouti, Egypt, Iraq, Jordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman, Palestine, Qatar, Saudi Arabia, Somalia, Sudan, Syria, Tunisia, UAE, and Yemen. Each of these countries has its own specific Arabic dialects. Additionally, Arabic is spoken in other countries outside the Arab League, contributing to a global total of over 400 million Arabic speakers. This significant number highlights the urgent need for dedicated research efforts to study and document these diverse dialects thoroughly.
Arabic is a rich and diverse language, characterised by a wide collection of dialects. Despite the significant efforts made in developing tools and corpora for Arabic MSA, many Arabic dialects remain under-studied, primarily due to limited resources such as research funding and available datasets. This lack of comprehensive study leaves significant gaps in our understanding and documentation of these dialects. WACL-4 aims to address this issue by providing a platform for scholars to share resources, methodologies, and findings, thereby advancing the study of Arabic in its various forms.
The fourth edition of the Workshop on Arabic Corpus Linguistics (WACL-4) is motivated by this pressing need to bridge the research gap. By focusing on Arabic dialects, WACL-4 aims to provide a platform for scholars to share resources, methodologies, and findings, thereby advancing the study of Arabic in its various forms. With the increasing importance of language models in the field of computational linguistics, WACL-4 will also highlight their role in analysing and understanding Arabic dialects. These models are crucial for processing and generating natural language, offering new insights and tools for researchers. This workshop will play a crucial role in fostering collaboration and innovation, ultimately contributing to a more comprehensive understanding of the Arabic language and its dialectal richness.
Call For Papers
The workshop topics include but are not limited to:
- Development and Utilisation of Arabic Dialectal Corpora
- Advancements in Natural Language Processing Techniques for Arabic Dialects
- Applications and Challenges of Large Language Models in Understanding and Generating Arabic Dialects
- Morphological and Syntactical Challenges in Arabic Dialects
- Dialect Identification and Classification
- Speech Recognition and Synthesis for Arabic Dialects
- Machine Translation involving Arabic Dialects
- Sentiment Analysis and Opinion Mining in Arabic Dialects
- Named Entity Recognition and Information Extraction for Arabic Dialects
- Development of Open Access Resources for Arabic Dialects
- Text Processing and Transliteration Challenges for Arabic Dialects
- Cultural and Sociolinguistic Considerations in NLP Applications for Arabic Dialects
- Resources and Tools for Computational Analysis of Arabic Dialects
- Applications of Arabic Dialects NLP in Real-World Scenarios

Summary of the Call:
We welcome submissions of papers centred around Arabic Dialects NLP and resources, focusing on supporting and advancing language technologies tailored to the diverse range of Arabic dialects. We encourage submissions that span a spectrum from theoretical investigations to practical applications, aiming to address the unique challenges, solutions, and insights that Arabic dialects introduce to the field of NLP.
Submissions should adhere to the COLING 2025 standards. Authors are strongly encouraged to review and follow the COLING 2025 submission guidelines and author kit, available at https://coling2025.org/.
If authors are describing dialectal variations, we request that they include relevant linguistic details and sociolinguistic contexts to enrich the understanding of the presented work.
Submissions may be of two types:
- Long papers – up to eight (8) pages excluding references, presenting substantial, original, completed, and unpublished work.
- Short papers – up to four (4) pages excluding references, describing a small focused contribution, negative results, or system demonstrations, etc.
Important dates:
- 1st Call for Papers Announcement: 13 August 2024
- 2nd Call for Papers Announcement: 01 October 2024
- Paper Submission Deadline: 8 November 2024
- Notification of Paper Acceptance: 8 December 2024
- Camera-ready Paper Deadline: 13 December 2024
- Workshop Date: 20th January 2025
Format:
The workshop will consist of a mix of invited talks, contributed talks, and panel discussions. The workshop will be held 100% virtually, allowing for greater accessibility and participation from scholars and researchers around the world. We anticipate 30 attendees and 2 invited speakers to the workshop. Scheduled for 20 January 2025, the workshop will be held in conjunction with the 31st edition of COLING 2025 in Abu Dhabi, UAE.
Anti-Harassment policy:
The workshop supports the COLING anti-harassment policy https://coling2022.org/policy
Keynote Speaker:
- Speaker 1: Imed Zitouni, Google, USA (Confirmed)
Organization
Organising Committee:
- Saad Ezzini, King Fahd University of Petroleum and Minerals, Saudi Arabia (General Chair)
- Hamza Alami, Sidi Mohamed Ben Abdellah University, Morocco (Programme Co-Chair)
- Ismail Berrada, Mohammed VI Polytechnic University, Morocco (Programme Co-Chair)
- Abdessamad Benlahbib, Sidi Mohamed Ben Abdellah University, Morocco (Programme Co-Chair)
- Abdelkader El Mahdaouy, Mohammed VI Polytechnic University, Morocco (Review Chair)
- Hatim Derrouz, Ibn Tofail University, Morocco (Publication Chair)
- Salima Lamsiyah, University of Luxembourg, Luxembourg (Publicity Co-Chair)
- Amal Haddad, University of Granada, Spain (Publicity Co-Chair)
- Mustafa Jarrar, Birzeit University, Palestine (Advisory Committee)
- Mo El-Haj, Lancaster University, UK (Advisory Committee)
- Ruslan Mitkov, Lancaster University, UK (Advisory Committee)
- Paul Rayson, Lancaster University, UK (Advisory Committee)
Programme Committee:
- Almoataz B. Al-Said, Cairo University, Egypt
- Abdessamad Benlahbib, Sidi Mohamed Ben Abdellah University, Morocco
- Ashraf Boumhidi, Sidi Mohamed Ben Abdellah University, University, Morocco
- Abdelkader El Mahdaoui, Mohammed VI Polytechnic University, Morocco
- Hamza Alami, Sidi Mohamed Ben Abdellah University, Morocco
- Hatim Derrouz, Ibn Tofail University, Morocco
- Hicham Hammouchi, University of Luxembourg, Luxembourg
- Ismail Berrada, Mohammed VI Polytechnic University, Morocco
- Maram Alharbi, Lancaster University, UK
- Nagham F. Hamad, Birzeit University, Palestine
- Nizar Habash, New York University Abu Dhabi, UAE
- Nora Al-Twairesh, King Saud University, Saudi Arabia
- Noorhan Abbas, Leeds University, UK
- Saad Ezzini, King Fahd University of Petroleum and Minerals, Saudi Arabia
- Salima Lamsiyah, University of Luxembourg, Luxembourg
- Salmane Chafik, Mohammed VI Polytechnic University, Morocco
- Samir El Amrani, University of Luxembourg, Luxembourg
- Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar