WACL4 at COLING’2025

with focus on Arabic Dialects


The workshop will be held online on January 20th, 2025 in conjunction with the 31st edition of COLING in 2025 in Abu Dhabi (UAE).

Description

The field of Arabic language research using corpora and corpus methods has experienced significant growth and development in recent years. What once were isolated efforts have now transformed into a vibrant and expansive area of study, advancing rapidly across multiple dimensions in both corpus and computational linguistics. Building upon the success of previous editions—WACL-1 in 2011, WACL-2 in 2013 in conjunction with the Corpus Linguistics Conference at Lancaster University, and WACL-3 in 2019 at the Corpus Linguistics 2019 conference at Cardiff University—we are excited to announce the fourth edition of the Workshop on Arabic Corpus Linguistics (WACL-4).

The primary objectives of WACL-4 are to highlight the latest developments in the creation, annotation, and application of Arabic corpora, including the introduction of new corpora and advancements in annotation techniques, while fostering collaboration among researchers from diverse institutions and regions to stimulate joint research projects and interdisciplinary initiatives. This edition will place a special emphasis on the study of Arabic dialects, including non-standard and regional varieties, to broaden the understanding of Arabic in its various manifestations and support research on under-resourced linguistic varieties. Additionally, WACL-4 aims to encourage the development and refinement of Natural Language Processing (NLP) systems and tools tailored for Arabic, integrating corpora into NLP workflows, creating new computational tools, and evaluating existing systems to improve their efficacy in processing Arabic text.

Rationale

There are 22 Arab-speaking countries in the Arab League, including Algeria, Bahrain, Comoros, Djibouti, Egypt, Iraq, Jordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman, Palestine, Qatar, Saudi Arabia, Somalia, Sudan, Syria, Tunisia, UAE, and Yemen. Each of these countries has its own specific Arabic dialects. Additionally, Arabic is spoken in other countries outside the Arab League, contributing to a global total of over 400 million Arabic speakers. This significant number highlights the urgent need for dedicated research efforts to study and document these diverse dialects thoroughly.

Arabic is a rich and diverse language, characterised by a wide collection of dialects. Despite the significant efforts made in developing tools and corpora for Arabic MSA, many Arabic dialects remain under-studied, primarily due to limited resources such as research funding and available datasets. This lack of comprehensive study leaves significant gaps in our understanding and documentation of these dialects. WACL-4 aims to address this issue by providing a platform for scholars to share resources, methodologies, and findings, thereby advancing the study of Arabic in its various forms.

The fourth edition of the Workshop on Arabic Corpus Linguistics (WACL-4) is motivated by this pressing need to bridge the research gap. By focusing on Arabic dialects, WACL-4 aims to provide a platform for scholars to share resources, methodologies, and findings, thereby advancing the study of Arabic in its various forms. With the increasing importance of language models in the field of computational linguistics, WACL-4 will also highlight their role in analysing and understanding Arabic dialects. These models are crucial for processing and generating natural language, offering new insights and tools for researchers. This workshop will play a crucial role in fostering collaboration and innovation, ultimately contributing to a more comprehensive understanding of the Arabic language and its dialectal richness.


Call For Papers

The workshop topics include but are not limited to:

  • Development and Utilisation of Arabic Dialectal Corpora
  • Advancements in Natural Language Processing Techniques for Arabic Dialects
  • Applications and Challenges of Large Language Models in Understanding and Generating Arabic Dialects
  • Morphological and Syntactical Challenges in Arabic Dialects
  • Dialect Identification and Classification
  • Speech Recognition and Synthesis for Arabic Dialects
  • Machine Translation involving Arabic Dialects
  • Sentiment Analysis and Opinion Mining in Arabic Dialects
  • Named Entity Recognition and Information Extraction for Arabic Dialects
  • Development of Open Access Resources for Arabic Dialects
  • Text Processing and Transliteration Challenges for Arabic Dialects
  • Cultural and Sociolinguistic Considerations in NLP Applications for Arabic Dialects
  • Resources and Tools for Computational Analysis of Arabic Dialects
  • Applications of Arabic Dialects NLP in Real-World Scenarios

Summary of the Call:

We welcome submissions of papers centred around Arabic Dialects NLP and resources, focusing on supporting and advancing language technologies tailored to the diverse range of Arabic dialects. We encourage submissions that span a spectrum from theoretical investigations to practical applications, aiming to address the unique challenges, solutions, and insights that Arabic dialects introduce to the field of NLP.

Submissions should adhere to the COLING 2025 standards. Authors are strongly encouraged to review and follow the COLING 2025 submission guidelines and author kit, available at https://coling2025.org/.

If authors are describing dialectal variations, we request that they include relevant linguistic details and sociolinguistic contexts to enrich the understanding of the presented work.

Submissions may be of two types:

  • Long papers – up to eight (8) pages including references, presenting substantial, original, completed, and unpublished work.
  • Short papers – up to four (4) pages including references, describing a small focused contribution, negative results, or system demonstrations, etc.

Important dates:

  • 1st Call for Papers Announcement: 25 July 2024
  • 2nd Call for Papers Announcement: 31 August 2024
  • Paper Submission Deadline: 10 October 2024
  • Notification of Paper Acceptance: 1 November 2024
  • Camera-ready Paper Deadline: 15 November 2024
  • Workshop Date: 20th January 2025

Format

The workshop will consist of a mix of invited talks, contributed talks, and panel discussions. The workshop will be held 100% virtually, allowing for greater accessibility and participation from scholars and researchers around the world. We anticipate 30 attendees and 2 invited speakers to the workshop. Scheduled for 20 January 2025, the workshop will be held in conjunction with the 31st edition of COLING 2025 in Abu Dhabi, UAE.

Anti-Harassment policy:

The workshop supports the COLING anti-harassment policy https://coling2022.org/policy

Keynote Speakers*:

  1. Speaker 1: Imed Zitouni, Google, USA
  2. Speaker 2: Hend Alkhalifa, King Saud University, Saudi Arabia

*Currently in contact with them.

Organization

Organising Committee:

  • Saad Ezzini, Lancaster University, UK (General Chair)
  • Hamza Alami, Sidi Mohamed Ben Abdellah University, Morocco (Programme Co-Chair)
  • Ismail Berrada, Mohammed VI Polytechnic University, Morocco (Programme Co-Chair)
  • Abdessamad Benlahbib, Sidi Mohamed Ben Abdellah University, Morocco (Programme Co-Chair)
  • Abdelkader El Mahdaouy, Mohammed VI Polytechnic University, Morocco  (Review Chair)
  • Salima Lamsiyah, University of Luxembourg, Luxembourg (Publication Chair)
  • Nouran Khallaf, Leeds University, UK (Publicity Co-Chair)
  • Hatim Derrouz, Ibn Tofail University, Morocco (Publicity Co-Chair)
  • Amal Haddad, University of Granada, Spain (Publicity Co-Chair)
  • Mustafa Jarrar, Birzeit University, Palestine (Advisory Committee)
  • Mo El-Haj, Lancaster University, UK (Advisory Committee)
  • Ruslan Mitkov, Lancaster University, UK (Advisory Committee)
  • Paul Rayson, Lancaster University, UK (Advisory Committee)

Programme Committee:

  • Ahmed Ali, Qatar Computing Research Institute (QCRI), Qatar
  • Ahmed Abdelali, Qatar Computing Research Institute (QCRI), Qatar
  • Almoataz B. Al-Said, Cairo University, Egypt
  • Eric Atwell, Leeds University, UK
  • Haithem Afli, Dublin City University, Ireland
  • Hazem Hajj, American University of Beirut, Lebanon
  • Ignatius Ezeani, Lancaster University, UK
  • Imed Zitouni, Microsoft Research, USA
  • Karim Bouzoubaa, Mohamed Vth University, Morocco
  • Khaled Shaban, Qatar University, Qatar
  • Abdessamad Benlahbib, Sidi Mohamed Ben Abdellah University, Morocco
  • Lama Alsudias, Lancaster University, UK
  • Mo El-Haj, Lancaster University, UK
  • Mariam Aboelezz, Birkbeck, University of London, UK
  • Nadi Tomeh, University of Paris 13, France
  • Nizar Habash, New York University Abu Dhabi, UAE
  • Nora Al-Twairesh, King Saud University, Saudi Arabia
  • Abdelkader El Mahdaoui, Mohammed VI Polytechnic University, Morocco
  • Paul Rayson, Lancaster University, UK
  • Scott Piao, Lancaster University, UK
  • Taha Zerrouki, Ecole Nationale Supérieure d’Informatique, Algeria
  • Tamer Elsayed, Qatar University, Qatar
  • Violetta Cavalli-Sforza, Al Akhawayn University, Morocco
  • Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar
  • Hanane El Faik, Chouaïb Doukkali University, Morocco
  • Wassim El-Hajj, American University of Beirut, Lebanon
  • Ashraf Boumhidi, Sidi Mohamed Ben Abdellah University, University, Morocco
  • Khadidja Merakchi, Heriot-Watt University
  • Ed-Drissiya El-Allaly, University of Moulay Ismail, Morocco
  • Driss Aboulhoucine, EMRO, WHO
  • El Habib Nfaoui, Sidi Mohamed Ben Abdellah University, Morocco
  • Salima Lamsiyah, University of Luxembourg, Luxembourg
  • Khaled Shaalan, The British University in Dubai, UAE
  • Ismail Berrada, Mohammed VI Polytechnic University, Morocco
  • Maram Alharbi, Lancaster University, UK
  • Hatim Derrouz, Ibn Tofail University, Morocco
  • Nouran Khallaf, Leeds University, UK
  • Hamza Alami, Sidi Mohamed Ben Abdellah University, Morocco
  • Mustafa Jarrar, Birzeit University, Palestine
  • Hanane Grissette, Cadi Ayyad University, Morocco