{"id":4,"date":"2021-05-06T13:50:26","date_gmt":"2021-05-06T13:50:26","guid":{"rendered":"http:\/\/wp.lancs.ac.uk\/acc\/?page_id=4"},"modified":"2024-03-27T18:43:03","modified_gmt":"2024-03-27T18:43:03","slug":"home","status":"publish","type":"page","link":"https:\/\/wp.lancs.ac.uk\/acc\/","title":{"rendered":"Adnodd Creu Crynodebau (ACC) Welsh Summary Creator (WSC)"},"content":{"rendered":"<div class=\"entry-content\">\n<p>&nbsp;<\/p>\n<\/div>\n<p><span style=\"font-family: verdana, geneva;color: #ff6600\"><strong>Project overview<\/strong><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">We are developing a publicly available Welsh-language automatic text summarisation tool: Adnodd Creu Crynodebau (ACC). ACC will contribute to the automated tools available in the Welsh language and facilitate the work of those involved in document preparation, proof-reading, and (in certain circumstances) translation. ACC will also allow professionals to quickly summarise long documents for efficient presentation. For instance, ACC will allow educators to adapt long documents for use in the classroom. It is also envisaged that ACC will benefit the wider public, who may prefer to read a summary of complex information presented on the internet or who may have difficulties reading translated versions of information on websites.<\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: verdana, geneva;color: #ff6600\"><strong>What is text summarisation?<\/strong><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">Text summarisation is a digital approach to summarising \u2018key\u2019 information contained within texts, and the creation of shortened versions of texts based on this content. This is to provide succinct and coherent summaries to users, something that is often time-consuming and difficult to conduct manually. Summarisation is useful in the modern digital world where the creation and sharing of text is ever-increasing, as it enables users to navigate, and make sense of, the dearth of information available with ease.<\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: verdana, geneva;color: #ff6600\"><strong>Approaches to text summarisation<\/strong><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">The main approaches to text summarisation include extraction-based summarisation and abstraction-based summarisation. The former extracts specific words\/phrases from the text in the creation of the summary, while the latter works to provide paraphrased summaries (i.e. not directly extracted) from the source text. The successful extraction\/abstraction of content, when using summarisation tools\/approaches, depends on the accuracy of automatic algorithms (which require training using hand-coded gold-standard datasets).<\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">Work on automatic text summarisation has a long history in NLP (Natural Language Processing). This work originally focused only on English, but is now used in a range of other language contexts, including French, Spanish, Hindi, Arabic, amongst others. The &#8216;MultiLing&#8217; project and associated conference series, are a noteworthy champion of developing text summarisation in a range of the world\u2019s 7000+ different languages. The website,\u00a0 <a href=\"http:\/\/multiling.iit.demokritos.gr\">http:\/\/multiling.iit.demokritos.gr<\/a> provides an open repository for summarisation tasks test\/training data, model summaries, amongst others. Missing from current summarisation resources are tools that effectively work with the Welsh language &#8211; this is the research gap that the proposed research project aims to fill.\u00a0<\/span><\/p>\n<hr \/>\n<p><span style=\"color: #ff6600\"><strong><span style=\"font-family: verdana, geneva\">Project Team<\/span><\/strong><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\"><a href=\"https:\/\/www.cardiff.ac.uk\/people\/view\/142032-knight-dawn\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft\" src=\"https:\/\/corcencc.org\/wp-content\/uploads\/2021\/06\/dawn-knight-1.png\" width=\"104\" height=\"104\" \/>Dr Dawn Knight, Cardiff University (PI)<\/a><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">Dr. Dawn Knight is a Reader in Applied Linguistics at Cardiff University, UK, and Chair of the British Association for Applied Linguistics (BAAL). She was the Principal Investigator (PI) of the CorCenCC (National Corpus of Contemporary Welsh) project and has expertise in corpus linguistics, discourse analysis, digital interaction and non-verbal communication. Dawn is the PI of the Welsh Automatic Text Summarisation project.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-family: verdana, geneva\"><a href=\"https:\/\/www.cardiff.ac.uk\/people\/view\/78474-morris-jonathan\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft\" src=\"https:\/\/corcencc.org\/wp-content\/uploads\/2021\/06\/LLUN-PROFFIL.jpg\" width=\"102\" height=\"130\" \/>Dr Jonathan Morris, Cardiff University (CI)<\/a><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">Dr. Jonathan Morris is a Senior Lecturer in Welsh linguistics at Cardiff University. Jonathan\u2019s research focuses on sociolinguistic aspects of bilingualism. His publications include work on cross-linguistic phonological interactions and sociophonetic variation in Welsh-English bilinguals\u2019 speech and research on the use of the Welsh language among young people and families.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-family: verdana, geneva\"><a href=\"https:\/\/www.lancaster.ac.uk\/staff\/elhaj\/\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-17 alignleft\" src=\"http:\/\/wp.lancs.ac.uk\/acc\/files\/2021\/06\/mohaj2.jpg\" alt=\"\" width=\"101\" height=\"101\" srcset=\"https:\/\/wp.lancs.ac.uk\/acc\/files\/2021\/06\/mohaj2.jpg 932w, https:\/\/wp.lancs.ac.uk\/acc\/files\/2021\/06\/mohaj2-298x300.jpg 298w, https:\/\/wp.lancs.ac.uk\/acc\/files\/2021\/06\/mohaj2-150x150.jpg 150w, https:\/\/wp.lancs.ac.uk\/acc\/files\/2021\/06\/mohaj2-768x772.jpg 768w, https:\/\/wp.lancs.ac.uk\/acc\/files\/2021\/06\/mohaj2-180x180.jpg 180w\" sizes=\"auto, (max-width: 101px) 100vw, 101px\" \/>Dr Mahmoud El-Haj, Lancaster University (CI)<\/a><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">Dr. Mahmoud El-Haj, also known as Mo, is an NLP Lecturer in Computer Science at the School of Computing and Communications at Lancaster University. Mo received his PhD in Computer Science from The University of Essex working on Multi-document Summarization. His work is mainly towards Summarization, Information Extraction, Financial NLP and multilingual NLP with his work being applied to many languages including English, Arabic, Spanish, Portuguese and Welsh. He has an interest in under-resourced languages and building NLP datasets.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-family: verdana, geneva\"><a href=\"https:\/\/www.lancaster.ac.uk\/scc\/about-us\/people\/ignatius-ezeani\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft\" src=\"https:\/\/corcencc.org\/wp-content\/uploads\/2021\/06\/ignatius_ezeani2-002.jpg\" width=\"111\" height=\"103\" \/>Dr Ignatius Ezeani, Lancaster University\u00a0(RA)<\/a><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">Dr Ignatius Ezeani is a Senior Teaching\/Research Associate at Lancaster University. He is interested in the application of NLP techniques in building resources for low-resource languages including Igbo and Welsh. He works on the efficient adaption of existing NLP tools and techniques for creating task-oriented systems for low-resource languages.<\/span><\/p>\n<hr \/>\n<div class=\"wp-block-media-text alignwide is-stacked-on-mobile\">\n<div class=\"wp-block-media-text__content\">\n<p>&nbsp;<\/p>\n<\/div>\n<\/div>\n<p><span style=\"font-family: verdana, geneva;color: #ff6600\"><strong>Technical details<\/strong><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">To learn more about the technical development of ACC, and for access to the tools and dataset being created as part of this project, please visit our GitHub site.<\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: verdana, geneva;color: #ff6600\"><strong>Accessing ACC<\/strong><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">ACC will be available soon. Updates on the development of ACC will be added to this website, the project\u2019s GitHub site and will be tweeted via the <a href=\"https:\/\/twitter.com\/corcencc?lang=en\">@CorCenCC<\/a> Twitter account.<\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: verdana, geneva\"><strong><span style=\"color: #ff6600\">Funding acknowledgement<\/span> <\/strong><\/span><\/p>\n<p><span style=\"font-family: verdana, geneva\">This project, which runs from 2021-2022, is funded by the Welsh Government as part of the \u2018Welsh Automatic Text Summarisation\u2019 project.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; Project overview We are developing a publicly available Welsh-language automatic text summarisation tool: Adnodd Creu Crynodebau (ACC). ACC will contribute to the automated tools available in the Welsh language and facilitate the work of those involved in document preparation, proof-reading, and (in certain circumstances) translation. ACC will also allow professionals to quickly summarise long [&hellip;]<\/p>\n","protected":false},"author":660,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-4","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/wp.lancs.ac.uk\/acc\/wp-json\/wp\/v2\/pages\/4","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.lancs.ac.uk\/acc\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/wp.lancs.ac.uk\/acc\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/acc\/wp-json\/wp\/v2\/users\/660"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.lancs.ac.uk\/acc\/wp-json\/wp\/v2\/comments?post=4"}],"version-history":[{"count":17,"href":"https:\/\/wp.lancs.ac.uk\/acc\/wp-json\/wp\/v2\/pages\/4\/revisions"}],"predecessor-version":[{"id":8,"href":"https:\/\/wp.lancs.ac.uk\/acc\/wp-json\/wp\/v2\/pages\/4\/revisions\/8"}],"wp:attachment":[{"href":"https:\/\/wp.lancs.ac.uk\/acc\/wp-json\/wp\/v2\/media?parent=4"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}