FNS-2023 Shared Task: “Financial Narrative Summarisation“
To be held at The 5th Financial Narrative Processing Workshop (FNP 2023), Sorrento, Italy, 15-18 December 2023.
NEW: Registration is now open: Click here to register your team for FNS 2023
- 1st Call for papers & shared task participants: July 01, 2023
- 2nd Call for papers & shared task participants: July 15, 2023
- Training set release: July 20, 2023
- Blind test set release: September 15, 2023
- Systems submission: September 30, 2023
- Release of results: October 10, 2023
- Paper submission deadline: October 22, 2023 (anywhere in the world)
- Papers notification of acceptance: November 05, 2023
- Camera-ready submission deadline: November 15, 2023
- Workshop date (1-day event): December 15-18, 2023 (exact date to be announced)
The volume of available financial information is increasing sharply, and therefore the study of NLP methods that automatically summarise content has grown rapidly into a major research area.
The Financial Narrative Summarisation (FNS 2023) aims to demonstrate the value and challenges of applying automatic text summarisation to financial text written in English, Spanish and Greek, usually referred to as financial narrative disclosures. The task dataset has been extracted from UK annual reports published in PDF file format.
The participants are asked to provide structured single summaries, based on real-world, publicly available financial annual reports of UK firms by extracting information from different key sections. Participants will be asked to generate summaries that reflect the analysis and assessment of the financial trend of the business over the past year, as provided by annual reports.
For the evaluation, we aim to use the Rouge Java package.
ROUGE 2 will be the main metric for teams and submission comparisons (Ranking). Teams will be ranked according to the highest F1-Score on the Test set.
For FNS 2023 task, we focus on annual reports produced by UK, Spanish and Greek firms listed on the stock exchange market of each of those countries. In the UK and elsewhere, annual report structure is much less rigid than those produced in the US. Companies produce glossy brochures with a much looser structure, and this makes automatic summarisation of narratives in UK annual reports a challenging task. This is due to the fact that the structure of those documents needs to be extracted first in order to summarise the narrative sections of the annual reports. This can happen by detecting narrative sections that usually include the management disclosures rather than the financial statements of the annual reports.
In this task, we will introduce a new summarisation task which we call Financial Narrative Summarisation. In this task, the summary requires extraction from different key sections found in the annual reports. Those sections are usually referred to as “narrative sections” or “front-end” sections, and they usually contain textual information and reviews by the firm’s management and board of directors. Sections containing financial statements in terms of tables and numbers are usually referred to as “back-end” sections and are not supposed to be part of the narrative summaries. UK annual reports are lengthy documents with around 80 pages on average, some annual reports could span over more than 250 pages, making the summarisation task a challenging but academically interesting one.
For the purpose of this task, we will ask the participants to produce one summary for each annual report. The summary length should not exceed 1000 words. We advise that the summary is generated/extracted based on the narrative sections. Therefore, the participating summarisers need to be trained to detect narrative sections before creating the summaries. The MultiLing team, along with help from Barcelona’s UPF summarisation team, will help in organising the shared task, including the generation of the evaluation results and final proceedings. The MultiLing team have a rich experience in organising summarisation tasks since 2011.
Q1: Is this an Extractive or Abstractive summarisation task?
A1: The process is about extracting (or regenerating) relevant information from a document. Therefore the process can be either extractive or abstractive, the choice is yours.
Q2: Is this similar to summarising news documents?
A2: Financial annual reports are a bit different from news articles, they are large in size and contain vast information that is deemed repetitive or irrelevant to the year’s performance summary.
Q3: What part of the annual reports are you interested in summarising?
A3: We are basically interested in summarising the narrative sections.
Q4: How did you generate the gold-standard summaries?
A4: Every section in an annual report is written by human experts in each language with the aim of summarising the previous year’s performance. To clarify, we did not ask those experts to write down a summary, as we are not involved in the annual report creation, instead we asked the experts who created those annual reports to tell us which sections in the annual reports are considered a summary of the whole annual reports, and those sections were used as gold standard summaries.
Q5: Are the gold standard summaries contained within the annual reports?
A5: Yes, they are contained within the reports of each of the three languages (English, Spanish and Greek). Having said that, such information is not always in the same order, location, format or even contents. Therefore detecting which information in an annual report needs to be extracted into the final summary is a challenge even for experts in the field.
Q6: What is the word limit for the generated summaries?
A6: You are to generate a summary of no more than 1000 words.
Q7: Can we submit results for more than one summarisation-system?
A7: Yes, you can submit up to three systems (one run each).
Q8: Will there be a leader-board?
A8: More details soon.
Q9: Is the dataset extracted from PDF to txt following the procedure in this referenced paper?
Q10: In what language are those reports?
A10: The reports are in three languages: English, Spanish and Greek.
Q11: What evaluation methods will be used?
A11: We aim to use the Rouge Java package , from https://github.com/kavgan/ROUGE-2.0. ROUGE 2 will be the main metric for teams and submission comparisons (Ranking). Teams will be ranked according to the highest ROUGE-2 F1-Score on the Test set for each of the three languages. The scores are weighted differently between the languages: English (50%), Spanish (25%) and Greek (25%).
Q12: Do we need to participate in all three languages?
A12: Yes, the summarisation methods provided should be able to work with all three languages, the team can submit one summariser for all three languages or a summariser for each language on its own. The team with the best ROUGE-2 scores for all three languages will win the competition. The scores are weighted differently: English (50%), Spanish (25%) and Greek (25%).
Q13: Is it possible to provide us with the original PDF files so we can look into the structures?
A13: Unfortunately, we cannot provide the original PDF files due to copyright issues. The idea is to work on extracting a structure for the unstructured plain text file formats.
Q14: Would you clarify about the quoted sentences in the gold summaries? Those seem to be not quotations from the full text.
A14: The quotations could be a result of highlighted text (usually floating text) in the original PDF files but they are not intended to affect or guide the summarisation process in any way.
Q15: Is it safe to ignore the new lines in the gold summary? I assume they are results of PDF conversions.
A15: Yes, we will not look into line breaks so feel free to ignore them.
Q16: Do we need to submit a paper describing our system(s)?
A16: Yes! All shared task papers will be published in IEEE Big Data 2023 proceedings.
If in doubt please do not hesitate to contact us on email@example.com
Please note that only registered teams will be considered for the competition. To register, please use the following link: Participation form
For the creation of the financial narrative summarisation dataset, we used a number of 3,863 annual reports. We randomly split the English dataset into training (c75%), testing and validation (c25%). This is a bit different for Greek and Spanish, as we have a smaller dataset. The split for each language is (training: c60%, testing and validation: c40%). Tables 1,2 and 3 show the dataset details. We will provide the participants with the training and validation sets, including the full text of each annual report, along with the extracted sections and gold-standard summaries. At a later stage, the participants will be given the testing data. On average, there are at least 2 gold-standard summaries for each annual report. In addition, we’ll provide you with a number of Greek and Spanish annual reports with their gold standard summaries.
Table 1: English Dataset
|Report full text||3,050||413||550||4,013|
Table 2: Greek Dataset
|Report full text||212||50||50||312|
Table 3: Spanish Dataset
|Report full text||162||50||50||262|
- Please name your summaries following a naming convention of the form: (annual_report_filename)_(system_name).txt. For example, if your system’s name is myAwesomeSystem, then a summary produced by this system for the annual report named 330.txt should get the name 330_myAwesomeSystem.txt
- If you participate with multiple systems, please submit summaries of each system in a separate zip file. The number of files in the submitted (and compressed) folder must be identical to the number of annual reports in the test set.
Shared Task Co-Organisers
- Mahmoud El-Haj (Lancaster University)
- Nikiforos Pittaras (SKEL Lab, NCSR Demokritos)
- Marina Litvak (Sami Shamoon College of Engineering)
- George Giannakopoulos (SKEL Lab, NCSR Demokritos)
- Ilias Zavitsanos (SKEL Lab, NCSR Demokritos)
- Aris Kosmopoulos (SciFY PNPC)
- Antonio Moreno Sandoval (UAM, Madrid)
- Blanca Carbajo Coronado (UAM, Madrid)