Internationally, civil protection, police forces and emergency response agencies are under increasing pressure to more quickly and effectively respond to emergency situations. Moreover, such emergencies are common and recurring. For example, 50,000 people per-year on average die during natural disasters internationally.
The mass adoption of mobile internet-enabled devices paired with wide-spread use of social media platforms for communication and coordination has created ways for the public on-the-ground to contact response services. Moreover, a recent study reported that 63% of people expect responders to answer calls for help on social media.
With the rise of social media, emergency service operators are now expected to monitor those channels and answer questions from the public However, they do not have adequate tools or manpower to effectively monitor social media, due to the large volume of information posted on these platforms and the need to categorise, cross-reference and verify that information.
The Text Retrieval Conference is a combined conference and evaluation campaign that aims to encourage research into information retrieval technologies from large test collections. It is co-sponsored by the National Institute of Standards and Technology (NIST) Information Technology Laboratory's (ITL) Retrieval Group of the Information Access Division (IAD), and has run annually for over 25 years.
TREC consists of a set tracks, areas of focus in which particular retrieval tasks are defined. The tracks serve several purposes. First, tracks act as incubators for new research areas: the first running of a track often defines what the problem really is, and a track creates the necessary infrastructure (test collections, evaluation methodology, etc.) to support research on its task. The tracks also demonstrate the robustness of core retrieval technology in that the same techniques are frequently appropriate for a variety of tasks. Finally, they make TREC attractive to a broader community by providing tasks that match the research interests of more groups.
Incident Streams is a TREC track designed to bring together academia and industry to research technologies to automaticaly process social media streams during emergency situations with the aim of categorizing information and aid requests made on social media for emergency service operators.
The TREC-IS task is to produce a series of curated feeds containing social media posts, where each feed corresponds to a particular type of information request, aid request, or report containing a particular type of information. These "types" are defined based on existing hierarchical incident management information ontologies, such as MOAC (Management of a Crisis), For instance, for a flash flooding event, feeds might include, "requests for food/water", "reports of road blockages", and "evacuation requests". In this way, during an emergency, individual emergency management operators and other stakeholders can register to access to the subset of feeds within their domain of responsibility providing access to relevant social media content.
The next up-coming edition of the track is 2019-B, with submission in September. To get started read the detailed the Task Guidelines and have a look at the Ontology of information types. We also recommend you read our overview paper from the first edition of the track.
TREC-IS ran twice in 2019. 2019-B is the second of these. The single task, classifying tweets by information type (high-level) was continued for this edition. Task guidelines are provided and participants could use the 20k examples from the 2018 edition and 10k examples from the 2019-A edition as training data.
You can download the tweet streams from the 2018 and 2019-A edition in conjunction with their assessor labels as training data. See the dataset download page for instructions for the tweets, and the labels can be accessed via the button below.
The full overview paper for 2019 (A and B) will be released in early February. Below you can find the overview slides presented at TREC 2019 along with the plenary intro slides that summarize the planned changes for 2020.
TREC-IS will run twice in 2019. 2019-A is the first of these. The single task, classifying tweets by information type (high-level) is continued for this edition. Task guidelines are provided and participants can use the 20k examples from the 2018 edition as training data.
You can download the tweet streams from the 2018 edition in conjunction with the 2018 assessor labels as training data. See the 2019-A dataset page for download instructions for the tweets, and the labels can be accessed via the button below.
We have updated the evaluation metrics for 2019-A, more information can be found in the document below
Participants should apply their system for a set of 'test' event streams, instructions for accessing these events can be found below:
Your system output can be submitted to the organisers for evaluation. The deadline for submission is June 3rd. Submission instructions can be found below:
A single task is was run for the first year of the track (2018): classifying tweets by information type (high-level). Task guidelines are provided, along with a training dataset where information types and priority levels have been manually annotated for a set of tweets. 'User' profiles are provided for each event type, defining what is relevant for each event type (this is what the assessors are provided when judging).
Instructions for downloading the test dataset, including the topics, ontology and tweets can be found at the link below.
A summary of the evaluation metrics for the 2018 edition is summarized at the following link.
The 2018 Edition of TREC-IS has finished. Across you can find the reports, presentations, publications, open source systems and resources produced from this edition
The 2019-B edition is under-way. We have released the guidelines.
You can join our Google Group with the button below to recieve updates and ask questions about the track.