2019-A TREC-IS Submission

This page provides submission instructions for the 2020-A edition of TREC-IS, tasks 1, 2 and 3.

Preparing Your Submission

General Guidelines

For the 2020-A edition of the track, participants can submit the output of different versions of their solution to the organisers for each task. All submissions will be evaluated against human annotations of the tweets for the test events. Evaluated submissions will be ranked based on a target metric, which promotes the correct classification of tweets that contain critical information (we mostly care about getting actionable information to the response officer, and less about other categories like sentiment or general news reporting). A participating group can provide a maximum of 4 new submissions per task. This is to stop a participant from spamming hundreds of submissions with minor parameter tweaks. You may also submit the output of systems from previous TREC-IS editions, these do not count towards the 4 system limit (if you participated in previous editions then please do submit the output of those older systems, so we can better track performance across editions).

The advantage of submitting runs is that you will get performances returned to you in June and your system will appear in an overview paper that the track organisers will produce. The evaluation scripts and human labelled data will be released later (likely in July) for those people that could not participate. The organisers will also archive your submitted runs, and will distribute them on request. This helps promote reproducable research and enables furture systems to be compared with what the state-of-the-art was at submission time.

Run Format

To construct a submission, you need to have a system that outputs predicted information categories and priority levels for each of the tweets in each of the test events. In effect, we expect participants to have a system that ingests and processes the tweet stream for each event in time order, and will emit one line in a text file per tweet containing the following information separated by tabs:

The incident identifier (the contents of the "num" tags in the topic files for Task 1/2 and/or Task 3.
A literal string "Q0" (its to support compatability with an old piece of software)
The tweet ID of the tweet, an 18-19 digit number
The tweet number in the stream, sometimes referred to as the rank. Start at 1 for each event and count up.
The importance/criticality label, either Low, Medium, High or Critical. (Note that this is different from prior editions where a score rather than a label was submitted!)
The information types within the ontology for the current tweet. The valid information types depend on which task you are submissing for. This should be a comma-delimited list as illustrated below.
The runtag, this should be a unique identifier for your system and institution. For example, UoGlasgow-DLRun1.

Per-Task Guidelines

As described in the participation section, we are running three different tasks for the 2020-A edition. You do not need to submit to all tasks. If you submit to Task 1, we will consider that to be also a submission to Task 2 (we will simply ignore the categtory labels you provide which are not used in Task 2, as the event set is the same across tasks. You may also submit further runs only targeting Task 2. See the Submission deadlines for each task below:

Task 1: Crises / 25 Information Types

In this task you will process a stream of tweets from different events and you need to assign one or more of 25 information type labels and one priority label (Critical, High, Medium or Low) for each tweet. This task uses 15 crisis events, either bombings, earthquakes, floods, typhoons, hurricanes, wildfires or shootings. The information type categories are:

Request-GoodsServices
Request-SearchAndRescue
Request-InformationWanted
CallToAction-Volunteer
CallToAction-Donations
CallToAction-MovePeople
Report-FirstPartyObservation
Report-ThirdPartyObservation
Report-Weather
Report-Location
Report-EmergingThreats
Report-NewSubEvent
Report-MultimediaShare
Report-ServiceAvailable
Report-Factoid
Report-Official
Report-News
Report-CleanUp
Report-Hashtags
Report-OriginalEvent
Other-ContextualInformation
Other-Advice
Other-Sentiment
Other-Discussion
Other-Irrelevant

--- SUBMISSION DEADLINE: June 1st 2020 ---

Task 2: Crises / 12 Information Types

In this task you will process a stream of tweets from different events and you need to assign one or more of a reduced set of 12 information type labels and one priority label (Critical, High, Medium or Low) for each tweet. This task uses 15 crisis events, either bombings, earthquakes, floods, typhoons, hurricanes, wildfires or shootings. The information type categories are:

Request-GoodsServices
Request-SearchAndRescue
Request-InformationWanted
CallToAction-Volunteer
CallToAction-MovePeople
Report-FirstPartyObservation
Report-Location
Report-EmergingThreats
Report-NewSubEvent
Report-MultimediaShare
Report-ServiceAvailable
Other-Any

--- SUBMISSION DEADLINE: June 1st 2020 ---

Task 3: COVID-19 / 9 Information Types

In this task you will process a stream of tweets about the COVID-19 outbreak in different affected regions and you need to assign one or more of a reduced set of 9 information type labels and one priority label (Critical, High, Medium or Low) for each tweet. The information type categories are:

Request-GoodsServices
Request-InformationWanted
CallToAction-Volunteer
CallToAction-MovePeople
Report-EmergingThreats
Report-NewSubEvent
Report-ServiceAvailable
Other-Advice
Other-Any

--- SUBMISSION DEADLINE: June 8th 2020 ---

Performing Submission

Preparing Your Submission

To create a run submission, you need to create a ZIP file with the name '[runtag].2020A.task[X].zip' (replace [runtag] with the name of your run as defined in the last row of your run) and [X] with the task number, where the ZIP contains two files:

Your run file (see the run format description above), where the filename is [runtag].2020A.task[X].run.
A readme.txt file, containing the following:

name of your company or university
name of the person submitting the run
the email address of that person.
either 'academic', 'public sector', or 'industry'.
the run's runtag
2-3 sentences describing the key aspects of your approach (it helps us get an overview of what people tried)

Performing a Submission

For the 2020-A edition, submissions are made to the University of Glasgow via a file transfer service:

http://transfer.gla.ac.uk/

On this page, select 'Drop-off'. You will be asked to verify your contact details, enter your details and press 'Send Confirmation'. You will shortly recieve an automatic email entitled '[UofG Transfer] You are trying to drop off some files'. Press the link in the email, and you will be taken back to the transfer service, with a box open asking for recipients, enter the following details:

Name: Richard McCreadie (TREC-IS)
Email: richard.mccreadie@glasgow.ac.uk

In the 'Short note to the Recipients:' box, enter 'TREC-IS 2020-A Submission'. Then press 'Click to Add Files or Drag Them Here' and select your submission zip. If you have multiple submissions, you can add them all here. Once done, press 'Drop-off Files'. You should see a final screen indicating that your files have been successfully sent. You will also recieve an email some time later when I download your submission, so you know I got it.

Personal Data Processing Policy

As with downloading the datasets, by uploading your runs, you agree to the University of Glasgow processing your personal data, as defined by the EU General Data Protection Regulation (GDPR) - your name and email in this case. Queries about data processing and access/deletion requests should be sent to me via email. We will store your data for as long as the track is on-going and up-to 2 years beyond that. I may contact you using the details provided to notify you about changes in the datasets or track, to provide information or ask you questions about your participation or otherwise contact you about topics relevant to emergency management. We may collate statistics from the provided information that will be published, but we will not release individual names or email addresses. Best effort will be made to assure that your runs will be permenantly stored, and we may share them with third parties (e.g. other research groups or NIST), but we will not share your personal data with them (i.e. we will only share the run file not the readme.txt).

Preparing Your Submission

General Guidelines

Run Format

Per-Task Guidelines

Task 1: Crises / 25 Information Types

Task 2: Crises / 12 Information Types

Task 3: COVID-19 / 9 Information Types

Performing Submission

Preparing Your Submission

Performing a Submission

Personal Data Processing Policy

Supported By