CrisisFACTS is an open data challenge for state-of-the-art temporal summarization technologies to support disaster-response managers' use of online data sources during crisis events.
Tracking developments in topics and events has been studied at TREC and other venues for several decades (e.g., from DARPA’s early Topic-Detection and Tracking initiative to the more recent Temporal Summarization and Real-Time Summarization TREC tracks). Today’s high-velocity, multi-stream information ecosystem, however, leads to missed critical information or new developments, especially during crises. While modern search engines are adept at providing users with search results relevant to an event, they are ill-suited to multi-stream fact-finding and summarization needs. The CrisisFACTS track aims to foster research that closes these gaps.
CrisisFACTS is making available multi-stream datasets from several disasters, covering Twitter, Reddit, Facebook, and online news sources. We supplement these datasets with queries defining the information needs of disaster-response stakeholders (extracted from FEMA ICS209 forms). Participants’ systems should integrate these streams into temporally ordered lists of important facts, which we can aggregate into summaries for disaster response personnel.
This track’s core information need is:
What critical new developments have occurred
that I need to know about?
Many pieces of information posted during a disaster are not essential for responders or disaster-response managers. To make these needs explicit, we have made a list of general and disaster-specific queries/”user profiles”, available here. These queries capture a responder might consider important, such as the following:
Emergency response staff typically want to receive a summary of this information at particular points during the emergency. Such a summary might be generated at the start of a new shift so the next group of team members can be informed about new developments. Alternatively, local government or media agencies might request updates on the emergency.
Currently, these information needs are fulfilled via manual summarization, e.g. by filling an incident report such as the FEMA ICS209 forms.
The 2022 track will have a single fact-extraction task, where systems consume a multi-stream dataset for a given disaster, broken into disaster-day pairs. From this stream, the system should produce a minimally redundant list of atomic facts, with importance scores denoting how critical the fact is for responders. CrisisFACTS organizers will aggregate these facts into daily summaries for these disasters, along the following lines:
Fig 0. ConOps/High-Level System Overview
Input to participant systems include:
{
"eventID": "CrisisFACTS-001",
"trecisId": "TRECIS-CTIT-H-092",
"dataset": "2017_12_07_lilac_wildfire.2017",
"title": "Lilac Wildfire 2017",
"type": "Wildfire",
"url": "https://en.wikipedia.org/wiki/Lilac Fire",
"description": "The Lilac Fire was a fire that burned in northern San Diego County, California, United States, and the second-costliest one one of multiple of multiple wildfires that erupted in Southern California in December 2017."
}
Fig 1. Example Event Definition for the 2017 Lilac Fire
[{
"queryID": "CrisisFACTS-General-q001",
"indicativeTerms": "airport closed",
"query": "Have airports closed",
"trecisCategoryMapping": "Report-Factoid"
},
{
"queryID": "CrisisFACTS-General-q002",
"indicativeTerms": "rail closed",
"query": "Have railways closed",
"trecisCategoryMapping": "Report-Factoid"
},
{
"queryID": "CrisisFACTS-General-q003",
"indicativeTerms": "water supply",
"query": "Have water supplies been contaminated",
"trecisCategoryMapping": "Report-EmergingThreats"
},
...,
{
"queryID": "CrisisFACTS-Wildfire-q001",
"indicativeTerms": "acres size",
"query": "What area has the wildfire burned",
"trecisCategoryMapping": "Report-Factoid"
},
{
"queryID": "CrisisFACTS-Wildfire-q002",
"indicativeTerms": "wind speed",
"query": "Where are wind speeds expected to be high",
"trecisCategoryMapping": "Report-Weather"
},
...
]
Fig 2. Example Query Definition
[{
"eventID": "CrisisFACTS-001",
"requestID": "CrisisFACTS-001-r3",
"dateString": "2017-12-07",
"startUnixTimestamp": 1512604800,
"endUnixTimestamp": 1512691199
},
...,
{
"eventID": "CrisisFACTS-001",
"requestID": "CrisisFACTS-001-r4",
"dateString": "2017-12-08",
"startUnixTimestamp": 1512691200,
"endUnixTimestamp": 1512777599
}]
Fig 3. Example Summary Requests
[{
"event": "CrisisFACTS-001",
"streamID": "CrisisFACTS-001-Twitter-14023-0",
"unixTimestamp": 1512604876,
"text": "Big increase in the wind plus drop in humidity tonight into Thursday for San Diego County #SanDiegoWX https://t.co/1pV0ZAhsJH",
"sourceType": "Twitter"
},
{
"event": "CrisisFACTS-001",
"streamID": "CrisisFACTS-001-Twitter-27052-0",
"unixTimestamp": 1512604977,
"text": "Prayers go out to you all! From surviving 2 massive wild fires in San Diego and California in general we have all c… https://t.co/B5Y7KLY0uS",
"sourceType": "Twitter"
},
{
"event": "CrisisFACTS-001",
"streamID": "CrisisFACTS-001-Twitter-43328-0",
"unixTimestamp": 1512691164,
"text": "If you're in the San Diego area (or north of it), you should probably turn on tweet notifs from @CALFIRESANDIEGO fo… https://t.co/hNjEuEfKaB",
"sourceType": "Twitter"
}]
Fig 4. Three Event Snippets for Event CrisisFACTS-001
Your system should produce one summary for each request using only the content provided for that event and only between the starting and ending timestamps.
This task differs from traditional summarization in that you should not simply produce a block of text of a set length. Instead, this track’s “summaries” are comprised of the facts describing the target disaster’s evolution. As such, your summaries consist of an itemised list of ‘facts’ that match one or more of the user information needs.
We will use these top-k “most important” facts from a given event-day pair as the summary for that event’s day.
Each fact should contain the following:
StreamID
of your fact’s original item.Examples of system output are as follows:
{
"requestID": "CrisisFACTS-001-r3",
"factText": "Increased threat of wind damage in the San Diego area.",
"unixTimestamp":1512604876,
"importance": 0.71,
"sources": [
"CrisisFACTS-001-Twitter-14023-0"
],
"streamID": null,
"informationNeeds": ["CrisisFACTS-General-q015"]
}
...
Fig 5. Example System Output with Abstractive Facts. The streamID
field is empty as this fact may not appear in the dataset verbatim. It is, however, supported by one Twitter message.
{
"requestID": "CrisisFACTS-001-r3",
"factText": "Big increase in the wind plus drop in humidity tonight into Thursday for San Diego County #SanDiegoWx https://t.co/1pVOZAhsJH",
"unixTimestamp":1512604876,
"importance": 0.71,
"sources": [
"CrisisFACTS-001-Twitter-14023-0"
],
"streamID": "CrisisFACTS-001-Twitter-14023-0",
"informationNeeds": ["CrisisFACTS-General-q015"]
}
...
Fig 6. Example System Output with Extractive Facts. The streamID
field is populated with the Twitter document from which this text was taken.
Participant systems may produce as many facts as they wish for a specific summary request. However, to handle variable summary length, each fact may not contain more than 200 characters.
For days after the first, your system should avoid returning information that has been reported in previous summaries for the same event. Furthermore, evaluation will be performed at a predetermined number of facts (not revealed in advance). To truncate your list of facts, we will rank them by importance score and cut at a specific rank k – which will vary across event-day pairs.
We recommend that you return at least 100 facts per summary request.
For each day during an event, the following content is available:
CrisisFACTS has transitioned to the ir_datasets infrastructure for making data available to the community. We provide a GitHub repository with Jupyter notebooks and a Collab notebook to accelerate participants’ access to this data:
We provide these data streams to participants to use as the source of content for inclusion into their summaries. The statistics for each of the eight 2022 events are listed below:
eventID | Title | Type | Tweets | News | ||
---|---|---|---|---|---|---|
CrisisFACTS-001 | Lilac Wildfire 2017 | Wildfire | 41,346 | 1,738 | 2,494 | 5,437 |
CrisisFACTS-002 | Cranston Wildfire 2018 | Wildfire | 22,974 | 231 | 1,967 | 5,386 |
CrisisFACTS-003 | Holy Wildfire 2018 | Wildfire | 23,528 | 459 | 1,495 | 7,016 |
CrisisFACTS-004 | Hurricane Florence 2018 | Hurricane | 41,187 | 120,776 | 18,323 | 196,281 |
CrisisFACTS-005 | Maryland Flood 2018 | Flood | 33,584 | 2,006 | 2,008 | 4,148 |
CrisisFACTS-006 | Saddleridge Wildfire 2019 | Wildfire | 31,969 | 244 | 2,267 | 3,869 |
CrisisFACTS-007 | Hurricane Laura 2020 | Hurricane | 36,120 | 10,035 | 6,406 | 9,048 |
CrisisFACTS-008 | Hurricane Sally 2020 | Hurricane | 40,695 | 11,825 | 15,112 | 48,492 |
Runs will be submitted through the NIST submission system at trec.nist.gov. Runs that do not pass validation will be rejected outright. Submitted runs will be asked to specify the following:
Each run submission must indicate whether the run is manual or automatic. An automatic run is any run that receives no human intervention once the system is started and provided with the task inputs. We expect most CrisisFACTS runs to be automatic.
Results on manual runs will be specifically identified when results are reported. A manual run is any run in which a person manually changes, summarises, or re-ranks queries, the system, or the system’s lists of facts. Simple bug fixes that address only format handling do not result in manual runs, but the changes should be described.
The submission format for CrisisFACTS is a newline-delimited JSON
file, where each entry in the submitted file contains the fields outlined in System Output section above. Each submission file corresponds to a single submitted run (i.e., all event-day pairs for all events), with the submission’s runtag included in the filename.
Example submissions are available in Output Examples.
Submitted systems will be evaluated using two sets of approaches for this first year. In both approaches, participant systems’ lists of facts will be truncated to a private k value based on NIST assessors’ assessments.
Milestone | Date |
---|---|
Guidelines released | 17 June 2022 |
Submissions Due | 26 September 2022 |
NIST-Assessor Evaluation | 29 Sept - 27 Oct 2022 |
Scores returned to participants | 28 October 2022 |
TREC Notebook Drafts Due | 7 November 2022 (Tentative) |
TREC Conference | 14 November 2022 |
Cody Buntain
@cbuntain
he/him
College of Information Studies, University of Maryland, College Park.
Richard McCreadie
@richardm_
he/him
School of Computing Science, University of Glasgow.