ReCoRD Dataset

ReCoRD Cloze Reading Comprehension
Dataset

120K cloze-style reading comprehension questions based on CNN/Daily Mail news, requiring commonsense reasoning to select the correct entity. It is a core task of the SuperGLUE benchmark.

120K+ Questions 65K+ Passages SuperGLUE License Zhang et al. (2018)
ReCoRD Dataset
πŸ“
120K+
Cloze Questions
πŸ“°
65K+
News Passages
🏷️
Entity Spans
Answer Format
πŸ†
SuperGLUE
Benchmark Task

Dataset Highlights

Large-scale cloze reading comprehension benchmark challenging commonsense reasoning ability

πŸ“°

Real News Corpus

All passages are from real news reports by CNN and Daily Mail, covering politics, sports, technology, entertainment, and more, with natural and authentic language.

🧩

Cloze-style Format

Each question replaces a key entity in the query sentence with @placeholder. The model must extract the correct entity from the passage to fill the blank, a concise yet challenging format.

🧠

Commonsense Reasoning Driven

Answering requires the model to go beyond surface text matching, integrating world knowledge and commonsense reasoning to understand passage semantics, truly testing deep language understanding.

πŸ†

Core SuperGLUE Task

As an important part of the SuperGLUE benchmark, ReCoRD is a standard test for evaluating pre-trained language models' reading comprehension and reasoning abilities.

πŸ“Š

Large-scale Annotation

Contains over 120K manually annotated question-answer pairs and 65K news passages, providing ample data for training and evaluating large language models.

πŸŽ“

Academic Authoritative Source

Released by Zhang et al. from Johns Hopkins University in 2018, widely cited in the NLP academic community, and a standard benchmark in reading comprehension research.

Use Cases

From model evaluation to academic research, covering diverse NLP task needs

πŸ“–

Reading Comprehension

Train and evaluate models on accurately extracting entities from news passages to answer cloze questions

🧠

Commonsense Reasoning

Test whether models can leverage commonsense knowledge for reasoning beyond simple pattern matching and keyword retrieval

🧩

Cloze Tasks

Train models in standard cloze task format to enhance precise semantic understanding at the entity level

πŸ†

SuperGLUE Evaluation

Part of the SuperGLUE benchmark for systematic evaluation of pre-trained language models' comprehensive language understanding performance

Reading Comprehension Cloze Commonsense Reasoning SuperGLUE Entity Extraction

Data Preview

The following is a sample ReCoRD data example, showing the structure of passage, query, and answers

JSON
{
  "passage": {
    "text": "CNN -- The U.S. Senate on Thursday passed a bill
      that would provide #9.7 billion in flood insurance
      to victims of Superstorm Sandy. The measure, which
      passed 62-32, now goes to the House. President Barack
      Obama has urged Congress to pass the bill quickly.",
    "entities": [
      {"start": 6, "end": 14, "text": "U.S. Senate"},
      {"start": 112, "end": 126, "text": "Superstorm Sandy"},
      {"start": 176, "end": 187, "text": "Barack Obama"},
      {"start": 199, "end": 206, "text": "Congress"}
    ]
  },
  "query": "President @placeholder has urged lawmakers to
    act swiftly on the flood insurance legislation.",
  "answers": ["Barack Obama"],
  "idx": 0
}

3 Steps to Get Started Quickly

From browsing to training, start your NLP research in minutes

01

Browse the Dataset

View dataset details on the Ace Data Cloud platform, including field descriptions, sample size, and SuperGLUE license metadata.

02

Download Data

Download the training, validation, and test JSON files of ReCoRD. The data structure is clear and fields are well-defined, ready to use out of the box.

03

Load and Train

Use json.load() or the HuggingFace datasets library to load the data and start fine-tuning, evaluation, and experiments.

Start Exploring the ReCoRD Reading Comprehension Data

A core SuperGLUE benchmark with 120K commonsense reasoning questions. Whether you are evaluating pre-trained models or exploring the frontiers of reading comprehension, this dataset is indispensable.