AIエージェントが未来を予測できるかを測るためのベンチマーク。「We finalized a collection of 991,759 GDELT event records, corresponding to 59,161 unique events and 296,630 unique news articles. Our test set contains 705 query and answer pairs on forecasting an event of given timestamp between two countries, with a 100 balanced test subset.」(GDELT=The GDELT Project)と大規模。