FreshWiki

FreshWiki is a dataset curated to study the generation of Wikipedia-like articles from scratch while mitigating data leakage from LLM pre-training.

From STORM-paper

  • Contains 100 high-quality Wikipedia articles focusing on the most-edited pages from February 2022 to September 2023.
  • Articles are filtered to B-class quality or above assessed by ORES.
  • Used to evaluate the pre-writing stage of STORM.