FreshWiki
FreshWiki is a dataset curated to study the generation of Wikipedia-like articles from scratch while mitigating data leakage from LLM pre-training.
From STORM-paper
- Contains 100 high-quality Wikipedia articles focusing on the most-edited pages from February 2022 to September 2023.
- Articles are filtered to B-class quality or above assessed by ORES.
- Used to evaluate the pre-writing stage of STORM.