hotpot_qa / distractor / validationkg_exploration_hotpotqa_distractor_v1sentence (supporting-facts setting)example (per-example closed-world context)n_queries=50, seed=0hotpotqa_official_metrics.json (version v1)| Metric group | Metric | Value |
|---|---|---|
| Supporting facts (micro) | F1 | 0.9130434783 |
| Supporting facts (micro) | Precision | 0.9292035398 |
| Supporting facts (micro) | Recall | 0.8974358974 |
| Supporting facts (micro) | EM | 0.7200000000 |
| Supporting facts (macro) | F1 | 0.9229523810 |
| Supporting docs (micro) | F1 | 1.0000000000 |
| Answer (macro) | EM | 0.7200000000 |
| Answer (macro) | F1 | 0.8666269841 |
| Joint (macro) | EM | 0.5200000000 |
| Joint (macro) | F1 | 0.8086674049 (80.87) |
| Joint (macro) | F1 95% bootstrap CI | [0.7293608588, 0.8755161213] |
Official HotPotQA benchmark homepage and leaderboard: