MATH-500
mathA 500-problem slice of the MATH dataset spanning algebra to number theory at five difficulty levels, used for step-by-step mathematical reasoning.
Official benchmark pageModel rankings on MATH-500
No verified scores for this benchmark yet. We only list results with a primary source.
Scores are self-reported or from primary evaluations, each linked to its source. Test conditions (tools, shots, prompt) vary between labs — see the source for details.