Developers at Scale AI and the Center for AI Safety have created "Humanity's Last Exam" (HLE), a benchmark of 2,500 expert-level questions spanning 100 topics,
Explore More Assessment Resources
Browse Knowledge Base | Upcoming Events | Curated Collections