Data Explorer

Below we show a small subset of OmniBench subtasks, which allows you to explore the data in detail.

Workflow
Task 1
Task ID: Please select a task
Instruction

Trajectory
Evaluation
def evaluate_task():
    # TODO: Implement evaluation logic
    return "Not implemented"