Custom Evaluation Dataset
Introduction
CSGHub provides model evaluation tools and supports custom evaluation datasets. Users can upload their own datasets and then use these datasets to evaluate model performance. This document will provide detailed instructions on how to customize evaluation datasets.
EvalScope Custom Dataset Usage
Multiple Choice Questions (MCQ)
CSV Format
Directory structure:
mcq/
├── example_dev.csv # (Optional) File name format: `{subset_name}_dev.csv`, used for few-shot evaluation
└── example_val.csv # File name format: `{subset_name}_val.csv`, used for actual evaluation data
CSV files should follow this format:
id,question,A,B,C,D,answer
1,通常来说,组成动物蛋白质的氨基酸有____,4种,22种,20种,19种,C
2,血液内存在的下列物质中,不属于代谢终产物的是____。,尿素,尿酸,丙酮酸,二氧化碳,C
JSONL Format
Directory structure:
mcq/
├── example_dev.jsonl # (Optional) File name format: `{subset_name}_dev.jsonl`, used for few-shot evaluation
└── example_val.jsonl # File name format: `{subset_name}_val.jsonl`, used for actual evaluation data
JSONL files should follow this format:
{"id": "1", "question": "