π Paper | π MMSciBench
MMSciBench focuses on mathematics and physics that evaluates scientific reasoning capabilities. This repository contains the code for the benchmark. The dataset is available on Hugging Face: MMSciBench Dataset.
If you use this benchmark in your research, please cite our paper:
@article{ye2025mmscibench,
title={MMSciBench: Benchmarking Language Models on Chinese Multimodal Scientific Problems},
author={Ye, Xinwu and Li, Chengfan and Chen, Siming and Wei, Wei and Tang, Xiangru},
journal={Findings of the Association for Computational Linguistics: ACL 2025},
year={2025}
}
Clone the repository:
git clone https://github.com/xinwuye/MMSciBench-code.git
cd MMSciBench-codeThe dataset for MMSciBench is available on Hugging Face:
π MMSciBench Dataset
To evaluate models on the benchmark, use the following command:
python exp.py
python exp_hf.py
python eval1.pyOnce evaluation is complete, results will be saved.
This project is licensed under the Apache-2.0 License.
We welcome contributions! To contribute:
- Fork the repository.
- Create a new branch:
git checkout -b feature-branch-name. - Commit your changes:
git commit -m 'Add a new feature'. - Push to the branch:
git push origin feature-branch-name. - Open a pull request.
We thank the open-source community and previous works that inspired this benchmark.
For questions or collaborations, please open an issue.