The objective of this challenge is to understand some of the problems of automatic natural language processing (NLP) through an automatic prediction task. This consists in processing texts corresponding to film reviews from the AlloCiné website, in order to automatically deduce the scores attributed to films by the authors of these reviews.
- Bouferroum Aymen Salah Eddine.
- Kracheni Zakaria.
- Vincent Labatut
- Mickaël Rouvier
$ pip install -r requirements.txtin /prepare_data/ directory :
- Execute
xml_parsing_train_dev.py - Execute
tokenization_train: x_train.txt and y_train.txt will be generated - Execute
xml_parsing_test.py - Execute
tokenization_test: x_test.txt will be generated