The Wayback Machine - https://web.archive.org/web/20200106171053/https://spacy.io/api/goldcorpus/

Other

GoldCorpus

classv2
An annotated corpus, using the JSON file format

This class manages annotations for tagging, dependency parsing and NER.

GoldCorpus.__init__ method

Create a GoldCorpus. IF the input data is an iterable, each item should be a (text, paragraphs) tuple, where each paragraph is a tuple (sentences, brackets), and each sentence is a tuple (ids, words, tags, heads, ner). See the implementation of gold.read_json_file for further details.

NameTypeDescription
trainunicode / Path / iterableTraining data, as a path (file or directory) or iterable.
devunicode / Path / iterableDevelopment data, as a path (file or directory) or iterable.