Skip to content
Brendan Quinn edited this page Jul 12, 2024 · 13 revisions

Welcome to the IPTC Sport Schema wiki!

Here is where we did the initial analysis and design work towards "the next generation of SportsML", which ended up being called IPTC Sport Schema.

The public-facing documentation now lives at https://sportschema.org/

NOTE: Some of the information in this wiki may be out of date. See the published ontology and docs at sportschema.org for the final spec.

Scope

We are creating an RDF model that represents schedules, statistics and results for all levels of all sports, for both human and machine consumption.

Participants

Please add your name and/or fix your details here.

  • Paul Kelly - product consultant, lead of Sports Content Working Group
  • Silver Oliver - data architect at Data Language
  • Nathan Matten - XML Team
  • Jim Howard - product consultant
  • Paul Wilton - data architect at Data Language
  • Drew Wanczowski - Principal Solutions Engineer at Progress Software/MarkLogic
  • Brendan Quinn - Managing Director at IPTC

Background and motivation

We (mostly Jim Howard) developed a document showing what could be done with a semantic sports model in the future:

Read it here: IPTC Sport Development

To help prioritise our development, we developed some Use Cases - mostly sample questions that people would want to ask of sports data.

We used these to define the scope and prioritise our work and to explore the complexity of different implementation options.

Requirements

  • plays well with other vocabularies
  • works on the web (e.g. works with schema.org)
  • can generalise and map other proprietary formats (IOC, OPTA, STATS, etc.)
  • plays well with current dev tech (e.g. GraphQL)
  • flexible regarding format (json, xml, ttl, etc.)

Modelling and Format

Initial attempts with RDF model using Turtle format

Turtle and SPARQL samples

Design Approaches

Generic versus Specific

Should we model specific properties or have a generic property that is effectively a key/value pair where the key is not defined in advance?

Generic approach

Pros:

  • Very flexible model
  • We have existing CVs from SportsML that we can use as keys

Cons:

  • No typing
  • Hard to validate
  • Doesn't follow the semantic web model very well
Specific approach

Pros:

  • brevity (data and query code)
  • data typing
  • IDs readymade
Decision:

We generally chose to create sport-specific ontologies from our SportsML CVs, turning each term into a property. That way we can add validation (via SHACL) and typing (via RDF Schema).

Clone this wiki locally