Jump to content

Experimentation Lab/Contextual attributes

From Wikitech

Contextual attributes are fields in the event data that provide information about the performer who triggered the event and the wiki where the event occurred. The values of contextual attributes that are included in the stream configuration are populated automatically by Metrics Platform when the event is generated. This page documents the contextual attributes that are supported by each Metrics Platform client.

For attribute descriptions, types and other constraints (like their maximum length), see the base schema definitions.

JavaScript

The JavaScript client supports these contextual attributes:[1]

Included automatically:[2]

  • agent_client_platform
  • agent_client_platform_family

Optional:

  • agent_ua_string
  • page_id
  • page_title
  • page_namespace_id
  • page_namespace_name
  • page_revision_id
  • page_wikidata_id
  • page_wikidata_qid
  • page_content_language
  • page_is_redirect
  • page_user_groups_allowed_to_move
  • page_user_groups_allowed_to_edit
  • mediawiki_skin
  • mediawiki_version
  • mediawiki_is_production
  • mediawiki_is_debug_mode
  • mediawiki_database
  • mediawiki_site_content_language
  • mediawiki_site_content_language_variant
  • performer_is_logged_in
  • performer_id
  • performer_name
  • performer_session_id
  • performer_active_browsing_session_token
  • performer_pageview_id
  • performer_groups
  • performer_is_bot
  • performer_is_temp
  • performer_language
  • performer_language_variant
  • performer_can_probably_edit_page
  • performer_edit_count
  • performer_edit_count_bucket
  • performer_registration_dt

PHP

The PHP client supports these contextual attributes:[3]

Included automatically:[4]

  • agent_client_platform
  • agent_client_platform_family

Optional:

  • agent_ua_string
  • page_id
  • page_title
  • page_namespace_id
  • page_namespace_name
  • page_revision_id
  • page_wikidata_id
  • page_wikidata_qid
  • page_content_language
  • page_is_redirect
  • page_user_groups_allowed_to_move
  • page_user_groups_allowed_to_edit
  • mediawiki_skin
  • mediawiki_version
  • mediawiki_is_production
  • mediawiki_is_debug_mode
  • mediawiki_database
  • mediawiki_site_content_language
  • mediawiki_site_content_language_variant
  • performer_is_logged_in
  • performer_id
  • performer_name
  • performer_groups
  • performer_is_bot
  • performer_is_temp
  • performer_language
  • performer_language_variant
  • performer_can_probably_edit_page
  • performer_edit_count
  • performer_edit_count_bucket
  • performer_registration_dt

Java

The Java client supports these contextual attributes:[5]

Included automatically:[6]

  • agent_app_flavor
  • agent_app_install_id
  • agent_app_theme
  • agent_app_version
  • agent_app_version_name
  • agent_client_platform
  • agent_client_platform_family
  • agent_device_family
  • agent_device_language
  • agent_release_status

Optional:

  • mediawiki_database
  • page_id
  • page_title
  • page_namespace_id
  • page_namespace_name
  • page_revision_id
  • page_wikidata_qid
  • page_content_language
  • performer_id
  • performer_name
  • performer_is_logged_in
  • performer_is_temp
  • performer_session_id
  • performer_pageview_id
  • performer_groups
  • performer_language_groups
  • performer_language_primary
  • performer_registration_dt

Swift

The Swift client supports these contextual attributes:[7]

Included automatically:[8]

  • agent_app_install_id
  • agent_client_platform
  • agent_client_platform_family

Optional:

  • page_id
  • page_title
  • page_namespace
  • page_namespace_name
  • page_revision_id
  • page_wikidata_id
  • page_content_language
  • page_is_redirect
  • page_user_groups_allowed_to_edit
  • page_user_groups_allowed_to_move
  • mediawiki_skin
  • mediawiki_version
  • mediawiki_is_production
  • mediawiki_is_debug_mode
  • mediawiki_database
  • mediawiki_site_content_language
  • mediawiki_site_content_language_variant
  • performer_is_logged_in
  • performer_id
  • performer_name
  • performer_session_id
  • performer_pageview_id
  • performer_groups
  • performer_is_bot
  • performer_language
  • performer_language_variant
  • performer_can_probably_edit_page
  • performer_edit_count
  • performer_edit_count_bucket
  • performer_registration_dt

Enabling attributes

All attributes must be enabled in the instrument's event stream configuration. To enable a contextual attribute, list the attribute name in the provide_values array. For more information, see the stream configuration guide.

For example, this stream configuration enables the collection of the page_id, page_title, performer_is_logged_in, and performer_is_temp contextual attributes:

ext-EventStreamConfig.php
// …
'mediawiki.interwiki_link_hover' => [
    'schema_title' => 'analytics/product_metrics/web/base',
    'destination_event_service' => 'eventgate-analytics-external',
    'producers' => [
        'metrics_platform_client' => [
            // Contextual attributes to add to the event before it is submitted to this stream.
            'provide_values' => [
                "page_id",
                "page_title",

                // We recommend collecting the performer_is_logged_in and performer_is_temp
                // attributes at the same time. See https://phabricator.wikimedia.org/T374812#10953216
                "performer_is_logged_in",
                "performer_is_temp",
            ],
        ],
    ],
],
// …

Privacy considerations

Data Collection Guidelines outline best practices at the Wikimedia Foundation for managing privacy risk in data collection. Some criteria presented in this policy are based on the specific data that the instrument collects. Depending of those attributes, the risk level of an instrument may be increased. And because contextual attributes are the main way to collect data when using Experimentation Lab, depending of those ones, the risk level for an instrument may be affected.

The following are the specific combinations of contextual attributes that increase the risk level for an instrument. Otherwise the risk level of the instrument can be defined as Tier 3: Low risk

Combination Risk level
agent_ua_string + performer_id/performer_name/agent_app_install_id Tier 2: Medium risk
page_id/page_title + performer_id/performer_name Tier 2: Medium risk
page_id/page_title + agent_app_install_id Tier 2: Medium risk (only if end-user is logged-in)
agent_ua_string + performer_id/performer_name + page_id/page_title Tier 1: High risk

When registering/modifying your instrument via xLab, validation and advice will be given based on the above combinations.

When using a custom schema, where additional attributes could be collected apart from the contextual ones, those ones should be considered by the instrument owner to check whether they might increase the risk level of the instrument. For now xLab is not supporting this case.

Check the Regulation section guide to get more details about how to fill that part when registering an instrument or experiment.

References

  1. gitlab:repos/data-engineering/metrics-platform/-/blob/main/js/src/Context.js
  2. gitlab:repos/data-engineering/metrics-platform/-/blob/main/js/src/ContextController.js
  3. gitlab:repos/data-engineering/metrics-platform/-/blob/main/php/src/StreamConfig/StreamConfig.php
  4. gitlab:repos/data-engineering/metrics-platform/-/blob/main/php/src/ContextController.php
  5. gitlab:repos/data-engineering/metrics-platform/-/blob/main/java/src/main/java/org/wikimedia/metrics_platform/context/ContextValue.java
  6. gitlab:repos/data-engineering/metrics-platform/-/blob/main/java/src/main/java/org/wikimedia/metrics_platform/ContextController.java
  7. gitlab:repos/data-engineering/metrics-platform/-/blob/main/swift/Sources/WikimediaMetricsPlatform/StreamConfig/ContextAttribute.swift
  8. gitlab:repos/data-engineering/metrics-platform/-/blob/main/swift/Sources/WikimediaMetricsPlatform/Context/ContextController.swift#L16