Experimentation Lab/Contextual attributes
Contextual attributes are fields in the event data that provide information about the performer who triggered the event and the wiki where the event occurred. The values of contextual attributes that are included in the stream configuration are populated automatically by Metrics Platform when the event is generated. This page documents the contextual attributes that are supported by each Metrics Platform client.
For attribute descriptions, types and other constraints (like their maximum length), see the base schema definitions.
JavaScript
The JavaScript client supports these contextual attributes:[1]
Included automatically:[2]
agent_client_platform
agent_client_platform_family
Optional:
agent_ua_string
page_id
page_title
page_namespace_id
page_namespace_name
page_revision_id
page_wikidata_id
page_wikidata_qid
page_content_language
page_is_redirect
page_user_groups_allowed_to_move
page_user_groups_allowed_to_edit
mediawiki_skin
mediawiki_version
mediawiki_is_production
mediawiki_is_debug_mode
mediawiki_database
mediawiki_site_content_language
mediawiki_site_content_language_variant
performer_is_logged_in
performer_id
performer_name
performer_session_id
performer_active_browsing_session_token
performer_pageview_id
performer_groups
performer_is_bot
performer_is_temp
performer_language
performer_language_variant
performer_can_probably_edit_page
performer_edit_count
performer_edit_count_bucket
performer_registration_dt
PHP
The PHP client supports these contextual attributes:[3]
Included automatically:[4]
agent_client_platform
agent_client_platform_family
Optional:
agent_ua_string
page_id
page_title
page_namespace_id
page_namespace_name
page_revision_id
page_wikidata_id
page_wikidata_qid
page_content_language
page_is_redirect
page_user_groups_allowed_to_move
page_user_groups_allowed_to_edit
mediawiki_skin
mediawiki_version
mediawiki_is_production
mediawiki_is_debug_mode
mediawiki_database
mediawiki_site_content_language
mediawiki_site_content_language_variant
performer_is_logged_in
performer_id
performer_name
performer_groups
performer_is_bot
performer_is_temp
performer_language
performer_language_variant
performer_can_probably_edit_page
performer_edit_count
performer_edit_count_bucket
performer_registration_dt
Java
The Java client supports these contextual attributes:[5]
Included automatically:[6]
agent_app_flavor
agent_app_install_id
agent_app_theme
agent_app_version
agent_app_version_name
agent_client_platform
agent_client_platform_family
agent_device_family
agent_device_language
agent_release_status
Optional:
mediawiki_database
page_id
page_title
page_namespace_id
page_namespace_name
page_revision_id
page_wikidata_qid
page_content_language
performer_id
performer_name
performer_is_logged_in
performer_is_temp
performer_session_id
performer_pageview_id
performer_groups
performer_language_groups
performer_language_primary
performer_registration_dt
Swift
The Swift client supports these contextual attributes:[7]
Included automatically:[8]
agent_app_install_id
agent_client_platform
agent_client_platform_family
Optional:
page_id
page_title
page_namespace
page_namespace_name
page_revision_id
page_wikidata_id
page_content_language
page_is_redirect
page_user_groups_allowed_to_edit
page_user_groups_allowed_to_move
mediawiki_skin
mediawiki_version
mediawiki_is_production
mediawiki_is_debug_mode
mediawiki_database
mediawiki_site_content_language
mediawiki_site_content_language_variant
performer_is_logged_in
performer_id
performer_name
performer_session_id
performer_pageview_id
performer_groups
performer_is_bot
performer_language
performer_language_variant
performer_can_probably_edit_page
performer_edit_count
performer_edit_count_bucket
performer_registration_dt
Enabling attributes
All attributes must be enabled in the instrument's event stream configuration. To enable a contextual attribute, list the attribute name in the provide_values
array. For more information, see the stream configuration guide.
For example, this stream configuration enables the collection of the page_id
, page_title
, performer_is_logged_in
, and performer_is_temp
contextual attributes:
// …
'mediawiki.interwiki_link_hover' => [
'schema_title' => 'analytics/product_metrics/web/base',
'destination_event_service' => 'eventgate-analytics-external',
'producers' => [
'metrics_platform_client' => [
// Contextual attributes to add to the event before it is submitted to this stream.
'provide_values' => [
"page_id",
"page_title",
// We recommend collecting the performer_is_logged_in and performer_is_temp
// attributes at the same time. See https://phabricator.wikimedia.org/T374812#10953216
"performer_is_logged_in",
"performer_is_temp",
],
],
],
],
// …
Privacy considerations
Data Collection Guidelines outline best practices at the Wikimedia Foundation for managing privacy risk in data collection. Some criteria presented in this policy are based on the specific data that the instrument collects. Depending of those attributes, the risk level of an instrument may be increased. And because contextual attributes are the main way to collect data when using Experimentation Lab, depending of those ones, the risk level for an instrument may be affected.
The following are the specific combinations of contextual attributes that increase the risk level for an instrument. Otherwise the risk level of the instrument can be defined as Tier 3: Low risk
Combination | Risk level |
---|---|
agent_ua_string + performer_id/performer_name/agent_app_install_id |
Tier 2: Medium risk |
page_id/page_title + performer_id/performer_name |
Tier 2: Medium risk |
page_id/page_title + agent_app_install_id |
Tier 2: Medium risk (only if end-user is logged-in) |
agent_ua_string + performer_id/performer_name + page_id/page_title |
Tier 1: High risk |
When registering/modifying your instrument via xLab, validation and advice will be given based on the above combinations.
When using a custom schema, where additional attributes could be collected apart from the contextual ones, those ones should be considered by the instrument owner to check whether they might increase the risk level of the instrument. For now xLab is not supporting this case.
Check the Regulation section guide to get more details about how to fill that part when registering an instrument or experiment.
References
- ↑ gitlab:repos/data-engineering/metrics-platform/-/blob/main/js/src/Context.js
- ↑ gitlab:repos/data-engineering/metrics-platform/-/blob/main/js/src/ContextController.js
- ↑ gitlab:repos/data-engineering/metrics-platform/-/blob/main/php/src/StreamConfig/StreamConfig.php
- ↑ gitlab:repos/data-engineering/metrics-platform/-/blob/main/php/src/ContextController.php
- ↑ gitlab:repos/data-engineering/metrics-platform/-/blob/main/java/src/main/java/org/wikimedia/metrics_platform/context/ContextValue.java
- ↑ gitlab:repos/data-engineering/metrics-platform/-/blob/main/java/src/main/java/org/wikimedia/metrics_platform/ContextController.java
- ↑ gitlab:repos/data-engineering/metrics-platform/-/blob/main/swift/Sources/WikimediaMetricsPlatform/StreamConfig/ContextAttribute.swift
- ↑ gitlab:repos/data-engineering/metrics-platform/-/blob/main/swift/Sources/WikimediaMetricsPlatform/Context/ContextController.swift#L16