Skip to main content

Content Data app

The Content Data app summarises and displays key statistics on the performance of pages on www.gov.uk.

The app mostly relies on GA4 data and is populated via a scheduled query and table in BigQuery, detailed below.

Access

The Content Data app itself requires a Signon account to access.

The data which populates the Content data app is stored in BigQuery in the govuk-content-data project. Access to this project and dataset is currently limited to individuals that are working on the project.

Location

The table directly connected to the tool can be found in BigQuery under the data set govuk-content-data.ga4.GA4 dataform. This table can be found in BigQuery under the data set govuk-content-data.ga4.ga4_two_years.

For more information on our Google Cloud Platform projects, see our GCP project documentation.

Schema

field name type mode
the_date STRING NULLABLE
cleaned_page_location STRING NULLABLE
unique_page_views INTEGER NULLABLE
total_page_views INTEGER NULLABLE
exits INTEGER NULLABLE
entrances INTEGER NULLABLE
total_searches INTEGER NULLABLE
unique_search_sessions INTEGER NULLABLE
unique_searchterms INTEGER NULLABLE
useful_yes INTEGER NULLABLE
useful_no INTEGER NULLABLE
session_starts INTEGER NULLABLE
session_engaged INTEGER NULLABLE

Code for pulling in GA4 data

Presently, the table which populates the Content Data app is generated via a scheduled query. The code for this query can be found below.

WITH
  CTE1 AS (
  SELECT
    user_pseudo_id,
    (
    SELECT
      value.string_value
    FROM
      UNNEST(event_params)
    WHERE
      KEY = 'search_term') AS search_term,
    SPLIT(SPLIT(REGEXP_REPLACE((
          SELECT
            value.string_value
          FROM
            UNNEST(event_params)
          WHERE
            KEY = 'page_location'), 'https://www.gov.uk', ''),'?')[SAFE_OFFSET(0)],'#')[SAFE_OFFSET(0)] AS cleaned_page_location,
    (
    SELECT
      value.string_value
    FROM
      UNNEST(event_params)
    WHERE
      KEY = 'ui_text') AS ui_text,
    CONCAT(user_pseudo_id, (
      SELECT
        value.int_value
      FROM
        UNNEST(event_params)
      WHERE
        KEY = 'ga_session_id')) AS unique_session_id,
    (
    SELECT
      value.int_value
    FROM
      UNNEST(event_params)
    WHERE
      KEY = 'entrances') AS entrances,
    event_name,
    event_timestamp,
    FORMAT_DATE('%Y-%m-%d', PARSE_DATE('%Y%m%d', event_date)) AS the_date,
    IFNULL((CAST((
          SELECT
            value.int_value
          FROM
            UNNEST(event_params)
          WHERE
            KEY = "session_engaged") AS STRING)),(
      SELECT
        value.string_value
      FROM
        UNNEST(event_params)
      WHERE
        KEY = "session_engaged")) AS session_engaged,
  IF
    (LEAD(event_name) OVER (PARTITION BY CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params)
          WHERE
            KEY = 'ga_session_id'))
      ORDER BY
        event_timestamp) IS NULL, 1, NULL) AS isExit,
  FROM
    `ga4-analytics-352613.analytics_330577055.events_*`
  WHERE
    _table_suffix = FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)) ),
  CTE2 AS (
  SELECT
    the_date,
    cleaned_page_location,
    COUNT(DISTINCT
      CASE
        WHEN event_name = 'page_view' THEN unique_session_ID
    END
      ) AS unique_page_views,
    COUNT(CASE
        WHEN event_name = 'page_view' THEN unique_session_ID
    END
      ) AS total_page_views,
    COUNT(DISTINCT
      CASE
        WHEN isexit = 1 THEN unique_session_ID
    END
      ) AS exits,
    COUNT(DISTINCT
      CASE
        WHEN entrances = 1 THEN unique_session_ID
    END
      ) AS entrances,
    COUNT(CASE
        WHEN event_name = 'search' THEN search_term
    END
      ) AS total_searches,
    COUNT(DISTINCT
      CASE
        WHEN event_name = 'search' THEN unique_session_id
    END
      ) AS unique_search_sessions,
    COUNT(DISTINCT
      CASE
        WHEN event_name = 'search' THEN search_term
    END
      ) AS unique_searchterms,
    COUNT(CASE
        WHEN event_name = 'form_submit' AND ui_text = 'Yes' THEN unique_session_id
    END
      ) AS useful_yes,
    COUNT(CASE
        WHEN event_name = 'form_submit' AND ui_text = 'No' THEN unique_session_id
    END
      ) AS useful_no,
    COUNT(DISTINCT
      CASE
        WHEN event_name = 'session_start' THEN unique_session_iD
    END
      ) AS session_starts,
    COUNT(DISTINCT
      CASE
        WHEN session_engaged = '1' THEN unique_session_id
    END
      ) AS session_engaged
  FROM
    CTE1
  GROUP BY
    cleaned_page_location,
    the_date )
SELECT
  *
FROM
  CTE2
This page was last reviewed on 16 May 2024. It needs to be reviewed again on 16 November 2024 .
This page was set to be reviewed before 16 November 2024. This might mean the content is out of date.