Skip to main content

Use the historic analytics (UA) data

This page is a work in progress.

Before Google Analytics 4, we used Universal Analytics (UA) to collect data on the usage of various government sites, including GOV.UK.

More information on the data sources themselves can be found on the historic UA data sources page.

Get access to UA data

For information on how to get access to the historic UA data, please see the historic data access policy.

How to use the UA data

The historic UA data can be queried directly in BigQuery, or it can be used in various reporting or data visualisation tools.

Whether you query it directly in BigQuery or use the data in a reporting tool, you will need a Google Cloud Platform project with billing enabled through which you can query the data. If you are GDS staff, please contact the #data-engineering community who will be able to provide you with a Google Cloud Platform billing project ID. If you work for another public sector organisation, your department or organisation will have to provide you with this ID. We are investigating potential solutions for public sector organisations who do not have access to Google Cloud Platform, so please do get in touch with the Analytics team via the GA4 support inbox if you have any issues.

If you are using the GOV.UK historic data, we recommend you use the flattened dataset as this is cheaper and easier to query.

Use the UA data in Looker Studio

The UA data stored in BigQuery can be used in Looker Studio.

To connect to the UA data, we recommend you use a custom query. You can do this by:

  1. Clicking to add a data source
  2. In the ‘Connect to data’ panel, selecting ‘BigQuery’
  3. In the left-hand menu, selecting ‘Custom query’
  4. Entering your Google Cloud Platform billing project ID
  5. Entering a query to select the fields needed using a WHERE statement to define the date range required
  6. Tick the checkbox to ‘enable date range parameters’ if needed
  7. Selecting ‘Add’

An example of a generic SQL query that could be used in step 5 above is:

SELECT * 
FROM `govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
WHERE _TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE

This query selects all fields from the GOV.UK UA flattened tables and returns all the data for those fields between the dates you have selected in your Looker Studio report.

Sample queries

Below are some sample queries which may help you in exploring the historic data.

‘All pages’ UA report query

The below query can be used to replicate the ‘All Pages’ report in the ‘Site Content’ section of the Behaviour reports in the Universal Analytics interface. This again uses the GOV.UK UA flattened dataset and Looker Studio date parameters, although these could easily be switched for dates in the ‘YYYYMMDD’ format.

#Query to replicate the UA All Pages report
#Page views and unique page views
WITH
PVS AS (
SELECT
  date,
  pagePath,
  COUNT(*) AS pageviews,
  COUNT(DISTINCT session_id) AS unique_pageviews
FROM (
  SELECT
    date,
    pagePath,
    CONCAT(fullVisitorId, CAST(visitStartTime AS STRING)) AS session_id
  FROM
    `govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
  WHERE
    _table_suffix BETWEEN @DS_START_DATE
    AND @DS_END_DATE
    AND type = 'PAGE')
GROUP BY
  pagePath,
  date
ORDER BY
  pageviews DESC ),
#Average time on page
ATOP AS (
SELECT
  date,
  pagePath,
  pageviews,
  exits,
  total_time_on_page,
  CASE
    WHEN pageviews = exits THEN 0
    ELSE total_time_on_page / (pageviews - exits)
END
  AS avg_time_on_page
FROM (
  SELECT
    date,
    pagePath,
    COUNT(*) AS pageviews,
    SUM(
    IF
      (isExit IS NOT NULL, 1, 0)) AS exits,
    SUM(time_on_page) AS total_time_on_page
  FROM (
    SELECT
      date,
      fullVisitorId,
      visitStartTime,
      pagePath,
      hit_time,
      type,
      isExit,
      CASE
        WHEN isExit IS NOT NULL THEN last_interaction - hit_time
        ELSE next_pageview - hit_time
    END
      AS time_on_page
    FROM (
      SELECT
        date,
        fullVisitorId,
        visitStartTime,
        pagePath,
        hit_time,
        type,
        isExit,
        last_interaction,
        LEAD(hit_time) OVER (PARTITION BY fullVisitorId, visitStartTime ORDER BY hit_time) AS next_pageview
      FROM (
        SELECT
          date,
          fullVisitorId,
          visitStartTime,
          pagePath,
          hit_time,
          type,
          isExit,
          last_interaction
        FROM (
          SELECT
            date,
            fullVisitorId,
            visitStartTime,
            pagePath,
            type,
            isExit,
            time / 1000 AS hit_time,
            MAX(
            IF
              (isInteraction IS NOT NULL, time / 1000, 0)) OVER (PARTITION BY fullVisitorId, visitStartTime) AS last_interaction
          FROM
            `govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
          WHERE
            _table_suffix BETWEEN @DS_START_DATE
            AND @DS_END_DATE
            AND type = 'PAGE'))
      GROUP BY
        date,
        pagePath,
        fullVisitorId,
        visitStartTime,
        hit_time,
        type,
        isExit,
        last_interaction )
    GROUP BY
      date,
      pagePath,
      fullVisitorId,
      visitStartTime,
      hit_time,
      type,
      isExit,
      last_interaction,
      next_pageview)
  GROUP BY
    pagePath,
    date
  ORDER BY
    pageviews DESC ) ),
#Entrances
ENTRANCES AS (
SELECT
  date,
  pagePath,
  SUM(entrances) AS entrances
FROM (
  SELECT
    date,
    pagePath,
    CASE
      WHEN isEntrance IS NOT NULL THEN 1
      ELSE 0
  END
    AS entrances
  FROM
    `govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
  WHERE
    _table_suffix BETWEEN @DS_START_DATE
    AND @DS_END_DATE)
GROUP BY
  pagePath,
  date
ORDER BY
  entrances DESC ),
#Bounce rate
BOUNCES AS (
SELECT
  date,
  pagePath,
  bounces,
  sessions,
  CASE
    WHEN sessions = 0 THEN 0
    ELSE bounces / sessions
END
  AS bounce_rate
FROM (
  SELECT
    date,
    pagePath,
    SUM(bounces) AS bounces,
    SUM(sessions) AS sessions
  FROM (
    SELECT
      date,
      fullVisitorId,
      visitStartTime,
      pagePath,
      CASE
        WHEN hitNumber = first_interaction THEN bounces
        ELSE 0
    END
      AS bounces,
      CASE
        WHEN hitNumber = first_hit THEN visits
        ELSE 0
    END
      AS sessions
    FROM (
      SELECT
        date,
        fullVisitorId,
        visitStartTime,
        pagePath,
        bounces,
        visits,
        hitNumber,
        MIN(
        IF
          (isInteraction IS NOT NULL, hitNumber, 0)) OVER (PARTITION BY fullVisitorId, visitStartTime) AS first_interaction,
        MIN(hitNumber) OVER (PARTITION BY fullVisitorId, visitStartTime) AS first_hit
      FROM
        `govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
      WHERE
        _table_suffix BETWEEN @DS_START_DATE
        AND @DS_END_DATE
      GROUP BY
        fullVisitorId,
        visitStartTime,
        date,
        pagePath,
        bounces,
        visits,
        hitNumber,
        isInteraction)
    ORDER BY
      sessions DESC )
  GROUP BY
    pagePath,
    date) ),
#Exits and exit rate
EXITS AS (
SELECT
  date,
  pagePath,
  pageviews,
  exits,
  CASE
    WHEN pageviews = 0 THEN 0
    ELSE exits / pageviews
END
  AS exit_rate
FROM (
  SELECT
    date,
    pagePath,
    COUNT(*) AS pageviews,
    SUM(exits) AS exits
  FROM (
    SELECT
      date,
      pagePath,
      CASE
        WHEN isExit IS NOT NULL THEN 1
        ELSE 0
    END
      AS exits
    FROM
      `govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
    WHERE
      type = 'PAGE'
      AND _table_suffix BETWEEN @DS_START_DATE
      AND @DS_END_DATE)
  GROUP BY
    pagePath,
    date))
#Joining everything together
SELECT
PVS.date,
PVS.pagePath AS page_path,
PVS.pageviews,
PVS.unique_pageviews,
ATOP.avg_time_on_page,
ATOP.total_time_on_page,
ATOP.exits,
ENTRANCES.entrances,
BOUNCES.bounces,
BOUNCES.sessions,
BOUNCES.bounce_rate,
EXITS.exit_rate
FROM
PVS
JOIN
ATOP
ON
PVS.pagePath = ATOP.pagePath
AND PVS.date = ATOP.date
JOIN
EXITS
ON
PVS.pagePath = EXITS.pagePath
AND PVS.date = EXITS.date
JOIN
ENTRANCES
ON
PVS.pagePath = ENTRANCES.pagePath
AND PVS.date = ENTRANCES.date
JOIN
BOUNCES
ON
PVS.pagePath = BOUNCES.pagePath
AND PVS.date = BOUNCES.date
ORDER BY
date DESC,
pageviews DESC,
unique_pageviews DESC

‘Top Events’ UA report query

The below query can be used to replicate the ‘Top Events’ report in the ‘Events’ section of the Behaviour reports in the Universal Analytics interface. This uses the GOV.UK UA raw dataset, but this could easily be switched to use any raw UA dataset by changing the dataset ID.

This query has the dates written in, although these could be switched for Looker Studio date parameters if used in a Looker Studio custom query (see the above queries for examples of this).

SELECT
  #Commenting out the date as it is not needed in the 'Top events' table, although it would be required for a chart of events over time
  #date,
  hits.eventInfo.eventCategory AS event_category,
  hits.eventInfo.eventAction AS event_action,
  hits.eventInfo.eventLabel AS event_label,
  COUNT(*) AS total_events,
  COUNT(DISTINCT CONCAT(CAST(fullVisitorId AS string), CAST(visitStartTime AS string))) AS unique_events,
FROM
  `govuk-bigquery-analytics.87773428.ga_sessions_*`,
  UNNEST(HITS) AS HITS
WHERE
  _table_suffix BETWEEN '20240601'
  AND '20240603'
  AND totals.visits = 1
  AND hits.type = 'EVENT'
  AND hits.eventInfo.eventCategory IS NOT NULL
GROUP BY
  #date,
  event_category,
  event_action,
  event_label
ORDER BY
  total_events DESC

Other resources

The below are some resources that may help those trying to find other bits of information in the UA data:

This page was last reviewed on 13 August 2024. It needs to be reviewed again on 13 February 2025 .
This page was set to be reviewed before 13 February 2025. This might mean the content is out of date.