Use the historic analytics (UA) data
Before Google Analytics 4, we used Universal Analytics (UA) to collect data on the usage of various government sites, including GOV.UK.
More information on the data sources themselves can be found on the historic UA data sources page.
Get access to UA data
For information on how to get access to the historic UA data, please see the historic data access policy.
How to use the UA data
The historic UA data can be queried directly in BigQuery, or it can be used in various reporting or data visualisation tools.
Whether you query it directly in BigQuery or use the data in a reporting tool, you will need a Google Cloud Platform project with billing enabled through which you can query the data. If you are GDS staff, please contact the #data-engineering community who will be able to provide you with a Google Cloud Platform billing project ID. If you work for another public sector organisation, your department or organisation will have to provide you with this ID. We are investigating potential solutions for public sector organisations who do not have access to Google Cloud Platform, so please do get in touch with the Analytics team via the GA4 support inbox if you have any issues.
If you are using the GOV.UK historic data, we recommend you use the flattened dataset as this is cheaper and easier to query.
Use the UA data in Looker Studio
The UA data stored in BigQuery can be used in Looker Studio.
To connect to the UA data, we recommend you use a custom query. You can do this by:
- Clicking to add a data source
- In the ‘Connect to data’ panel, selecting ‘BigQuery’
- In the left-hand menu, selecting ‘Custom query’
- Entering your Google Cloud Platform billing project ID
- Entering a query to select the fields needed using a WHERE statement to define the date range required
- Tick the checkbox to ‘enable date range parameters’ if needed
- Selecting ‘Add’
An example of a generic SQL query that could be used in step 5 above is:
SELECT *
FROM `govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
WHERE _TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE
This query selects all fields from the GOV.UK UA flattened tables and returns all the data for those fields between the dates you have selected in your Looker Studio report.
Sample queries
Below are some sample queries which may help you in exploring the historic data.
‘All pages’ UA report query
The below query can be used to replicate the ‘All Pages’ report in the ‘Site Content’ section of the Behaviour reports in the Universal Analytics interface. This again uses the GOV.UK UA flattened dataset and Looker Studio date parameters, although these could easily be switched for dates in the ‘YYYYMMDD’ format.
#Query to replicate the UA All Pages report
#Page views and unique page views
WITH
PVS AS (
SELECT
date,
pagePath,
COUNT(*) AS pageviews,
COUNT(DISTINCT session_id) AS unique_pageviews
FROM (
SELECT
date,
pagePath,
CONCAT(fullVisitorId, CAST(visitStartTime AS STRING)) AS session_id
FROM
`govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
WHERE
_table_suffix BETWEEN @DS_START_DATE
AND @DS_END_DATE
AND type = 'PAGE')
GROUP BY
pagePath,
date
ORDER BY
pageviews DESC ),
#Average time on page
ATOP AS (
SELECT
date,
pagePath,
pageviews,
exits,
total_time_on_page,
CASE
WHEN pageviews = exits THEN 0
ELSE total_time_on_page / (pageviews - exits)
END
AS avg_time_on_page
FROM (
SELECT
date,
pagePath,
COUNT(*) AS pageviews,
SUM(
IF
(isExit IS NOT NULL, 1, 0)) AS exits,
SUM(time_on_page) AS total_time_on_page
FROM (
SELECT
date,
fullVisitorId,
visitStartTime,
pagePath,
hit_time,
type,
isExit,
CASE
WHEN isExit IS NOT NULL THEN last_interaction - hit_time
ELSE next_pageview - hit_time
END
AS time_on_page
FROM (
SELECT
date,
fullVisitorId,
visitStartTime,
pagePath,
hit_time,
type,
isExit,
last_interaction,
LEAD(hit_time) OVER (PARTITION BY fullVisitorId, visitStartTime ORDER BY hit_time) AS next_pageview
FROM (
SELECT
date,
fullVisitorId,
visitStartTime,
pagePath,
hit_time,
type,
isExit,
last_interaction
FROM (
SELECT
date,
fullVisitorId,
visitStartTime,
pagePath,
type,
isExit,
time / 1000 AS hit_time,
MAX(
IF
(isInteraction IS NOT NULL, time / 1000, 0)) OVER (PARTITION BY fullVisitorId, visitStartTime) AS last_interaction
FROM
`govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
WHERE
_table_suffix BETWEEN @DS_START_DATE
AND @DS_END_DATE
AND type = 'PAGE'))
GROUP BY
date,
pagePath,
fullVisitorId,
visitStartTime,
hit_time,
type,
isExit,
last_interaction )
GROUP BY
date,
pagePath,
fullVisitorId,
visitStartTime,
hit_time,
type,
isExit,
last_interaction,
next_pageview)
GROUP BY
pagePath,
date
ORDER BY
pageviews DESC ) ),
#Entrances
ENTRANCES AS (
SELECT
date,
pagePath,
SUM(entrances) AS entrances
FROM (
SELECT
date,
pagePath,
CASE
WHEN isEntrance IS NOT NULL THEN 1
ELSE 0
END
AS entrances
FROM
`govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
WHERE
_table_suffix BETWEEN @DS_START_DATE
AND @DS_END_DATE)
GROUP BY
pagePath,
date
ORDER BY
entrances DESC ),
#Bounce rate
BOUNCES AS (
SELECT
date,
pagePath,
bounces,
sessions,
CASE
WHEN sessions = 0 THEN 0
ELSE bounces / sessions
END
AS bounce_rate
FROM (
SELECT
date,
pagePath,
SUM(bounces) AS bounces,
SUM(sessions) AS sessions
FROM (
SELECT
date,
fullVisitorId,
visitStartTime,
pagePath,
CASE
WHEN hitNumber = first_interaction THEN bounces
ELSE 0
END
AS bounces,
CASE
WHEN hitNumber = first_hit THEN visits
ELSE 0
END
AS sessions
FROM (
SELECT
date,
fullVisitorId,
visitStartTime,
pagePath,
bounces,
visits,
hitNumber,
MIN(
IF
(isInteraction IS NOT NULL, hitNumber, 0)) OVER (PARTITION BY fullVisitorId, visitStartTime) AS first_interaction,
MIN(hitNumber) OVER (PARTITION BY fullVisitorId, visitStartTime) AS first_hit
FROM
`govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
WHERE
_table_suffix BETWEEN @DS_START_DATE
AND @DS_END_DATE
GROUP BY
fullVisitorId,
visitStartTime,
date,
pagePath,
bounces,
visits,
hitNumber,
isInteraction)
ORDER BY
sessions DESC )
GROUP BY
pagePath,
date) ),
#Exits and exit rate
EXITS AS (
SELECT
date,
pagePath,
pageviews,
exits,
CASE
WHEN pageviews = 0 THEN 0
ELSE exits / pageviews
END
AS exit_rate
FROM (
SELECT
date,
pagePath,
COUNT(*) AS pageviews,
SUM(exits) AS exits
FROM (
SELECT
date,
pagePath,
CASE
WHEN isExit IS NOT NULL THEN 1
ELSE 0
END
AS exits
FROM
`govuk-bigquery-analytics.FlattenedDailyData.flattened_daily_ga_data_*`
WHERE
type = 'PAGE'
AND _table_suffix BETWEEN @DS_START_DATE
AND @DS_END_DATE)
GROUP BY
pagePath,
date))
#Joining everything together
SELECT
PVS.date,
PVS.pagePath AS page_path,
PVS.pageviews,
PVS.unique_pageviews,
ATOP.avg_time_on_page,
ATOP.total_time_on_page,
ATOP.exits,
ENTRANCES.entrances,
BOUNCES.bounces,
BOUNCES.sessions,
BOUNCES.bounce_rate,
EXITS.exit_rate
FROM
PVS
JOIN
ATOP
ON
PVS.pagePath = ATOP.pagePath
AND PVS.date = ATOP.date
JOIN
EXITS
ON
PVS.pagePath = EXITS.pagePath
AND PVS.date = EXITS.date
JOIN
ENTRANCES
ON
PVS.pagePath = ENTRANCES.pagePath
AND PVS.date = ENTRANCES.date
JOIN
BOUNCES
ON
PVS.pagePath = BOUNCES.pagePath
AND PVS.date = BOUNCES.date
ORDER BY
date DESC,
pageviews DESC,
unique_pageviews DESC
‘Top Events’ UA report query
The below query can be used to replicate the ‘Top Events’ report in the ‘Events’ section of the Behaviour reports in the Universal Analytics interface. This uses the GOV.UK UA raw dataset, but this could easily be switched to use any raw UA dataset by changing the dataset ID.
This query has the dates written in, although these could be switched for Looker Studio date parameters if used in a Looker Studio custom query (see the above queries for examples of this).
SELECT
#Commenting out the date as it is not needed in the 'Top events' table, although it would be required for a chart of events over time
#date,
hits.eventInfo.eventCategory AS event_category,
hits.eventInfo.eventAction AS event_action,
hits.eventInfo.eventLabel AS event_label,
COUNT(*) AS total_events,
COUNT(DISTINCT CONCAT(CAST(fullVisitorId AS string), CAST(visitStartTime AS string))) AS unique_events,
FROM
`govuk-bigquery-analytics.87773428.ga_sessions_*`,
UNNEST(HITS) AS HITS
WHERE
_table_suffix BETWEEN '20240601'
AND '20240603'
AND totals.visits = 1
AND hits.type = 'EVENT'
AND hits.eventInfo.eventCategory IS NOT NULL
GROUP BY
#date,
event_category,
event_action,
event_label
ORDER BY
total_events DESC
Other resources
The below are some resources that may help those trying to find other bits of information in the UA data:
- Article on replicating the ‘Top Events’ UA reporting in BigQuery (with other resources on querying UA data on adjacent pages)