Skip to main content

Find things in the GOV.UK GA4 data

Google Analytics 4 (GA4) is used to collect data on the usage of GOV.UK. There are a variety of different things you can learn from the GA4 data about pages and users on GOV.UK.

This page provides some guidance on how to find commonly requested information in the GA4 data.

More information on the data source itself can be found on the GOV.UK GA4 data source page. Information on best practice accessing and using GA4 data can be found in the ‘Use the GA4 data’ section.

The BigQuery SQL examples provided on this page make use of the GOV.UK GA4 flattened partitioned dataset which is easier and more efficient to query than the raw nested data, so should be used for most analysis and reporting.

Page views

There is a report in the GA4 interface which shows views by page path. In explorations and in Looker Studio, you can get numbers of page views either using a count of page_view events or using the ‘views’ metric (which also includes screen views if you have an app).

A count of page_view events in GA4 should provide you with a similar number to the total pageviews figure in Universal Analytics (UA).

‘Unique pageviews’ no longer exists as a metric in GA4, but the ‘sessions’ metric is conceptually similar, telling you how many sessions contained a visit to this page. However, the definition of a session has changed in GA4, so the figures produced here will deviate significantly from those in UA.

Page views by page path

We recommend using ‘page path’ over the ‘page location’ dimension as ‘page location’ is compatible with fewer other dimensions.

The ‘page path’ is the part of the URL after the hostname (not including any query strings) of a page.

Method - using the views metric in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Page path and screen class’ dimension and the ‘Views’ metric
  2. If you would like to only look at page view data for certain pages, filter the data using the ‘Page path’ dimension to select the paths of the pages you are interested in. For example, if you were only interested in ‘browse’ pages on GOV.UK, you could filter down to Page path contains ‘/browse’
  3. Change the date range and sort the table however you like

Remember to check the data quality icons when using reports or explorations in GA, and to keep an eye open for the impact of our high cardinality data in Looker Studio.

For an example in Looker Studio, see the ‘Top pages’ table in this sample Looker Studio report.

Method in BigQuery

The following SQL gets a count of page views by page path (cleaned_page_location in the flattened dataset) for the 1st January 2024:

SELECT
  cleaned_page_location,
  COUNT(*) AS pageviews,
FROM
  `ga4-analytics-352613.flattened_dataset.partitioned_flattened_events`
WHERE
  event_date BETWEEN '2024-01-01'
  AND '2024-01-01'
  #Filter to just page_view events
  AND event_name = 'page_view'
GROUP BY
  cleaned_page_location
ORDER BY
  pageviews DESC

Page views filtered to a certain publishing organisation

In GA4, there are two custom dimensions you can use to identify different organisations:

  1. Publishing organisations IDs, which uses the organisation codes. However, each document can be tagged to multiple organisations. These tags are added by publishers
  2. Primary publishing organisation. This is the long form official name of the organisation. Only one organisation can be added per document. The long text is used in this dimension to make reports more easily understandable to those who does not recognise the codes

Method - using the views metric in an exploration

Steps:

  1. Create a blank exploration, and add in the ‘Page path and screen class’ and ‘Primary publishing organsation’ dimensions, along with the ‘Views’ metric, using the ‘+’ buttons in the ‘Variables’ tab
  2. Add the ‘Page path and screen class’ dimension and the ‘Views’ metric to the table. You can do this by either dragging and dropping/selecting the ‘Page path’ dimension in the ‘Rows’ area in the ‘Settings’ tab, and doing the same to the ‘Views’ metric for the ‘Values’ area, or by just double-clicking on the dimension and metrics you want to use (they should automatically be assigned to be a row and a value in the table settings when you do this)
  3. Filter the data using the ‘Primary publishing organisation’ dimension to select the organisation you are interested in. For example, if you were only interested in Home Office pages on GOV.UK, you could filter down to ‘Primary publishing organisation’ exactly matches ‘Home Office’
  4. Change the date range and sort the table however you like

Remember to check the data quality icons when using reports or explorations in GA, and to keep an eye open for the impact of our high cardinality data in Looker Studio.

Link clicks are now tracked as events with the event name ‘navigation’. External links will be navigation events with the ‘outbound’ parameter/custom dimension set to ‘true’. Internal links (links to other pages on GOV.UK) will not have the parameter/custom dimension ‘outbound’ equalling ‘true’. Not all internal links have specific tracking on them.

These events should all have dimensions such as the ‘Link URL’, ‘Link text’ and so on which should help you identify which links have been clicked.

Method in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Link text’ or ‘Link URL’ dimension by the ‘Event count’ metric
  2. Filter the data by the ‘Event name’ dimension to equals/contains ‘navigation’. This selects the navigation event which fires when the user clicks a link
  3. Filter the data by the ‘Outbound’ dimension to equals/contains ‘true’. This selects link clicks away from GOV.UK
  4. Filter the data by the ‘Page path and screen class’ dimension to equal the path of the specific page you want link clicks from. The ‘page path’ is the part of the URL after the hostname (not including any query strings) of a page, so by using it here we are selecting the page the user clicked the link on
  5. Change the date range and sort the table however you like

A basic exploration containing external link clicks from GOV.UK can be accessed here. See also the ‘External link clicks’ table in this sample Looker Studio report.

The following SQL returns external links clicked (and how many times they were clicked) on the ‘Child adoption’ page on 1st January 2024:

SELECT
  link_url,
  COUNT(*) AS events,
FROM
  `ga4-analytics-352613.flattened_dataset.partitioned_flattened_events`
WHERE
  event_date BETWEEN '2024-01-01'
  AND '2024-01-01'
  #Filter to just navigation events
  AND event_name = 'navigation'
  #Filter to just external link clicks
  AND outbound = 'true'
  #Filter to just the page of interest
  AND cleaned_page_location = '/child-adoption'
GROUP BY
  link_url
ORDER BY
  events DESC

File downloads

File downloads will have the event name ‘file_download’.

File downloads of a given file

You can count downloads of a file, regardless of which page the file was downloaded on, by using the file_download event and Link URL.

Method in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Link URL’ dimension by the ‘Event count’ metric. Alternatively, you could add in the ‘Page path and screen class’ dimension to show you which pages your file is downloaded from, or the ‘Link text’ dimension to show you which text is clicked to download your file
  2. Filter the data by the ‘Event name’ dimension to equals/contains ‘file_download’. This selects the file_download event which fires when the user clicks on a link to download a file
  3. Filter the data by the ‘Link URL’ dimension to equal the URL of the specific file you want to find downloads of
  4. Sort the table however you like

File downloads on a given page

You can find all file downloads from a given page by using the file_download event and filtering to the desired page path.

Method in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Link text’ or ‘Link URL’ dimension by the ‘Event count’ metric
  2. Filter the data by the ‘Event name’ dimension to equals/contains ‘file_download’. This selects the file_download event which fires when the user clicks on a link to download a file
  3. Filter the data by the ‘Page path and screen class’ dimension to equal the path of the specific page you want downloads from. The ‘page path’ is the part of the URL after the hostname (not including any query strings) of a page, so by using it here we are selecting the page the user clicked the download link on
  4. Sort the table however you like

Method in BigQuery

The following SQL gets a count of file downloads for the different files (using the Link URL) on the ‘Apply for a postal vote (paper forms)’ page on 1st January 2024:

SELECT
  link_url,
  COUNT(*) AS file_downloads,
FROM
  `ga4-analytics-352613.flattened_dataset.partitioned_flattened_events`
WHERE
  #Filter to the right date
  event_date BETWEEN '2024-01-01'
  AND '2024-01-01'
  #Filter to just file_download events
  AND event_name = 'file_download'
  #Filter to just the page of interest
  AND cleaned_page_location = '/government/publications/apply-for-a-postal-vote'
GROUP BY
  link_url
ORDER BY
  file_downloads DESC

Search terms

The ‘Search term’ (or search_term, if using the BigQuery data) dimension is sent with:

  • the ‘search’ event - when the user searches
  • the ‘view_item_list’ event (ecommerce tracking) - the view of the search results page
  • the ‘select_item’ event (ecommerce tracking) - when the user selects a search result
  • the ‘page_view’ event - only populated on page views of search results pages

These different events sent with the search term should enable you to analyse which terms users are searching for, and what users do after searching for various terms. The fact the search term is sent with many events does however mean you need to filter to the event you are interested in to make sure you are not accidentally double or triple counting the search terms used.

A basic exploration containing search terms used on GOV.UK can be found here.

Search terms used to get to a specific destination page

Method in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Search term’ dimension by the ‘Event count’ metric.
  2. Filter the data by the ‘Event name’ dimension to equals/contains ‘select_item’. This selects the select_item event when a user clicks on a search result.
  3. Filter the data by the ‘Link URL’ dimension to equal the path of the specific destination page you want search terms to. The ‘Link URL’ is the link the user clicked on from the search results, so by using it here we are selecting the search destination page.
  4. Change the date range and sort the table however you like!

This should give you a table showing the search terms used to get to your given page.

If you are unfamiliar with filtering GA4 data, it may be helpful to include the dimensions in your table to check that your filters have worked correctly (you should only be able to see ‘select_item’ in the ‘Event name’ column and the specific page you want search terms from in the ‘Link URL’ column).

For an example in Looker Studio, see the ‘Searches leading to these pages’ table in this sample Looker Studio report.

Search terms from a specific page

There are a couple of different ways that this can be done with the GA4 data.

Method one - using the search event in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Search term’ dimension by the ‘Event count’ metric
  2. Filter the data by the ‘Event name’ dimension to equals/contains ‘search’. This selects the search event which fires when a user searches
  3. Filter the data by the ‘Page path and screen class’ dimension to equal the path of the specific page you want search terms from. The ‘page path’ is the part of the URL after the hostname (not including any query strings) of a page, so by using it here we are selecting the page the user searched on
  4. Change the date range and sort the table however you like

This should give you a table showing the search terms used from your given page and the number of times each was searched for (the count).

For an example in Looker Studio, see the ‘Top searches from these pages’ table in this sample Looker Studio report.

Method two - using the view_item_list (search results) event in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Search term’ dimension by the ‘Event count’ metric
  2. Filter the data by the ‘Event name’ dimension to equals/contains ‘view_item_list’. This selects the view_item_list event which fires on views of search results lists
  3. Filter the data by the ‘Page referrer’ dimension to equal the specific page you want search terms from. The ‘page referrer’ is the page which led the user to their present page, so by using it here we are selecting the page before the search results
  4. Change the date range and sort the table

This should give you a table showing the search terms used from your given page and the number of times each was searched for (the count).

If you are unfamiliar with filtering GA4 data, it may be helpful to include the ‘Event name’ and ‘Page referrer’ dimensions in your table to check that your filters have worked correctly. You should only be able to see ‘view_item_list’ in the ‘Event name’ column and the specific page you want search terms from in the ‘Page referrer’ column.

Method in BigQuery

The following SQL gets a count of search terms used in searches on the ‘Sign in to your Universal Credit account’ page on 1st January 2024:

SELECT
  search_term,
  COUNT(*) AS searches,
FROM
  `ga4-analytics-352613.flattened_dataset.partitioned_flattened_events`
WHERE
  #Filter to the right date
  event_date BETWEEN '2024-01-01'
  AND '2024-01-01'
  #Filter to just search events
  AND event_name = 'search'
  #Filter to just the page of interest
  AND cleaned_page_location = '/sign-in-universal-credit'
GROUP BY
  search_term
ORDER BY
  searches DESC

Users’ journeys

There are different ways you can investigate users’ journeys.

In the interface, you can look at users’ page locations and page referrers.

You can also use funnel explorations to examine users’ flow through pre-defined steps. Google have published some information on how to use funnel explorations here.

Using the BigQuery data, however, you can view users’ journeys in much more detail.

Using the custom timestamp

In the absence of a hit number sent with each GA4 event, we implemented a custom timestamp which can be used to order a user’s events.

We implemented this in January 2024, so this cannot be used to order events before then.

Method in BigQuery

The following SQL gets all page views on 26th August 2024, with a calculated ‘session ID’ and ‘hit number’, ordered by the session ID and custom timestamp:

SELECT
  #Create a session ID
  CONCAT(user_pseudo_id,"-", ga_sessionid) AS session_id,
  page_location,
  timestamp,
  #Calculate a 'hit number' based on the timestamp
  ROW_NUMBER() OVER (PARTITION BY CONCAT(user_pseudo_id, "-", ga_sessionid) ORDER BY timestamp ASC) AS hit_number,
FROM
  `ga4-analytics-352613.flattened_dataset.partitioned_flattened_events`
WHERE
  #Page views only
  event_name="page_view"
  AND event_date = '2024-08-26'
ORDER BY
  session_id,
  timestamp

The result of this query could be further manipulated to enable analysis of users’ journeys.

Using the built-in batch_event_index

Google added batch ordering IDs to the GA4 dataset in mid-July 2024.

Google provide some sample code indicating how you can use these IDs to order users’ events in their guidance.

Engagement

There are a variety of metrics available to help you look into user engagement with content on GOV.UK.

User engagement time

The ‘User engagement’ and ‘Average engagement time’ metrics help you to understand how long a user has spent with your web page in focus.

This engaged time is calculated using the user_engagement events which fire any time the user changes page or ‘focuses away from the page’. Google compares the times that user_engagement events are sent with page load times to calculate the amount of time a user has been ‘engaged’ with a given page. Google’s documentation on user engagement contains further information.

Engaged sessions

Engaged sessions are all sessions that either last longer than 10 seconds, have more than one screen or page view, or contain a conversion event (ordinary events do not count and as of July 2024 we do not yet have any conversion events set up).

The ‘Engaged sessions’ should only be used with the ‘Landing page + query string’ page dimension (if you wish to use a page dimension - other dimensions such as device and source should work fine) as this is a ‘session-scoped’ metric tied to the session’s landing page. If you use this metric with another page dimension, such as page path, the GA4 interface will show you a figure but it will not be correct.

Method in BigQuery

Engagement metrics can be quite complicated to calculate from the raw data in BigQuery.

Whether or not a session is engaged is particularly complicated, because there is both a string value and an integer value provided for the ‘session_engaged’ flag. We believe that the string value is what Google collect in the raw analytics data as users are on the site (key ‘seg’ in the payload you can see in the Network tab). The integer value is then calculated and attached to the session_start event to be used for the engaged session and engagement rate metric calculations.

A session is engaged when the ‘session_engaged’ integer flag is ‘1’ (true) on that session’s landing page (the page where the session_start event fired).

The following SQL gets a count of sessions and engaged sessions by country for the 1st July 2024. This uses the flattened dataset:

SELECT
  country,
  COUNT(DISTINCT unique_session_id) AS sessions,
  COUNT(DISTINCT
    CASE
      WHEN session_engaged = '1' THEN unique_session_id
  END
    ) AS engaged_sessions
FROM
  `ga4-analytics-352613.flattened_dataset.partitioned_flattened_events`
WHERE
  event_date BETWEEN '2024-07-01'
  AND '2024-07-01'
GROUP BY
  country
ORDER BY
  engaged_sessions DESC

The following SQL gets the same count of sessions and engaged sessions by country for the 1st July 2024, but using the raw dataset:

WITH DATA AS (
  SELECT
    geo.country AS country,
    CONCAT(user_pseudo_id, (
      SELECT
        value.int_value
      FROM
        UNNEST(event_params)
      WHERE
        KEY = 'ga_session_id')) AS unique_session_id,
    (
    SELECT
      value.string_value
    FROM
      UNNEST(event_params)
    WHERE
      KEY = 'session_engaged') AS session_engaged
  FROM
    `ga4-analytics-352613.analytics_330577055.events_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20240701'
    AND '20240701' )
SELECT
  country,
  COUNT(DISTINCT unique_session_id) AS sessions,
  COUNT(DISTINCT
    CASE
      WHEN session_engaged = '1' THEN unique_session_id
  END
    ) AS engaged_sessions
FROM
  DATA
GROUP BY
  country
ORDER BY
  engaged_sessions DESC

Engagement rate

Engagement rate is a new metric in GA4, and represents the number of sessions which are engaged.

It is important to note that the engagement rate can only be used when reporting on landing pages (using the ‘Landing page + query string’ dimension). Do not use the engagement rate metric with more general page dimensions (such as ‘Page path and screen class’, ‘Page path + query string’, ‘Page location’ and so on). Although GA does not prevent you from adding engagement rate to such reports, the figures reported will not be accurate.

Method in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Landing page + query string’ dimension by the ‘Engagement rate’ metric
  2. If you are interested in the session engagement rates from specific pages, filter the data by the ‘Landing page + query string’ dimension to equal/contain the path of the specific pages you want. The ‘Landing page + query string dimension’ only uses the path and query string - the part of the URL after the hostname
  3. Change the date range and sort the table however you like

Bounce rate

The definition of a bounce has changed from Universal Analytics to GA4. The bounce rate is now the percentage of sessions that are not engaged sessions (the bounce rate is the inverse of the engagement rate).

As with engaged sessions and the engagement rate, is important to note that the bounce rate can only be used when reporting on landing pages (using the ‘Landing page + query string’ dimension). Do not use the bounce rate metric with more general page dimensions (such as ‘Page path and screen class’, ‘Page path + query string’, ‘Page location’ and so on). Although GA does not prevent you from adding bounce rate to such reports, the figures reported will not be accurate!

Method in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Landing page + query string’ dimension by the ‘Bounce rate’ metric
  2. If you are interested in the bounce rates for specific pages, filter the data by the ‘Landing page + query string’ dimension to equal/contain the path of the specific pages you want. The ‘Landing page + query string dimension’ only uses the path and query string - the part of the URL after the hostname
  3. Change the date range and sort the table however you like

A basic exploration showing the bounce rate on GOV.UK landing pages can be found here.

User and tech attributes

There are a variety of dimensions available to help you understand the attributes of users using GOV.UK.

Device category

The device category tells you whether a user was using a desktop, mobile, tablet, or other device to browse GOV.UK.

Method in an exploration

Steps:

  1. In an exploration, create a table showing the ‘Device category’ dimension by the ‘Sessions’ metric
  2. If you are interested in the device categories used on specific pages, you could filter the data by the ‘Page path and screen class’ dimension to equal/contain the path of the specific pages you want. This would then tell you the devices used in each session that included a view of the selected page
  3. Change the date range and sort the table however you like

Sorting and ordering data

You can sort the data in your exploration by clicking on the value column headings you wish to sort the data by.

Using regex

Regular expressions are very useful as they can help you create filters with more flexible definitions. For example, if you wanted to find all pages with a document type of either ‘taxon’ or ‘topic’, you could do this using the pipe character (|), which specifes an OR condition - so ‘taxon|topic’ means ‘taxon OR topic’ and will select pages with either document type.

Google’s documentation lists the metacharacters that can be used in regular expressions within Google Analytics. Note that the default regex behaviour in GA4 is ‘full match’, so the data must exactly match the pattern you provide.

Other resources

The below are some resources that may help those trying to find other bits of information in the GA4 data:

This page was last reviewed on 11 December 2024. It needs to be reviewed again on 11 June 2025 .
This page was set to be reviewed before 11 June 2025. This might mean the content is out of date.