Skip to main content

GOV.UK GA4 access logs

The GOV.UK GA4 access logs data details usage of the GOV.UK Google Analytics GA4 data via the Google Analytics Data API. This includes usage of the GOV.UK GA4 user interface and Looker Studio connections, as well as direct querying of the API, but does not include usage of the data exported to BigQuery.

Content

Data was first collected into this dataset on the 18th July 2023.

The fields in this dataset and their descriptions can be seen in the schema table below.

Access

Access to the BigQuery dataset is limited to GA4 user admins. However, the GA4 usage report which visualises this data is shared with all GDS performance analysts and the Data Services Google group.

Summarised data is also provided to SPOCs for their department.

Location

The data is located in BigQuery under the ga4-analytics-352613.ga4_logs dataset, in the GA4 Analytics project.

This is a partitioned table, and is partitioned on the epoch_time_micros timestamp.

Set-up

### Data collection This data is generated by querying the Google Analytics Admin API. A Google Cloud Run function is triggered by a Cloud Scheduler job to run every day at 6am GMT, retrieving the data and appending it to the ga4_logs table in the dataset mentioned above.

The Cloud Run function code can be seen in the ga4-access-report repository on Github.

If the table fails to populate for any reason, it can be backfilled using the following code:

``` !pip install -q google-analytics-admin

!gcloud auth application-default login –project=ga4-analytics-352613 –scopes=https://www.googleapis.com/auth/analytics.readonly,https://www.googleapis.com/auth/bigquery,https://www.googleapis.com/auth/cloud-platform

from google.analytics.admin import AnalyticsAdminServiceClient import pandas as pd from datetime import datetime from google.auth import default import os import re

SCOPES = [ ‘https://www.googleapis.com/auth/analytics.readonly’, ‘https://www.googleapis.com/auth/bigquery’] PROJECT = ‘ga4-analytics-352613’ creds, _ = default( scopes=SCOPES, default_scopes=SCOPES, quota_project_id=PROJECT) GA4_ENTITY = ‘properties/330577055’

def get_access_report(n): # client = AnalyticsAdminServiceClient(credentials=creds) client = AnalyticsAdminServiceClient() access_dict = { “entity”: GA4_ENTITY, “limit”: 100000, “date_ranges”: [ { “start_date”: f”{n}”, “end_date”: f”{n}” } ], “dimensions”: [ { “dimension_name”: “epochTimeMicros” }, { “dimension_name”: “userEmail” }, { “dimension_name”: “accessMechanism” }, { “dimension_name”: “accessorAppName” }, { “dimension_name”: “dataApiQuotaCategory” }, { “dimension_name”: “reportType” } ], “metrics”: [ { “metric_name”: “accessCount” }, { “metric_name”: “dataApiQuotaPropertyTokensConsumed” } ] }

access_records = client.run_access_report(access_dict)
return access_records

def format_access_report(response): access_list = []

for rowIdx, row in enumerate(response.rows):
    dims = {}

    for i, dimension_value in enumerate(row.dimension_values):
        dimension_name = response.dimension_headers[i].dimension_name
        if dimension_name.endswith("Micros"):
            # Convert microseconds since Unix Epoch to datetime object.
            dimension_value_formatted = datetime.utcfromtimestamp(
                int(dimension_value.value) / 1000000
            )
        else:
            dimension_value_formatted = dimension_value.value
        dims[dimension_name] = dimension_value_formatted

    for i, metric_value in enumerate(row.metric_values):
        metric_name = response.metric_headers[i].metric_name
        dims[metric_name] = metric_value.value
    access_list.append(dims)

df = pd.DataFrame(access_list)
df = df.rename(columns={
  'epochTimeMicros': 'epoch_time_micros',
  'userEmail': 'user_email',
  'accessMechanism': 'access_mechanism',
  'accessorAppName': 'accessor_app_name',
  'dataApiQuotaCategory': 'api_quota_category',
  'reportType': 'report_type',
  'accessCount': 'access_count',
  'dataApiQuotaPropertyTokensConsumed': 'api_tokens_consumed'})

df['access_count'] = pd.to_numeric(df['access_count'], errors='coerce')
df['api_tokens_consumed'] = pd.to_numeric(df['api_tokens_consumed'], errors='coerce')
df['domain'] = df['user_email'].apply(lambda x: ''.join(re.findall(r'(@.*$)', str(x))))

return df

def send_to_bq(df):

df.to_gbq(
    'ga4_logs.ga4_logs',
    project_id=PROJECT,
    chunksize=None,
    reauth=False,
    if_exists='append',
    auth_local_webserver=True,
    table_schema=None,
    location=None,
    credentials=creds
    )

def run(n=’YYYY-MM-DD’): access_records = get_access_report(n) df = format_access_report(access_records) df.dtypes try: send_to_bq(df) return “all good”

except Exception as e:
  print(df.shape)
  print(df.head(n=5))
  print(e)
  return "all bad"

Insert the date to be backfilled here

run(n=’2025-01-01’) ```

Schema

| field name | type | mode | description | | — | — | — | — | | epoch_time_micros | TIMESTAMP | NULLABLE | The unix microseconds since the epoch that the GA user accessed GA reporting data | | user_email | STRING | NULLABLE | The user’s email address | | access_mechanism | STRING | NULLABLE | The mechanism through which a user accessed GA reporting data, for example ‘Google Analytics User Interface’ or ‘Google Analytics API’ | | accessor_app_name | STRING | NULLABLE | The name of the application that accessed Google Analytics reporting data, for example ‘Looker Studio’ or ‘Power BI’ | | api_quota_category | STRING | NULLABLE | The quota category for the Data API request, for example ‘Core’ or ‘Realtime’ | | report_type | STRING | NULLABLE | The type of reporting data that the GA user accessed, for example ‘Realtime’ or ‘Free form exploration’ | | access_count | INTEGER | NULLABLE | The number of times GA reporting data was accessed. Note that every report viewed can result in one or more data access events | | api_tokens_consumed | INTEGER | NULLABLE | The number of property quota tokens consumed for Data API requests | | domain | STRING | NULLABLE | The email domain, taken from the user’s email address |

Retention

The data retention is currently set to 2 years.

This page was last reviewed on 11 December 2025. It needs to be reviewed again on 11 June 2026 .