GA4 Project Use Dashboard
The GA4 Project Use Dashboard uses log data to show what BigQuery datasets and tables in the ga4-analytics-352613
project are being accessed from users across Government.
The report is a single page, and provides visualisations to show the most frequently referenced tables and datasets, and from which departments these are being accessed. The report’s main table breaks down the data by dataset, table name, the email address of the user running the query, and what organisation they belong to. For some queries it is possible to ascertain which connector was used, and this information will be displayed in the connector_type column
By monitoring this report, we can monitor the use of the datasets and tables contained within the GA4 GCP project, even from users who sit outside of GDS.
Using the GA4 Project Use Dashboard
Get access to the GA4 Project Use Dashboard
The report can be viewed by anyone with a @digital.cabinet-office.gov.uk email address.
It can be edited by anyone in the GDS performance analysts Google group.
How the GA4 Project Use Dashboard works
Data sources
The GA4 Project Use Dashboard uses a pre-aggregated BigQuery table as its data source. This table is called log_test_4
and can be found in the ga4_log_data dataset
within the gds-bq-data
project.
The table is updated each day from a scheduled query, and processes information from stored BigQuery logs.
The scheduled query in full
The scheduled query used to build the table is as follows:
with data as (
SELECT
auth_info.permission,
proto_payload.audit_log.method_name,
auth_info.resource AS accessed_resource,
proto_payload.audit_log.authentication_info.
principal_email,
proto_payload.audit_log.service_data,
CASE
WHEN JSON_VALUE(proto_payload.audit_log.service_data, '$.jobQueryResponse.job.jobConfiguration.labels.requestor') = 'looker_studio'
THEN 'Looker Studio'
WHEN JSON_VALUE(proto_payload.audit_log.service_data, '$.jobInsertRequest.resource.jobConfiguration.labels.sheets_connector') = 'connected_sheets'
THEN 'Sheets Connector'
WHEN JSON_VALUE(proto_payload.audit_log.service_data, '$.jobInsertResponse.resource.jobName.jobId') LIKE '%scheduled_query%'
THEN 'Scheduled Query'
ELSE 'Query/Other'
END AS connector_type,
REGEXP_EXTRACT(auth_info.resource, r'/datasets/([^/]+)/tables/') AS dataset_name,
REGEXP_EXTRACT(auth_info.resource, r'/tables/([^/]+)') AS table_name,
timestamp
FROM
`ga4-analytics-352613.gcp_logs._AllLogs`,
UNNEST(proto_payload.audit_log.authorization_info) AS auth_info
WHERE
auth_info.permission = 'bigquery.tables.getData'
),
data_2 as (
select
CAST(timestamp AS DATE) as date,
timestamp,
dataset_name,
principal_email,
CASE
WHEN table_name LIKE 'flattened_daily_ga_data%' THEN 'flattened_daily_ga_data*'
WHEN table_name LIKE 'events_intraday%' THEN 'events_intraday*'
WHEN table_name LIKE 'events_20%' THEN 'events_*'
WHEN table_name LIKE 'temp_%' THEN 'misc_temp_tables'
ELSE table_name
END AS table_name_clean,
connector_type,
method_name, service_data
from data
WHERE method_name = 'jobservice.insert'
OR method_name = 'jobservice.query'
ORDER BY date ASC)
SELECT distinct principal_email, dataset_name, table_name_clean, date, connector_type, TO_JSON_STRING(service_data) as service_data FROM data_2
It works by grabbing various fields from the _AllLogs
table, extracting with REGEX the dataset and table names from the auth_info.resource field. It also determines the user’s email and timestamp, as well as the connection type (scheduled query, Google Sheets, Looker Studio or other) from the log’s service_data field.
Finally, the results are filtered to include only distinct references to resources where the method is either ‘jobservice.insert’ or ‘jobservice.query’. This processing means that the table is then connected directly to the charts in Looker Studio without having to create a Looker Studio custom query.