GA4 backfill processes
Each morning we receive yesterday’s GOV.UK GA4 data in BigQuery.
The sending of this data is administered by Google, and on occasion the data can arrive relatively late in the day. When this happens, several processes we use to transform the data fail, and the transformed data needs to be backfilled when it does arrive.
This page aims to summarise the backfilling processes undertaken by the Analytics team when GOV.UK GA4 data arrives late.
Flattened sharded GA4 data
- First, delete the affected empty sharded table from the GA4 flattened dataset
- Run the flattened sharded backfill SQL query, making sure to change the date at the top of the query.
- Once the query has run, view the results and save them to a BigQuery table. Save them to the ‘ga4-analytics-352613’ project, and data set ‘flattened_dataset’. The name of the backfilled table will be ‘flattened_daily_ga_data_YYYYMMDD’ replacing YYYYMMDD with the date of your backfilled data at the end of the table name, e.g. ‘flattened_daily_ga_data_20240902’
Pogo-sticking
The data for the pogo-sticking dashboard relies on the flattened GOV.UK GA4 data arriving on time. If this is late, then it will need to be backfilled. Only steps 8, 9 and 10 need to be backfilled for the dashboard to function.
The process for backfilling the data for this dashboard is as follows:
- First, delete the affected empty table(s) which will be blank from steps 8, 9 and 10 in the GA4_PogoSticking dataset in the gds-bq-data project.
- Then run the below queries (changing to the correct date at the beginning) and save the results with the correct table names to their corresponding tables in the aforementioned dataset. E.g. if you were backfilling the data for step 9 for the 2nd of September 2024, you would run the backfill query and save the results to the ‘gds-bq-data’ project, the ‘GA4_PogoSticking’ data set, and your table name would be ‘PogoStickStepNine_20240902’
Step 8
Pogo-sticking step 8 backfill query
Step 9
Pogo-sticking step 9 backfill query
Step 10
Pogo-sticking step 10 backfill query
Partitioning
The two partitioned tables built from GOV.UK GA4 data have been set up to automatically backfill, even if the raw data arrives late. However, for reference, find links to the backfill queries for each below.
Raw Partitioning
Raw partitioning backfill query
Flattened Partitioning
Flattened partitioning backfill query