Use the GOV.UK intent detector
The intent detector is a tool that analyses GOV.UK user journeys to better understand user intents.
User intents definition
Intents summarise what a user came to do on GOV.UK and what their information needs are.
At a high level, inferred user intents are a summary of the content visited by a user in a session.
More specifically, a user journey’s intent is a list of weighted keywords. The weights represent how important and distinctive to that journey each keyword is.
This definition assumes that the content that a user visits as part of a session is relevant and informative to what the user wants to know or do.
This is based on cognitive theories of web navigation, which state that:
- users navigate content by following relevant information clues
- the content users visit is relevant to their information goals
- therefore, the content that a user visits represents their intents
Even if this assumption is false, many of the techniques for intent detection can still summarise a user’s session.
Intent detector algorithms
There are 3 algorithms to infer and detect intents:
- an Inferring User Needs by Information Scents (IUNIS) algorithm
- a less sophisticated URL-derived algorithm
- a performance analyst-focused algorithm
Before you start
Before you use any of the intect detector algorithms, you must complete the following prerequisites.
Clone the intent-detector repo
- Go to the
govuk-intent-detector
GitHub repo. - Select Code and then select the appropriate option under Clone.
- Go to the root of the
govuk-intent-detector
directory on your local machine.
Install software dependencies
Install the following software dependencies in the root of the govuk-intent-detector
directory on your local machine.
- Install at least Python 3.8 or later.
- Install at least R 4.0.4 or later.
- Run
make requirements
in the command line to install the Python packages that the intent detector needs to run.
Download a copy of the GOV.UK mirror
Download a copy of the GOV.UK mirror, which is a static version of the entire GOV.UK website.
See the documentation on using the GOV.UK mirror for more information on how to download a copy of the mirror.
Install direnv
You should use direnv
to load environment variables, as this program makes sure you only have project-specific variables loaded when you are inside the project. This prevents accidental conflicts with identically named environment variables.
Run the following in the command line to install
direnv
using Homebrew:brew install direnv
Add the shell hooks to your bash profile:
echo 'eval "$(direnv hook bash)"' >> ~/.bash_profile
Check that you have added the shell hooks correctly:
cat ~/.bash_profile
If you have added the shell hooks correctly, you should see
eval "$(direnv hook bash)"
as output.Restart your command line interface to finish installing
direnv
.
Load the secrets environment variable
Ask the GOV.UK Data Infrastructure team for a Google Cloud Platform (GCP) credentials JSON file. These credentials must give permission to execute
select
queries on BigQuery.When you receive this GCP credentials JSON file, store it on your local machine.
Go to the root of the
govuk-intent-detector
directory on your local machine and create a.secrets
file:touch .secrets
Add
export GOOGLE_APPLICATION_CREDENTIALS="<SECRETS-FILE-ABSOLUTE-FILEPATH-AND-FILENAME>"
to the.secrets
file.Make sure
source_env ".secrets"
is not commented out. This will make suredirenv
loads the.secrets
file using.envrc
without changing the version of the.secrets
file.