We’ve been working on a prototype that asks users to tell us the GP practice they are registered with. We’ve prototyped pages like this before.

Even though it wasn’t “real”, participants filled in the fields as if it was. We saw users enter:

  • the name of their practice, or something close to it
  • the name of the road their practice is on
  • the name of their GP, or the GP they normally see
  • their own home postcode

To learn anything more we need to be able to watch users interacting with real data, to see how they react when it isn’t hardcoded to find the right practice.

To build the prototype we needed access to basic data about all the GP practices in the country, and all the GPs practising at them. Getting hold of that data has been a trial.


First problem: access to data

  • It’s not all easy to find. There’s no obvious place to go to find data. Searching the internet is a reasonable start, but there were data sets we didn’t find out about until someone approached us after a show and tell to say they could point us to a better data set if we sent them an email.
  • Some public data is only available for money. The General Medical Council keep the register of general medical practitioners. Every practitioner is obliged by law to register with them before they can practice. To get full access to the list costs £720 per year – a significant barrier for small teams on a shoestring budget.
  • Restrictive licensing. The data available through the NHS Choices API is licensed under the NHS Choices Syndication Licence, which restricts what you can make with the data and how you can share what you make.

Second problem: data quality

  • Weird data formats. The NHS Choices data are described as being in CSV format – comma separated variables – but it uses the not symbol (¬) as a separator and the encoding isn’t specified (we guessed ISO-8859-1 but it could be Windows-1252). Strictly speaking CSV is a pretty flexible format, but we’ve had a specification for it for over 15 years now.
  • ALL THE ORGANISATION DATA SERVICE DATA IS UPPERCASE. We want to show this data on screen in mixed case. Going from mixed case to uppercase is easy, the reverse is almost impossible with a complex data set.
  • The ODS data for General Medical Practitioners contains a lot of records that aren’t people. We need a list of practitioner names to search against, but almost half of the records have names like “GP IN ED PRESCRIBER”, “DR AT MIDDLE CHARE GROUP”, or “POOLED LIST”.
  • Duplicate data. NHS Choices publishes data for “GPs”, “GP Practices” and “GPs’ Staff”. The data for “GPs” and “GP Practices” are very similar and there’s no information on what they’re for.
  • Fields contain mixed data. The GP Practices data provided by NHS Choices has phone numbers and website URLs in the “County” field.

A better way

More than three-quarters of the time we’ve spent working on this prototype has been spent fighting with data. It’s messy now, but we can build a brighter future. A future where teams can worry about their service and not about the data behind it. To do that we need to embrace registers.

Registers are sources of trustworthy data that service
teams, analysts and policy experts can put to good use.

They are designed to be part of a wider ecosystem of
interconnected parts. They specialise in storing and
maintaining data, in a helpful, accessible, useable way.

We can use registers to break down the data into small chunks, cared for by the right organisation. For us, that could be three registers:

  1. A register of all practitioners allowed to practice, cared for by the General Medical Council.
  2. A register of all GP practices, cared for by the Organisation Data Service.
  3. A register of practitioners prescribing at practices, cared for by the NHS Prescription Service.

To properly serve the public interest, the registers should be free and open. They should be licensed under the Open Government Licence, which allows users to do anything with the data as long as they provide attribution.

Building those registers won’t be easy. We’ll need to work with the custodians to make sure they have the digital expertise they need both to manage their registers, and to understand their importance.

We must seek out opportunities to make data open as a side effect of delivering services. Open, well documented, cared-for data enables smaller, less well-connected teams to do more and to do it faster. It will reduce the barriers to entry that favour large internal projects and large external suppliers. It’s our job to open up the information of the health service so that anyone, no matter their size nor connections, can use it as their foundations.


  1. Comment from Stephen

    In NI we have this under OGL all open for use.

  2. This post nails it. While building https://openprescribing.net for https://ebmdatalab.net I faced a lot of these problems.

    Completely agree about it being hard to find the canonical source for each dataset. This wastes a lot of time.

    By trial and error, I eventually found that the best data source for GP information was HSCIC’s ODS, which is largely OGL. The data is clean and well-documented (though still UPPERCASE): http://systems.hscic.gov.uk/data/ods/datadownloads

    I also found that HSCIC were great and speedy at answering questions – many thanks to the people there who helped me.

    I faced the same challenges (hard to find definitive source, messy data, licensing questions, patchy documentation) finding several other datasets that we needed for the project: BNF codes, practice list sizes, CCG boundaries.

    It would be great to establish the canonical publishers for each dataset within the NHS, and to ensure that the datasets are interoperable, clean and up-to-date. It would make building services far easier – especially, as you say, for smaller teams.

  3. Comment from David Stone

    Just wondering why you want patients to tell you. HSIC hold the master patient index that links all patients registered with a GP to the practice. The data is selectively available.

    if your purpose is to link patients to their data, then you need to speak to NHS England or HSCIC about the citizen identification management programme, which will be providing the authentication engine for access to online services.

    • Comment from Dominic Baggott

      Hi David,

      We see identity verification as a curve, not as a binary. What’s the minimum amount of identity verification we can get away with for this particular service? That won’t be the same for all services. To book an appointment at a GP practice warrants a much lower level of verification than changing the delivery address for a repeat prescription.

      Many of our prototypes have focused on what we can provide a user *without* knowing their identity. To know how a blood test or a foot check is delivered for you we don’t need to know exactly who you are – the process is normally the same for all patients registered at a GP practice.

      Most people we’ve done research with know the practice they’re registered at. Many of them struggle when it comes to stronger identity checks, even if that’s just the username and password their practice gave them.

      One principle throughout our prototyping has been to present users with the lowest friction interface that will get the job done. If we started every transaction with identity verification it would probably make the rest of the transaction simpler. But if it created a wall that a significant number of users couldn’t get past, then it comes at too high a price.

      The prototypes I linked to from this post all form part of transactions that we think don’t require full identity verification. The GP lookup prototype we’ve created is to start to find out whether it’s low friction enough.

  4. This is a fantastic post — I can very much sympathise with the challenges of wrestling with (what should be canonical) NHS reference data. Just curious… is a register of GP practices on your roadmap?

    • Comment from Dominic Baggott

      Thanks Hadley!

      We want to drive implementation of registers like these by building real services.

      When we build a service for registering with a GP practice or booking an appointment, that service will need access to data like this. We’ll aim to make our service draw purely from registers, which means we’ll have to make sure those registers actually exist.

      By doing that we hope to help build registers that serve real user needs, and not just registers that open up data the NHS happens to have.

      That was a long winded way of saying yes, I’m pretty sure a register of GP practices is on our roadmap, I’m just not sure where or when.


Leave a comment