How NHS data analysts can succeed, become data experts and more than just the ‘number crunchers’

The request

The manager was very clear. “We’ve had a request for how many toes were done during the last financial year. Please will you get me the number.” The analyst knew that the first thing to do was to get the procedure codes for toes from the clinical coders, so she sent off an email to the Clinical Coding Manager. What came back was a longer list than expected:

OPCS4 codeDescription
W01.1Microvascular transfer of toe to thumb
W03.3Total correction of claw toe
W59.4Fusion of interphalangeal joint of great toe
W59.5Fusion of interphalangeal joint of toe NEC
W59.6Revision of fusion of joint of toe
W59.8Other specified fusion of joint of toe
W59.9Unspecified fusion of joint of toe
W79.3Syndactylisation of lesser toes
W79.8Other specified soft tissue operations on joint of toe
W79.9Unspecified soft tissue operations on joint of toe
X02.3Replantation of toe
X11.1Amputation of great toe
X11.2Amputation of phalanx of toe
X11.8Other specified amputation of toe
X11.9Unspecified amputation of toe
X27.2Release of syndactyly of toes
X27.3Amputation of supernumerary toe
X27.4Correction of curly fifth toe
X27.5Correction of congenital crossed toes

At this point the analyst thought that she should go back to her manager and ask: “Which codes should be included in the search?” The manager was busy and so gave the analyst the name of the person who had made the request in the first place.

Although she hadn’t been in post very long the analyst knew that surgeons who operated on toes were known as ‘orthopods’, and that data about their patients was recorded on the Patient Administration System (PAS) under the specialty of ‘Trauma & orthopaedics’. So she was a bit surprised that the name she had been given was that of the head of the cardiology service. But by now she had learnt enough about medicine to know that circulatory problems could lead to toes being amputated and was pleased that she was beginning to piece together for herself the background to the request.

She rang the head of cardiology. “I’m calling about your request for the number of toes,” she said. “There are about 50 operation codes, so I need to ask for a few more details from you, please. For example, are you just interested in amputations?”

The head’s response was sharp: “Transoesophageal echocardiograms! That’s what I mean by TOEs!”

Turning an information request into a query specification (i.e. a set of instructions to apply to the data) is not necessarily straightforward. What are the principles that should be followed to make sure that the information provided will meet the real needs of the person asking the question?

You know what the requester has asked for, but what do they really need?

Who is asking the question? Why do they want to know? What is the background to their request?

These open* questions will generate important information to help you to turn the initial question into one that is going to get to the crux of what the requester is really interested in, and one that you can answer from the data available to you.

Consider the following example. The trauma and orthopaedics service manager phones and says she needs to know how many hips the Trust did last year. By asking some open questions the analyst should be able to work out exactly what the manager needs. For example:

What’s the background? “A meeting held last week with the commissioners.”

Which commissioners? “North Forest CCG.”

OK. So is your request referring just to patients whose treatment has been commissioned by that CCG?  “Yes.”

What was the meeting about? “The size of the waiting list.”

OK. So is your request referring just to elective admissions? “Obviously.”

What’s the medical term for the operation? “Primary hip replacement.”

You have enough information now to start writing the selection statements, something like the following:

SELECT IF SPELL-END DATE = 2018

SELECT IF ADMISSION METHOD = ELECTIVE

SELECT IF COMMISSIONER = NORTH FOREST

(Note: the instructions above are illustrative only: the syntax required will be a bit more detailed than this.)

There are people around in the organisation who really know how patients flow through it and how the data about these encounters are recorded – ward clerks, health records staff, PAS specialists, medical secretaries and clinical coders. Make use of these front-line experts and learn to be forever curious!

You can ask the clinical coding manager for the OPCS4 codes for primary hip replacement. You should get the following:

W37.1Primary total prosthetic replacement of hip joint using cement
W38.1Primary total prosthetic replacement of hip joint not using cement
W39.1Primary total prosthetic replacement of hip joint NEC

Look up these codes in the OPCS4 manual. What about the codes next to those provided, i.e. W37.2, W37.3, etc? Ask the clinical coding manager why he hasn’t included these other codes, particularly W37.9 (Unspecified total prosthetic replacement of hip joint using cement). Could this code have been used in a situation when it was unclear that the replacement was a primary?

 Is there even the remotest possibility that some of them would be used when a ‘primary hip replacement’ has been done? Tell the clinical coding manager the background to your request: he has a lot of knowledge about anatomy, physiology, terminology, how services are organised locally and the local documentation practices of your hospital’s clinical staff. And he will appreciate the fact that you are consulting with him not only because of his coding expertise, but also because he is one of the hospital’s front-line experts.

How can you really delight your customer, by giving them something more than they asked for that will be much appreciated because you are giving them new knowledge? This approach may also reduce the need for a supplementary request.

If you tell the coding manager that you are just going to search for elective admissions, then he may tell you that sometimes one of the codes he’s provided is used for an emergency admission: for example, a fractured hip that can best be repaired by doing a primary hip replacement procedure. (Hence the importance of asking open questions; and preferably verbally, followed up with a confirmatory email!)

If so, then this may make you think more about the selection statements you were intending to use. It might make sense not to select just ELECTIVE patients, but to present the results in a table showing the elective/non-elective breakdown.

Similarly, if this request has come from a meeting with one set of commissioners then it is likely that there will be a similar meeting with other commissioners next week. So why not scrap the COMMISSIONER selection statement so that your table is based on all patients treated, and instead present the results giving a breakdown by commissioner?

You now have enough information to write the query to answer the request:

SELECT IF SPELL-END DATE = 2018

SELECT IF OPCS4 CODE IN LIST (W371, W381, W391).

TABULATE ADMISSION METHOD BY COMMISSIONER

Title = Admitted patient spells ending during 2018 where a primary hip replacement was carried out: Admission method by Commissioner.

If you show all the individual admission methods (rather than just the Elective group and the Non-elective group) then the requester will be able to distinguish between Elective-waiting list, Elective-booked and Elective-planned. (You will probably have to explain what ‘planned’ means in this context – by reference to the NHS Data Model and Dictionary.)

The skill you have used to ask the requester some open questions, and to be able to consult with the clinical coding manager, has now resulted in information for the manager which is most likely of greater value than they expected. It provides them with information about patients from all commissioners and it alerts them to the fact (or reminds them) that not all primary hip replacements are done on elective admissions. This may be helpful in planning for the number of prostheses that are used. People are usually grateful for some information that increases their knowledge and which will help them to do their job better in future.

And so it goes on…

But be prepared! When the manager shows the table to the surgeons and the commissioners someone may ask why only primary replacements have been included: they will have been alerted to this by the clear titles in your table.  You will then get a follow-up request for the number of ‘revisions’. Getting the selection and presentation right for this request will require a further close consultation with the clinical coding manager, because when a surgeon uses the term ‘revision’ she is referring to procedures which are described in the OPCS4 classification (and hence understood by the clinical coders) by all of the following terms: revision, conversion or attention to.

Requests involve knowledge of how this trust records its data

In 2012 a letter in the BMJ (‘The importance of knowing the context of hospital episode statistics (‘HES’) when reconfiguring the NHS’) gave some examples of what are – at first sight – surprising statistics produced from the national HES data. The authors suspected that ‘the numbers may, at least partly, reflect data errors’. But to those information analysts working in a trust for several years the numbers can probably be explained by knowledge gained from using the local data and querying the ‘surprises’ with the front-line experts who understand how the local configuration and delivery of services impacts on the numbers produced. These staff groups possess a wealth of knowledge to satisfy the curious analyst – if they are asked. The analyst will probably need to talk to more than one of these experts because they won’t all understand all the relevant parts of organisation’s processes that affect how – and what – data are recorded. Let’s take each of the scenarios quoted in the paper and attempt to give an explanation that would be obtained by consulting with these experts.

It is important to remember that these likely explanations are based on the experience of working in one trust, but all trusts are different: the way a trust’s services are configured is based on its geography, the history of the people who have worked in it and set up the services and the functionality of the computer systems that have been used to record the data. Hence no two trusts will record all data in the same way (Beattie, 2018).

Discussion

In the same year that the letter from Brennan and colleagues appeared in the BMJ, the Audit Commission published a report (‘By definition: Improving data definitions and their use by the NHS’) which clearly explains the reasons for the issues we have identified with routinely collected NHS data. A key paragraph (number 45) says:

The current guidance is a compromise between the treatment taking place and the existing data models. This results in the NHS trying to fit the changing way they deliver care to the existing datasets and their guidance. Clinicians do not recognise their service from the data, which impacts on their engagement in data quality and financial management issues.

The report also identifies an issue that will be familiar to experienced analysts: the functionality of the organisation’s PAS (Patient Administration System) can sometimes be the determining factor in how the data are recorded. Appendix D explains:

Similarly, a Trust may wish to use the APC (Admitted Patient Care) waiting list functionality of their PAS to manage the wait for an operation – which leads into the APC module of the PAS. However, the classification of the patient should be determined by the treatment provided.

The Commission’s report therefore reminds us that how the data is recorded is influenced by both how the service is actually delivered by the local clinicians and the functionality of computer system used to record it. Experienced analysts will be implicitly aware of these factors and how they influence the way the resulting analyses are received and interpreted by clinicians and other decision makers. Analysts need to be able to dig deep into all the knowledge they have internalised about hospital processes so they can fully tell the story behind the derivation of work they have produced – a story based in part on the knowledge they have gained from the front-line experts in their organisations.

These issues are not peculiar to the NHS, as a recent article from the Harvard Business Review makes clear (Haller and Satell). The authors report that as many as 85% of big data projects fail. “A big part of the problem is that numbers that show up on a computer screen take on a special air of authority. Once data are pulled in through massive databases and analysed through complex analytics software, we rarely ask where it came from, how it’s been modified or whether it’s fit for the purpose intended.” The authors remind us that “we need to learn how to ask [some] thoughtful questions: How was the data sourced? How was it analysed? What doesn’t the data tell us?”

Analysts of routinely collected NHS data do not only have a duty to understand ‘how was the data sourced’, but also to remind the managers of services and those who requested the analysis how data should be recorded according to the NHS Data Model and the consequences of not doing so. These points and the knowledge of the experts can sometimes turn out to be as important as the interpretation of the analysis. The words of statistician John Tukey echo here: “The more you know that is wrong with a figure, the more useful it becomes.” What is it about a service and how it is run that means the data can’t be recorded as the NHS Data Model states?

Conclusions

When data is used to judge or fund a service then there is an incentive to record all patients who have been seen (e.g. parents as well as the child referred; both sexual partners), using the computer system that makes for the easiest recording, even if doing so is not strictly in line with the NHS Data Model.

NHS data needs to be analysed by analysts who are not only good at number crunching and understand the national codes that all PAS systems use in the generation of data files, but who are also curious enough to query surprising results with the service that the data relates to and ask open questions of all the front-line experts involved in the data recording. Successful data analysts need to be able to continually update their ‘organisational intelligence’ (Beattie and Bartoli, 2020). The examples described in the BMJ letter provide a helpful starting point of surprising patient groups to try to explain; analysts and the front-line experts within hospitals could no doubt provide their own equally surprising scenarios.

Deeny and Steventon wrote that “major methodological challenges in using routine data arise from the difficulty of understanding the gap between patient and their ‘data shadow’”. They conclude that “for routinely collected data to be used optimally within [a learning healthcare system], … development is needed in ways to understand how the data were generated”.

Curious analysts, working closely with front-line experts (or networking with colleagues who do) and asking open questions, will generate the necessary understanding of how the organisation really works and so how its data was generated. This way of working generates essential knowledge that will inform the correct interpretation of the results produced and is sure to keep all analysts on their toes.

*Open questions should deliberately seek long answers.  Open questions begin with words such as: what, why, how, describe.

The views and opinions expressed in this article are personal and are not necessarily those of Northumbria Healthcare NHS FT.

References

Audit Commission. By definition: Improving data definitions and their use by the NHS; A briefing from the Payment by Results data assurance programme. 2012

Beattie A. What if M&S was run like the NHS? Health Service Journal. 2018. https://www.hsj.co.uk/service-design/what-if-mands-was-run-like-the-nhs/7023058.article

Beattie A, Bartoli B. Organisational intelligence and successful change in NHS organisations. British Journal of Healthcare Management. 2020; (26(3): https://doi.org/10.12968/bjhc.2019.0060

Brennan L, Watson M , Klaber R, Tagore C. The importance of knowing the context of hospital episode statistics when reconfiguring the NHS. BMJ. 2012; 344:e2432. https://doi.org/10.1136/bmj.e2432

Deeny SR, Steventon A. Making sense of the shadows: priorities for creating a learning healthcare system based on routinely collected data. BMJ Qual Saf 2015;24:505-515. https://qualitysafety.bmj.com/content/24/8/505

Haller E, Satell, G. Data-Driven Decisions Start with These 4 Questions. Harvard Business Review. 2020.

NHS Data Model and Dictionary. https://digital.nhs.uk/services/nhs-data-model-and-dictionary-service

June 2020

Leave a comment