Why is everyone so transfixed by ambient voice scribes when the potentially transformative impact - for good and ill - lies in GenAI clinical decision support?
Promise and peril
Artificial intelligence has been part of general practice for years in various guises: think rule-based prompts, appointment automation, diagnostic coding, and machine learning-based population segmentation.1 GP use of AI tools has risen by >25% in the past 2 years, and much of this rise is due to the introduction of generative AI (GenAI) in the form of ambient voice technology.2,3 Heidi and AccuRx transcribe a growing number of our consultations, albeit with varying degrees of accuracy: one of my recent discussions about adjusting a mood stabiliser was comically transmogrified into a conversation about salt reduction when sodium was transcribed in place of lithium. This error serves as a reminder that GenAI systems do not understand the content they generate in the true meaning of the word (although some credit is due for remaining within Group 1 metals); they simply predict the most likely next word. For now, the burden of vigilance remains entirely with the clinician.
“GenAI isn’t good or bad in itself; it is a tool with known flaws that we need to learn to apply with expertise.”
Just as ‘AI’ refers to a diverse range of tools and technologies, there is similar heterogeneity in GP uptake. The recent RCGP Voice survey found that male GPs are a third more likely to use AI than female colleagues, and use is highest in practices serving more affluent populations.2 Beyond scribing, other common applications include administrative and learning support, reported by around half of all AI users. In contrast, only a minority of survey respondents reported using AI for clinical decision support.2
The current focus on using AI for documentation and administration reflects the grinding pressures of contemporary practice: GPs are desperate to reduce the overwhelming administrative workload that obstructs direct patient care. The focus on back-room operations also reflects well-founded disquiet around the risks of relying on generative AI for high-stakes clinical reasoning. GPs and patients alike are rightly concerned about hallucinations; the absence of coherent guidance and regulatory oversight; unclear medicolegal liability; data privacy and patient consent; bias in training data; model drift; digital exclusion; uncertain effects on the doctor–patient relationship and the risk of eroding clinical skills.1,2,4–7
And yet … the suggestion to ‘ask ChatGPT’ is becoming a coffee-time leitmotif as clinicians discuss tricky cases. We’re rightly wary of the risks and limitations for clinical decision making - but we’re also wowed by the rapidly improving competence and capabilities of these tools. While AI chatbots haven’t been specifically trained on UK data, don’t know anything about our patient populations, and don’t draw on NHS clinical guidance (without prompting), they still manage to produce astonishingly cogent advice.
Anecdotally, GPs are increasingly using GenAI for sense-checking management plans, suggesting differential diagnoses, recommending investigations, interpreting imaging and blood results (‘is this secondary or tertiary hyperparathyroidism?’), summarising medical information (‘what is tertiary hyperparathyroidism?’), seeking second opinions on skin lesion photographs, and running post-hoc ‘shadow consultations’ to check if they should have done things differently. Most GPs use off-the-shelf general purpose chatbots like ChatGPT, Gemini, Claude, and *shudder* Microsoft Copilot rather than specific medical agents,8 which is scary for a number of reasons, not least because no one seems to have received any training on how to craft sensible prompts.
Flying blind
Actual data on how GPs are using GenAI for clinical decision support are vanishingly sparse. We currently have zero qualitative data (such as observations, interviews or focus group discussions) to really understand what clinicians are asking GenAI for help with, how sophisticated their prompts are, at what points before/during/after the consultation they are seeking support, and how they interpret, appraise, and apply AI-generated output. Without a clear view of real-world engagement, we cannot meaningfully judge the impact of AI decision support on patient care. The 10-Year Plan aspires to ‘make AI every doctor’s trusted assistant’9 but the NHS currently has no idea what GPs actually want or need.
This needs to change. While the current focus on streamlining administrative tasks is important, the application of GenAI to clinical decision support has profound and transformational implications for GPs and our patients.10,11 By expanding our ability to access and apply the totality of evidence and tailor guidelines to the patient sat in front of us, GenAI could potentially super-power GPs and reduce unwarranted variation in care quality so that we never miss a red flag or rare disease again. It could help us reconcile competing guidelines for patients with multimorbidity and draw on our patients’ entire clinical records, including all notes, letters, lab results, and scans - spotting patterns and diagnosing problems years before the onset of the first symptom. Connecting these agents with genomic, wearable, and exposomic data could fundamentally transform medicine. While we are currently justifiably worried about the risks, it could soon become morally indefensible not to consult GenAI when making clinical decisions.
At the same time, using GenAI as a backstop for clinical reasoning will probably lead to degradation of cognitive and clinical skills, as seen across other safety-critical professions.12 Worryingly, emerging evidence shows that clinicians working with AI support can underperform compared with clinicians or AI working alone, partly because of automation bias; the tendency to begin accepting whatever AI suggests.13 In the longer term, the gradual AI accretion of clinical reasoning could make us redundant for the higher-order elements of care that we find stimulating and rewarding. The stakes are high.
Pragmatic adoption
As societal and economic forces move us from the dyadic doctor–patient consultation towards ‘triadic’ doctor–patient–AI consultations,14 15 it is critical that we resist the introduction of tools that undermine safe and effective patient care, while embracing technology that can potentially reduce the clinical errors that occur in 4% of consultations that is, one per day for the average GP.16 As a profession we will need to proactively improve our AI literacy so that we can prompt intelligently, critically appraise output, and apply GenAI tools in ways that first do no harm. Rather than reflexively rejecting AI because of its current limitations we need to remember that GenAI isn’t good or bad in itself; it is a tool with known flaws that we need to learn to apply with expertise.
Rapid AI adoption among clinicians and patients is already fundamentally changing the nature of primary care. Compared to any other tool, GenAI poses the most profound risks and benefits for patient care, and the most radical challenge to the role of the GP. The genie isn’t getting back into the bottle. We need to take a clear-eyed approach to upskilling ourselves and evaluating new triadic models of care, with a focus on mitigating the limitations and maximising the benefits for our patients.
Notes
Provenance
Commissioned; not externally peer reviewed.
- Received December 10, 2025.
- Revision received January 30, 2026.
- Accepted February 9, 2026.
- © British Journal of General Practice 2026