Testing AI scribing technology for children's neurodiversity services in Wales

In recent years there has been a huge rise in demand for neurodevelopmental assessments in Wales, which is projected to continue to increase in coming years. As part of our project to improve children’s neurodiversity services, we’ve been exploring how digital tools could ease pressure on the professionals involved in assessments. We want to make it easier for them to do their jobs effectively and reduce the waiting time for children and their families needing a neurodevelopmental assessment.

In our alpha phase, we found that healthcare professionals working in children’s neurodiversity services are often burdened with large amounts of paperwork and documentation. This leaves less time for the work that matters most - supporting children and their families. That's why we’ve begun testing an AI scribing tool that could help reduce this burden.

What is AI scribing?

An AI scribe is a digital assistant that can listen to conversations and generate structured notes in real time. The technology we've been testing is called Magic Notes, developed by Beam. It records conversations, transcribes them, and creates summaries that could feed into patient assessments and records. This isn’t about replacing clinical judgement, but reducing admin overheard so that clinicians can focus on the parts of the job that only humans can do.

Putting it to the test

Over 2 days of testing, we deliberately tried to ‘break’ the system. We were fortunate to carry out testing at Swansea University's incredible simulation suite.. We ran 50 different test scenarios with actors to better understand any challenges or limitations with the technology. This was a realistic clinical setting where we could test the AI scribe’s ability to transcribe speech to text under challenging conditions, without involving real patients.

We tested how the AI scribe worked with:

different accents
bilingual and Welsh conversations
background noises (e.g. doors closing, children running around the room)
quiet voices
complex speech patterns
technical challenges like poor Wi-Fi connections.

Dave and Alaw from the CDPS team with one of our actors, Manon, in the simulation suite at Swansea University.

What we discovered

The results were encouraging. The AI scribe achieved a low word error rate of 6.6%, comparable to that of professional human transcribers of 4-5%.

It performed well at:

working in noisy environments - even with children talking and alarms going off
capturing medical information accurately - details like developmental milestones
understanding speech patterns - it correctly identified things like echolalia and the speaker spontaneously switching topics.
handling different accents, idioms and colloquialisms
maintaining connection – transcription continued even when Wi-Fi was poor or intermittent.

However, we also identified some limitations:

names were sometimes misspelled - something clinicians would need to review
it didn’t always understand the Welsh language - sometimes Welsh was put in brackets as "Welsh" without transcription, or incorrectly translated to English
pauses and hesitations were flattened - important clinical indicators could be lost
there was a higher word error rate when someone spoke in a very quiet voice, including some "hallucinations" where the AI inserted words that weren't spoken
during 1 of the 50 tests, transcription stopped and there was no alert to warn about this, but everything was still recorded (recording didn’t fail in any of the 50 tests).

We will be feeding back some of these problems we encountered to Beam.

Dave from CDPS and Manon, one of our actors testing the AI scribe tool in Swansea University's simulation suite. The suite has projections on the wall that make it look like a doctor's office. Manon has her back to Dave, to see how the scribe records her speech.

What this means in practice

Although AI scribing technology isn’t perfect yet, the value it could bring is enormous. From our testing, we’ve learnt a few safe principles for working with these tools:

review, don't assume - always double-check names, numbers and technical terms
expect the unexpected - background noise and technical issues may affect performance, it’s still important to take notes or risk details being lost
use as support, not substitute – the technology should reduce admin burden, not replace clinical judgement.

Next steps

Looking forward, we are hoping to support a pilot with Cwm Taf University Health Board. We can’t wait to see how this tool might work in a real-life trial, and to see how it performs in different environments, with different pressures. We're currently waiting for final approvals before this trial can begin.

Importantly, we're keeping this work supplier-neutral. Our goal isn't to promote one product, but to understand whether AI scribing as a concept is effective and valuable for healthcare teams working with neurodivergent children.

The potential is exciting. If we can reduce the administrative burden on clinical teams, we can free up more time for the direct support that children and families need. We're committed to testing thoroughly and implementing carefully, putting clinical effectiveness first.

We'll continue to share updates as this work progresses.