Improving access with AI: Understanding accents, dialects and languages at scale

ICS AI
Jul 22
8 min read

AI that understands regional accents and dialect

In a country as linguistically diverse as the UK, the ability to understand different accents, dialects, and regional expressions isn’t just a nice-to-have – it's essential for truly inclusive public services. At ICS.AI, we're focused on making AI technology that works for everyone. We help councils and public sector organisations across the UK improve access to services, no matter where citizens are from, what language they speak, or what accent they have. We’ve developed an innovative approach to ensure our AI phone assistants serve all communities effectively, regardless of how they speak.

From Glaswegian to West Midlands vernacular, our models are demonstrating that understanding someone doesn't depend on sounding like them. Instead, it depends on learning intelligently from data.

The Evidence: AI That Understands How People Speak

Our implementations across multiple UK councils provide compelling evidence of our success handling diverse regional language. With over 1.68 million calls processed across council customers, and more than 2.2 million queries just one council alone, we have robust data demonstrating that AI can effectively serve diverse communities:

Regional term recognition rate of 98.7%
Dialect categorisation accuracy of 94.8%
Only 2% of queries fail due to dialect or language interpretation (which are then successfully transferred to a human advisor)
Understanding of 15,000+ unique phrases across regional variants

These aren't just theoretical claims - they're real-world results from serving communities across the UK, from the Scottish Highlands to the Midlands to the Southwest.

Overall AI Performance Across Councils

Our technology has been successfully deployed across multiple councils, each with their own regional language characteristics:

Council	Total Calls	Deflection Rate	Deflected Calls
Council # 1	970, 964	45%	432,913
Council # 2	172,902	28%	48,983
Council # 3	114,319	42%	47,957
Council # 4	193,905	31%	60,712
Council # 5	232,419	41%	94,711
Total	1,684,239	41%	685,276

What is a “deflection rate”? This is the percentage of calls that the AI successfully handles without needing to be transferred to a human agent. A higher deflection rate means more queries are being resolved by the AI system, freeing up staff time for more complex cases.

Data and dialects

By combining supervised learning, generative refinement and expert human input, we're building a system that doesn't just understand a language, but understands the way the language is used in real communities.

What is supervised learning? This is a technique where we train our AI by showing it examples of conversations and their correct interpretations, helping it learn to understand different ways people express the same idea.

What is generative refinement? This is our approach where we use AI to improve understanding by automatically rephrasing unclear statements in ways that maintain the original meaning but are easier for our systems to process.

Our data shows that our latest GenAI-powered solutions achieve a 26.4% improvement in successful interaction rates compared to traditional NLP approaches.

AI Type	Total Calls	Deflection Rate	Deflected Calls
Client # 1 – NLP	920,764	44%	405,136
Client # 2 – GenAI	49,930	55.6%	27,777

What is NLP? Natural Language Processing (NLP) is technology that helps computers understand human language. It's the foundation of how our systems interpret what people are saying.

What is GenAI? Generative AI is more advanced technology that can create new content and better understand context. It's what powers our newer systems and delivers better results for understanding regional speech patterns.

This advancement is particularly important for processing regional expressions and colloquialisms that might otherwise be misunderstood.

In fact, data shows that audio fidelity impacts understanding more than accents do. When it comes to our AI phone models, fewer than 2% of calls encounter any issues linked to interpretation, thanks in part to our use of generative AI.

What is audio fidelity? This refers to the quality of the sound coming through the phone line. Sometimes what sounds like an accent problem is actually just poor call quality or background noise.

Using generative AI, our platform automatically rephrases low-confidence utterances in real time, improving comprehension without needing to pre-train for every accent. Our SMART: Mesh capability continuously reviews real-world calls, identifying colloquial terms and retraining the model to recognise them. This dual approach means the system improves organically – not just across accents, but across language habits.

What are low-confidence utterances? These are phrases or sentences that our AI isn't completely sure it understands correctly. Instead of guessing, our system uses generative AI to reframe these statements in ways it can better process.

What is SMART: Mesh? This is our proprietary technology that continuously learns from real conversations, automatically identifying new phrases, regional terms, and expressions to improve our AI's understanding over time.

Even distinctly regional expressions are handled well. Consider this real example: “Ey up! Me rubbish still sat there full as owt...”. Our system correctly categorised this as a missed bin collection enquiry, demonstrating deep semantic understanding beyond mere word recognition.

Regional Term Recognition

Our analysis shows the system performs consistently well with dialect-specific vocabulary:

Term	Occurrences	Successful	Escalated	Deflection Rate
Duck	243	149	94	61.3%
Pet	29	23	6	79.3%
Love	22	9	13	40.9%
Cob	4	4	0	100%

These results confirm that regional language use does not degrade AI performance, with deflection rates aligned with or exceeding regular language calls.

Measurable improvements across implementations

In an initial proof of concept back in 2021 with a Scottish Council, we tested the efficacy of regional accents handling. After determining a robust baseline, the testing demonstrated significant improvements:

Query Type	Before	After	Improvement
Bin Collections	46%	73%	+27%
Council Tax	66%	73%	+7%
Regional Term “Uplift”	21%	74%	+53%

This methodical approach to language adaptation shows quantifiable enhancement that has real-world impact. We apply the same rigour to all our implementations, measuring performance, identifying patterns, enhancing models, and continuously improving based on actual interactions.

Scottish success shows what's possible

Take another one of our Scottish deployments as a case in point. Despite a mix of strong regional accents, the AI phone assistant there achieved impressive confidence scores:

A 64.3% high-confidence intent recognition rate
Only 16.8% of calls in the low-confidence category

What is intent recognition? This is how our AI understands what the caller wants to accomplish. High-confidence recognition means the system is very sure it understands what the person is trying to do or ask about.

High-confidence intent recognition rates consistently exceed 60%, underscoring system reliability. These scores support the system’s strong comprehension capabilities even with varied regional expressions.

Even more impressive, the system successfully refined 1,869 low-confidence interactions into high-confidence outcomes through real-time generative refinement. This demonstrates robust fallback strategies that maintain service quality even when the system initially struggles to understand regional accents or expressions.

The answer found rate showed significant improvement over time, rising from 15.9% at launch to 28.9% in just five months, with a peak of 31.2% in March:

Metric	December	March	Change
Total calls	18,791	25,562	+6,771
Questions Asked	41,767	60,058	+18,291
Question Answer Rate	13.20%	31.22%	+18.01%
Auto-Escalation Rate	30.11%	26.28%	-3.73%

What is Question Answer Rate? This is the percentage of questions that our AI can successfully answer from its knowledge base. An increasing rate shows our system is getting better at finding the right information for callers.

What is Auto-Escalation Rate? This is when our system automatically transfers a call to a human agent, typically because the query is too complex or requires human judgment. A decreasing rate means our AI is handling more queries independently.

This steady growth shows how continuous, real-world feedback loops help our models adapt to the nuances of human speech over time.

Technical quality that makes a difference

The foundation of our success lies in the superior technical implementation. With the average call conversation time being 2 minutes 42 seconds in the generative AI version (compared to 1 minute 49 seconds in the NLP version). Longer durations in GenAI sessions reflect richer, more human-like interactions and indicate effective dialogue with colloquial and regional speech.

Our speech-to-text technology achieves a Word Error Rate of just 9.09% (rated “very good”), outperforming alternatives.

What is Word Error Rate? This is a measure of how accurately our system converts spoken words into text. A lower percentage means better accuracy - 9.09% is considered very good in the industry.

First Contact Resolution rates of 57.02% at Council # 1, as an example, indicate the higher majority of users have their queries resolved in first interaction.

What is First Contact Resolution? This means the caller got their issue completely resolved during their very first interaction with our system, without needing to call back about the same issue. It's a key measure of service quality. This is based on user call-backs within 5 days.

Real-World Examples

Our system successfully handles various speech recognition challenges, including strong regional expressions and accents that can be difficult to transcribe, like:

"Ma recycling ben whizz nail collected the day" → Understood as "My recycling bin was not collected today"
"Me far very lark needs a ramp to getting tiz proper tear" → Understood as "My father-in-law needs a ramp to get into his property"
"I spoke to yow last week a boat the rolled service on the boards lay circus island" → Understood as "I spoke to you last week about the road surface on the Bordsley Circus roundabout"
"Ey up" Me rubbish still sat there full as owt..." → Correctly categorised as a missed bin collection enquiry

These examples showcase how our system goes beyond word-by-word transcription to understand underlying intent, even when speech-to-text may struggle with strong accents or dialect variations.

Inclusive service across communities: It's about more than just accents

While our technology is already strong at recognising UK dialects, it goes further – into multilingual capabilities designed to support entire communities. For instance, Derby City Council's 'Darcie', built on ICS.AI's technology, can already communicate intelligently in ten different languages.

This focus on multilingual support reflects our commitment to local language inclusion, not just English dialect recognition. We recognise that communities aren't just defined by accent but instead by the languages they live and work in. Whether it's Polish or Punjabi, our models are designed to engage and understand.

Balancing understanding with safety

Our systems include important content safety protocols that protect against inappropriate language or misuse. These safeguards are an essential feature requested by councils to ensure appropriate interactions, especially when serving vulnerable residents.

What are content safety protocols? These are protective measures in our AI that prevent inappropriate responses and ensure the system maintains appropriate boundaries. They help ensure conversations remain respectful and safe for all users.

Occasionally, these protocols may temporarily flag benign regional terms for review, but this represents a tiny fraction of interactions and is continuously refined through our learning process.

Looking ahead: research, linguistics and continuous evolution

ICS.AI is beginning engagements with linguists and academic researchers to take this even further. Our goal is to understand not just how people speak, but why certain dialects pose challenges – and how to eliminate those barriers entirely.

This journey toward truly inclusive AI is ongoing. With each interaction, with each new regional expression processed, with each language added, our systems become more attuned to the rich diversity of human communication.

Our AI is learning how to listen more attentively and speak in ways that make everyone feel truly heard and understood - regardless of accent, dialect, or language preference.

Data Sources:

The statistics and performance metrics cited in this blog post are derived from:

ICS.AI internal data analysis of 1.68 million customer interactions from across five UK council phone AI implementations (2021-2025)
Performance reports from our Scottish council implementations (2024-2025)
Proof of concept study with a Scottish council (2021)

Performance metrics were calculated using standard industry methodologies. For more information about our data analysis approach or to request additional information, please contact our team.

*All data used in this analysis was anonymised and aggregated in accordance with data protection regulations. No personally identifiable information was used in the creation of this report.