liveThe listening layer for voice-native agents

Your agent can hear them.Now it can grade them.

Chivox MCP turns raw speech into a dense, agent-ready payload — phoneme scores, stress, tone, fluency, audio quality — all in one MCP call, any LLM. The listening layer under every voice-native agent you’re about to ship.

Deep linguistic understanding
Go beyond transcripts.
Enterprise-ready
Secure. Scalable. Reliable.
Real-time intelligence
React in the moment.
54321你好T3 ✓T3 ✓hǎoT3 + T3 → T2 + T3TONE SANDHI · DETECTEDSCORE92thinkPHONEMEACCURACYHEARD/s/WEAKDROPPED/s//θ//ɪ//ŋ//ŋ//k//k/3 ISSUES · DETECTEDMISSING · WEAK · MISPRONOUNCEDALL CORRECTEDPHONEME DIAGNOSIS · SUPERVISEDSCORE5892
/highlights5 frames
01
One MCP. Every agent runtime.
Plug Chivox into Claude, Cursor, Cline, LangChain, or any custom loop in minutes.
One npx command — no SDK to install.
Same payload for Mandarin and English.
Works with any MCP-compatible client.
01 / 05Plug-and-play
/what-it-does

The listening layer, as four MCP tools

Twenty years of pronunciation-assessment R&D, exposed as a structured payload your LLM can reason over. Drop into LangChain, LlamaIndex, the OpenAI Agents SDK or any custom loop — skip the months of DSP work.

overall
84
accuracy
78
fluency
88
rhythm
73
/assess

Score a learner’s speech

Stream mic audio or post a file. Get overall / accuracy / integrity / fluency / rhythm scores, plus word and phoneme-level diagnostics.

overallaccuracyfluencyrhythmphoneme
zh-CNen-USone flag
你好nǐ hǎo
tones · pinyin
Hello/həˈloʊ/
stress · CEFR
/languages

Mandarin & English, natively

Tones, pinyin, neutral tone, erhua, tone sandhi for Chinese. Stress, rhythm, CEFR-aligned scoring for English. One flag switches between them.

zh-CNen-USpinyintonesCEFR
AI
Describe your hometown in three sentences.
user · 00:14
U
fluency 82content 76grammar 88accuracy 79rhythm 81
/converse

Score free-flow dialogue

Open-ended AI-talk evaluation returns 5-dimensional scores on fluency, content, grammar, accuracy and rhythm — ready for the next LLM turn.

AI-talkopen-question5-dimstreaming
personalized drill
/θ/ minimal pairs · think · sink · thank · sank
GPTClaudeGeminiQwen
/drill

Personalize the next practice

Feed the JSON straight to GPT / Claude / Gemini. Use the shipped prompt-skill to generate targeted drills for weak phonemes or tones.

GPTClaudeGeminiQwenDeepSeek
/proof

Mandarin depth, production scale — same payload.

Toggle zh / en to see the same pron.* / details[] contract. Benchmarks on the right are live numbers you can sanity-check against your own eval harness.

Mandarin · tone accuracy

你好,今天天气……

nǐ hǎo, jīn tiān tiān qì
78/100
sentence score
T3
85
hǎo
T3
72
jīn
T1
88
tiān
T1
88
tiān
T1
58
T4
91
tonesT1T2T3T4
LLM hint · second (tiān)collapsed into T4. Keep the pitch high and steady — it’s a T1.
95%+
agreement with human experts
r ≈ 0.95

Scores align with certified human expert rubrics at 95%+ correlation. Validated by national standardized speaking tests in 100+ cities.

0.95+
Pearson r vs experts
<2 pts
Mean absolute error
500K+
Calibration utterances
  • Per-dimension rubrics: pron, fluency, completeness, prosody.
  • Calibration corpus refreshed quarterly across L1/L2 cohorts.
  • Stable across mic quality, room noise and child voices.
Validated against national speaking-test rubrics · ISO/IEC 17025-aligned labs
/quickstart

Production-ready in 3 steps

Watch it run. Paste config → server connects → your LLM calls a tool and gets structured scores back.

Grab an API key

Sign up, confirm your email, copy the key. Free trial credits included.

Get a key
02

Add one block to your MCP configrunning

Paste the snippet into Cursor, Claude Desktop, or your custom agent — pick a tab on the right.

03

Call a tool from your LLM

Hand your model the audio. It gets back nested JSON: pron sub-scores, fluency + WPM, audio SNR, and details[] with ms ranges, stress, liaison and per-phoneme rows.

API reference
Live playground · no micRun a real Mandarin + English demoWatch raw JSON → teacher diagnosis → auto-generated drill. No signup, no setup.Open the playground
$
npx -y @chivox/mcp
/use-cases

Built for what developers actually ship

Tutors, coaches, companions, QA tooling — pick the scenario that\u2019s yours and see how the agent loop looks in practice.

T3hǎoT3jīnT1
tone score · 88

The only MCP that feeds LLMs phoneme-level Mandarin

Tone objects, sandhi resolution and per-phoneme windows returned in the same payload shape every other language ships. Your agent reasons over 睡觉 vs. 水饺 at the acoustic layer, not the transcript — signal a Whisper-stack integration simply can’t surface.

0:06
Scoredoverall 84fluency 78
ai
Try it again, focus on /θ/
voice · live

Score candidate speech, not just transcripts

Screen English fluency, pronunciation confidence and rhythm at scale. Your LLM reasons over numbers, not vibes — explainable rubrics every HR team will trust.

retake
00:00·01:24 retake·02:48
AI QA · live

Agent training & call-script compliance

Evaluate standard-phrase delivery, articulation, pacing and keyword hits for call-center reps. Flag exactly which second drifted off-script and auto-generate coaching drills.

chivoxCursorClaudeClineLangChainZedDify
MCP · 1 config
+ LlamaIndex · OpenAI Agents SDK

Voice-gated NPCs and pronunciation-powered gameplay

Players unlock spells, dialogues or levels by saying the phrase correctly. Get a pass/fail plus the exact phoneme that missed, at <300 ms p95 — fast enough for real-time game loops.

Ready to wire it up?

Same payload. Your agent. Your production loop.

Drop Chivox MCP into Cursor, Claude Desktop, or any agent SDK. One npx and you’re reading the same JSON you just saw above.

Free trial · spend caps · low-balance alerts · zero audio retention