AI Engineer – Speech Modelling & Quality

Company: Quantalent AI
Apply for the AI Engineer – Speech Modelling & Quality
Location: Bangalore
Job Description:

AI Engineer – Speech Modelling & Quality (STT / TTS)

Location: Bangalore/Mumbai/Hyderabad/Gurgaon/Indore

Work from Office

Role Overview

The Speech Modelling & Quality Senior Engineer is responsible for end-to-end ownership of

speech quality delivered by the Indic Speech AI platform. This role directly determines how

accurately speech is recognized and how natural, intelligible, and expressive synthesized

speech sounds across all supported Indic languages.

This role exists to ensure that improvements in model capability translate into measurable,

sustained gains in real-world user experience, and that quality does not regress as the

platform scales, new languages are added, or models are upgraded.

This role owns outcome-level quality, not just model execution.

Core Responsibilities

The role defines and owns quality metrics for speech-to-text and text-to-speech systems,

including word error rate, substitution and deletion patterns, punctuation accuracy,

pronunciation correctness, prosody, intelligibility, and naturalness.

The role performs deep error analysis across languages, accents, acoustic conditions,

device types, and usage contexts to identify systematic weaknesses in speech recognition

and synthesis.

The role drives language-specific optimization strategies, ensuring that each Indic language

is tuned independently and not treated as a secondary outcome of multilingual training.

The role collaborates with ML engineering and training teams to define data requirements,

sampling strategies, and curriculum approaches required to improve quality.

The role ensures that improvements in one language or model dimension do not introduce

regressions in others, enforcing strict quality isolation and regression testing.

The role validates that training gains are preserved through inference, ensuring no quality

loss due to quantization, batching, streaming, or runtime optimizations.

Operational Ownership

The Speech Modelling & Quality Lead owns quality regressions in production. If recognition

accuracy drops, synthesized speech quality degrades, or users experience noticeable

deterioration, this role is accountable.

The role owns the pre- and post-release quality validation process, including baseline

comparisons, A/B evaluations, and rollout gating criteria.

The role is responsible for ensuring that model upgrades, retraining, or data changes do not

negatively impact user-facing quality metrics.

The role participates in incident analysis when customer complaints, usage drop-offs, or

monetization anomalies are traced back to speech quality issues.

Key Interfaces

This role works closely with the PyTorch & Python ML Engineering Lead to translate quality

findings into concrete model changes.

The role interfaces with the PyTorch Lightning Training Lead to ensure training strategies

align with quality improvement goals.

The role collaborates with the GPU Inference Optimization Lead to ensure inference

optimizations do not compromise quality.

The role works with Language Guardrails teams to ensure safety mechanisms do not distort

or degrade speech output unintentionally.

The role coordinates with Monetization Analytics & Billing teams when quality changes

correlate with usage or revenue shifts.

Explicit Non-Responsibilities

This role does not own training infrastructure, GPU scheduling, or Kubernetes operations.

This role does not own raw ML pipeline implementation or inference service engineering.

This role does not define system architecture or networking Behaviour.

Role Expectation

The Speech Modelling & Quality Lead is expected to operate with a user-centric and

language-centric mindset, treating speech quality as the primary product outcome.

Success in this role is measured by:

Sustained reduction in word error rates

Improved naturalness and intelligibility of synthesized speech

Language-specific quality leadership rather than averaged performance

Absence of silent quality regressions in production

Clear correlation between quality improvements and user adoption or retention

Posted: March 30th, 2026