fix: improve Telugu accuracy and final transcript reliability
Summary: Improvements to Telugu live transcription pipeline to enhance reliability, stability, and accuracy in WebSocket streaming without changing the ASR model.
Changes:
- Improved WebSocket session handling:
- clean new session initialization
- reset support for new recordings
- safer finalization before socket close
- better logging for chunk processing and responses
- Improved transcription pipeline:
- partials use fast PCM-based path for low latency
- final output uses full
transcribe_audio(...)path for better accuracy and punctuation
- Added streaming configuration:
- partials toggle (
STREAMING_ENABLE_PARTIALS) - increased partial and silence thresholds for more stable output
- partials toggle (
- Improved speech detection and buffering for cleaner final results
Impact:
- Fixes missing final transcript in UI after stop
- Reduces noisy/unstable partial outputs
- Improves streaming session reliability
- Enhances transcription accuracy without model change
Testing:
- Verified WebSocket flow and final transcript delivery
- Confirmed no premature socket closure
- Backend runs without errors after changes
Closes #15
Edited by Sahasra Reddy