Skip to content

feature(backend):Phase 2 WebSocket streaming for real-time ASR (non-blocking, low-latency)Feature/websocket streaming

🚀 Phase 2: Real-Time Streaming via WebSocket

Summary

Adds a WebSocket endpoint (/api/stream) to enable low-latency, real-time transcription. Audio is streamed in chunks, processed incrementally, and partial/final transcripts are returned to the client.

Key Features

  • WebSocket streaming endpoint: /api/stream
  • Chunked buffering (~1s audio) with in-place operations
  • Non-blocking ASR execution via asyncio.to_thread
  • Incremental responses: { text, is_final }
  • Robust handling of disconnects and malformed frames
  • Memory-safe buffer cap
  • Structured [STREAM] logging

Design Notes

  • No changes to Phase 1 (async job system)
  • No additional dependencies introduced
  • Temp file lifecycle handled within ASR layer

Status

Production-ready and ready for merge

Merge request reports

Loading