Refactor transcription flow to use single backend job per audio file (!33) · Merge requests · VISWAM / apps / Speech / Voice App Frontend

srilatha bandari requested to merge feat/models into feat/develop-pro May 14, 2026

Summary

This MR refactors the normal backend transcription flow to process the entire uploaded audio file using a single backend transcription job instead of chunk-based uploads.

Previously, the frontend decoded and split audio files into chunks client-side, where each chunk independently triggered api.transcribeAudio(...), resulting in multiple /api/transcribe requests, multiple job_ids, and chunk-wise transcription output.

With this update, the frontend now uploads the complete audio file once using api.transcribeFile(...) and polls a single job_id until processing is completed.

Changes Made

Removed chunk-based transcription flow for normal backend mode.
Updated frontend to upload the entire audio file in a single request.
Refactored polling logic to handle only one transcription job.
Updated UI state handling to display transcription output only after full processing completion.
Preserved existing chunking behavior for diarization mode where required.

Benefits

Aligns frontend behavior with backend processing flow.
Reduces unnecessary API calls and polling requests.
Improves transcription consistency and response handling.
Prevents partial/chunk-wise transcription rendering.
Simplifies job tracking and state management.

Affected Areas

Audio transcription flow
API handling (api.ts)
Job polling mechanism
Transcription result rendering

Testing Performed

Verified single /api/transcribe request is created per uploaded audio file.
Confirmed only one job_id is generated and polled.
Validated complete transcription output is returned as a single response.
Ensured diarization mode functionality remains unaffected.

Refactor transcription flow to use single backend job per audio file

Summary

Changes Made

Benefits

Affected Areas

Testing Performed

Merge request reports