Skip to content

Refactor transcription flow to use single backend job per audio file

srilatha bandari requested to merge feat/models into feat/develop-pro

Summary

This MR refactors the normal backend transcription flow to process the entire uploaded audio file using a single backend transcription job instead of chunk-based uploads.

Previously, the frontend decoded and split audio files into chunks client-side, where each chunk independently triggered api.transcribeAudio(...), resulting in multiple /api/transcribe requests, multiple job_ids, and chunk-wise transcription output.

With this update, the frontend now uploads the complete audio file once using api.transcribeFile(...) and polls a single job_id until processing is completed.

Changes Made

  • Removed chunk-based transcription flow for normal backend mode.
  • Updated frontend to upload the entire audio file in a single request.
  • Refactored polling logic to handle only one transcription job.
  • Updated UI state handling to display transcription output only after full processing completion.
  • Preserved existing chunking behavior for diarization mode where required.

Benefits

  • Aligns frontend behavior with backend processing flow.
  • Reduces unnecessary API calls and polling requests.
  • Improves transcription consistency and response handling.
  • Prevents partial/chunk-wise transcription rendering.
  • Simplifies job tracking and state management.

Affected Areas

  • Audio transcription flow
  • API handling (api.ts)
  • Job polling mechanism
  • Transcription result rendering

Testing Performed

  • Verified single /api/transcribe request is created per uploaded audio file.
  • Confirmed only one job_id is generated and polled.
  • Validated complete transcription output is returned as a single response.
  • Ensured diarization mode functionality remains unaffected.

Merge request reports

Loading