Phase 6: Benchmarking

Build a benchmarking engine to evaluate ASR model performance.
Implement API endpoints to run and retrieve benchmark jobs.
Accept the audio in file format.
Use the model-router to run inference on each audio sample.
Support benchmarking for multiple models: Swecha Gonthuka Whisper
Follow an asynchronous job pattern: POST /benchmark → returns job_id GET /benchmark/{job_id} → returns results
Generate a structured JSON report for metrics output.
Ensure compatibility with CI dashboards for automated evaluation.