feat: benchmarking separate models (!35) · Merge requests · VISWAM / apps / Speech / Voice App Backend

vyshnavi requested to merge asr-benchmarking into develop May 10, 2026

Summary

Implemented benchmarking for all major models used in the ASR pipeline by isolating and evaluating each model independently to analyze performance characteristics.

Models Benchmarked

Transcription Models

swecha_gonthuka
distil-whisper/distil-large-v3

Speaker Diarization

pyannote/speaker-diarization-3.1

Punctuation Restoration

ModelsLab/punctuate-indic-v1

Language Recognition

openai/whisper-small

Benchmarking Metrics

The following metrics were collected for each model:

Model Load Time
Transcription/Inference Time
RAM Usage / Memory Consumption

Benchmark Dataset

Benchmarking was performed using both Telugu and English audio samples with varying durations:

30 seconds
60 seconds
1 minute

Purpose

The benchmarking was conducted to:

Measure individual model performance in isolation
Compare inference efficiency across models
Analyze memory utilization and scalability
Identify performance bottlenecks in the ASR pipeline

Notes

Each model was benchmarked independently to avoid interference from other pipeline components.
Results can be used for optimization, deployment planning, and future model selection decisions.

*closes #23

Edited May 10, 2026 by vyshnavi

feat: benchmarking separate models