feat: benchmarking separate models
Summary
Implemented benchmarking for all major models used in the ASR pipeline by isolating and evaluating each model independently to analyze performance characteristics.
Models Benchmarked
Transcription Models
- swecha_gonthuka
- distil-whisper/distil-large-v3
Speaker Diarization
- pyannote/speaker-diarization-3.1
Punctuation Restoration
- ModelsLab/punctuate-indic-v1
Language Recognition
- openai/whisper-small
Benchmarking Metrics
The following metrics were collected for each model:
- Model Load Time
- Transcription/Inference Time
- RAM Usage / Memory Consumption
Benchmark Dataset
Benchmarking was performed using both Telugu and English audio samples with varying durations:
- 30 seconds
- 60 seconds
- 1 minute
Purpose
The benchmarking was conducted to:
- Measure individual model performance in isolation
- Compare inference efficiency across models
- Analyze memory utilization and scalability
- Identify performance bottlenecks in the ASR pipeline
Notes
- Each model was benchmarked independently to avoid interference from other pipeline components.
- Results can be used for optimization, deployment planning, and future model selection decisions.
*closes #23
Edited by vyshnavi