Skip to content

test: improve the test coverage

Vandana reddy Balannagari requested to merge feat/tests into main

Merge Request: Improve Test Coverage

Overview

This MR significantly improves test coverage for the ASR Frontend service from 18% to 81%, adding comprehensive unit tests for all major modules.

Changes

Test Coverage Improvements

Module Before After Improvement
app/asr_service.py 7% 77% +70%
app/main.py 34% 88% +54%
app/config.py 0% 100% +100%
Overall 18% 81% +63%

Test Files Modified

tests/test_asr_service.py

Added comprehensive tests for:

  • Punctuation restoration (empty text, exceptions, pipeline failures)
  • Chunk normalization (tuple/list/separate field formats, invalid timestamps)
  • Synthetic chunk creation (empty text, audio duration handling)
  • Timestamp formatting (SRT/VTT, edge cases)
  • Subtitle generation (empty chunks, overlapping timestamps, format selection)
  • ASR pipeline lazy loading (local vs remote model paths)
  • Audio conversion (ffmpeg errors, timeouts)
  • Word timestamp generation (fallback paths)

tests/test_main.py

Added tests for:

  • Login page with redirect parameter handling
  • Transcription endpoint exception handling
  • Subtitle generation with text-only input
  • Subtitle empty content validation
  • Subtitle JSON decode error handling
  • Subtitle generic exception handling

Test Statistics

  • Total Tests: 92 passing tests
  • Test Files: 4 files (test_app.py, test_asr_service.py, test_config.py, test_main.py)
  • Execution Time: ~20 seconds

Key Features Tested

ASR Service (app/asr_service.py)

  • Punctuation restoration with various entity formats
  • Chunk normalization across multiple timestamp formats
  • Synthetic chunk creation for audio without timestamps
  • SRT/VTT subtitle generation with edge cases
  • Model loading (local and remote paths)
  • Audio conversion error handling
  • Transcription timeout and exception handling

API Endpoints (app/main.py)

  • Health check endpoint
  • Status endpoint with model info
  • Login/logout authentication flow
  • File transcription endpoint
  • Subtitle generation endpoint
  • Error handling for invalid inputs
  • CORS middleware
  • Static file mounting
  • Cache control headers

Configuration (app/config.py)

  • All default settings values
  • Audio extension validation
  • MIME type validation
  • Environment variable overrides

Remaining Coverage Gaps (19%)

The remaining uncovered code consists of:

  1. Complex ML Operations (asr_service.py lines 333-446)

    • Word-level timestamp alignment using CTC models
    • Requires actual model loading and tensor operations
    • Would need integration tests with real models
  2. URL Transcription (main.py lines 207-222)

    • Uses internal import requests inside function
    • Difficult to mock without refactoring
  3. Specific Error Paths (various lines)

    • Exception logging statements
    • Duplicate endpoint definitions
    • Edge case error handling

These gaps are acceptable as they represent:

  • Complex integration scenarios better tested with integration tests
  • Error handling that's difficult to trigger without breaking dependencies
  • Code that would require significant refactoring to test in isolation

Testing Instructions

# Activate virtual environment
source venv/bin/activate

# Run all tests with coverage
python -m pytest tests/ --cov=app --cov-report=term-missing

# Run specific test file
python -m pytest tests/test_asr_service.py -v
python -m pytest tests/test_main.py -v

# Run tests with HTML coverage report
python -m pytest tests/ --cov=app --cov-report=html
# Open htmlcov/index.html in browser

Benefits

  1. Confidence in Refactoring: High coverage allows safe refactoring
  2. Regression Prevention: Catches bugs before they reach production
  3. Documentation: Tests serve as living documentation of expected behavior
  4. Code Quality: Writing tests revealed edge cases and potential issues
  5. Faster Development: Quick feedback loop during development

Breaking Changes

None - This MR only adds tests, no production code changes.

Dependencies

No new dependencies added. Uses existing:

  • pytest (already in requirements)
  • pytest-cov (already in requirements)

Checklist

  • All tests passing (92/92)
  • Coverage improved to 81%
  • No production code changes
  • No new dependencies
  • Tests execute in reasonable time (<30s)
  • Test code follows project conventions

Related Issues

  • Improves test coverage for ASR transcription service
  • Adds comprehensive API endpoint testing
  • Covers edge cases in subtitle generation

Merge request reports

Loading