Skip to content

feat(records): add human speech ratio and quality tier to audio quality analysis

What changed

  • app/utils/snr_frequency.py: Added calculate_bandpass_ratio() — measures energy ratio in the 300–3400 Hz human speech band using ffmpeg highpass+lowpass filter, returns 0.0–1.0. Added _extract_rms_level() helper and classify_quality_tier() which maps SNR → high (≥35 dB) / medium (≥20 dB) / low (<20 dB)
  • app/models/record.py: Added human_speech_ratio (Float, nullable) and quality_tier (String(10), nullable) fields to Record
  • app/api/v1/endpoints/records.py: Both values calculated during chunked-upload finalize alongside snr_frequency, stored on the record; added quality_tier filter to GET /records/
  • app/schemas/__init__.py: Exposed human_speech_ratio and quality_tier in RecordRead
  • alembic/versions/a1b2c3d4e5f7: Adds human_speech_ratio and quality_tier columns to record table (down_revision 77cae534757f)

Test plan

  • Migration a1b2c3d4e5f7 applies cleanly: alembic upgrade a1b2c3d4e5f7
  • calculate_bandpass_ratio() verified in container — 1000 Hz sine wave returns ratio=0.9844 (fully within speech band)
  • Chunked upload of audio file → finalize returns snr_frequency=5.0, human_speech_ratio=0.8206, quality_tier="low"
  • GET /records/c8d8122a-3d70-4262-8e56-707569607005 — all three quality fields present in response
  • GET /records/?quality_tier=low — returns only records with matching tier

Closes #51 (closed)

Merge request reports

Loading