Skip to content

feat: add audio metadata extraction pipeline and update API branding to metadata-extractor

Mohana Sri Bhavitha requested to merge audio-L1 into develop

🚀 Overview

This MR introduces audio metadata extraction support and updates API branding to align with the broader metadata extraction scope.


Changes

🎧 Audio Extraction Pipeline

  • Implemented audio metadata extraction using ffprobe
  • Extracted key fields:
    • duration, bitrate, sample rate, channels, codec, format, file size
  • Added robust parsing and normalization:
    • type-safe conversions
    • bitrate normalization
    • duration rounding
  • Integrated audio processing into the existing /extract pipeline
  • Added routing support for audio file types (mp3, wav, flac)

🧩 Pipeline Integration

  • Added process_audio() in pipeline
  • Extended routing logic to support audio files
  • Maintained separation of concerns (no extraction logic in endpoint)
  • Ensured no regression in existing PDF and image flows

🏷️ API Branding Update

  • Updated FastAPI title to "metadata-extractor"
  • Updated README to reflect multi-format support:
    • documents (PDF)
    • images
    • audio (new)
    • video (upcoming)

Testing

  • Verified PDF extraction remains unaffected
  • Verified image extraction remains unaffected
  • Tested audio extraction with valid .mp3 files
  • Handled edge cases:
    • corrupted/low-quality files
    • missing metadata fields

️ Notes

  • Python package name (bookextractor) is intentionally unchanged to avoid breaking imports
  • Full package rename will be handled in a separate refactor PR after feature stabilization

🔜 Next Steps

  • Add video metadata extraction pipeline
  • Introduce unified file-type detection (magic bytes)
  • Extend response schema consistency across all data types

Merge request reports

Loading