Skip to content

feat: add automatic media type detection for bulk upload

Kuruva Laxmi requested to merge feature/auto-media-type-detection into main

🚀 Summary

This MR implements automatic media type detection for bulk uploads in the CLI.

Previously, users were required to select a single media type (e.g., image, audio, document), which was applied to all files in a directory. This caused incorrect behavior when uploading mixed media files.

This change introduces per-file media type detection based on file extensions.


Changes Made

  • Added detect_media_type(file_path) function
  • Introduced extension-based media type mapping
  • Integrated Python mimetypes as fallback
  • Implemented per-file media type detection during upload
  • Added Auto-detect (default) option in media type selection
  • Maintained backward compatibility with manual override
  • Added logging for detected media type per file

📁 Media Type Detection

File Type Detected As
.png, .jpg, .jpeg image
.mp3, .wav audio
.mp4 video
.txt, .md text
others document

🧪 Testing

  • Tested with mixed file types (images + text)
  • Verified correct media type assignment per file
  • Verified manual override still works
  • Tested unknown file fallback → document
  • No errors during upload

Problem Solved

  • Incorrect metadata assignment for mixed uploads
  • API validation errors (e.g., HTTP 422)
  • Poor user experience due to single media type selection

🔗 Related Issue

Closes #2 (closed)

Screenshots Before image After

image


💡 Impact

  • Improves data accuracy
  • Enhances CLI usability
  • Supports real-world mixed media uploads
  • Reduces upload failures

Merge request reports

Loading