Merge branch 'develop' into 'main'
Summary
This merge request consolidates all development work from the develop branch into main for release. It includes multiple features, fixes, refactoring efforts, and CI/CD improvements.
Features
-
Glob Pattern File Filtering: Added support for glob patterns in file filtering (
*.mp3,**/*.wav,*.jpg,*.mp4, etc.) via the--patternflag - Profile Command: Consolidated status command into a unified profile command
- Bulk Upload Media Type Detection: Auto-detect media type per file during bulk upload instead of using a single media type for all files
- DotSocr JSON Format: Added support for dots.ocr JSON format in extracted text upload
- Description Validation: Enforced 32-character minimum for description field in uploads
-
Phone Number Auto-Prefix: Automatically add
+91prefix to phone numbers during login - S3-compatible bucket upload support
Bug Fixes
- Fixed glob pattern resolution with a new
resolve_file_pattern()helper for testable pattern resolution - Fixed parameter mutation bug in pattern handling
- Added bare extension normalization (
mp3→*.mp3,wav→*.wav) - Improved error handling to surface backend error messages in CLI uploads
- Centralized state paths and cleared extracted upload log
Refactoring
- Converted
test_glob_patterns.pyto pytest with type annotations - Added type hints to test functions for mypy compliance
- Fixed redundant type casts and improved version testing
- Addressed type errors across source and tests
Testing
- Added comprehensive tests for pattern resolution logic
- Added CLI
--patternflag integration tests - Added assertions to nonexistent path tests
- Updated test files for compatibility
CI/CD & DevOps
- Added GitLab CI configuration
- Moved static analysis jobs to lint stage
- Configured pipeline to run on all commits and MRs
- Aligned GitLab CI with pre-commit configuration
- Removed redundant
needsdirective from GitLab CI jobs
Documentation
- Updated README with new features
- Improved help page for extract command with format details
Code Quality
- Fixed ruff lint and format issues
- Improved mypy compliance across codebase
Edited by Ahlad Pataparla