fix: migrate S3 bucket interactions to Python MinIO SDK
Summary
Migrate from aioboto3/botocore to the official MinIO Python SDK (minio>=7.2.5) for S3-compatible object storage operations. This change ensures compatibility with corpus-server-app and resolves issues encountered with Boto.
Changes
Dependency Migration
- Replace aioboto3>=13.1.1 and botocore>=13.1.1 with minio>=7.2.5 in pyproject.toml
- Update uv.lock with new dependency tree
Code Updates (upload.py)
- Replace aioboto3.Session and botocore.config.Config with minio.Minio client
- Update endpoint URL parsing to extract host and secure flag for MinIO client initialization
- Convert async Boto calls to threaded blocking calls using asyncio.to_thread():
- client.fget_object() for file downloads
- client.list_objects() for object listing
- Remove Boto-specific paginator logic in favor of MinIO's list_objects() API
- Remove unused urllib.parse.urlencode import (now using urlparse)
Test Updates (test_s3_upload.py)
- Update tests to reflect MinIO SDK usage
Why This Change?
- Compatibility: Aligns with corpus-server-app which uses the MinIO SDK
- Boto Issues: Resolves compatibility issues with Boto's async behavior and configuration complexity
- Simpler API: MinIO SDK provides a cleaner, more direct API for S3-compatible operations
- Reduced Dependencies: Removes aioboto3 and botocore in favor of a single lightweight SDK