refactor(upload): replace custom chunk manager with TUS protocol
Refactoring Request Template
Title
refactor(upload): replace custom chunk manager with TUS protocol support
Description
This merge request refactors the existing upload system by replacing the custom chunk-based upload manager with a TUS-based resumable upload implementation.
The previous upload workflow relied on a custom chunk_manager.py implementation responsible for:
- storing file chunks in temporary directories
- validating uploaded chunk order
- tracking upload progress manually
- assembling chunks into a final file
- detecting missing chunks
- cleaning stale uploads using scheduled Celery maintenance jobs
This approach required maintaining custom upload logic and handling multiple edge cases internally.
This refactor introduces a TUS-compatible upload flow that provides protocol-standard resumable uploads and removes the need for custom chunk orchestration.
Problem Statement
The legacy upload architecture had several limitations:
- Custom chunk lifecycle management
- Manual temp-file orchestration in
/tmp/chunks - Dedicated cleanup task for incomplete uploads
- Tight coupling between backend chunk APIs and frontend implementation
- Increased maintenance cost for retry/resume edge cases
- No interoperability with standard upload clients
Solution Implemented
This MR replaces the custom chunk upload flow with a TUS-based upload manager and routing layer.
Upload Flow Changes
Old flow:
Frontend Upload → Custom Chunk API → chunk_manager.py → Temporary Chunk Storage → Chunk Merge → Cleanup Task
New flow:
Frontend Upload → TUS Endpoint → tus_manager.py → Resumable Upload Storage → Upload Finalization → Automatic Cleanup Lifecycle
Refactor Details
1. Removed Legacy Chunk Upload System
The following legacy components were removed:
- custom chunk upload orchestration
- chunk progress tracking
- manual merge logic
- chunk validation logic
- incomplete chunk cleanup dependency
Removed files:
app/utils/chunk_manager.pytests/unit/utils/test_chunk_manager.py
2. Added TUS Upload Support
Introduced a TUS-compatible upload manager and API layer.
New functionality includes:
- resumable uploads
- upload offset tracking
- upload resume after interruption
- upload metadata handling
- upload deletion support
- protocol-compliant lifecycle
Added files:
app/api/v1/endpoints/tus_upload.pyapp/utils/tus_manager.pytests/unit/api/v1/endpoints/test_tus_upload.pytests/unit/utils/test_tus_manager.py
3. API Routing Updates
Updated API routing to include TUS upload endpoints.
Modified:
app/api/v1/api.py
TUS endpoints now support:
-
POST→ create upload -
HEAD→ retrieve upload status -
PATCH→ upload file content -
DELETE→ remove upload -
OPTIONS→ protocol negotiation
4. Upload Finalization Refactor
Updated upload completion flow to work with TUS uploads instead of chunk aggregation.
Modified:
app/api/v1/endpoints/records.py
Changes:
- removed dependency on chunk-based completion
- validated upload completion via TUS upload state
- replaced legacy upload finalization logic
5. Validation Schema Updates
Updated validation models to remove chunk-specific requirements.
Modified:
app/schemas/upload_validation.py
Changes:
- removed legacy
total_chunksdependency - aligned validation with TUS lifecycle
6. Cleanup & Maintenance Refactor
Updated cleanup architecture to remove chunk cleanup scheduling.
Modified:
app/tasks/maintenance.pyapp/core/celery_app.py
Changes:
- removed chunk cleanup task
- introduced stale TUS upload cleanup
- updated Celery schedule integration
Files Changed
Added
app/api/v1/endpoints/tus_upload.pyapp/utils/tus_manager.pytests/unit/api/v1/endpoints/test_tus_upload.pytests/unit/utils/test_tus_manager.py
Modified
app/api/v1/api.pyapp/api/v1/endpoints/records.pyapp/core/celery_app.pyapp/schemas/upload_validation.pyapp/tasks/maintenance.pytests/unit/tasks/test_maintenance.pytests/unit/tasks/test_maintenance_v2.py
Removed
app/utils/chunk_manager.pytests/unit/utils/test_chunk_manager.py
Benefits
- Protocol-standard resumable uploads
- Better recovery for interrupted uploads
- Reduced custom upload complexity
- Cleaner backend architecture
- Easier maintenance
- Better interoperability with standard clients
- Reduced manual cleanup handling
Testing & Validation
Validated:
- TUS upload route registration
- Upload creation and resume flow
- Offset validation handling
- Upload cleanup lifecycle
- Removal of legacy chunk references
- Updated maintenance tasks
- Unit test coverage for TUS manager and endpoints
Checklist
- Removed custom chunk manager
-
Added TUS upload endpoints -
Added TUS upload manager -
Updated upload finalization flow -
Removed chunk cleanup dependency -
Updated Celery cleanup handling -
Added unit tests -
Removed legacy upload implementation
-
Related Issue
Closes: #101