Skip to content

refactor(upload): replace custom chunk manager with TUS protocol

Laxman Reddy requested to merge refactor/tus-upload-protocol into develop

Refactoring Request Template

Title

refactor(upload): replace custom chunk manager with TUS protocol support


Description

This merge request refactors the existing upload system by replacing the custom chunk-based upload manager with a TUS-based resumable upload implementation.

The previous upload workflow relied on a custom chunk_manager.py implementation responsible for:

  • storing file chunks in temporary directories
  • validating uploaded chunk order
  • tracking upload progress manually
  • assembling chunks into a final file
  • detecting missing chunks
  • cleaning stale uploads using scheduled Celery maintenance jobs

This approach required maintaining custom upload logic and handling multiple edge cases internally.

This refactor introduces a TUS-compatible upload flow that provides protocol-standard resumable uploads and removes the need for custom chunk orchestration.


Problem Statement

The legacy upload architecture had several limitations:

  • Custom chunk lifecycle management
  • Manual temp-file orchestration in /tmp/chunks
  • Dedicated cleanup task for incomplete uploads
  • Tight coupling between backend chunk APIs and frontend implementation
  • Increased maintenance cost for retry/resume edge cases
  • No interoperability with standard upload clients

Solution Implemented

This MR replaces the custom chunk upload flow with a TUS-based upload manager and routing layer.

Upload Flow Changes

Old flow:

Frontend Upload → Custom Chunk API → chunk_manager.py → Temporary Chunk Storage → Chunk Merge → Cleanup Task

New flow:

Frontend Upload → TUS Endpoint → tus_manager.py → Resumable Upload Storage → Upload Finalization → Automatic Cleanup Lifecycle


Refactor Details

1. Removed Legacy Chunk Upload System

The following legacy components were removed:

  • custom chunk upload orchestration
  • chunk progress tracking
  • manual merge logic
  • chunk validation logic
  • incomplete chunk cleanup dependency

Removed files:

  • app/utils/chunk_manager.py
  • tests/unit/utils/test_chunk_manager.py

2. Added TUS Upload Support

Introduced a TUS-compatible upload manager and API layer.

New functionality includes:

  • resumable uploads
  • upload offset tracking
  • upload resume after interruption
  • upload metadata handling
  • upload deletion support
  • protocol-compliant lifecycle

Added files:

  • app/api/v1/endpoints/tus_upload.py
  • app/utils/tus_manager.py
  • tests/unit/api/v1/endpoints/test_tus_upload.py
  • tests/unit/utils/test_tus_manager.py

3. API Routing Updates

Updated API routing to include TUS upload endpoints.

Modified:

  • app/api/v1/api.py

TUS endpoints now support:

  • POST → create upload
  • HEAD → retrieve upload status
  • PATCH → upload file content
  • DELETE → remove upload
  • OPTIONS → protocol negotiation

4. Upload Finalization Refactor

Updated upload completion flow to work with TUS uploads instead of chunk aggregation.

Modified:

  • app/api/v1/endpoints/records.py

Changes:

  • removed dependency on chunk-based completion
  • validated upload completion via TUS upload state
  • replaced legacy upload finalization logic

5. Validation Schema Updates

Updated validation models to remove chunk-specific requirements.

Modified:

  • app/schemas/upload_validation.py

Changes:

  • removed legacy total_chunks dependency
  • aligned validation with TUS lifecycle

6. Cleanup & Maintenance Refactor

Updated cleanup architecture to remove chunk cleanup scheduling.

Modified:

  • app/tasks/maintenance.py
  • app/core/celery_app.py

Changes:

  • removed chunk cleanup task
  • introduced stale TUS upload cleanup
  • updated Celery schedule integration

Files Changed

Added

  • app/api/v1/endpoints/tus_upload.py
  • app/utils/tus_manager.py
  • tests/unit/api/v1/endpoints/test_tus_upload.py
  • tests/unit/utils/test_tus_manager.py

Modified

  • app/api/v1/api.py
  • app/api/v1/endpoints/records.py
  • app/core/celery_app.py
  • app/schemas/upload_validation.py
  • app/tasks/maintenance.py
  • tests/unit/tasks/test_maintenance.py
  • tests/unit/tasks/test_maintenance_v2.py

Removed

  • app/utils/chunk_manager.py
  • tests/unit/utils/test_chunk_manager.py

Benefits

  • Protocol-standard resumable uploads
  • Better recovery for interrupted uploads
  • Reduced custom upload complexity
  • Cleaner backend architecture
  • Easier maintenance
  • Better interoperability with standard clients
  • Reduced manual cleanup handling

Testing & Validation

Validated:

  • TUS upload route registration
  • Upload creation and resume flow
  • Offset validation handling
  • Upload cleanup lifecycle
  • Removal of legacy chunk references
  • Updated maintenance tasks
  • Unit test coverage for TUS manager and endpoints

Checklist

  • Removed custom chunk manager
    • Added TUS upload endpoints
    • Added TUS upload manager
    • Updated upload finalization flow
    • Removed chunk cleanup dependency
    • Updated Celery cleanup handling
    • Added unit tests
    • Removed legacy upload implementation

Related Issue

Closes: #101

Merge request reports

Loading