refactor(docker): optimize orchestration, dynamic vparse, and model caching
Closes #8 (closed)
💡 What does this MR do?
This Merge Request completely overhauls the Docker infrastructure to make it significantly more modular, faster to build, and resilient across different Docker environments.
🛠 ️ Key Technical Changes
-
Dynamic VParse Orchestration via Git Context
-
Removed: All
Dockerfile.vparse*variants have been deleted to reduce repository bloat. -
Added: VParse is now entirely optional and hidden behind Docker Compose profiles (
pipeline,vlm,hybrid). When a profile is invoked, Docker Compose directly clones the upstreammineru-dotsrepository and builds the corresponding backend on the fly.
-
Removed: All
-
Legacy Docker Compose Compatibility Fix
-
Fixed:
model-downloaderpreviously useddockerfile_inline, causing silent build failures (and subsequentModuleNotFoundErrorcrashes) on older Docker installations that fallback to the legacy builder. This logic has been extracted into a dedicatedDockerfile.model-downloader.
-
Fixed:
-
Optimized Initial Model Fetching
-
Fixed:
setup_models.pyno longer forces a 2GB download of Paddle OCR models on default startup. It now strictly downloads the Gemma LLM for BookExtractor. VParse will dynamically handle downloading its own OCR models when a user spins up a specific profile.
-
Fixed:
-
Persistent HuggingFace Cache Mapping
-
Fixed: Added
- models:/root/.cache/huggingfacevolume mappings to all VParse profiles. MinerU expects models at this hardcoded internal path. This ensures VParse recognizes previously downloaded models and stops wasting 15+ minutes re-downloading them on every container restart.
-
Fixed: Added
🧪 How to test this PR
-
Test Default Startup: Run
docker compose up --build -d. Verify that only the BookExtractor and model-downloader start, and only the Gemma model is downloaded. -
Test Profile Startup: Run
docker compose --profile pipeline up -d. Verify that it successfully builds VParse from the remote repo. -
Test Cache Persistence: Restart the VParse container (
docker compose restart vparse-pipeline). Check the logs to ensure it instantly uses the cached models instead of re-downloading them.
📸 Documentation
The README.md has been fully updated to reflect the new profile-based startup instructions and the removal of the old VPARSE_DOCKERFILE environment variable overrides.
Edited by Lakshy Yarlagadda