feat(job): Implement job tracking system with CLI commands and history support
📌 Overview
This MR introduces a comprehensive job tracking system for the Corpus Client CLI.
It allows users to view, monitor, and inspect past upload and extract jobs directly from the command line.
The feature improves visibility into job execution, progress, and metadata without requiring any API interaction.
✨ Features Added
🔹 Job Listing
corpus-client job- Displays all jobs in a formatted table
- Shows:
- Job index
- Mode (upload / extract)
- Status (completed / failed / in-progress)
- Progress (completed/total)
- Last updated timestamp
🔹 Filtering Support
--mode upload | extract--status completed | failed | in-progress--limit <n>
🔹 Detailed Job View
corpus-client job show <index>- Displays:
- Status and progress
- Metadata (language, categories, media type, etc.)
- Failed files
- Completed files (summarized)
🔹 Automatic Job Tracking
- Jobs are created when:
-
upload-filesis executed -
extractis executed
-
- Job files are:
- Stored in
~/.corpus-cli/jobs/ - Updated during execution
- Reused during
resume
- Stored in
🛠 Implementation Details
📁 Job System
- Added
job.py:-
Jobclass for parsing and processing job data - Status computation logic
- Progress tracking utilities
- Safe handling of corrupted files
-
🧾 Job Storage
-
Each job is stored as a timestamped JSON file:
~/.corpus-cli/jobs/<mode>_<timestamp>.json -
Example:
upload_2026-04-16_103000.json
🖥 CLI Integration
- Updated
cli.py:- Added
jobcommand (list view) - Added filtering options
- Added
job showcommand for detailed inspection
- Added
- Implemented using Typer and Rich for better UX
🔄 Integration with Existing Commands
- Modified:
upload.pyextracted_text_upload.py
- Added:
- Job creation at start
- Incremental updates during execution
- Resume support (same job continues)
🧪 How to Test
1. Run Upload/Extract
uv run corpus-client upload-files ... uv run corpus-client extract ...
2. List Jobs
uv run corpus-client job
3. Apply Filters
uv run corpus-client job --mode upload uv run corpus-client job --status completed
4. View Job Details
uv run corpus-client job show 1
📊 Example Output
Job List
Index | Mode | Status | Progress | Last Updated 1 | upload | ✓ completed | 10/10 | 2026-04-16 10:30 2 | extract| ⏳ in-progress| 5/20 | 2026-04-16 11:00
Job Details
`Job #1 (closed): Upload Job
Status: ✓ completed Progress: 10/10 Last Updated: 2026-04-16 10:30
Metadata: Language: en
Failed Files: None Completed Files: 10 files`
⚠ ️ Edge Cases Handled
- Missing job directory → graceful message
- Corrupted job files → skipped with warning
- Invalid job index → handled safely
- Missing metadata fields → safely ignored
📝 Notes
- No authentication required (local state only)
- Existing functionality remains unaffected
- Backward compatible with current state files
✅ Status
- Feature complete
- Tested (manual + unit tests)
- Ready for review
🚀