Skip to content

feat(job): Implement job tracking system with CLI commands and history support

Mohana Sri Bhavitha requested to merge add-job-feature into develop

📌 Overview

This MR introduces a comprehensive job tracking system for the Corpus Client CLI.
It allows users to view, monitor, and inspect past upload and extract jobs directly from the command line.

The feature improves visibility into job execution, progress, and metadata without requiring any API interaction.


Features Added

🔹 Job Listing

  • corpus-client job
  • Displays all jobs in a formatted table
  • Shows:
    • Job index
    • Mode (upload / extract)
    • Status (completed / failed / in-progress)
    • Progress (completed/total)
    • Last updated timestamp

🔹 Filtering Support

  • --mode upload | extract
  • --status completed | failed | in-progress
  • --limit <n>

🔹 Detailed Job View

  • corpus-client job show <index>
  • Displays:
    • Status and progress
    • Metadata (language, categories, media type, etc.)
    • Failed files
    • Completed files (summarized)

🔹 Automatic Job Tracking

  • Jobs are created when:
    • upload-files is executed
    • extract is executed
  • Job files are:
    • Stored in ~/.corpus-cli/jobs/
    • Updated during execution
    • Reused during resume

🛠 Implementation Details

📁 Job System

  • Added job.py:
    • Job class for parsing and processing job data
    • Status computation logic
    • Progress tracking utilities
    • Safe handling of corrupted files

🧾 Job Storage

  • Each job is stored as a timestamped JSON file:

    ~/.corpus-cli/jobs/<mode>_<timestamp>.json

  • Example:

    upload_2026-04-16_103000.json


🖥 CLI Integration

  • Updated cli.py:
    • Added job command (list view)
    • Added filtering options
    • Added job show command for detailed inspection
  • Implemented using Typer and Rich for better UX

🔄 Integration with Existing Commands

  • Modified:
    • upload.py
    • extracted_text_upload.py
  • Added:
    • Job creation at start
    • Incremental updates during execution
    • Resume support (same job continues)

🧪 How to Test

1. Run Upload/Extract

uv run corpus-client upload-files ... uv run corpus-client extract ...

2. List Jobs

uv run corpus-client job

3. Apply Filters

uv run corpus-client job --mode upload uv run corpus-client job --status completed

4. View Job Details

uv run corpus-client job show 1


📊 Example Output

Job List

Index | Mode | Status | Progress | Last Updated 1 | upload | ✓ completed | 10/10 | 2026-04-16 10:30 2 | extract| ⏳ in-progress| 5/20 | 2026-04-16 11:00


Job Details

`Job #1 (closed): Upload Job

Status: ✓ completed Progress: 10/10 Last Updated: 2026-04-16 10:30

Metadata: Language: en

Failed Files: None Completed Files: 10 files`


️ Edge Cases Handled

  • Missing job directory → graceful message
  • Corrupted job files → skipped with warning
  • Invalid job index → handled safely
  • Missing metadata fields → safely ignored

📝 Notes

  • No authentication required (local state only)
  • Existing functionality remains unaffected
  • Backward compatible with current state files

Status

  • Feature complete
  • Tested (manual + unit tests)
  • Ready for review 🚀
Edited by Mohana Sri Bhavitha

Merge request reports

Loading