Chore: Enforce Automated Code Quality, Security Scanning, and Test Coverage Thresholds

Summary

This MR implements a high-standard development workflow by integrating the industry-standard pre-commit framework, significantly increasing test coverage, and enforcing strict quality gates in both local and CI/CD environments.


Architectural Changes

1. Validation Workflow (Standardized)

  • Framework: Implemented .pre-commit-config.yaml using standard Python ecosystem tools.
  • Installation: Standardized via uv run pre-commit install.

Hooks Integrated:

Hook Purpose
Commitizen Enforces Conventional Commits
Ruff High-performance linting and formatting
Bandit Automated security vulnerability scanning
Mypy Strict static type checking
Vulture Dead code detection
Pytest-Cov Test execution with coverage enforcement

2. Test Coverage Excellence (99%)

  • Gap Analysis: Resolved major coverage gaps in vlm_client.py (0% → 100%) and models.py (0% → 100%).
  • Advanced Testing: Implemented 19+ new test cases covering:
    • image_utils.py: Complex EXIF/GPS extraction and image manipulation utilities.
    • pipeline.py: LLM instantiation logic and malformed OCR data fallbacks.
    • main.py: Singleton pipeline initialization and CLI operational modes.
    • validation.py: Exhaustive edge cases for ISBN-10 and ISBN-13 formats.

3. CI/CD & Quality Enforcement

  • Strict Gates: Configured a mandatory 95% minimum coverage threshold. Commits and pipelines will fail if coverage drops below this limit.
  • GitLab Visualization: Configured .gitlab-ci.yml with specific regex parsing to enable native GitLab coverage reporting (visible on Pipeline and MR pages).
  • Security Fixes: Addressed Bandit-flagged issues including migrating from insecure /tmp paths to tempfile.TemporaryDirectory.

Coverage Statistics

Module Before After
Total Project ~80% 99%
vlm_client.py 0% 100%
models.py 0% 100%
image_utils.py 73% 100%

How to Onboard

# 1. Sync dependencies
uv sync

# 2. Install hooks
uv run pre-commit install

# 3. Verify everything
uv run pre-commit run --all-files

Checklist

  • All modules exceed 95% test coverage.
  • Pre-commit hooks are passing locally.
  • GitLab CI is configured for coverage visualization.
  • Security vulnerabilities (Bandit) resolved.
  • README.md updated with standard developer instructions.

Closes #9 (closed)

Edited by Kushal Lagichetty

Merge request reports

Loading