feat: add record listing command with pagination, filtering, and sorting
Overview
Adds a corpus-client list command that lets users browse their uploaded records directly from the CLI with pagination, filtering, and sorting. This implements ROADMAP item 2.1 (Record Listing) — previously users had no way to view records without switching to the web frontend.
What does this MR do and why?
feat: add record listing command with pagination, filtering, and sorting
Implements ROADMAP 2.1 - adds corpus-client list command that fetches
records from GET /api/v1/records/ with support for media type, language,
status, and date range filters plus sorting by date, title, or status.
Motivation: The CLI supports uploading and extracting but had no way to view uploaded records. Users had to switch to the React web app to check record status or browse contributions, breaking the CLI-only workflow.
Approach: Follows the existing architecture — thin Typer command in cli.py delegating to a new records.py module with a single run_list_records() entry point. This is the same pattern used by upload.py and extracted_text_upload.py.
Trade-offs: The API response format is handled defensively (results, items, or direct list) since the exact backend response structure hasn't been confirmed. Unsupported query params are silently ignored by the API.
Changes Made
| File | Action | Purpose |
|---|---|---|
src/corpus_client_cli/records.py |
Created | API fetching, response parsing, Rich table rendering |
src/corpus_client_cli/cli.py |
Modified | Added list command with 9 Typer options, imported records module |
docs/ROADMAP.md |
Modified | Marked 2.1 items as [x] complete |
Technical Details
Architecture:
-
cli.pyadds alist_records()command that checks auth, creates anaiohttp.ClientSession, and delegates torecords.run_list_records() -
records.pycontains 4 focused functions:-
_build_params()— builds query params dict, excluding None values -
_parse_response()— extracts records list and total count from flexible API response formats -
_display_records()— renders Rich table with emoji media type labels and pagination info -
run_list_records()— single async entry point that fetches, parses, displays, and handles errors
-
CLI options:
| Option | Type | Default | Description |
|---|---|---|---|
--page / -p
|
int | 1 | Page number |
--size / -s
|
int | 20 | Records per page |
--type / -t
|
str | None | Filter: text, audio, video, document, image |
--language / -l
|
str | None | Filter: language code |
--status |
str | None | Filter: record status |
--sort |
str | None | Sort field: date, title, status |
--order |
str | desc | Sort order: asc, desc |
--from |
str | None | Start date (YYYY-MM-DD) |
--to |
str | None | End date (YYYY-MM-DD) |
Type of Change
-
🐛 Bug fix (non-breaking change that fixes an issue) -
✨ New feature (non-breaking change that adds functionality) -
💥 Breaking change (fix or feature that would cause existing functionality to change) -
📝 Documentation update -
🎨 UI/UX improvement -
♻ ️ Refactor (no functional changes) -
⚡ Performance improvement -
🧪 Test update -
🔧 Configuration change -
🚨 Security fix
Related Issues / References
- Related to: ROADMAP Priority 2 (Fetch/Retrieve Features)
- Enables: 2.2 Record Details (depends on record IDs visible from list output)
- Depends on: Backend
GET /api/v1/records/endpoint
Screenshots or Screen Recordings
corpus-client list --help output:
Usage: corpus-client list [OPTIONS]
📋 List uploaded records with filters and pagination
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --page -p INTEGER Page number [default: 1] │
│ --size -s INTEGER Records per page [default: 20] │
│ --type -t TEXT Filter by media type (text, audio, video, │
│ document, image) │
│ --language -l TEXT Filter by language │
│ --status TEXT Filter by status │
│ --sort TEXT Sort by field (date, title, status) │
│ --order TEXT Sort order (asc, desc) [default: desc] │
│ --from TEXT Filter from date (YYYY-MM-DD) │
│ --to TEXT Filter to date (YYYY-MM-DD) │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
Table output (mock data):
📋 Records (Page 1)
┌────┬─────────┬──────────────┬──────────────┬────────────┬──────────────┬──────────────┐
│ # │ ID │ Title │ Type │ Language │ Status │ Created │
├────┼─────────┼──────────────┼──────────────┼────────────┼──────────────┼──────────────┤
│ 1 │ abc-123 │ Sample Audio │ 🎵 Audio │ telugu │ approved │ 2025-06-15 │
│ 2 │ def-456 │ My Document │ 📑 Document │ hindi │ pending │ 2025-07-20 │
└────┴─────────┴──────────────┴──────────────┴────────────┴──────────────┴──────────────┘
Showing page 1 of 1 (2 total records)
How to Set Up and Validate Locally
- Pull this branch:
git checkout feat/record-listing-harsha - Install dependencies:
uv sync - Verify command exists:
corpus-client list --help - Test auth guard (without logging in):
corpus-client list # Expected: "Not logged in. Run: corpus-client login" + exit code 1 - Test with authentication:
corpus-client login corpus-client list corpus-client list --type audio --language telugu corpus-client list --page 2 --size 10 corpus-client list --sort title --order asc corpus-client list --from 2025-01-01 --to 2025-12-31 - Expected: Rich table with records or "No records found." if no matches.
Testing Done
-
Manual testing completed -
Unit tests run (13 tests via inline test script)
Test Cases Covered:
| Scenario | Expected Result | Status |
|---|---|---|
| List records (default pagination) | Table with up to 20 records, pagination footer | |
| Pagination (page 2, size 10) | Row numbering starts at 11, correct page count | |
| Filter by media type | Only matching records shown | |
| Filter by language | Only matching records shown | |
| Filter by status | Only matching records shown | |
| Filter by date range | date_from/date_to params sent | |
| Sort by title asc | sort_by + sort_order params sent | |
| Empty results | "No records found." in yellow | |
| Unauthenticated user | Auth error message, exit code 1 | |
| Expired token (401) | "Unauthorized. Please login again." | |
| Connection error | "Connection error: ..." | |
| Generic API error (500) | "API error (HTTP 500): ..." | |
| Title truncation (>28 chars) | Truncated to 25 + "..." |
Code Quality Checklist
Code Standards
-
Code follows project conventions (async pattern, module delegation, Rich output) -
No console.log() or debugger statements left in code -
No unused imports, variables, or functions -
No duplicate code and use of existing components for reusability -
ruff checkpasses onrecords.py(0 errors)
Python / CLI Best Practices
-
Follows existing module pattern ( run_*entry point per module) -
Auth check consistent with other commands ( upload_files,upload_extracted) -
Rich console passed from cli.py(single instance, not recreated) -
aiohttp session created in cli.pyand passed to module (consistent with existing commands) -
Error handling covers: 200, 401, other HTTP errors, connection exceptions
API & Data Fetching
-
Bearer token auth header included -
Query params built cleanly (None values excluded) -
Response format handled defensively ( results/items/list) -
HTTP error codes handled (401 specific message, generic for others) -
Connection errors caught and displayed
Error Handling
-
Errors caught and handled gracefully -
User-friendly error messages displayed with Rich formatting -
Network failures handled with actionable message
Documentation
-
ROADMAP.md updated (2.1 items marked complete) -
README.md updated — not needed, no setup changes -
user-manual.md updated — should be updated in a follow-up to document the listcommand
Known Limitations / Technical Debt
-
API response format assumed: The
_parse_response()function handlesresults,items, and direct list formats defensively. The actual backend response format should be confirmed. -
Query param names assumed:
media_type,language,status,sort_by,sort_order,date_from,date_tomay need adjustment based on actual API contract. - No formal pytest tests: Verification done via inline test scripts (13 tests). Formal pytest tests should be added as the test infrastructure matures.
-
user-manual.md not updated: The
listcommand should be documented in a follow-up MR.
Additional Notes
- This is a read-only command — no state files created, no data modified.
- The
sort_orderparam is only sent whensort_byis provided (avoids meaningless param). - Pre-existing lint warnings in
cli.py(unusedloggingimport, unusedglobimport, ambiguous variablel) are not addressed in this MR — they're unrelated to this feature.
MR Acceptance Checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.