feat: add corpus-client list command for browsing uploaded records
Feature Summary
Add a corpus-client list command that lets users browse their uploaded records directly from the CLI with pagination, filtering (media type, language, status, date range), and sorting (date, title, status).
Problem Statement
-
Current limitation: The CLI supports uploading and extracting but has no way to view or browse uploaded records. Users must switch to the React web frontend (
corpus-client-app) to see their records. - Who experiences this: All CLI users who want a terminal-only workflow for managing corpus records.
- Frequency: Every time a user wants to verify uploads, check record status, or browse contributions.
- Workaround: Open the web app at the same API base URL and browse records there — breaks the CLI-only workflow.
Proposed Solution
Add a list command to the CLI that:
- Calls
GET /api/v1/records/with query params for pagination, filtering, and sorting - Renders results as a formatted Rich table in the terminal
- Follows existing architecture: thin Typer command in
cli.pydelegating to a newrecords.pymodule
User flow:
corpus-client login # authenticate
corpus-client list # see all records (page 1, 20 per page)
corpus-client list --type audio --language telugu # filter by type + language
corpus-client list --page 2 --size 10 # paginate
corpus-client list --sort title --order asc # sort
corpus-client list --from 2025-01-01 --to 2025-12-31 # date range filter
Test-Driven Development
Acceptance Criteria (Given-When-Then)
Primary User Flow
Scenario 1: List records with default pagination
Given the user is authenticated
And records exist in the system
When the user runs `corpus-client list`
Then a table of up to 20 records is displayed
And each row shows: #, ID, Title, Type, Language, Status, Created date
And pagination info shows "Showing page 1 of N (X total records)"
Alternative Flows
Scenario 2: Filter by media type
Given the user is authenticated
And records of type "audio" exist
When the user runs `corpus-client list --type audio`
Then only records with media_type "audio" are displayed
And the Type column shows the emoji label for each media type
Scenario 3: Paginate through results
Given the user is authenticated
And more than 10 records exist
When the user runs `corpus-client list --page 2 --size 10`
Then records 11-20 are displayed
And row numbering starts at 11
And pagination info shows "Showing page 2 of N"
Scenario 4: Sort records
Given the user is authenticated
When the user runs `corpus-client list --sort title --order asc`
Then records are displayed sorted by title ascending
Scenario 5: Filter by date range
Given the user is authenticated
When the user runs `corpus-client list --from 2025-01-01 --to 2025-06-30`
Then only records created within the date range are displayed
Scenario 6: Combined filters
Given the user is authenticated
When the user runs `corpus-client list --type audio --language telugu --sort date`
Then only Telugu audio records are displayed, sorted by date
Edge Cases
Scenario 7: Empty result set
Given the user is authenticated
And no video records exist
When the user runs `corpus-client list --type video`
Then the message "No records found." is displayed in yellow
Scenario 8: Validation - Unauthenticated user
Given the user is NOT authenticated
When the user runs `corpus-client list`
Then the message "Not logged in. Run: corpus-client login" is displayed
And the CLI exits with code 1
Scenario 9: Error - API connection failure
Given the user is authenticated
And the API server is unreachable
When the user runs `corpus-client list`
Then the message "Connection error: ..." is displayed in red
Scenario 10: Error - Expired token
Given the user has an expired auth token
When the user runs `corpus-client list`
And the API returns HTTP 401
Then the message "Unauthorized. Please login again." is displayed in red
Unit Test Requirements
Components/Functions to Create:
| File Path | Component/Function | Test Coverage Required |
|---|---|---|
src/corpus_client_cli/records.py |
fetch_records() |
Correct query params, auth header, 200/non-200/exception handling |
src/corpus_client_cli/records.py |
display_records() |
Table rendering, empty state, pagination math, title truncation, media type labels |
src/corpus_client_cli/records.py |
run_list_records() |
Orchestration, error routing (401, connection, generic) |
src/corpus_client_cli/cli.py |
list_records() |
Auth guard, parameter passthrough |
Test Assertions Required:
-
fetch_recordssends correct query params to API -
fetch_recordsincludesAuthorization: Bearer <token>header -
fetch_recordsreturns parsed JSON on HTTP 200 -
fetch_recordsreturns error dict on non-200 responses -
fetch_recordsreturns error dict on connection exceptions -
fetch_recordsconditionally includes optional params only when provided -
display_recordsrenders Rich table with 7 columns (#, ID, Title, Type, Language, Status, Created) -
display_recordshandlesresults,items, and direct list response formats -
display_recordsshows "No records found." for empty results -
display_recordscalculates correct total pages -
display_recordstruncates titles longer than 28 characters to 25 + "..." -
display_recordsmaps media types to emoji labels (text, audio, video, document, image) -
run_list_recordsprints specific error for 401 responses -
run_list_recordsprints specific error for connection failures -
list_recordsCLI command exits with code 1 for unauthenticated users
Technical Specification
Files to Create/Modify
| File Path | Action | Purpose |
|---|---|---|
src/corpus_client_cli/records.py |
Create | API fetching, response parsing, Rich table rendering |
src/corpus_client_cli/cli.py |
Modify | Add list command + records module import |
pyproject.toml |
Modify | Add rich>=13.0.0 as explicit dependency |
docs/ROADMAP.md |
Modify | Mark 2.1 items as [x]
|
API Requirements
| Endpoint | Method | Request Body | Response | Status |
|---|---|---|---|---|
/api/v1/records/ |
GET | N/A (query params: page, page_size, media_type, language, status, sort_by, sort_order, date_from, date_to) |
{"results": [...], "total": N} |
[x] Exists (per ROADMAP) |
Query Parameters:
| Param | Type | Default | Description |
|---|---|---|---|
page |
int | 1 | Page number |
page_size |
int | 20 | Records per page |
media_type |
string | - | Filter: text, audio, video, document, image |
language |
string | - | Filter: language code |
status |
string | - | Filter: record status |
sort_by |
string | - | Sort field: date, title, status |
sort_order |
string | desc | Sort order: asc, desc |
date_from |
string | - | Start date (YYYY-MM-DD) |
date_to |
string | - | End date (YYYY-MM-DD) |
UI/UX Specification
Design Requirements
Terminal table output using Rich:
📋 Records (Page 1)
┌────┬────────┬──────────────┬────────────┬──────────┬────────────┬────────────┐
│ # │ ID │ Title │ Type │ Language │ Status │ Created │
├────┼────────┼──────────────┼────────────┼──────────┼────────────┼────────────┤
│ 1 │ abc-12 │ Sample Audio │ 🎵 Audio │ telugu │ approved │ 2025-06-15 │
│ 2 │ def-45 │ My Document │ 📑 Document│ hindi │ pending │ 2025-07-20 │
└────┴────────┴──────────────┴────────────┴──────────┴────────────┴────────────┘
Showing page 1 of 3 (45 total records)
Media type emoji labels:
- text →
📄 Text - audio →
🎵 Audio - video →
🎬 Video - document →
📑 Document - image →
🖼 ️ Image
Color scheme (consistent with existing commands):
- Column headers: default Rich table styling
- ID: cyan
- Title: white
- Type: magenta
- Language: green
- Status: yellow
- Created / row number: dim
Accessibility Requirements
-
Works in any terminal with Unicode support -
No color-only information (emoji labels supplement color) -
Clear error messages with actionable guidance
Definition of Done
Development
-
All acceptance criteria (10 Given-When-Then scenarios) pass -
Unit tests written for fetch_records,display_records,run_list_records -
Code follows project conventions (async pattern, module delegation, Rich output) -
No lint errors ( ruffpasses) -
richdependency explicitly declared inpyproject.toml
Testing
-
Manual testing completed against dev API ( https://dev.api.corpus.swecha.org) -
corpus-client list --helpshows all 9 options correctly -
corpus-client --helpshowslistcommand in command list -
Authenticated and unauthenticated flows verified -
Empty result and error flows verified
Documentation
-
ROADMAP.md updated (2.1 items marked complete)
Code Review
-
Code reviewed and approved
Additional Context
Open Questions
- Does the backend
GET /api/v1/records/endpoint support all the proposed query params (date_from,date_to,sort_by,sort_order)? If not, the command still works — unsupported params are ignored server-side. - What is the exact response format? Implementation defensively handles
results,items, or direct list. - Does the API support filtering by
status? What are the valid status values?
Related Issues
- Related to: ROADMAP Priority 2 (Fetch/Retrieve Features)
- Enables: 2.2 Record Details (depends on record IDs visible from list output)
- Depends on: Backend
GET /api/v1/records/endpoint availability
References
- API endpoint:
GET /api/v1/records/(perdocs/ROADMAP.mdline 78) - CLI framework: Typer (https://typer.tiangolo.com/)
- Output formatting: Rich (https://rich.readthedocs.io/)
- Existing pattern reference:
upload.py→run_record_upload()delegation