feat : corpus integration in the team leaderboard
## Overview
This MR enhances the **Team Leaderboard** with two major capabilities:
1. **Corpus Contribution Integration** — Members' Corpus media activity (audio, image, video, file uploads) is now fetched, displayed per-member, and factored into attendance/consistency index calculations alongside GitLab contributions.
2. **Individual Metrics Excel Export** — A new downloadable Excel report provides granular, per-member contribution breakdowns (MRs, Issues, Commits, Corpus files, Active Days, Attendance %, Consistency %) in addition to the existing full team report.
---
## What does this MR do and why?
Previously the leaderboard only tracked GitLab activity (MRs, Issues, Commits). This MR closes the gap by pulling Corpus media contributions into the leaderboard pipeline, giving team leads a holistic view of member productivity. Additionally, the new **Individual Metrics export** enables offline analysis of each member's raw contribution data per team.
---
## Changes Made
### `src/gitlab_compliance_checker/infrastructure/corpus/client.py` *(NEW)*
- Introduced a dedicated `CorpusClient` that authenticates via token and exposes endpoints to list members' uploaded files (audio, image, video, generic files) filtered by date range.
### `src/gitlab_compliance_checker/ui/leaderboard.py`
- Added `_render_corpus_login()` sidebar widget for users to supply Corpus credentials; stores authenticated client and token in `st.session_state`.
- Added `_fetch_corpus_media_for_team()` helper that concurrently fetches Corpus files per member using `corpus_username` → GitLab username mapping.
- Added `get_corpus_list_html()` to render stylised file lists (with icons, bucket colours) inside member expanders.
- Extended `_render_team_result()` to show a Corpus file section (audio / image / video / file buckets) beneath existing GitLab activity panels; shows login prompt if credentials are absent.
- Extended team data assembly loop to:
- Attach `corpus_username` and empty `corpus_files` scaffold from roster metadata.
- After GitLab data fetch, call the Corpus fetcher and attach results per member.
- Pre-calculate `Active Days`, `Total Days`, `Consistency %`, `Attendance %`, `Working Days` for each member row (including Corpus data) before rendering.
- Added `_build_individual_metrics_excel_export()` to produce a per-member Excel workbook with individual contribution breakdowns.
- Exposed a second **"Download Individual Metrics (Excel)"** button beside the existing full-report button in a two-column layout.
- Added a leaderboard-level warning banner when Corpus credentials are missing.
### `src/gitlab_compliance_checker/services/roster_service.py`
- Minor: `corpus_username` field now surfaced from roster metadata during member row construction.
### `tests/test_leaderboard_ui_extended.py`
- Updated `test_render_team_result` mock to patch `streamlit.columns` with 6 columns (up from 5) to reflect the new Corpus file section column layout.
---
### Technical Details
- **Corpus auth flow**: Token stored in `st.session_state["_lb_corpus_token"]`; client stored in `st.session_state["_lb_corpus_client"]`. This avoids re-authentication on every rerun.
- **Date filtering**: Corpus API receives `YYYY-MM-DD` date parts sliced from the existing ISO timestamp filter, ensuring consistency with GitLab date range.
- **Attendance metric**: `_get_contribution_index()` now receives `corpus_files` kwarg so audio/media days count towards active-day computation, not just GitLab events.
- **Excel export**: `_build_individual_metrics_excel_export()` iterates `team_data` and writes one sheet per team with columns: Name, Username, MRs, Issues, Commits, Audio Files, Image Files, Video Files, Files, Active Days, Working Days, Consistency %, Attendance %.
---
## Type of Change
- [x] ✨ New feature (non-breaking change that adds functionality)
- [x] ♻️ Refactor (no functional changes — attendance pre-calc moved to data assembly)
- [x] 🧪 Test update
---
## Related Issues / References
- Corpus contribution visibility gap in team leaderboard
- Per-member export for offline HR/management review
---
## Screenshots or Screen Recordings
> _(Attach screenshots of the Corpus login sidebar, the Corpus file panels in member expanders, and the new Export section with two download buttons)_
---
## How to Validate Locally
```bash
# 1. Start the app
uv run streamlit run app.py
# 2. In the sidebar, enter Corpus credentials and click "Login to Corpus"
# 3. Navigate to the Team Leaderboard, select a team, and run analytics
# 4. Expand any member card and verify:
# - Corpus audio/image/video/file buckets appear
# - "Login to Corpus" caption shows when not authenticated
# 5. Click "Download Individual Metrics (Excel)" and verify the file contains one sheet per team
# with per-member breakdown columns
Testing Done
-
Unit tests added/updated
Test Cases Covered:
| Scenario | Expected Result | Status |
|---|---|---|
| Corpus credentials NOT provided | Warning banner shown; member expander shows "Login to Corpus" caption instead of file buckets | |
| Corpus credentials provided, member has files | Audio/Image/Video/File buckets rendered with correct counts inside member expander | |
| Member has no Corpus files in date range | "No Corpus files found" caption displayed | |
| Download Individual Metrics button clicked | Excel file downloaded with correct per-member columns (MRs, Issues, Commits, Corpus buckets, Attendance %, Consistency %) | |
test_render_team_result unit test |
Passes with 6-column mock (updated from 5) |
Test Commands Run:
pytest tests/test_leaderboard_ui_extended.py -v
pytest --tb=short
Code Quality Checklist
Code Standards
-
Code follows project conventions (naming, structure, formatting) -
No debug statements or commented-out code left -
No unused imports, variables, or functions -
No duplicate code (DRY principle followed) -
Type hints are properly defined
Python & Streamlit Best Practices
-
Functions follow single-responsibility principle -
Session state used correctly for auth persistence -
Error handling is comprehensive ( try/excepton Excel export)
Security
-
No sensitive data logged (Corpus token stored only in session state, never printed) -
Authentication tokens handled securely
Known Limitations / Technical Debt
- Corpus fetching is currently synchronous within the
st.spinnerblock; could be made async for large teams. -
corpus_usernamemust be manually entered in the roster admin panel — no auto-discovery from GitLab username.
Additional Notes
The
_build_individual_metrics_excel_exportfunction reuses the pre-computed attendance/consistency metrics already stored inmrow, so there is zero performance overhead for the export.
MR Acceptance Checklist
Quality & Correctness
-
Code works as intended and solves the stated problem -
No bugs introduced (existing functionality not broken) -
Edge cases handled appropriately (missing Corpus token, no files, parse errors on joining date)
Maintainability
-
Code is readable and well-organised -
Code is testable and well-tested -
Follows project patterns and conventions
Acceptance Review
-
Reviewed by at least 1 teammate -
Reviewed by product owner