Skip to content

fix: removed manual cache entirely on queries

Bhaskar Battula requested to merge fix/cache_in_analytics_service into develop

Merge Request

Overview

This MR refactors the analytics service layer to eliminate manual in-memory caching and resolve N+1 query issues. The previous implementation relied on a global cache to reduce repeated database queries, which introduced risks such as stale data, memory growth, and lack of thread safety.

This change replaces the caching mechanism with optimized SQLAlchemy query patterns using eager loading and batch queries, ensuring consistent, performant, and maintainable data access.


What does this MR do and why?

The previous implementation used a global _cache dictionary to store medicines and doctors. While this reduced query count, it introduced several architectural issues:

  • Stale data due to lack of proper invalidation
  • Memory growth without eviction policy
  • Not thread-safe in concurrent environments
  • Masked underlying N+1 query problems instead of fixing them

Approach Taken

  • Removed global cache (_cache, _get_*, _preload_*)
  • Replaced with:
    • SQLAlchemy eager loading (selectinload) where applicable
    • Batch queries using .in_() for medicines and doctors
  • Refactored builder functions to use ORM objects directly instead of cache lookups

Trade-offs

  • Slight increase in query complexity
  • But significantly improved:
    • Data correctness
    • Maintainability
    • Predictability of performance

Changes Made

  • Removed:

    • _cache
    • _clear_cache
    • _get_medicine, _get_doctor
    • _preload_medicines, _preload_doctors
  • Updated:

    • _get_camp_consultations_with_prescriptions to use optimized fetching
    • Analytics functions to avoid cache usage
    • Builder functions to use ORM objects / maps
  • Improved:

    • Query efficiency (eliminated N+1 queries)
    • Type consistency (UUID handling, string IDs)

Technical Details

Root Cause

The original implementation suffered from N+1 query problems, where each doctor/medicine lookup triggered additional queries. This was mitigated using a manual cache instead of addressing the query design.

Fix

  • Used bulk queries:
    db.query(Medicine).filter(Medicine.medicine_id.in_(ids))
    

Used ORM relationships and eager loading:

selectinload(PatientConsultation.prescriptions) selectinload(PatientConsultation.doctor)

Ensured all IDs are handled consistently (UUID vs string mismatch fixed)

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 📝 Documentation update
  • ️ Refactor (no functional changes)
  • Performance improvement
  • 🧪 Test update
  • 🔧 Configuration change
  • 🚨 Security fix
  • 🗑️ Deprecation (removing deprecated code)

Related Issues / References

How to Validate Locally

  1. Start the application
uvicorn app.main:app --reload

2. Test the analytics endpoints

Use Postman / curl / browser:

- /analytics/camp
- /analytics/medicines
- /analytics/doctors
- /analytics/doctor-specific

Verify:

- No errors (no 500 responses)
- Data is returned correctly
- Fields are populated as expected

3. Enable SQL query logging

Update your DB engine:

engine = create_engine(DB_URL, echo=True)

Restart the server.

4. Verify query behavior (IMPORTANT)
❌ Incorrect (N+1 problem)

You should NOT see repeated queries like:

SELECT * FROM doctor WHERE id = ...
SELECT * FROM doctor WHERE id = ...
SELECT * FROM doctor WHERE id = ...

✅ Correct behavior

You SHOULD see batched queries like:

SELECT * FROM patient_consultation ...
SELECT * FROM prescription WHERE consultation_id IN (...)
SELECT * FROM medicine WHERE medicine_id IN (...)


## Code Quality Checklist


### Code Standards

- [ ] Code follows project conventions (naming, structure, formatting)
- [ ] No debug statements or commented-out code left (unless necessary and intended)
- [ ] No unused imports, variables, or functions
- [ ] No duplicate code (DRY principle followed)
- [ ] Type hints are properly defined (no `Any` unless justified and no mypy type check errors)
- [ ] Ruff checks pass:
  ```bash
  ruff check .
  ruff format . --check

Python & FastAPI Best Practices

  • Functions follow single-responsibility principle
  • Async/await used correctly (no blocking calls in async functions)
  • Dependency injection used appropriately
  • Pydantic models used for request/response validation
  • SQLAlchemy queries are optimized (no N+1 queries)
  • Error handling is comprehensive (try/except with proper logging)

API Design

  • RESTful conventions followed
  • Proper HTTP status codes returned
  • Input validation implemented
  • Authentication/authorization enforced
  • Role Base access control used for user restriction
  • API documentation (docstrings) updated

Database & Migrations

  • Database migrations created (if schema changed)
  • Database migrations version is pointing to the latest version (and version name follows project conventions)
  • Migrations are reversible (migrations contain downgrade scripts)
  • Indexes added for frequently queried fields
  • No raw SQL queries (using SQLAlchemy ORM)
  • Data integrity constraints maintained

Security

  • No sensitive data logged (passwords, tokens, PII)
  • SQL injection prevention verified (ORM used)
  • Input sanitization implemented
  • Authentication tokens handled securely
  • CORS settings appropriate
  • Security scan passes:
    bandit -r app/

Error Handling

  • Errors are caught and handled gracefully
  • User-friendly error messages returned
  • Errors are logged appropriately
  • HTTP error responses follow API standards

Documentation

  • README.md updated (if setup steps changed)
  • .env.example updated (if new env vars added)
  • API documentation updated (docstrings, OpenAPI specs)
  • CHANGELOG.md will be updated (if applicable)
  • Code comments explain complex logic (not what, but why)

Known Limitations / Technical Debt

Additional Notes


MR Acceptance Checklist

Quality & Correctness

  • Code works as intended and solves the stated problem
  • No bugs introduced (existing functionality not broken)
  • Edge cases handled appropriately

Maintainability

  • Code is readable and well-organized
  • Code is testable and well-tested
  • Follows project patterns and conventions

Acceptance Review

  • Reviewed by at least 1 teammate
  • Reviewed by product owner

Merge request reports

Loading