Skip to content

Internationalization and localization implemented

🌍 Internationalization (i18n) Implementation Guide

This PR adds comprehensive internationalization (i18n) support to the Indic Corpus Server, enabling the application to serve content in multiple languages.

🧩 Description

The initial implementation supports English (en) and Hindi (hi) with an extensible architecture.


🧱 Key Components

1. Core Files

Added:

  • app/core/i18n_config.py
    • Central configuration for i18n
    • Implements translation function _()
    • Handles locale detection from requests
    • Provides context manager for temporary locale changes
    • Includes i18n middleware for FastAPI

2. Locales Directory

  • locales/hi/LC_MESSAGES/messages.po → Hindi translations
  • locales/hi/LC_MESSAGES/messages.mo → Compiled Hindi translations

3. Babel Configuration

  • babel.cfg
    • Configuration for Babel string extraction
    • Defines which files to scan for translations
    • Specifies translation function names

4. Translation Template

  • messages.pot
    • Template file for translations
    • Contains all translatable strings

5. Tests

  • tests/unit/core/test_actual_i18n.py
    • Comprehensive test suite for i18n functionality

6. Modified Files

  • app/main.py
    • Integrated i18n middleware
    • Added Babel initialization
    • Updated request handling for i18n
  • pyproject.toml
    • Dependencies now include fastapi-babel

️ How It Works

Translation Flow

Request Handling

Middleware intercepts each request and extracts locale in the following order of priority:

  1. URL query parameter → ?lang=hi
  2. Accept-Language header
  3. Default locale (en)

String Translation

  • Use _('text to translate') to mark translatable strings.
  • Translations are looked up based on the current locale.
  • Falls back to the original string if translation not found.

🌐 Adding a New Language

1. Initialize New Language

# Extract translatable strings
pybabel extract -F babel.cfg -o messages.pot .

# Initialize new language (e.g., for Marathi 'mr')
pybabel init -i messages.pot -d locales -l mr

# Add translations to locales/mr/LC_MESSAGES/messages.po
# Then compile:
pybabel compile -d locales

2. Update Configuration

Add the new language code to SUPPORTED_LOCALES in app/core/i18n_config.py:

SUPPORTED_LOCALES = ["en", "hi", "mr"]  # Add new language code

3. Add Translations

Open locales/[lang]/LC_MESSAGES/messages.po Add translations for each msgid, then compile:

pybabel compile -d locales

� Useful Commands

# Extract translatable strings
pybabel extract -F babel.cfg -o messages.pot .

# Initialize new language
pybabel init -i messages.pot -d locales -l [lang_code]

# Update translations after adding new strings
pybabel update -i messages.pot -d locales

# Compile translations
pybabel compile -d locales

# Run tests
pytest tests/unit/core/test_actual_i18n.py -v

🧷 Test Coverage

1. Translation Tests

  • Verifies direct translation functionality
  • Tests both English and Hindi translations
  • Validates fallback for missing translations

2. Locale Detection Tests

  • Tests locale detection from query parameters
  • Validates Accept-Language header handling
  • Ensures proper fallback to default locale

3. Context Manager Tests

  • Verifies locale switching within context
  • Ensures proper cleanup after context exit
  • Tests nested context managers

4. API Integration Tests

  • Tests end-to-end translation through API endpoints
  • Verifies correct content negotiation
  • Validates response headers

Quick Testing with cURL

Test Default Language (English)

curl -X 'GET' 'http://localhost:8000/'

Expected: Returns content in English (default)

Test Hindi Translation

curl -X 'GET' 'http://localhost:8000/?lang=hi'

Expected: Returns content in Hindi (if available)

Test Fallback to English (Unsupported Language)

curl -X 'GET' 'http://localhost:8000/?lang=fr'

Expected: Falls back to English (default) as French is not supported

Test with Accept-Language Header

curl -H "Accept-Language: hi" -X 'GET' 'http://localhost:8000/'

Expected: Returns content in Hindi based on the Accept-Language header


🧾 Example Responses

Hindi

{
  "message": "भारतीय कॉर्पस सर्वर में आपका स्वागत है",
  "language": "hi"
}

English (Fallback)

{
  "message": "Welcome to Indic Corpus Server",
  "language": "en"
}

Checklist

  • The feature has been fully implemented.
  • Tests for the new feature are included and passing.
  • User documentation/guides have been updated (if applicable).
  • Impact on existing functionality has been considered.

🔗 Related Issue(s)

#52

Merge request reports

Loading