Internationalization and localization implemented
🌍 Internationalization (i18n) Implementation Guide
This PR adds comprehensive internationalization (i18n) support to the Indic Corpus Server, enabling the application to serve content in multiple languages.
🧩 Description
The initial implementation supports English (en) and Hindi (hi) with an extensible architecture.
🧱 Key Components
1. Core Files
Added:
-
app/core/i18n_config.py- Central configuration for i18n
- Implements translation function
_() - Handles locale detection from requests
- Provides context manager for temporary locale changes
- Includes i18n middleware for FastAPI
2. Locales Directory
-
locales/hi/LC_MESSAGES/messages.po→ Hindi translations -
locales/hi/LC_MESSAGES/messages.mo→ Compiled Hindi translations
3. Babel Configuration
-
babel.cfg- Configuration for Babel string extraction
- Defines which files to scan for translations
- Specifies translation function names
4. Translation Template
-
messages.pot- Template file for translations
- Contains all translatable strings
5. Tests
-
tests/unit/core/test_actual_i18n.py- Comprehensive test suite for i18n functionality
6. Modified Files
-
app/main.py- Integrated i18n middleware
- Added Babel initialization
- Updated request handling for i18n
-
pyproject.toml- Dependencies now include
fastapi-babel
- Dependencies now include
⚙ ️ How It Works
Translation Flow
Request Handling
Middleware intercepts each request and extracts locale in the following order of priority:
- URL query parameter →
?lang=hi -
Accept-Languageheader - Default locale (
en)
String Translation
- Use
_('text to translate')to mark translatable strings. - Translations are looked up based on the current locale.
- Falls back to the original string if translation not found.
🌐 Adding a New Language
1. Initialize New Language
# Extract translatable strings
pybabel extract -F babel.cfg -o messages.pot .
# Initialize new language (e.g., for Marathi 'mr')
pybabel init -i messages.pot -d locales -l mr
# Add translations to locales/mr/LC_MESSAGES/messages.po
# Then compile:
pybabel compile -d locales
2. Update Configuration
Add the new language code to SUPPORTED_LOCALES in app/core/i18n_config.py:
SUPPORTED_LOCALES = ["en", "hi", "mr"] # Add new language code
3. Add Translations
Open locales/[lang]/LC_MESSAGES/messages.po Add translations for each msgid, then compile:
pybabel compile -d locales
� Useful Commands
# Extract translatable strings
pybabel extract -F babel.cfg -o messages.pot .
# Initialize new language
pybabel init -i messages.pot -d locales -l [lang_code]
# Update translations after adding new strings
pybabel update -i messages.pot -d locales
# Compile translations
pybabel compile -d locales
# Run tests
pytest tests/unit/core/test_actual_i18n.py -v
🧷 Test Coverage
1. Translation Tests
- Verifies direct translation functionality
- Tests both English and Hindi translations
- Validates fallback for missing translations
2. Locale Detection Tests
- Tests locale detection from query parameters
- Validates Accept-Language header handling
- Ensures proper fallback to default locale
3. Context Manager Tests
- Verifies locale switching within context
- Ensures proper cleanup after context exit
- Tests nested context managers
4. API Integration Tests
- Tests end-to-end translation through API endpoints
- Verifies correct content negotiation
- Validates response headers
⚡ Quick Testing with cURL
Test Default Language (English)
curl -X 'GET' 'http://localhost:8000/'
Expected: Returns content in English (default)
Test Hindi Translation
curl -X 'GET' 'http://localhost:8000/?lang=hi'
Expected: Returns content in Hindi (if available)
Test Fallback to English (Unsupported Language)
curl -X 'GET' 'http://localhost:8000/?lang=fr'
Expected: Falls back to English (default) as French is not supported
Test with Accept-Language Header
curl -H "Accept-Language: hi" -X 'GET' 'http://localhost:8000/'
Expected: Returns content in Hindi based on the Accept-Language header
🧾 Example Responses
Hindi
{
"message": "भारतीय कॉर्पस सर्वर में आपका स्वागत है",
"language": "hi"
}
English (Fallback)
{
"message": "Welcome to Indic Corpus Server",
"language": "en"
}
✅ Checklist
-
The feature has been fully implemented. -
Tests for the new feature are included and passing. -
User documentation/guides have been updated (if applicable). -
Impact on existing functionality has been considered.