Skip to content

GitLab

Explore

Sign in
Register

Improve Model Loading UX and Add Precision-Aware Punctuation Model Loading with Documentation Updates

Review changes
Download
Patches
Plain diff

srilatha bandari requested to merge feat/load into develop Apr 28, 2026

Overview 3
Commits 8
Pipelines 5
Changes 13

Summary

This MR improves the application’s model loading experience and enhances the punctuation pipeline by introducing precision-aware model loading.

For model loading, the application now displays the loading screen only during the first-time model initialization. Once ASR and punctuation models are downloaded and cached, subsequent visits skip the loading screen and directly render the main UI, improving user experience and perceived performance.

For punctuation, the system now supports dynamic precision modes (FP16/INT8) and switches to a new model repository. It also optimizes model loading by preventing redundant reloads and safely handling concurrent requests.

Changes Made

Model Load Persistence

Added persistent flags (modelLoaded, punctModelLoaded) using localStorage
Flags are set after successful model initialization
Ensures model state is retained across refresh and revisit

Conditional UI Rendering

Added startup check for model load status
Updated rendering flow:
- Show loading screen only on first load
- Skip loading screen if models are already cached
Improves perceived performance for returning users

Background Initialization

Enabled non-blocking/background model initialization
Prevented unnecessary re-triggering of loading logic
Ensured smooth UI experience without delays

Precision-Aware Punctuation Loading

Switched model source to therajasekhar/punctuate-indic-v1-ONNX
Added support for precision-based loading:
- FP16 → when highPrecision = true and WebGPU is available
- INT8 → default (faster and lightweight)
Enables flexible trade-off between performance and accuracy

Worker & Service Enhancements

`src/ai/punctuateWorker.ts`

Added highPrecision support in worker messages
loadPunctuationModel() now:
- Accepts options
- Dynamically selects model format (FP16/INT8)
Tracks loaded model via loadedModelFile
Clears and reloads pipeline only when precision changes

`src/services/punctuateWorkerService.ts`

Added state tracking:
- loadedHighPrecision
- pendingHighPrecision
Optimized loading:
- Skips reload if same precision is already loaded
- Reuses in-flight load requests
Ensures punctuate(text, highPrecision) always uses correct model

Code Improvements

Simplified model loading flow
Reduced redundant UI states and reloads
Improved separation of concerns (UI vs model logic)
Added safeguards for concurrent requests

Documentation Update

Updated README to:
- Reflect new model source
- Document precision-based loading behavior
- Remove outdated or irrelevant content
Ensures clarity and alignment with implementation

Net Effect

Faster app startup after first load (no repeated loading screen)
Improved performance with INT8 as default
Higher accuracy option via FP16 when enabled
No redundant model reloads
Stable handling of concurrent requests
Cleaner and more maintainable architecture

Screenshots

Testing Done

Verified first-time load shows loading screen
Verified subsequent refresh skips loading screen
Confirmed models are not re-downloaded (Network tab)
Tested precision switching (FP16 ↔ INT8)
Verified no duplicate loads during concurrent requests
Tested after clearing browser cache
Ensured no UI flicker or regressions

Checklist

Model load persistence implemented
Loading screen shown only on first load
Main UI rendered directly on subsequent visits
Precision-based model loading implemented
No redundant model reloads
Concurrent request handling optimized
README updated
Tested locally
No known regressions
Ready for review

Closes #16 (closed) #17 (closed)

Edited Apr 28, 2026 by srilatha bandari

Merge request reports

Assignee Loading

Reviewers Loading

Request review from

Loading

Time tracking Loading

Loading