Skip to content

Improve Model Loading UX and Add Precision-Aware Punctuation Model Loading with Documentation Updates

srilatha bandari requested to merge feat/load into develop

Summary

This MR improves the application’s model loading experience and enhances the punctuation pipeline by introducing precision-aware model loading.

For model loading, the application now displays the loading screen only during the first-time model initialization. Once ASR and punctuation models are downloaded and cached, subsequent visits skip the loading screen and directly render the main UI, improving user experience and perceived performance.

For punctuation, the system now supports dynamic precision modes (FP16/INT8) and switches to a new model repository. It also optimizes model loading by preventing redundant reloads and safely handling concurrent requests.


Changes Made

Model Load Persistence

  • Added persistent flags (modelLoaded, punctModelLoaded) using localStorage
  • Flags are set after successful model initialization
  • Ensures model state is retained across refresh and revisit

Conditional UI Rendering

  • Added startup check for model load status

  • Updated rendering flow:

    • Show loading screen only on first load
    • Skip loading screen if models are already cached
  • Improves perceived performance for returning users


Background Initialization

  • Enabled non-blocking/background model initialization
  • Prevented unnecessary re-triggering of loading logic
  • Ensured smooth UI experience without delays

Precision-Aware Punctuation Loading

  • Switched model source to therajasekhar/punctuate-indic-v1-ONNX

  • Added support for precision-based loading:

    • FP16 → when highPrecision = true and WebGPU is available
    • INT8 → default (faster and lightweight)
  • Enables flexible trade-off between performance and accuracy


Worker & Service Enhancements

src/ai/punctuateWorker.ts

  • Added highPrecision support in worker messages

  • loadPunctuationModel() now:

    • Accepts options
    • Dynamically selects model format (FP16/INT8)
  • Tracks loaded model via loadedModelFile

  • Clears and reloads pipeline only when precision changes

src/services/punctuateWorkerService.ts

  • Added state tracking:

    • loadedHighPrecision
    • pendingHighPrecision
  • Optimized loading:

    • Skips reload if same precision is already loaded
    • Reuses in-flight load requests
  • Ensures punctuate(text, highPrecision) always uses correct model


Code Improvements

  • Simplified model loading flow
  • Reduced redundant UI states and reloads
  • Improved separation of concerns (UI vs model logic)
  • Added safeguards for concurrent requests

Documentation Update

  • Updated README to:

    • Reflect new model source
    • Document precision-based loading behavior
    • Remove outdated or irrelevant content
  • Ensures clarity and alignment with implementation


Net Effect

  • Faster app startup after first load (no repeated loading screen)
  • Improved performance with INT8 as default
  • Higher accuracy option via FP16 when enabled
  • No redundant model reloads
  • Stable handling of concurrent requests
  • Cleaner and more maintainable architecture

Screenshots

image image


Testing Done

  • Verified first-time load shows loading screen
  • Verified subsequent refresh skips loading screen
  • Confirmed models are not re-downloaded (Network tab)
  • Tested precision switching (FP16 INT8)
  • Verified no duplicate loads during concurrent requests
  • Tested after clearing browser cache
  • Ensured no UI flicker or regressions

Checklist

  • Model load persistence implemented
  • Loading screen shown only on first load
  • Main UI rendered directly on subsequent visits
  • Precision-based model loading implemented
  • No redundant model reloads
  • Concurrent request handling optimized
  • README updated
  • Tested locally
  • No known regressions
  • Ready for review

Closes #16 (closed) #17 (closed)

Edited by srilatha bandari

Merge request reports

Loading