Skip to content

Implementation of Speaker Label Ordering and Convert to 1-Based Indexing

Vemuri priya requested to merge index into speaker-diarization

Description

This merge request fixes issues related to speaker diarization output ordering and speaker label formatting. The implementation ensures that speaker segments are displayed in chronological order and that speaker labels follow a clean and consistent 1-based indexing format.

Speaker Segment Ordering

  • Added sorting logic to arrange all diarization segments by start_time in ascending order
  • Ensured conversation flow is displayed correctly and sequentially

Speaker Label Remapping

  • Converted speaker labels from zero-based indexing to one-based indexing
    • SPEAKER_00SPEAKER_01
    • SPEAKER_01SPEAKER_02
  • Implemented sequential speaker numbering without gaps or inconsistencies

Consistent Speaker Mapping

  • Added mapping logic to maintain consistency between original speaker IDs and remapped labels
  • Updated speaker labels across:
    • Segment outputs
    • speaker_durations dictionary
    • Response formatting and processing logic

Formatting Improvements

  • Standardized speaker label formatting throughout the response pipeline
  • Improved readability and structure of diarization results

Validation Performed

  • Verified all segments are ordered chronologically
  • Confirmed no occurrence of SPEAKER_00 in output
  • Tested multiple diarization inputs with varying speaker counts
  • Ensured speaker labels remain consistent across segments and duration mappings
  • Validated overall response readability and logical conversation flow

Outcome

This update improves the clarity, consistency, and usability of speaker diarization outputs, making transcripts easier to read and more suitable for downstream processing and UI integration.

Closes #20 (closed)

Merge request reports

Loading