Research Article

Utterance Clustering Using Stereo Audio Channels

Figure 1

Visualization of audio signal processing for each speaker. The same color box represents the waveforms from the same speech segment. (a) A stereo waveform of a speaker’s speaking audio, (b) stereo waveforms in 0.5 second, (c) mono waveforms of extracted left- and right-channel audio signal for every 0.5 seconds, and (d) the processed waveforms for every 0.5 seconds.