Method of speaker clustering for unknown speakers in conversational audio data