Speaker recognition and speaker verification


Speaker recognition: is the process of automatically recognizing who is speaking by using the speaker-specific information included in speech waves to verify identities being claimed by people accessing systems; that is, it enables access control of various services by voice.

Speaker verification (also called speaker authentication) contrasts with identification.

Speaker recognition:

Speaker recognition

2-stage speaker recognition: Enroll and verify

Before we can verify a speaker, we need the user to enroll his voice:


“Hello” 3 times

and random strings 2 times

Store the averaged embedding vector on database use for speaker recognition.


Generalized End-to-End Loss for create embedding vector:

Case 5 - Solution

Verification decision Thresholding cosine similarity:

Case 5