Affiliate links on Android Authority may earn us a commission. Learn more.
Google explains how it works its AI magic to label speakers in Pixel Recorder
- Google has published a blog post detailing what went into creating the new Speaker Labels feature on Tensor-powered Pixels.
- Google also revealed that it’s working to make the feature less power-hungry.
Google recently added Speaker Labels to the super helpful Pixel Recorder app. The feature automatically recognizes different speakers in a recording and assigns them unique labels in the transcript. Users can then assign speaker names to those labels. It sounds so simple. But Recorder’s on-device solution for labeling speakers had a lot of thought and work go into it.
Google explains in a blog post that Speaker Labels are powered by its new speaker diarization system named Turn-to-Diarize. It takes advantage of several highly optimized machine learning models and algorithms to allow diarizing hours of audio in real-time while using limited computational resources on Pixel phones.
The system can detect speaker changes using an encoder model that extracts voice characteristics from each speaker. A multi-stage clustering algorithm then annotates speaker labels to each speaker.
Google explains that audio recordings from the Recorder app can be as short as a few seconds or as long as up to 18 hours. As the model consumes more audio, it becomes more confident in predicting speaker labels. It also occasionally makes corrections to previously predicted low-confidence speaker labels. The Recorder app automatically updates the speaker labels on the screen during the recording to reflect the latest and most accurate predictions.
Seems quite magical that your phone can do all that, right?
Google says in the future, the Speaker Labels feature will consume less power thanks to changes it’s making. Currently, the system works on the CPU block of Google’s Tensor chips. The company is now working on delegating more computational tasks to the TPU block, making the diarization system more power efficient.