Summarize Audio Recordings: Gemini Nano On The Pixel 8 Pro Is Ingenious!


With the last Google Feature Drop, the AI known as Gemini Nano also arrived on the Google Pixel 8 Pro (review) for the very first time. You can now use the smallest version of the Large Language Model in the Google Recorder app. What can it do? For starters, the artificial intelligence can create a summary of your audio recordings. We will now show you how this works.

Last year, Google distributed the last quarterly Pixel Feature Drop for 2023 after a slight delay. The focus was on the Google Pixel 8 Pro in particular, which was given an improved video mode with the Video Boost function. The AI ensures better image stabilization, less noise, and optimized lighting.

Using the Gemini Nano AI on your Google Pixel 8 Pro

Today, we will focus on a function that is only available on the Google Pixel 8 Pro—the in-house Gemini Nano AI. The smallest of Google’s LLM models also runs offline, i.e. without an internet connection. The artificial intelligence creates a summary of voice recordings in written form. I’ll explain why this can work without an internet connection at the end of this article. Just to provide you with a preview: The magic word is Android AICore.

Set the system language to English (US)

Unfortunately, there is still a catch: the feature is only available in English (US). Google will gradually deliver additional language packs down the road. So if you want to test Gemini Nano on your Google Pixel 8 Pro, simply set your system language to English (US):

  • Go to Settings.
  • Select System.
  • Scroll to Languages and System languages.
You need to make sure the system language is set to English (US).
First change the language to English (US). / © nextpit
  • Now, download the English (US) language pack (if you have not done so already).
  • Move the language to position one (you can do so by a long press on the two dashes).
  • You are now ready for Google’s AI Gemini Nano!
Only the English language is supported at the moment.
Gemini Nano currently only runs in English. / © nextpit

The Google Recorder app

Google already distributed a Pixel Feature Drop a year ago, which gave the recorder app the ability to transcribe audio recordings. The AI was also able to differentiate between several speakers and create a dialog from the recording. The first time you use the recorder, Google will ask you whether you want to activate speaker recognition.

In my little test with my colleague Rubens, it didn’t work 100% as expected, as all three speakers were recognized and the last sentence “Yes, now we check it” came from me again (Speaker 1 & 2). There is a caveat though: Neither of us are native English speakers.

The person recognition doesn't quite as claimed.
The person recognition doesn’t quite work yet. / © nextpit

First voice recording and summary

To get a summary of the voice recording converted into text, the text should not be too short. In my first example (first recording), the text proved to be too short.

The AI is unable to work with a voice recording that is too short.
This voice recording was probably too short for the AI. / © nextpit

I then read a few paragraphs from my review of the Amazon Echo Show 5 (2023). Now, the smartphone also wanted to download the Large Language Model (LLM) of the Gemini Nano AI, which is required for the summary.

You will get a quick summary once Gemini Nano is done processing your audio file.
Once the audio recording and transcript are ready, Gemini Nano creates three key points with a summary of your file. © nextpit

Wait for a while as the artificial intelligence creates a summary of our transcript.

The LLM Gemini Nano works on an AICore

Gemini is the latest and most effective artificial intelligence from Google. There are basically three variations:

  • Gemini Ultra – Google’s largest and most powerful model for highly complex tasks.
  • Gemini Pro – Google’s best model for scaling for a wide range of tasks.
  • Gemini Nano – the most efficient model from Mountain View, for on-device tasks.
The Android AICore is essential in the running of Gemini Nano.
The Android AICore manages the model, runtime, and security functions. / © Google

Gemini Nano uses a system service called Android AICore. It works independently from the rest of the system and does not require access to your network. This is particularly advantageous if you perform tasks that require end-to-end encryption, such as WhatsApp, and want to ensure effective data protection.

Suggestions and replies from Gemini Nano do not leave your smartphone and are therefore secure in nature. This was also particularly important for “Smart Reply” in Gboard. In the Developer Preview, the Pixel 8 Pro delivered high-quality reply suggestions with conversational awareness for messenger apps like Line, KakaoTalk, and WhatsApp with the help of Gemini Nano.

How many of you already use artificial intelligence regularly? Which one do you use and for what purpose? Let us know your experiences in the comments below.


Source link






Leave a Reply

Your email address will not be published. Required fields are marked *