what-is-gemini-live.md 11 KB


title: "What is Gemini Live" tags: ["ai"] categories: ["ai"] image: "https://i0.wp.com/9to5google.com/wp-content/uploads/sites/4/2024/08/Gemini-Live-on-Pixel-9-Pro.jpg?resize=1200%2C628&quality=82&strip=all&ssl=1" url: "/news/what-is-gemini-live" date: 2024-08-15T00:54:25+08:00 descrption: "What is Gemini Live" draft: false display: true

type: "featured"

## Gemini Live & new voices rolling out on Pixel, Samsung

img]

As announced during Made by Google 2024, Gemini Live is rolling out and we’re seeing wider availability on Pixel and other Android phones today.

The Gemini Live icon is a waveform badged with a sparkle that appears in the bottom-right corner of the Gemini overlay and fullscreen app.

Hi, I’m Gemini. We’re about to go live where you can explore complex topics or ideas just by talking.

An introductory prompt explains how you can “Hold” or “End” the conversation with the big buttons at the bottom, or say “stop.” Right out of the gate, Google says Gemini Extensions aren’t available in Live yet, but they are coming later to let you control your phone and access other apps (Gmail, YouTube, etc.) via voice.

The fullscreen Gemini Live UI is quite clean, but you can exit the app to go about using your phone or lock/turn off the screen to keep talking. In that case, you get a “Live with Gemini” notification that notes how the “mic is on” with an “End Live mode” button.

After you end a conversation, a text transcript showing your prompts and Gemini’s responses will appear. It appears in the “Recent” history list like every other text chat. You can restart a conversation by tapping the Live button in the corner.

In Gemini Settings, you have a new toggle for “Interrupt Live responses” — an aspect Google is particularly proud of to let users interject — and “Gemini’s voice” to pick from 10 options that are fittingly star and space-themed (thanks Omega192). This new voice is available outside of Live as well.

  • Nova: Calm • Mid-range voice
  • Ursa: Engaged • Mid-range voice
  • Vega: Bright • Higher voice
  • Pegasus: Engaged • Deeper voice
  • Orbit: Energetic • Deeper voice
  • Lyra: Bright • Higher voice
  • Orion: Bright • Deeper voice
  • Dipper: Engaged • Deeper voice
  • Eclipse: Energetic • Mid-range voice
  • Capella: British Accent • Higher voice

So far, we’re seeing this on Pixel and Samsung devices. Gemini Live is widely rolling out to Android underway. It requires a Gemini Advanced subscription and works with English globally.

img]

For years, we’ve relied on digital assistants to set timers, play music or control our smart homes. This technology has made it easier to get things done and saved valuable minutes each day.

Now with generative AI, we can provide a whole new type of help for complex tasks that can save you hours. With Gemini, we’re reimagining what it means for a personal assistant to be truly helpful. Gemini is evolving to provide AI-powered mobile assistance that will offer a new level of help — all while being more natural, conversational and intuitive.

Learn more about the new Gemini features, which will be available on both Android and iOS.

img]

ChoozMo is a leading Generative AI application company in Taiwan, providing AI news anchor services for major Taiwanese TV stations, including SET iNews, CTS, Hakka TV. Developed LLM-based AI customer service for Taipei 101, supporting Chinese, English, Japanese, and Korean. Additionally, they are developing LLM in the ESG field.

Google is rolling out a new voice chat mode for Gemini, called Gemini Live, the company announced at its Pixel 9 event today. Available for Gemini Advanced subscribers, it works a lot like ChatGPT’s voice chat feature, with multiple voices to choose from and the ability to speak conversationally, even to the point of interrupting it without tapping a button.

Google says that conversations with Gemini Live can be “free-flowing,” so you can do things like interrupt an answer mid-sentence or pause the conversation and come back to it later. Gemini Live will also work in the background or when your phone is locked. Google first announced that Gemini Live was coming during its I/O developer conference earlier this year, where it also said Gemini Live would be able to interpret video in real time.

Gemini Live adds voice chatting to Google’s AI assistant. GIF: Google

Google also has 10 new Gemini voices for users to pick from, with names like Ursa and Dipper. The feature has started rolling out today, in English only, for Android devices. The company says it will come to iOS and get more languages “in the coming weeks.”

In addition to Gemini Live, Google announced other features for its AI assistant, including new extensions coming later on, for apps like Keep, Tasks, Utilities, and YouTube Music. Gemini is also gaining awareness of the context of your screen, similar to AI features Apple announced at WWDC this year. After users tap “Ask about this screen” or “Ask about this video,” Google says Gemini can give you information, including pulling out details like destinations from travel videos to add to Google Maps.

img]

It was inevitable. With Gemini taking form in all parts of Google’s ecosystem, it was apparent that Android would eventually follow suit. Alongside the new family of Pixel hardware, Google announced it is doubling down on its promise of AI by introducing a handful of new Gemini-enabled features, including Gemini Live, which lets you chat with it as if it’s right in your ear. Google calls this the newly rebuilt “assistant experience with Gemini.”

Android users are getting a new Gemini overlay. Like the Assistant before it, Gemini can pop in at any time you need with the long press of the power button and offer context about what’s on the screen. This works with several different apps in varying ways. Google’s examples include asking for additional information about what you’re watching on a YouTube video. Or, use it for image generation in an app like Google Messages. Circle to Search also gets a small feature bump on most Android devices. You can select and share material right as you interact with it.

Then, there’s Gemini Live, which launches today. This experience feels the most like the indie sleaze-era film Her, but the Google way and without the problematic ScarJo. You can speak “naturally” to Gemini, just as you would with another person, just like Joaquin Phoenix did with that earbud. And yes, the new Pixel Buds Pro 2 will enable this feature. Google says the new Gemini Live can understand intent, follow a train of thought, and do complex tasks the Assistant couldn’t do before. Gemini Live will even let you chat with it about life and track any ideas you might have. The company suggests using it to “brainstorm potential jobs” suited to your skill set. Let a machine help you figure out where you belong in the machine.

Gemini will be Google’s most widely available AI assistant, just as the Google Assistant was/is. It still exists in the Nest ecosystem to some extent, but Gemini is the replacement for what once was the Google Assistant as the handy helper. The only difference is the way you input it. Gemini relies more on imagery and direct prompting, and that’s not how we talk to Google Assistant. We trained ourselves to dial down the prompts once we realized digital assistants weren’t in it for the “casual conversation,” as we’d hoped. Perhaps Gemini will be that for Android.

Google promises Gemini is private. The ability requires your permission before accessing all parts of your life in the ecosystem, where it will then interact with your email and documents and serve as the assistant it’s billed to be. Some of Gemini’s Android features are processed in the cloud, while most “sensitive use cases” stay on-device with Gemini Nano.

If you’re down to using Gemini as Google intended, features like Gemini Live will start rolling out for Gemini Advanced subscribers. If you buy a Pixel 9 or Pixel 9 Pro, Google will include a year of the Google One AI Premium Plan as a treat, which includes free access to Gemini Advanced for a year.

img]

Kerry Wan/ZDNET

One of generative AI's most useful (and needed) applications is enhancing voice assistants, which have remained relatively unchanged for years. Now, Google is making several upgrades to its voice assistant experience with the help of Gemini.

At the company's Made by Google event on Tuesday, Google made Gemini its default voice assistant, replacing Google Assistant with a smarter alternative that can be interrupted, is aware of your Google apps, and can even help answer questions about the contents of your screen.

Also: Google tops the Index with Gemini Live and Pixel's AI features

Arguably the biggest Gemini announcement is that Google made Gemini Live available three months after announcing it at Google I/O.

Gemini Live is an advanced voice assistant that can have human-like, multi-turn (or exchanges), verbal conversations on complex topics and even give you advice. For example, when speaking to the assistant, you can interrupt it mid-sentence, and the assistant will still understand you. You can also pick from multiple voices to enhance your conversation experience.

However, there's a catch: only Gemini Advanced subscribers on Android devices can access it. The feature is already being rolled out to both Samsung and Pixel devices.

As a bonus, Pixel Pro 9 users get access to the Google One AI Premium Plan, which includes access to Gemini Advanced -- and, therefore, Gemini Live -- at no additional cost for the first year. But for all other Android users, it's hard to say whether Gemini Live is worth paying $20 per month for a Google One AI Premium Plan. If you want to see whether the plan is worth it, you can try it for free via a one-month trial.

Also: How to try Google's new Gemini Live AI assistant for free

When announced at Google I/O, Gemini Live also had multimodal capabilities, which allowed it to use the camera to see the world around you and ingest that as context for answers. That feature, however, has not been released yet.

Gemini Live is a direct competitor to GPT-4o's new and improved Voice Mode, which has the same conversational and multimodal capabilities. Like Google, OpenAI has yet to make video and screen-sharing capabilities available.