Gemini breaks new ground with a faster model, longer context, AI agents and more

1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more. This is because it’s been trained by 1.5 Pro through a process called “distillation,” where the most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model.

Read more about 1.5 Flash on the Gemini technology page, and learn about 1.5 Flash’s availability and pricing. We’ll share more details in an updated Gemini 1.5 technical report soon.

Significantly improving 1.5 Pro

Over the last few months, we’ve significantly improved 1.5 Pro, our best model for general performance across a wide range of tasks.

Beyond extending its context window to 2 million tokens, we’ve enhanced its code generation, logical reasoning and planning, multi-turn conversation, and audio and image understanding through data and algorithmic advances. We see strong improvements on public and internal benchmarks for each of these tasks.

1.5 Pro can now follow increasingly complex and nuanced instructions, including ones that specify product-level behavior involving role, format and style. We’ve improved control over the model’s responses for specific use cases, like crafting the persona and response style of a chat agent or automating workflows through multiple function calls. And we’ve enabled users to steer model behavior by setting system instructions.

We added audio understanding in the Gemini API and Google AI Studio, so 1.5 Pro can now reason across image and audio for videos uploaded in Google AI Studio. And we’re now integrating 1.5 Pro into Google products, including Gemini Advanced and in Workspace apps.

Read more about 1.5 Pro on the Gemini technology page. More details are coming soon in our updated Gemini 1.5 technical report.

Gemini Nano understands multimodal inputs

Gemini Nano is expanding beyond text-only inputs to include images as well. Starting with Pixel, applications using Gemini Nano with Multimodality will be able to understand the world the way people do — not just through text, but also through sight, sound and spoken language.

Read more about Gemini 1.0 Nano on Android.

What You Need To Raise Your Series A Today | TechCrunch Disrupt

Raising a Series A round in today’s competitive market can be a daunting task. To equip seed-stage founders with the insights and strategies needed for

July 2, 2024

ChatGPT’s Free Tier Gets A Boost With A Bunch Of Previously Paywalled Features

OpenAI is adding features to its free version of ChatGPT that were previously available only to its paying users, as the company says it’s “focused

May 30, 2024

PCIe 7.0 specs released: 512GB/sec bandwidth total, next-gen PCIe 7.0 SSDs will pump 128GB/sec

PCI-SIG initially announced the next-generation PCIe 7.0 standard back in 2022, but now the organization behind the PCIe standards — PCI-SIG, duh — has announced

April 4, 2024

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

With expert analysis, comprehensive market coverage, and actionable insights, our newsletter equips you with the knowledge & tools necessary to make informed decisions & maximize your potential returns in the dynamic world of future tech stocks.