As VP of Product at Google Cloud, Michael Gerstenhaber primarily works on Vertex, the company’s unified platform for deploying enterprise AI. This gives us an overview of how companies are actually using AI models and what it takes to realize the full potential of agentic AI.
When I spoke with Michael, I was particularly struck by an idea I had never heard before. In his words, AI models are simultaneously challenging three frontiers: raw intelligence, response time, and a third quality that has less to do with cost than raw functionality: whether the models can be deployed at a cost sufficient to run them at large, unpredictable scale. This is a new way of thinking about model functionality and is especially valuable for those looking to take their frontier models in new directions.
This interview has been edited for length and clarity.
Why not start by explaining your past experience in AI and what you’re doing at Google?
I’ve been working with AI for about two years. I’ve been at Anthropic for a year and a half, and I’ve been at Google for about six months. I run Vertex, Google’s developer platform. Most of our customers are engineers building their own applications. They want access to agent patterns. They want access to an agent platform. They want access to the world’s smartest model inferences. I provide it, but not the application itself. This is for Shopify, Thomson Reuters, and our various customers to serve on their own domains.
What attracted you to Google?
I think Google is unique in the world in that they have everything from the interface to the infrastructure layer. You can build a data center. We can buy electricity and build power plants. We have our own chips. we have our own model. There is a layer of reasoning that we control. There is a layer of agents that we control. There are APIs for memory and for writing interleaved code. On top of that is an agent engine that ensures compliance and governance. And we also have a chat interface with Gemini enterprise and Gemini chat for consumers, right?One of the reasons I came here is because Google is uniquely vertically integrated and I think that’s our strength.
tech crunch event
boston, massachusetts
|
June 9, 2026
It’s strange, because despite all the differences between the companies, the capabilities of the three major research institutes seem very similar. Is it simply a race for greater intelligence, or is it more complex than that?
You can see three borders. Models like Gemini Pro are tuned for raw intelligence. Think about writing code. You just need the best code you can get. It’s okay if your code takes 45 minutes. Because I have to maintain it and I have to put it into production. I just hope for the best.
Then there’s another boundary around latency. If you’re in customer support and need to know how to enforce a policy, you need the intelligence to enforce that policy. Can I process a return? Can I upgrade my seat on the plane? But if it takes you 45 minutes to get an answer, it doesn’t matter how many times you’re right. So in those cases you want the most intelligent product within that latency range. Because once the person gets bored and hangs up, further intelligence no longer matters.
And the last bucket is someone like Reddit or Meta who wants to control the entire internet. They have big budgets, but they can’t take a corporate risk on something without knowing how it will scale. They have no idea how many harmful posts there will be today or tomorrow. Therefore, budgets must be limited to models with the highest possible intelligence, but in a scalable manner that can accommodate an infinite number of subjects. For this reason, cost becomes very important.
One thing I’ve always wondered is why it’s taken so long for agent systems to catch on. I feel like the model exists and we’ve seen great demos, but we don’t see the big changes that we expected a year ago. What do you think is holding it back?
The technology is essentially two years old and there is still a lot of infrastructure missing. There is no pattern for auditing what agents are doing. There is no pattern for approving data to agents. These patterns require work to deploy into production. And production is always the ultimate indicator of what technology is capable of. So two years is not enough time to see what the intelligence supports in production, and that’s where people are struggling.
I think we’re moving at a unique pace in software engineering because it fits well into the software development lifecycle. You have a development environment that is safe to break something, and then you promote it from the development environment to a test environment. The process of writing code at Google requires two people to audit that code, and both of them make sure it’s good enough to support Google’s brand and deliver it to its customers. So there are a lot of human-involved processes that make implementation very low-risk. But we need to create those patterns for other places and other professions.
