11Sight at Google Cloud AI Agents: Live + Labs - Building Production-Grade Voice AI for Dealerships

On September 10, 2025, at the Chase Center in San Francisco, Google Cloud hosted AI Agents: Live + Labs – Startups Edition. Founders, VCs, and builders gathered to answer a simple question: how do we turn AI agent potential into production results?

11Sight was there to represent a very specific-and very hard-problem: voice AI, to augment, customer-facing, entry-level roles at automotive dealerships. Our CEO Aleks Gollu, PhD joined Ofer Ronen (Tomato.ai) and Maxim Fateev (Temporal Technologies) on a panel moderated by Google to share what’s actually working.

Aleks Gollu (11Sight) with Ofer Ronen (Tomato.ai) and Maxim Fateev (Temporal Technologies) at Google Cloud AI Agents: Live + Labs – Startups Edition.

Core message: 80% isn’t enough

Here’s the thing: getting an AI agent to “work” 80% of the time is fast. You can reach that in days.
But enterprises don’t buy 80%. They need 100% of the targeted functionality to work-every time.

Aleks put it bluntly:

“Statistical correctness is fine for demos. In production, you need deterministic behavior. No hallucinations. No sometimes-right. If it’s customer-facing, it must be absolutely right.”

That last 20%-hardening, guardrails, fallbacks, monitoring, and backward-compatible releases over years-is where most systems fail. That’s the part we obsess over.

Don’t boil the ocean. Solve the hard, narrow problems.

We focus on a narrow set of planned outcomes (think 9–10, not 500). We augment, not replace:

Service desk: answer calls, schedule/confirm/modify appointments, escalate when needed
Front desk (auto + hospitality): reservations, changes, confirmations
Outbound reminders: recalls, maintenance, follow-ups

This focus does two things:

It makes the problem solvable to enterprise standards.
It protects the customer experience, because when our agent can’t resolve in 2–3 turns, it escalates-immediately. No endless IVR loops. No “agent will try more questions” while the customer steams.

Production means a decade, not a demo

Enterprise production is not a launch date-it’s a 10-year commitment:

Releases without regressions
Backward compatibility
Encapsulation, modularity, unit testing-rediscovered for AI systems
Real-time observability and issue triage

We designed 11Sight’s architecture to live in that reality.

Continuous improvement you can measure

We don’t guess. We measure and iterate.

Resolution rate progression across deployments

Initial: ~30%
Q2: ~48%
September: ~60%

How we push it up:

Sentinel Agents continuously listen and categorize issues (latency, silence, wrong prompt, missing data, etc.).
We escalate only the high-signal call clusters to engineering (e.g., “These 15 calls fail for the same reason-fix that.”).
Sentinel accuracy is tuned via confusion-matrix sampling; 90–95% is acceptable internally because it feeds improvement, not customer outcomes.
Roadmap: end-of-call CSAT prompts (“How satisfied were you? What could we do better?”) to pair human feedback with Sentinel telemetry.

Bottom line: the customer-facing agent is deterministic; the monitoring agent can be statistical. That’s how you improve safely-without risking live CX.

Why dealerships are adopting now

Dealerships don’t have the luxury telcos do. If you frustrate a few customers, you feel it-these are relationships, not anonymous accounts.

What 11Sight delivers:

~60% first-call resolution today (and climbing)
Coverage of the ~38% of in-hours calls that go unanswered (NADA workforce study), turning missed demand into revenue
Net effect: ~28% more booked appointments
True 24/7/365 coverage-yes, people schedule at 2 a.m. and they expect it to work
Clean escalation when the agent shouldn’t or can’t proceed

We’re not replacing teams. We’re giving them time back and protecting CSI scores.

Voice is harder. We chose it anyway.

Customer-facing + voice + entry-level workflows = zero-mistake tolerance.
That’s exactly why we chose it. If you can make voice work to enterprise standards, the rest follows.

We also meet customers where they are:

Phone calls, after-hours intake
Web voice widgets for “talk, don’t type” flows
Multi-attempt logic that never harms the experience

A balanced panel, a clear takeaway

Google curated a practical mix:

Tomato.ai: speech/accents and the human-to-AI interface
Temporal: durable workflow orchestration for evolving stacks
11Sight: the application layer solving a real, high-stakes CX problem in automotive

Our stance, restated:

“We build voice AI agents that augment customer-facing entry-level roles. We define narrow outcomes, escalate with dignity, and improve continuously.”

The market noise vs. the work

Yes, there’s capital flooding AI. Yes, “one-person unicorns” make headlines. Also true: most agents fail in production because they stop at 80%. We don’t.

We do the unglamorous work-guardrails, monitoring, regression-proof releases-so dealerships get something simple: answered calls, booked appointments, and happier customers.

Grateful to the Google Cloud team for hosting a serious, practice-oriented day. The mix of founders, operators, and investors made for candid conversations about what’s real and what still needs building.

See what “production-ready” feels like

If you’re running a service department or front desk and want fewer missed calls, more booked appointments, and better CSI-without making customers fight a machine:

Book a demo and ask us to show you:

Live call flows (with real escalation)
Sentinel monitoring in action
How we got from 30% → 48% → 60% resolution-and where we’re headed next

👉 Book a Demo with 11Sight

‍

Go Back