Back to Blog
AI Product Management
min read
min read

The State of YC AI Agents (2026)

Most teams have agents in production. Almost none know if they're actually working

A smiling person with glasses and short hair, wearing a light shirt against a plain background. The image conveys a friendly and approachable mood.
Tyler Postle
Feb 20, 2026
Share

We surveyed YC companies building AI agent products and found that the hardest part is no longer getting agents into production — it's figuring out whether agents are actually helping users so adoption and usage can scale.

Two signals from the data stood out.

First, most agent deployments are still early in scale: 89% of production systems handle fewer than 10k conversations per month.

Second, even teams already using tools like LangSmith, Langfuse, or Grafana report that understanding and improving agent behavior is still the hardest operational problem.

To better understand how YC companies are deploying agents today, we asked founders about:

  • usage volume
  • agent use cases
  • interaction patterns
  • architecture design
  • operational challenges

The full report is exclusive to survey respondents, but here are a few key signals that emerged from the data.

🚀 1) Most agent startups already have production deployments

  • 86% of respondents already have agents live in production
  • 14% are still in development

This suggests YC teams are moving beyond demos quickly. Agents are already embedded into real workflows and products. This shouldn’t be a big surprise these days — but what comes after deployment (adoption, usage, churn) is what really matters.

📊 2) Most production agents are still early in scale

Among the companies that reported agents live in production, conversation scale is still mostly small:

  • 89% handle fewer than 10k conversations per month
  • Only one respondent reported volumes in the 1M–10M/month range

So while YC companies are already shipping agents to production, most deployments are still early in scale. This surprised me, considering all the LinkedIn, Reddit, and Twitter claims about how agents are everywhere doing everything. Good reminder not to spend too much time comparing yourself to what you see online.

⚙️ 3) What YC agents are actually doing

The most common agent use cases (multiple select) were:

  • 62% data extraction / processing
  • 62% workflow automation
  • 38% research and analysis
  • 38% content generation
  • 33% search / retrieval
  • 29% customer support

This suggests many YC agent products focus primarily on structured work and operational tasks, where outcome success is more clearly defined. This becomes relevant when we look at the key challenges facing agent products today.

🔄 4) Agents are both product features and background systems

One interesting pattern from the survey is that agents are not just chat interfaces — many run as background systems inside the product.

Across responses:

  • 76% expose agents directly to customers
  • 48% run agents in the background performing automated tasks
  • 33% run agents primarily for internal teams

This suggests many products now include both:

  • interactive agent experiences (chat, copilots, assistants)
  • background agents that execute tasks, process data, or automate workflows

In other words, agents are increasingly becoming core product infrastructure, not just user-facing chat features.

🧠 5) Agent architectures are converging on hybrid systems

Teams reported using a mix of architectural patterns:

  • 76% use iterative reasoning loops
  • 57% use deterministic workflows or state machines
  • 38% use multi-agent systems
  • 33% use single-pass tool calling
  • 24% use planner–executor architectures

The interesting part is the overlap.

Many teams reported using both iterative loops and deterministic workflows, suggesting a common architecture pattern emerging:

  • deterministic workflow orchestration and agent reasoning loops

In practice, workflows often control the high-level product logic, while agent loops handle reasoning and tool use inside each step.

🧪 6) The biggest challenge: Evaluation tooling

Now comes the fun part — the pain points.

We asked respondents what their biggest challenge is with building, running, and scaling production agents. When we clustered the free-text responses, they grouped into three broad categories:

Agent quality & improvement
(evals, reliability, monitoring)

System performance & infrastructure
(latency, cost, traffic/availability, voice limitations)

Product & organizational constraints
(discovering use cases, time/resources)

The largest cluster by far was the first one: agent quality and improvement.

About 38% of respondents explicitly mentioned evaluation challenges, including:

  • building eval suites
  • running A/B tests
  • improving agent behavior over time

What’s interesting is that many founders mentioned using systems like LangSmith, Langfuse, Braintrust, or internal observability tooling, yet still struggle to understand what their agents are doing in the wild — and whether they are actually helping users.

(You can bet I forwarded this insight to our team at Voker — it’s a clear signal that reinforces our thesis that the space needs a purpose-built Agent Analytics platform.)

As product builders, when people tell us their pain, we should always ask: what’s the pain behind the pain?

When I did this, two takeaways stood out.

1️⃣ “Evals” are a symptom, not the full solution.

Many respondents already have evaluation tools in their stack. What they’re really describing is the broader problem:

AI promises magic, but delivers lumpy outcomes — sometimes magical, sometimes frustrating.

Evals help address part of this, but they’re only one tool in the box. In a previous survey we ran, a super-majority of respondents said evals often under-deliver because keeping them up to date becomes an impossible task.

2️⃣ Quality problems ultimately show up as product problems.

Why does quality matter? Because adoption, usage, and churn are on the line.

2025 was the year agents got into production.
2026 is the year teams have to harden and optimize them.

A huge wave of churn may be coming as the unbelievable promise of “Ask me anything” (the slogan behind almost every agentic product out there) starts to come under scrutiny from real users.

An early indicator is the low interaction volumes we saw in this survey.
The lagging indicator is the C-word: churn.

That’s why I believe founders are pointing to agent quality and improvement as their biggest challenge.

🧾 Final thoughts

A few early signals from this dataset:

  • YC companies are already shipping agents to production (duh)
  • Most deployments are still early in scale
  • Agents are primarily used for automation and structured work, not the unbounded intelligence we’re often promised
  • Architectures appear to be converging around hybrid workflow + agent systems, so it’s not as simple as “just prompt a model”
  • The most common operational challenge is evaluating and improving agent behavior

In the world of agent evaluation tools, here’s a quick cheat sheet:

Observability tools help engineers debug individual traces.
Evals tools help prevent unintended agent regressions when configurations change.
Agent Analytics tools (Voker) help engineers, product managers, and business leaders measure whether agents are actually helping users and identify performance patterns that need attention.

I plan to run this survey again next year to track how the agent space evolves.

In the meantime, if you’re building an Agent-First product, submit your response to our survey and I’ll send you the full report. We’ll continue updating the analysis as more data comes in.

And of course — if you have agents in production and are struggling to understand or improve agent behavior, let’s talk.

AI Product Management
A smiling person with glasses and short hair, wearing a light shirt against a plain background. The image conveys a friendly and approachable mood.
Tyler Postle
CEO, Voker
Abstract Shape

More from the Blog

View all articles
A person in a gray shirt works at a multi-monitor desk setup, focused on a laptop. The workspace is organized and modern, conveying concentration.
AI Product Management
12 min read

What is Agent Analytics?

Measuring AI Agents in Production

A smiling person with glasses and short hair, wearing a light shirt against a plain background. The image conveys a friendly and approachable mood.
Tyler Postle
·
Apr 15, 2026
Read article
A dark night scene shows a tombstone with "R.I.P. Prompt Engineering, 2022–2025." Below, text reads, "Prompt Engineering is Dead. Long live Agent Engineering." Stars and a moon are in the sky.
AI Product Management
4 min read

The Rise of the Agent Engineer

The prompt engineer died so Agent Engineers could thrive

A smiling person with glasses and short hair, wearing a light shirt against a plain background. The image conveys a friendly and approachable mood.
Tyler Postle
·
Mar 22, 2026
Read article
A man in a t-shirt gestures while speaking to a colleague standing nearby in an office setting, suggesting a collaborative and focused atmosphere.
AI Product Management
5 min read

Agent Analytics FAQ

Everything you need to know to start measuring what your AI agent actually does

A smiling person with glasses and short hair, wearing a light shirt against a plain background. The image conveys a friendly and approachable mood.
Tyler Postle
·
Mar 5, 2026
Read article