What a day of agentic development actually looks like

, By Victor Valenzuela

Most developers I know still use AI coding tools the way they used Stack Overflow in 2015: a window open to ask a question, copy an answer, paste, run and debug. Cursor's tab-complete, Copilot's inline suggestions, the occasional round trip through ChatGPT for a tricky function. Useful as a faster reference, not so much as a collaborator.

There's a different way to work, and the difference is measurable. We've spent the last months moving our style of work at dala.care from chat-style assistance to what the industry now calls agentic development, where the agent owns a task end to end. It reads the ticket, maps the code, proposes a plan, writes it, runs tests, opens the PR (Pull Request), and addresses review comments. I stay in the loop at every decision point, especially where a business rule is ambiguous or the architecture isn't obvious, but I'm not typing code by hand anymore.

The gain isn't what most people assume. I write less code than I used to, not more.

The number that matters

The metric I look for isn't ticket count or lines of code. It's the ratio of merged PRs to closed tickets, which is the thing that actually translates to shipped improvements being used by our clients.

Before the agentic workflow, I averaged about 2.5 PRs per closed ticket, counting the initial implementation, the cleanup commits, the review-round fixes, the formatting passes, and the fix-forward patches. Every ticket took multiple round trips to land. After the workflow came online, that ratio dropped to about 1.2. Almost every ticket now closes with a single focused PR.

The difference is the ceremony and rework that the workflow eliminates before the PR ever opens. Every PR I ship has already passed formatting, lint, typecheck, tests, and an adversarial code review by an agent running a different model. The problems that used to generate cleanup PRs and review round-trips get caught automatically, so the code tends to be right the first time more often than not.

Let's look at my Done tickets per quarter at dala.care, pulled directly from our ticketing system. Every item in the graph represents a production-shipped, QA-passed ticket.

The chat-mode era covers everything before Q3 2025. Output varied from 76-79 Done tickets in productive quarters down to 25 when tickets got bigger and each one spanned weeks. The ramp starting in Q3 2025 tracks the rollout of the agentic workflow, and Q1 2026 is the first full quarter with the loop in place, ending at 135.

The ticket count went up because the tickets got smaller. The workflow pushes me toward splitting large features into stacked, reviewable pieces instead of shipping huge diffs. More tickets closed does not mean more code written: when I stripped generated files and lock files and measured organic lines of code for Q1 2026, my output was actually lower than in my pre-agentic quarters. I'm writing less code per quarter than I used to, the key being that much less of it is rework.

The process

The agent handles most of the sequence around day to day work. A typical ticket involves a lot of small steps before and after the actual code writing: pulling context from our ticketing system and the codebase, checking the vault for past decisions, running lints and types and tests, writing the PR description. Some of that is fully automatic. Other parts (walking the diff, deciding how to address review comments) still need my judgment, but the workflow puts those steps in front of me instead of relying on me to remember them on a Friday afternoon.

As I use these agents for my work, knowledge capture compounds. When I learn something non-obvious about the product, a business rule, a migration gotcha, a piece of history that explains why the code is shaped the way it is, the workflow captures it into a knowledge vault at the end of the session. The next time I pick up a ticket in that area, the relevant notes get loaded automatically, and the same mistake doesn't get made twice. Addy Osmani describes this as the self-improving agent flywheel: better context produces better agent work, which produces better learnings, which produces better context. That flywheel is real, and it's the reason the Q3-to-Q4-to-Q1 ramp in the chart hasn't flattened yet.

A published benchmark ran the identical model against the same task suite under different surrounding configurations. Success rate swung from 42% to 78%. Thirty-six points, with no change to the model.

The model itself is fixed. What you can tune is everything around it, and that environment is where the gain lives.

In Nonaka's knowledge-management model the process is called externalization: converting tacit knowledge into explicit, shareable form. Infrastructure-as-Code already proved the pattern. The agent is what finally made me do the externalization.

It isn't all upside. High-velocity PR output is a real load on the people who review my work. In Q1 2026 I opened 168 PRs, and that volume has to be absorbed somewhere. Any honest accounting of agentic productivity has to include the review cost it creates for everyone else.

Worth it?

If you're still in chat mode, the jump is worth making, but know what you're buying. You're not buying more code, you're buying less rework, fewer forgotten steps, and external knowledge that compounds. I went from 2.5 PRs per ticket to 1.2, and that difference is the lack of ceremony and iteration that used to eat a lot of my time.

For this to work you have to build the environment around the models: the gates, the knowledge vault, the skill orchestrators, the review automation. That substrate will outlast whichever tool is in fashion next year.

I don't know which model I'll be using in twelve months, but I know the harness I'll be running it through.

About the author

Vic is a senior engineer at dala.care.