Mike Cann's avatar
Mike Cann
6 days ago

Is Claude Code Better Than Cursor? A Real-World Test with Convex

A few weeks ago, I posted something on X that I wasn't sure I should say out loud. An honest, slightly existential reckoning with what it means to be a developer experience engineer when AI agents are writing the code, and with the direction the industry seems to be heading.

Then I decided to stop worrying about it and actually test the thing everyone's arguing about.

The Experiment: Building a Full-Stack Convex App Without Opening an IDE

The experiment was simple. I wanted to build a real, full-stack app using only Claude Code from the terminal, with no IDE, no Cursor, and no VS Code. Just a CLI agent and a Convex backend. The goal wasn't to crown the "best AI for coding", but to stress-test the agent-first workflow on a project with real complexity and see what actually happened.

The app itself is a portfolio site that pulls data from the YouTube API and the X API, parses content with AI, handles authentication, and includes admin tooling. It's backed would be powered entirely by Convex, including cloud functions, database, and file storage. You can see the finished product at Mike's Convex Portfolio, and if you want to watch the full build, the video walkthrough covers every step.

This wasn't a toy demo but a multi-integration project with enough moving parts to expose real friction (and it did).

What Claude Code Gets Right

Claude Code excels at plugin extensibility, autonomous plan mode, intelligent tool calling, and raw CLI speed. For greenfield projects where you want an agent to figure things out without hand-holding, it's genuinely impressive.

Let me break down the specific wins.

1. Plugins and Extensibility

The /plugin system in Claude Code is one of its strongest differentiators. Installing plugins is fast and frictionless, and MCP (Model Context Protocol) server support means you can extend the agent's capabilities in ways that feel native rather than bolted on.

During this build, these are the plugins that made a massive difference:

  • a dev browser plugin for previewing output
  • the Convex MCP for inspecting the database during development
  • Context7 MCP for pulling in documentation context
  • Claude skills for front-end design guidance

The Convex MCP was particularly useful. Being able to have the agent inspect the actual state of the database while it was writing queries and mutations meant fewer wrong turns, because the agent could verify its own assumptions about the data shape without me having to copy-paste schema definitions into the chat.

This is where Convex's architecture quietly pays dividends. Because queries in Convex are just TypeScript functions that run inside the database, the agent didn't need to learn a separate query language or translate between an ORM and raw SQL. It could read the schema, understand the types, and write a query, all in the same language it was already thinking in. That consistency made the MCP integration feel seamless rather than like a bolted-on inspection tool.

2. Plan Mode and Collaborative Problem Solving

Claude Code's plan mode is where the agent stops, thinks, and asks clarifying questions before acting, and this matters more than it sounds.

The key differentiator is that Claude Code can autonomously enter plan mode when it senses complexity. Rather than barreling ahead, it pauses, scopes the problem, and proposes an approach before writing a single line of code. In Cursor, plan mode exists too, but you have to manually invoke it, and that distinction changes the dynamic entirely. With Claude Code, the agent behaves more like a collaborator who knows when to slow down, whereas with Cursor, you're the one who has to recognize when the task is complex enough to warrant planning.

For a multi-step task like wiring up YouTube API fetching, parsing the response with AI, and storing the results in Convex, I was grateful for the autonomous planning. The agent broke the problem into discrete steps: first defining the schema, then writing the action to fetch from YouTube, then writing the mutation to store the parsed data, and finally wiring up the query to surface it on the frontend. Each step was scoped and sequenced before any code was written.

That kind of structured thinking is exactly what you want from an agent working on a real backend. When your data pipeline flows from an external API through a Convex action into a mutation and then out through a reactive query, getting the order of operations wrong means debugging across multiple layers, so the agent's willingness to plan first saved real time.

3. Agent Intelligence and Tool Usage

Throughout the build, the agent consistently used the right tools at the right time. It reached for the Convex CLI to run mutations, used the browser plugin to verify visual output, and called MCP servers for documentation when it needed context.

That said, it's hard to separate the model from the orchestration around it. Both matter, and we need benchmarks that can tell them apart. Right now we're evaluating the whole stack as one unit.

The agent's tool usage was noticeably better when working with Convex than with some of the other integrations. Part of that is the Convex MCP doing its job well, and part of it is that Convex's API surface is consistent and predictable. There's one way to write a query, one way to write a mutation, one way to define a schema, and that consistency gives the agent fewer wrong paths to wander down.

4. CLI Speed

Type claude from any directory and you're working, with no project loading, no extension initialization, and no waiting for an IDE to index your codebase. The startup speed is instant, and for developers who already live in the terminal, that's compelling.

Even as someone who typically prefers IDE-based workflows, I found the immediacy appealing. There's something clarifying about stripping away the visual chrome and just talking to an agent in a terminal. You focus on what you want to build rather than on configuring the environment you're building in.

For quick iterations like spinning up a new Convex function, testing a query, or tweaking a mutation, that speed compounds. You're not waiting for anything; you're just working.

Where Claude Code Falls Short

Claude Code's terminal-based interface creates friction around discoverability, mouse interaction, screenshot handling, model switching, message queuing, and cost control, all areas where IDE-based tools currently have a meaningful advantage.

None of these are dealbreakers on their own, but together they add up to a noticeably rougher experience for certain workflows.

Terminal UI Limitations

The biggest adjustment is discoverability. In a GUI, features announce themselves through menus, buttons, and tooltips, but in Claude Code you need to read the docs or stumble onto commands, because there are no visual affordances telling you what's possible.

Mouse interaction is also minimal. Ctrl+O expands all tool calls at once, but you can't click into individual ones the way you would in a GUI-based agent. Text selection doesn't work as expected either: Cmd+A and Ctrl+A don't behave the way your muscle memory assumes, and pasting large text blocks has compaction and expansion UX issues that slow you down.

There's also no screenshot paste support, which is a real gap. In Cursor, you can drag and drop an image into the chat to show the agent what you're seeing, but in Claude Code, you can't. For front-end work where "this doesn't look right" is a common prompt, that missing capability adds friction every single time you need to communicate a visual problem.

For backend-heavy work like writing Convex functions, defining schemas, and wiring up integrations, the terminal-only interface is fine. The friction only shows up when you cross into the visual layer. If your project is mostly backend logic and data flow, you'll barely notice, but if you're doing significant UI work, you'll feel it.

Model Switching and Message Queuing

The /model command only works when the agent is idle, so you can't switch models mid-task. If the agent is working on something and you realize a cheaper model would be fine for the next step, you have to wait for the current operation to complete.

Message queuing is also unreliable, because messages sent while the agent is actively working may interrupt the current task rather than queue up for later. In Cursor, you can queue messages freely and use Alt+Enter to send immediately when you want to interrupt. The interaction model is more intuitive and more forgiving of the way developers actually think, which is often several steps ahead of what the agent is currently doing.

These feel like solvable problems, and I expect they'll improve in future releases, but right now they're friction points in a real workflow. When you're iterating quickly on a Convex backend (writing a function, testing it, adjusting the schema, testing again), the inability to fluidly switch between models or queue up your next instruction breaks the flow.

Token Usage and Cost

The total cost to build this app was $60, which breaks down to $20 for the Claude Code subscription plus $40 in additional API credits, using Opus 4.5 as the primary model.

Is that expensive? Compared to the hours of manual development time it replaced, no; compared to what the same project would likely cost in Cursor, probably yes. And the reason traces directly back to the model-switching friction described above.

In Cursor, it's easy to drop down to a cheaper, faster model for simple tasks like renaming variables, writing boilerplate, or fixing lint errors. In Claude Code, the difficulty of switching models mid-session means you end up running Opus on tasks that don't need it, which adds up quickly over the course of a full project build.

For developers who care about cost control, and most of us do, this is a meaningful consideration. The raw capability of Claude Code is impressive, but capability without cost efficiency limits who can realistically use it as a daily driver.

How Did Convex Hold Up Under Agentic Development?

Claude Code wrote correct Convex backend code, including functions, queries, mutations, and CLI usage, without any rules files or templates guiding it. The TypeScript-native API and consistent patterns made Convex surprisingly agent-friendly out of the box.

This was one of the most interesting findings from the experiment. I deliberately did not use Convex cursor rules files or any of the built-in templates, because I wanted to see whether the agent could figure out Convex's patterns from its training data alone.

It managed to do exactly that. The agent correctly structured Convex queries and mutations, used ctx.db.query and ctx.db.insert appropriately, and called internal functions through the Convex CLI to interact with the server. It understood that Convex queries are reactive by default, meaning that when the underlying data changes, any component subscribed to that query automatically gets the new data. So it didn't try to build polling logic or manual refresh mechanisms. It just wrote the query and trusted the platform to handle the rest.

Why did this work so well? Because in Convex, the application and database types are automatically equivalent, and the entire data pipeline from frontend to database uses the same exact types and definitions. There's no adapter code between languages, no ORM translation layer, and no impedance mismatch between "how TypeScript works" and "how the backend works." For an AI agent that thinks in TypeScript, that's the ideal surface to work against. The agent didn't need to context-switch between languages or mental models; it just wrote TypeScript, and the backend understood it.

The Convex MCP plugin added another layer of reliability by letting the agent inspect the database state directly, which meant it could verify its own work without me acting as a middleman. When it wrote a mutation to store parsed YouTube data, it could then query the database to confirm the data landed correctly. That feedback loop of write, verify, adjust made the agentic workflow noticeably more reliable than it would have been with a backend that required separate tooling for inspection.

Where did the agent struggle? The same places most developers struggle when they're new to Convex: unit testing patterns, migration workflows, and optimistic updates. These are areas where the patterns aren't immediately obvious from the API surface alone, so while the agent could write a basic query or mutation without guidance, it needed more direction when it came to testing that mutation in isolation or handling a schema change gracefully.

This is where Convex-specific Claude skills could make a real difference. The foundation is strong, since the agent can already write correct Convex code from scratch, and the opportunity is in building agent-specific guidance for the non-obvious patterns: how to structure tests around Convex functions, how to think about migrations when your schema evolves, and how to implement optimistic updates that feel instant to the user. Those are the areas where a little structured knowledge would turn a capable agent into an expert one.

If you want to try building your own project on Convex, the quickstart guide is the fastest way to get running. You'll have a reactive backend with TypeScript functions, a database, and file storage in minutes, and if you decide to point an AI agent at it, you'll find the surface remarkably easy for the agent to work with.

Claude Code vs. Cursor: Which AI Coding Tool Should You Use?

For greenfield projects where you want maximum agent autonomy and you're comfortable in the terminal, Claude Code is genuinely impressive. For most day-to-day development, especially when you need cost control, model flexibility, and a more intuitive interaction model, Cursor still has the edge.

That's the honest answer, not "it depends" in the hand-wavy sense, but a real trade-off that maps to how you actually work.

Here's how they compare across the dimensions that mattered most during this build:

DimensionClaude CodeCursor
InterfaceTerminal CLIFull IDE
Plan ModeAutonomous (agent-initiated)Manual (user-initiated)
Plugin SystemExcellent (MCP, skills, plugins)Good (extensions, rules files)
Model SwitchingOnly when idleAnytime, flexible
Cost ControlLimited (hard to use cheaper models)Strong (easy model switching)
Screenshot SupportNoneDrag-and-drop image input
Message QueuingUnreliable mid-taskReliable, with interrupt option
Startup SpeedInstantRequires IDE load
Agent AutonomyHighModerate

If you're building something from scratch (exploring a new idea, prototyping a product, or scaffolding a full-stack app on a platform like Convex), Claude Code is the more interesting tool right now, since its plugin system and autonomous planning give it a genuine edge for agentic workflows where you want the agent to take the lead.

If you're doing daily development on an existing codebase, switching between tasks frequently, or working on front-end code where visual feedback matters, Cursor's IDE integration and interaction model will save you time and money. The ability to switch models on the fly, queue messages reliably, and paste screenshots into the chat are practical advantages that compound over a full workday.

And here's the thing most comparison articles won't tell you: you don't have to pick one. I use both, and the best coding AI setup might well be Claude Code for greenfield exploration and Cursor for iteration and refinement. They're complementary tools, not competing religions.

What AI Coding Assistants vs. AI Coding Agents Actually Means

AI coding assistants (like autocomplete and inline chat features) suggest code within your existing workflow, whereas AI coding agents (like Claude Code's agentic mode or Cursor's Composer) autonomously plan, execute multi-step tasks, call tools, and modify multiple files. The difference is that agents don't just suggest things; they actually do them.

This distinction matters because the market is bifurcating and most developers are conflating the two categories. An autocomplete suggestion that fills in a function signature is a fundamentally different interaction than an agent that reads your schema, plans a data pipeline, writes the code across four files, runs the Convex CLI to deploy it, and then queries the database to verify the result.

When you're evaluating agentic coding tools, the questions that matter are about autonomy, tool usage, planning quality, and cost, whereas for assistants the questions are about latency, suggestion accuracy, and inline UX. They're different tools solving different problems, and comparing them on the same axis leads to confused decisions.

The reason this matters for your choice of backend is that agents perform differently depending on the surface they're working against. An agent writing raw SQL has to context-switch between TypeScript application logic and a completely different query language with different semantics, whereas an agent writing Convex functions stays in TypeScript the entire time. The types flow from the schema definition through the function signature to the frontend component, so there's no translation layer where the agent can introduce bugs.

That's not a theoretical advantage. During this build, the agent made zero type-related errors when writing Convex functions. Not a single one. The TypeScript compiler caught everything at the boundary, and because Convex functions are TypeScript, the agent's existing knowledge of the language was sufficient to write correct backend code. That's what "no impedance mismatch" looks like in practice.

What Agent-First Development Means for the Near Future

This experiment changed how I think about my own role. Not in a "developers are obsolete" way, but in a "the skill distribution is shifting" way.

Building this app from the terminal forced me to operate less as a coding expert and more as a software development expert. The difference is subtle but real: I spent less time writing code and more time evaluating code, defining architecture, specifying behavior, and catching the agent's mistakes before they compounded. The skill that mattered most wasn't TypeScript fluency but knowing what good software looks like and being able to articulate it clearly enough for an agent to execute.

That's the trajectory. The developers who thrive in an agent-first world won't be the fastest typists; they'll be the ones who can spec a feature precisely, evaluate generated code critically, and understand system architecture deeply enough to guide an agent through complex decisions.

Emerging patterns are already pointing in this direction. Spec-driven agentic development, where you write a detailed specification and let the agent implement it, is becoming a real workflow, and kanban-style coding, where you break work into discrete tasks and assign them to agents in parallel, is being explored by teams building on platforms where the backend architecture supports that kind of composability.

Convex is particularly well-suited to this pattern. Because each Convex function is a self-contained unit with clear inputs, outputs, and types, it maps naturally to a discrete task that an agent can own end-to-end. You can spec a mutation, hand it to the agent, and verify the result by querying the database, all without the agent needing to understand the entire codebase. The function boundary is the task boundary, and that's not an accident of design but what happens when your backend is built around composable primitives rather than monolithic query layers.

I wrote that existential crisis post because the shift felt disorienting, and after building this app, it still feels disorienting. But it also feels like the beginning of something genuinely better for developers who are willing to adapt: the backend gets out of your way, the agent handles the boilerplate, and you focus on the decisions that actually matter, like what to build, how it should behave, and why it matters to the people who'll use it.

If you want to talk about any of this, whether it's the agent workflow, the Convex integration, or just the general existential weirdness of watching an AI build your app, the Convex Discord is where a lot of these conversations are happening in real time.

Frequently Asked Questions

Q: Is Claude Code better than Cursor?

Claude Code excels at autonomous agentic tasks and plugin extensibility, but Cursor offers better UX, cost control, and model flexibility for most development workflows. The best choice depends on whether you prioritize agent autonomy (Claude Code) or interaction polish and cost efficiency (Cursor). Many developers use both: Claude Code for greenfield builds and Cursor for daily iteration.

Q: How much does Claude Code cost to build an app?

In this test, building a full-stack Convex app with Claude Code using Opus 4.5 cost $60 total: $20 for the subscription and $40 in additional API credits. Costs could be lower with more aggressive model switching, but Claude Code's current UX makes that difficult. The same project in Cursor would likely cost less due to easier access to cheaper models for simple tasks.

Q: Can AI coding tools write Convex backend code?

Yes. Claude Code wrote correct Convex functions, queries, and mutations without rules files or templates, and intelligently used the Convex CLI for server operations. Convex's TypeScript-native API makes it particularly agent-friendly because there's no impedance mismatch between the language the agent knows and the backend's API surface. The agent made zero type-related errors when writing Convex functions during this build.

Q: What's the difference between AI coding assistants and AI coding agents?

Assistants suggest code inline within your existing workflow: think autocomplete and inline chat. Agents autonomously plan, execute multi-step tasks, call external tools, and modify multiple files across a project. The distinction matters for evaluation: agents are judged on autonomy and planning quality, while assistants are judged on suggestion accuracy and latency. Claude Code and Cursor's Composer are agents, whereas GitHub Copilot's inline suggestions are an assistant.

Q: What is the best AI coding tool for TypeScript development?

For TypeScript full-stack development, both Cursor and Claude Code are strong choices. Cursor suits IDE-integrated workflows with frequent model switching and visual feedback, whereas Claude Code suits terminal-based agentic development with maximum autonomy and plugin extensibility. Both work well with TypeScript-native backends like Convex, where the type system flows from frontend to database without adapter code, giving the agent a consistent, predictable surface to write against.

Q: Why does the choice of backend matter for AI coding agents?

AI agents perform better when the backend API is consistent, typed, and uses the same language as the rest of the application. Backends that require context-switching between TypeScript and SQL, or between application code and ORM configuration, introduce more opportunities for the agent to make mistakes. Convex's TypeScript-native approach means the agent stays in one language and one mental model from frontend to database, which reduces errors and speeds up development.

Build in minutes, scale forever.

Convex is the backend platform with everything you need to build your full-stack AI project. Cloud functions, a database, file storage, scheduling, workflow, vector search, and realtime updates fit together seamlessly.

Get started