Bright ideas and techniques for building with Convex.
Profile image
Ian Macartney
7 months ago

Migrating Data With Mutations

Implementing a migration with our "migration" helper

No one is perfect, and that applies to designing the schema of your app. As your understanding of the problem evolves, you will inevitably change your mind about the ideal way to store information.

In this post, we’ll look at strategies for migrating backend data. We’ll be working specifically with Convex, but the concepts are universal.

To learn about migrations at a high level and some best practices, see this primer.

To use the migration helpers I wrote, you can find the code here, and read the lower section to see how they work.

Schema Migrations

One thing to call out explicitly is that with Convex, you don’t have to write migration code like “add column” or “add index” explicitly. All you need to do is update your schema.ts file. Convex isn’t rigidly structured like most SQL databases are. If you change your field from v.string() to v.union(v.string(), v.number()), Convex doesn’t have to reformat the data or table. You can even turn off schema validation and throw unstructured data into Convex and it will all just work1.

With schema validation enabled, Convex will help your code and data stay in sync by only letting you push schemas that match the current data. To add a string field to an object, for instance, you’d probably push a schema where that field is v.optional(v.string()). Later, if there is a string on every object, Convex will let you push a schema that is just v.string() and will enforce that the field will always be set and be a string.

In this way, Convex gives you the ease of just defining your types declaratively, while also guaranteeing that they match the reality of the data at rest when you deploy your code and schema. It’s also worth mentioning that transitions from one schema definition and code version to the next are atomic, thanks to Convex coordinating both the functions and the database.

The rest of this post is about how you go about changing the underlying data.

Data Migrations using Mutations

To migrate data in Convex, you can use a mutation to transform your data. To make this easy, I’ve made some helper functions that you can use here, which contain the patterns shown here.

If your table is small enough (let’s say a few thousand rows, as a guideline), you could just do it all in one mutation. For example:

export const doMigration = internalMutation(async ({db}) => {
  const teams = await db.query("teams").collect();
  for (const team of teams) {
    // modify the team and write it back to the db here

This would define the doMigration mutation, which you could run from the dashboard or via npx convex run. I made it an internalMutation so it wouldn’t be available on the public API. To allow a client to call it publicly, you would probably want to add authentication and authorization if it wasn’t safe to run multiple times.

Let’s look at what it would look like to add a default value or delete a field.

Adding a new field with a default value

To add a field with some value to all users, you might write:

if (!team.plan) {
  await db.patch(team._id, { plan: "basic" });

Note: this doesn’t have to be a static value. You could write the value based on other fields in the document, or whatever custom logic you like.

As a reminder for those who skipped the primer, to do this correctly, you’d also want to update your code to start writing the default field value on new documents before running this mutation to avoid missing any documents.

Schema stages

If you’re using a schema and validation, for the above example, you’d likely update the team’s schema first to define “plan” as:

plan: v.optional(v.union(v.literal("basic"), v.literal("pro")))

Then, after all the fields have a value, you’d change it to:

plan: v.union(v.literal("basic"), v.literal("pro"))

Convex won’t let you deploy a schema that doesn’t conform to the data unless you turn off schema validation. As a result, you can safely trust that the typescript types inferred from your schema match the actual data.

Deleting a field

If you’re sure you want to get rid of data, you could do the following:

export const removeBoolean = internalMutation(async ({db}) => {
  const teams = await db.query("teams").collect();
  for (const team of teams) {
    if (team.isPro !== undefined) {
      delete team.isPro;
      await db.replace(team._id, team);

As mentioned in the migration primer, I advise deprecating fields over deleting them when real user data is involved.

Big tables

For larger tables, reading the whole table becomes impossible. Even with smaller tables, if there are a lot of active writes happening to the table, you might want to break the work into smaller chunks to avoid conflicts. Convex will automatically retry failed mutations up to a limit, and mutations don’t block queries, but it’s still best to avoid scenarios that make them likely.

There are a few ways you could break up the work, but I’d recommend using our pagination functionality. Each mutation will only operate on a batch of documents and keep track of how far it got, so the next worker can efficiently pick up the next batch. One nice benefit of this is you can keep track of your progress, and if it fails on some batch of data, you can keep track of the cursor it started with and restart the migration at that batch. Thanks to Convex’s transactional guarantees, either all of the batch or none of the batch’s writes will have committed. A mutation that works with a page of data might look like this:

export const myMigrationBatch = internalMutation(async ({ db }, { cursor, numItems }) => {
  const data = await db.query("mytable").paginate({ cursor, numItems });
  const { page, isDone, continueCursor } = data;
  for (const doc of page) {
    // modify doc
  return { cursor: continueCursor, isDone };

Running a batch from the dashboard

To try out your migration, you might try running it on one chunk of data, by going to the functions panel on the dashboard and clicking “Run function.” To run from the beginning of the table, you’d pass as an argument:

{ cursor: null, numItems: 1 }

It would then run and return the next cursor (and print it to the console so you can look back if you lose track of it). To run the next batch, just update the parameter to the cursor string instead of null.

You could keep running it from here, but it might start to feel tedious. Once you have confidence in the code and batch size, you can start running the rest. You can even pass in the cursor you got from testing on the dashboard to skip the documents you’ve already processed!

Looping from an action

To iterate through chunks, you can call it from an action in a loop:

export const runMigration = internalAction(
  async ({ runMutation }, { name, cursor, batchSize }) => {
    let isDone = false;
    while (!isDone) {
      const args = { cursor, numItems: batchSize };
      ({ isDone, cursor } = await runMutation(name, args));

You can then go to the dashboard page for the runMigration function and test run the mutation with the arguments { name: "myMigrationBatch", cursor: null, batchSize: 1 }

Here "myMigrationBatch" is whatever your mutation’s path is, e.g. if it’s in the file convex/migrations/someMigration.js, it would be "migrations/someMigration:myMigrationBatch".

To use the CLI, you could run:

npx convex run migrations:runMigration '{ name: "myMigrationBatch", cursor: null, batchSize: 1 }'

It is also possible to loop from a client, such as the ConvexHttpClient, if you make it a public mutation. You could also recursively schedule a mutation to run, as an exercise left to the reader.

An aside on serial vs. parallelizing

You might be wondering whether we should be doing all of this in parallel. I’d urge you to start doing it serially, and only add parallelization gradually if it’s actually too slow. As a general principle with backend systems, avoid sending big bursts of traffic when possible. Even without causing explicit failures, it could affect latencies for user requests if you flood the database with too much traffic at once. This is a different mindset from an analytics database where you’d optimize for throughput. I think you’ll be surprised how fast a serial approach works in most cases.

The migrations helper library

To make this easy, I’ve made some helper functions you can use in your project that wraps up some of the patterns above so you only have to write the code relevant to updating documents. You can find the code here:

The migration function wrapper

It allows you to write a migration that looks like:

import { migration } = from './lib/migrations';

export const myMigration = migration({
  table: "mytable",
  batchSize: 100, // optional, defaults to 100
  migrateDoc: async ({ db }, doc) => {
    // change document

You can then test this mutation from the dashboard with the parameters:

{ numItems: 1 }

Or starting from some cursor with:

{ cursor: "..." }

And to test running the code but not committing the batch:

{ dryRun: true }

For those curious, check out the repo to see the code.

The runMigration action

To run the mutation over the whole table in batches, you can run the runMigration action in the "convex/lib/migrations" module to run a function that you’ve wrapped with the above migration wrapper. You can call it from the dashboard by passing the name of your migration function:

{ name: "path/to:myMigration" }

As in previous sections, you can optionally pass it cursor to start from where another run left off. You can also override the batching with a batchSize parameter. To test a migration, call your migration directly, as in the migration function wrapper section above.


In this post, we looked at a strategy for migrating data in Convex using mutation functions. As with other posts, the magic is in composing helper functions and leveraging the fact that you get to write javascript or typescript rather than divining the right SQL incantation. The code for the helpers is on GitHub here, and if you have any questions don’t hesitate to reach out in Discord.


  1. Technically, there are some restrictions on Convex values, such as array lengths and object key names that you can read about here.

Build in minutes, scale forever.

Convex is the backend application platform with everything you need to build your project. Cloud functions, a database, file storage, scheduling, search, and realtime updates fit together seamlessly.

Get started