

Migrating data from Postgres to Convex

This guide will help you migrate data to Convex from Postgres on any provider (Supabase, Neon, RDS, etc).
Exporting your Postgres data
Dumping to a file using Psql
If you have relatively small amounts of data (less than a few gigabytes), you can likely just dump your data to a file from psql and import it into Convex.
First, use psql to connect to your postgres instance. To dump the table people run a command like this one. It will dump in the JSONL format.
1\copy ( SELECT row_to_json(people) FROM people ) TO '/path/to/people.jsonl';
2Adding it to Convex
If you already have a Convex project, simply cd into the directory of that project. If you don’t have one yet, you can make a new project with.
1npm create convex@latest
2Import the data
Once you are in the project directory, you can import your jsonl file into your convex deployment
1cd /my/convex/project
2npx convex import --format jsonLines --replace --table people /path/to/people.jsonl
3Take a look at your new data in the dashboard
1npx convex dashboard
2Define your schema
You can add a Convex schema to your project following guides here https://docs.convex.dev/database/schemas#writing-schemas.
Tip: there is a button in the dashboard to generate this automatically based on your data.
It might end up looking like this:
1import { defineSchema, defineTable } from "convex/server";
2import { v } from "convex/values";
3
4export default defineSchema({
5 people: defineTable({
6 id: v.string(),
7 name: v.string(),
8 pseudonym: v.string(),
9 }),
10});
11Your favorite LLMs (eg Cursor) can be rather helpful here to generate a schema. You can dump the postgres schema from psql with \d people and feed that to a prompt to generate the Convex schema.
Reconnect your table relationships
If you have a field referring to documents in other tables, the convex-ic way to do that is to use v.id("people") as the field. Note: this field can’t be any arbitrary string. It must be the person._id value.
You’re welcome to continue using your other ID with an index on that field, but you will likely find benefit from using v.id, for instance runtime validators that ensure the IDs point to the correct tables.
Example schema:
Migrating relationships to Convex IDs generally looks like:
-
Import the data with existing primary keys (e.g. an
idfield) -
Add an index on that field to look it up by
id. -
Update relationship fields (e.g. a
authorId: v.string()field) to bev.union(v.string(), v.id("users")). At this point your schema looks like:1import { defineSchema, defineTable } from "convex/server"; 2import { v } from "convex/values"; 3 4export default defineSchema({ 5 people: defineTable({ 6 id: v.string(), 7 name: v.string(), 8 pseudonym: v.string(), 9 }).index("id", ["id"]), 10 blogPosts: defineTable({ 11 authorId: v.union(v.string(), v.id("people")), 12 content: v.string(), 13 }), 14}); 15 -
Write a function to get the document by either id:
1import { QueryCtx } from "./_generated/server"; 2import { Id } from "./_generated/dataModel"; 3 4async function getPersonById(ctx: QueryCtx, id: string | Id<"people">) { 5 const convexId = ctx.db.normalizeId("people", id); 6 if (convexId !== null) { 7 return ctx.db.get(convexId); 8 } else { 9 return ctx.db 10 .query("people") 11 .withIndex("id", (q) => q.eq("id", id)) 12 .unique(); 13 } 14} 15 -
Write a migration to walk the table and update the
authorIdreferences:1export const updateAuthorId = migrations.define({ 2 table: "blogPosts", 3 migrateOne: async (ctx, doc) => { 4 const author = await getPersonById(ctx, doc.authorId); 5 if (!author) throw new Error("Can't find author"); 6 if (author._id !== doc.authorId) { 7 await ctx.db.patch(doc._id, { authorId: author._id }); 8 } 9 }, 10}); 11See the migration docs for setting up and running migrations.
-
Update your schema to use the
v.idvalidator:1blogPosts: defineTable({ 2 authorId: v.id("people"), 3 content: v.string(), 4}), 5Note: thanks to schema validation, this will only successfully deploy if all blog posts have an
authorIdthat is a valid Convex ID. Unless you disable schema validation, the deployed Convex schema is guaranteed to match the data at rest!
Alternative: Streaming import using Airbyte
If you have a larger amount of data, or want to continuously stream data, you can try out the Airbyte integration. Check out more details https://www.convex.dev/can-do/airbyte
Convex is the backend platform with everything you need to build your full-stack AI project. Cloud functions, a database, file storage, scheduling, workflow, vector search, and realtime updates fit together seamlessly.