
Migrating data from Postgres to Convex
This guide will help you migrate data to Convex from Postgres on any provider (Supabase, Neon, RDS, etc).
Exporting your Postgres data
Dumping to a file using Psql
If you have relatively small amounts of data (less than a few gigabytes), you can likely just dump your data to a file from psql and import it into Convex.
First, use psql to connect to your postgres instance. To dump the table people
run a command like this one. It will dump in the JSONL format.
1\copy ( SELECT row_to_json(people) FROM people ) TO '/path/to/people.jsonl';
2
Adding it to Convex
If you already have a Convex project, simply cd
into the directory of that project. If you don’t have one yet, you can make a new project with.
1npm create convex@latest
2
Import the data
Once you are in the project directory, you can import your jsonl file into your convex deployment
1cd /my/convex/project
2npx convex import --format jsonl --replace --table people /path/to/people.jsonl
3
Take a look at your new data in the dashboard
1npx convex dashboard
2
Define your schema
You can add a Convex schema to your project following guides here https://docs.convex.dev/database/schemas#writing-schemas.
Tip: there is a button in the dashboard to generate this automatically based on your data.
It might end up looking like this:
1import { defineSchema, defineTable } from "convex/server";
2import { v } from "convex/values";
3
4export default defineSchema({
5 people: defineTable({
6 id: v.string(),
7 name: v.string(),
8 pseudonym: v.string(),
9 }),
10});
11
Your favorite LLMs (eg Cursor) can be rather helpful here to generate a schema. You can dump the postgres schema from psql with \d people
and feed that to a prompt to generate the Convex schema.
Reconnect your table relationships
If you have a field referring to documents in other tables, the convex-ic way to do that is to use v.id("people")
as the field. Note: this field can’t be any arbitrary string. It must be the person._id
value.
You’re welcome to continue using your other ID with an index on that field, but you will likely find benefit from using v.id
, for instance runtime validators that ensure the IDs point to the correct tables.
Example schema:
Migrating relationships to Convex IDs generally looks like:
-
Import the data with existing primary keys (e.g. an
id
field) -
Add an index on that field to look it up by
id
. -
Update relationship fields (e.g. a
authorId: v.string()
field) to bev.union(v.string(), v.id("users"))
. At this point your schema looks like:1import { defineSchema, defineTable } from "convex/server"; 2import { v } from "convex/values"; 3 4export default defineSchema({ 5 people: defineTable({ 6 id: v.string(), 7 name: v.string(), 8 pseudonym: v.string(), 9 }).index("id", ["id"]), 10 blogPosts: defineTable({ 11 authorId: v.union(v.string(), v.id("people")), 12 content: v.string(), 13 }), 14}); 15
-
Write a function to get the document by either id:
1import { QueryCtx } from "./_generated/server"; 2import { Id } from "./_generated/dataModel"; 3 4async function getPersonById(ctx: QueryCtx, id: string | Id<"people">) { 5 const convexId = ctx.db.normalizeId("people", id); 6 if (convexId !== null) { 7 return ctx.db.get(convexId); 8 } else { 9 return ctx.db 10 .query("people") 11 .withIndex("id", (q) => q.eq("id", id)) 12 .unique(); 13 } 14} 15
-
Write a migration to walk the table and update the
authorId
references:1export const updateAuthorId = migrations.define({ 2 table: "blogPosts", 3 migrateOne: async (ctx, doc) => { 4 const author = await getPersonById(ctx, doc.authorId); 5 if (!author) throw new Error("Can't find author"); 6 if (author._id !== doc.authorId) { 7 await ctx.db.patch(doc._id, { authorId: author._id }); 8 } 9 }, 10}); 11
See the migration docs for setting up and running migrations.
-
Update your schema to use the
v.id
validator:1blogPosts: defineTable({ 2 authorId: v.id("people"), 3 content: v.string(), 4}), 5
Note: thanks to schema validation, this will only successfully deploy if all blog posts have an
authorId
that is a valid Convex ID. Unless you disable schema validation, the deployed Convex schema is guaranteed to match the data at rest!
Alternative: Streaming import using Airbyte
If you have a larger amount of data, or want to continuously stream data, you can try out the Airbyte integration. Check out more details https://www.convex.dev/can-do/airbyte
Convex is the backend platform with everything you need to build your full-stack AI project. Cloud functions, a database, file storage, scheduling, workflow, vector search, and realtime updates fit together seamlessly.