Stack logo
Sync up on the latest from Convex.
Mike Cann's avatar
Mike Cann
5 days ago

Code Spelunking: Uncovering Convex's API Generation Secrets

picture of a cave and representation of an api to represent code spelunking

I'm the newest member of the Convex team, and as a newcomer, I know I have a lot to learn. One feature that has fascinated me is Convex's automatic API creation, a powerful capability that is one of the reasons why development with Convex feels so fast.

For example I can write a function like this on the server:

// In "/convex/foo/myQueries.ts"

export const listMessages = query({
  args: {
    fromUserId: v.id("users"),
  },
  handler: (ctx, args) => {
    ...
  },
});

Then on the client I can access this (from React for simplicity) using a really clean, type-safe interface:

export const UserMessages: React.FC<Props> = ({ fromUserId }) => {
  const messages = useQuery(api.foo.myQueries.listMessages, { fromUserId });
  ...
}

I have attempted to write something like this in the past but have never done a great job so I'm super curious to see how they have managed to pull this off.

So join me as I go spelunking through the Convex codebase and work out how this magic works!

Method

Before diving in, I want to mention that there are several ways to approach this kind of investigation.

You could start with the documentation, explore the codebase directly, search online, or ask the system's creators themselves.

I considered taking a shortcut by asking an AI agent like Cursor or Github's Copilot. While that might get me quick answers, I realized this is one of those cases where the journey matters more than the destination. The small insights you gather along the way often prove invaluable when tackling future challenges.

Sometimes asking your all-knowing friend for the answer isn't the best way to learn!

So I decided to go back to first principles and follow the code's breadcrumbs to see where they lead.

Getting Started

Let's start with what we know: when you run the convex dev command in the terminal, a process generates files in the convex/_generated directory.

https://github.com/a16z-infra/ai-town/blob/e66da914f0418a202098b01a16b3d5a38cac2997/convex/_generated/api.d.tshttps://github.com/a16z-infra/ai-town/blob/e66da914f0418a202098b01a16b3d5a38cac2997/convex/_generated/api.d.ts

There's a few files here but one one I'm most interested in is api.d.ts as its the one that seems to be where all that automatic type-safe API magic is coming from.

So I guess a sensible place to start would be to checkout the convex dev command. Lets start by looking at the npm package for the convex CLI command and seeing where it points to.

From the referenced source-code we can see that the package.json lists “bin” which is how Node knows what do do when you run a command provided by a package.

  ...
  "bin": {
    "convex": "bin/main-dev",
    "convex-bundled": "bin/main.js"
  },
  ...

If we look in the bin directory in the repo and open the main-dev file we can see that its a simple bash script that does something different depending if its windows or mac.

#!/bin/bash
# Run the Convex CLI directly from source code.

if [ "$(uname)" == "Darwin" ] || [ "$(expr substr $(uname -s) 1 5)" == "Linux" ]; then
  SCRIPTDIR="$(echo "$0" | python3 -c 'import os; print(os.path.dirname(os.path.realpath(input())))')"
  CONVEX_RUNNING_LIVE_IN_MONOREPO=1 "exec" "$SCRIPTDIR/../node_modules/.bin/tsx" "$SCRIPTDIR/../src/cli/index.ts" "$@"
else # it's probably Windows
  # This doesn't follow symlinks quite as correctly as the Mac/Linux solution above
  CONVEXDIR="$(dirname "$(dirname "$0")")"
  CONVEX_RUNNING_LIVE_IN_MONOREPO=1 "exec" "$CONVEXDIR/node_modules/.bin/tsx" "$CONVEXDIR/src/cli/index.ts" "$@"
fi

It seems like no matter the platform you are always going to be running the code defined in the /src/cli/index.ts file, so lets look there next.

...
const program = new Command();
  program
    .name("convex")
    .usage("<command> [options]")
    .description("Start developing with Convex by running `npx convex dev`.")
    .addCommand(login, { hidden: true })
    .addCommand(init, { hidden: true })
    .addCommand(reinit, { hidden: true })
    .addCommand(dev) // <-- this is the one we are looking for
    .addCommand(deploy)
    .addCommand(deployments, { hidden: true })
    .addCommand(run)
    .addCommand(convexImport)
    .addCommand(dashboard)
    ....

As expected this file lists the main “program” and is using the excellent Commander library to help with the CLI.

Lets just confirm this by typing convex --help and noting the output:

Cool, lets continue exploring the dev command specifically which is handily defined in dev.ts

Right at the bottom of the file we have what looks like the main part of the command:

  promises.push(
      watchAndPush(
        ctx,
        {
          ...credentials,
          verbose: !!cmdOptions.verbose,
          dryRun: false,
          typecheck: cmdOptions.typecheck,
          typecheckComponents: !!cmdOptions.typecheckComponents,
          debug: false,
          debugBundlePath: cmdOptions.debugBundlePath,
          codegen: cmdOptions.codegen === "enable",
          liveComponentSources: !!cmdOptions.liveComponentSources,
        },
        cmdOptions,
      ),
    );

As a side note, Github's reference sidebar has been invaluable in helping me navigate the codebase and track function references:

So digging into thewatchAndPush function it looks like it contains the main infinite “dev” loop in it:

 ...
 while (true) {    
    const start = performance.now();
    tableNameTriggeringRetry = null;
    shouldRetryOnDeploymentEnvVarChange = false;
    const ctx = new WatchContext(cmdOptions.traceEvents);
    showSpinner(ctx, "Preparing Convex functions...");
    try {
      await runPush(ctx, options);
      const end = performance.now();
    ...

That "Preparing Convex functions..." is a good sign that we are on the right track as this is what you see when you make a change to a Convex file right before the codegen does its thing.

The runPush function looks to be the next port of call on our journey..

We are now in components.ts and this function:

export async function runPush(ctx: Context, options: PushOptions) {
  const { configPath, projectConfig } = await readProjectConfig(ctx);
  const convexDir = functionsDir(configPath, projectConfig);
  const componentRootPath = await findComponentRootPath(ctx, convexDir);
  if (ctx.fs.exists(componentRootPath)) {
    await runComponentsPush(ctx, options, configPath, projectConfig);
  } else {
    await runNonComponentsPush(ctx, options, configPath, projectConfig);
  }
}

The word "components" in the code is likely referring to Convex's new Components system. This appears to be an abstraction layer to handle codebases with and without component support.

Rather than diving into the Pull Request history of this file, let's stay focused on our main investigation.

For now, let's follow the runNonComponentsPush code path and see where it takes us.

 ...
 if (!options.codegen) {
    logMessage(
      ctx,
      chalk.gray("Skipping codegen. Remove --codegen=disable to enable."),
    );
    // Codegen includes typechecking, so if we're skipping it, run the type
    // check manually on the query and mutation functions
    const funcDir = functionsDir(configPath, projectConfig);
    await typeCheckFunctionsInMode(ctx, options.typecheck, funcDir);
  } else {
    await doCodegen(
      ctx,
      functionsDir(configPath, projectConfig),
      options.typecheck,
      options,
    );
    if (verbose) {
      logMessage(ctx, chalk.green("Codegen finished."));
    }
  }
  ...

Hmm this if block looks promising. As a side note, its interesting that you can turn off codegen, I'm not exactly sure why you would want to but its there regardless 🤷

doCodegen function sounds like what we are after so lets explore it a bit more

export async function doCodegen(
  ctx: Context,
  functionsDir: string,
  typeCheckMode: TypeCheckMode,
  opts?: { dryRun?: boolean; generateCommonJSApi?: boolean; debug?: boolean },
) {
  const { projectConfig } = await readProjectConfig(ctx);
  const codegenDir = await prepareForCodegen(ctx, functionsDir, opts);

  await withTmpDir(async (tmpDir) => {
    // Write files in dependency order so a watching dev server doesn't
    // see inconsistent results where a file we write imports from a
    // file that doesn't exist yet. We'll collect all the paths we write
    // and then delete any remaining paths at the end.
    const writtenFiles = [];

    // First, `dataModel.d.ts` imports from the developer's `schema.js` file.
    const schemaFiles = await doDataModelCodegen(
      ctx,
      tmpDir,
      functionsDir,
      codegenDir,
      opts,
    );
    writtenFiles.push(...schemaFiles);

    // Next, the `server.d.ts` file imports from `dataModel.d.ts`.
    const serverFiles = await doServerCodegen(ctx, tmpDir, codegenDir, opts);
    writtenFiles.push(...serverFiles);

    // The `api.d.ts` file imports from the developer's modules, which then
    // import from `server.d.ts`. Note that there's a cycle here, since the
    // developer's modules could also import from the `api.{js,d.ts}` files.
    const apiFiles = await doApiCodegen(
      ctx,
      tmpDir,
      functionsDir,
      codegenDir,
      opts?.generateCommonJSApi || projectConfig.generateCommonJSApi,
      opts,
    );
    writtenFiles.push(...apiFiles);

    // Cleanup any files that weren't written in this run.
    for (const file of ctx.fs.listDir(codegenDir)) {
      if (!writtenFiles.includes(file.name)) {
        recursivelyDelete(ctx, path.join(codegenDir, file.name), opts);
      }
    }

    // Generated code is updated, typecheck the query and mutation functions.
    await typeCheckFunctionsInMode(ctx, typeCheckMode, functionsDir);
  });
}

Let's examine this function in detail since it has some interesting components and helpful comments that make it easy to follow.

The await withTmpDir(async (tmpDir) => helper is particularly elegant—it provides a clean way to handle temporary files that get automatically cleaned up after use. While there's a small risk that temporary files might remain if the CLI crashes during codegen, the operating system should eventually handle cleanup.

The comment about writing files in dependency order is intriguing. Though I'd like to explore this concept further, let's bookmark it for now and move forward.

Among several possible paths to explore, I'm particularly interested in the generation of the api.d.ts file, so let's investigate the doApiCodegen function next.

  ...
  const absModulePaths = await entryPoints(ctx, functionsDir);
  const modulePaths = absModulePaths.map((p) => path.relative(functionsDir, p));

  const apiContent = apiCodegen(modulePaths);
  await writeFormattedFile(
    ctx,
    tmpDir,
    apiContent.JS,
    "typescript",
    path.join(codegenDir, "api.js"),
    opts,
  );
  ...

Most of this function seems to deal with actually writing the api files out to disk. We may return to this but for now I'm interested in how it works out which files and functions it should export out to the api.d.ts file.

It looks like entryPoints might be a good place to head next as the name seems to suggest its responsible for finding the “entry points” into the API.

export async function entryPoints(
  ctx: Context,
  dir: string,
): Promise<string[]> {
  const entryPoints = [];

  for (const { isDir, path: fpath, depth } of walkDir(ctx.fs, dir)) {
    if (isDir) {
      continue;
    }
    const relPath = path.relative(dir, fpath);
    const parsedPath = path.parse(fpath);
    const base = parsedPath.base;
    const extension = parsedPath.ext.toLowerCase();
    ...

So the first thing I see here is the walkDir function which appears to be a recursive directory walker. This makes sense as convex functions could be nested arbitrarily deep within directories, do you need a way to recursively iterate over this tree structure.

Hmmm.. It seems like most of the ~100 line entryPoints function deals with logging. There is a little bit at the end that deals with excluding ts files that dont contain export or input in them

 // If using TypeScript, require that at least one line starts with `export` or `import`,
  // a TypeScript requirement. This prevents confusing type errors described in CX-5067.
  const nonEmptyEntryPoints = entryPoints.filter((fpath) => {
    // This check only makes sense for TypeScript files
    if (!fpath.endsWith(".ts") && !fpath.endsWith(".tsx")) {
      return true;
    }
    const contents = ctx.fs.readUtf8File(fpath);
    if (/^\s{0,100}(import|export)/m.test(contents)) {
      return true;
    }
    ...

This is interesting but its not what was expecting to see. I was expecting some kind of AST parser that works out whether the given file contains Convex functions in it or not and then if it does then it should be included in the API.

Instead what im seeing is that so long as the file includes import or export within its not a _deps , _generated or http router file then its considered to be a file that has an “entry point” in it.

So what's going on here? How does this code work?

export const UserMessages: React.FC<Props> = ({ fromUserId }) => {
  const messages = useQuery(api.foo.myQueries.listMessages, { fromUserId });
  ...
}

How does Typescript and the JS runtime know that there is a callable Convex function at api.foo.myQueries.listMessages?

I think at this point it is probably a good idea to take a step back and re-asses how I thought API generation works.

Typescript secrets

Lets take a look at an actual api.d.ts file more deeply. Using use the one from the excellent AI Town project as an example we see:

declare const fullApi: ApiFromModules<{
  "agent/conversation": typeof agent_conversation;
  "agent/embeddingsCache": typeof agent_embeddingsCache;
  "agent/memory": typeof agent_memory;
  "aiTown/agent": typeof aiTown_agent;
  "aiTown/agentDescription": typeof aiTown_agentDescription;
  "aiTown/agentInputs": typeof aiTown_agentInputs;
  "aiTown/agentOperations": typeof aiTown_agentOperations;
  "aiTown/conversation": typeof aiTown_conversation;
  "aiTown/conversationMembership": typeof aiTown_conversationMembership;
  "aiTown/game": typeof aiTown_game;
  "aiTown/ids": typeof aiTown_ids;
  "aiTown/inputHandler": typeof aiTown_inputHandler;
  "aiTown/inputs": typeof aiTown_inputs;
  "aiTown/insertInput": typeof aiTown_insertInput;
  "aiTown/location": typeof aiTown_location;
  "aiTown/main": typeof aiTown_main;
  "aiTown/movement": typeof aiTown_movement;
  "aiTown/player": typeof aiTown_player;
  "aiTown/playerDescription": typeof aiTown_playerDescription;
  "aiTown/world": typeof aiTown_world;
  "aiTown/worldMap": typeof aiTown_worldMap;
  constants: typeof constants;
  crons: typeof crons;
  "engine/abstractGame": typeof engine_abstractGame;
  "engine/historicalObject": typeof engine_historicalObject;
  http: typeof http;
  init: typeof init;
  messages: typeof messages;
  music: typeof music;
  testing: typeof testing;
  "util/FastIntegerCompression": typeof util_FastIntegerCompression;
  "util/assertNever": typeof util_assertNever;
  "util/asyncMap": typeof util_asyncMap;
  "util/compression": typeof util_compression;
  "util/geometry": typeof util_geometry;
  "util/isSimpleObject": typeof util_isSimpleObject;
  "util/llm": typeof util_llm;
  "util/minheap": typeof util_minheap;
  "util/object": typeof util_object;
  "util/sleep": typeof util_sleep;
  "util/types": typeof util_types;
  "util/xxhash": typeof util_xxhash;
  world: typeof world;
}>;
export declare const api: FilterApi<
  typeof fullApi,
  FunctionReference<any, "public">
>;
export declare const internal: FilterApi<
  typeof fullApi,
  FunctionReference<any, "internal">
>;

Here we can see all TypeScript modules are combined into one large object type. Interestingly, even modules likeutil/assertNeverare included in this API type, despite containing just a single helper function:

// From https://www.typescriptlang.org/docs/handbook/unions-and-intersections.html#union-exhaustiveness-checking
export function assertNever(x: never): never {
  throw new Error(`Unexpected object: ${JSON.stringify(x)}`);
}

Aha! The magic that determines which files should be included in the client-accessible API happens at the TypeScript level, not during codegen!

This is fascinating. Let's verify this by examining the ApiFromModules type that fullApi uses.


/**
 * Given the types of all modules in the `convex/` directory, construct the type
 * of `api`.
 *
 * `api` is a utility for constructing {@link FunctionReference}s.
 *
 * @typeParam AllModules - A type mapping module paths (like `"dir/myModule"`) to
 * the types of the modules.
 * @public
 */
export type ApiFromModules<AllModules extends Record<string, object>> =
  FilterApi<
    ApiFromModulesAllowEmptyNodes<AllModules>,
    FunctionReference<any, any, any, any>
  >;

Ah, seeing that comment above the type makes things clearer.

Let's dig one type deeper and examine that FilterApi type

/**
 * @public
 *
 * Filter a Convex deployment api object for functions which meet criteria,
 * for example all public queries.
 */
export type FilterApi<API, Predicate> = Expand<{
  [mod in keyof API as API[mod] extends Predicate
    ? mod
    : API[mod] extends FunctionReference<any, any, any, any>
      ? never
      : FilterApi<API[mod], Predicate> extends Record<string, never>
        ? never
        : mod]: API[mod] extends Predicate
    ? API[mod]
    : FilterApi<API[mod], Predicate>;
}>;

Oof, that is one mind-bending recursive conditional type. After a bit of consultation with buddy ChatGPT I can inform you that what its doing is creating a nice type that reflects our API perfectly. Its going to exclude module exports that aren't Convex functions and is in turn going to exclude modules that contain no function references.

But wait a minute, if this is all just magical Typescript type stuff how are we able to write a chained object at runtime on the client and NOT have it throw a runtime error?

In search of magic

To explain what I mean here, open any webpage in chrome and then open the console (F11) and type this:

const api = {};
console.log(api.foo.myQueries.listMessages)

You will be greeted by a lovely cannot read properties of undefined error:

This makes sense right? Because even though JS is quite permissive its not permissive enough to allow you to access properties of an object that have yet to be defined.

So back to the original question, how on earth does Convex let us have that lovely function reference experience where we can define a function reference like api.foo.myQueries.listMessages if its not the codegen in the CLI that is doing it?

That api object must be something special 🤔 Lets take a closer look.

Going back to the AI Town example we can see that api.js in the convex/_generated directory defines the api object as anyApi from the “convex/server” package:

import { anyApi } from "convex/server";

/**
 * A utility for referencing Convex functions in your app's API.
 *
 * Usage:
 * ```js
 * const myFunctionReference = api.myModule.myFunction;
 * ```
 */
export const api = anyApi;
export const internal = anyApi;

If we follow through to where this is defined in the api.ts in the “get-convex/convex-js” repo we find this export:

/**
 * A utility for constructing {@link FunctionReference}s in projects that
 * are not using code generation.
 *
 * You can create a reference to a function like:
 * ```js
 * const reference = anyApi.myModule.myFunction;
 * ```
 *
 * This supports accessing any path regardless of what directories and modules
 * are in your project. All function references are typed as
 * {@link AnyFunctionReference}.
 *
 *
 * If you're using code generation, use `api` from `convex/_generated/api`
 * instead. It will be more type-safe and produce better auto-complete
 * in your editor.
 *
 * @public
 */
export const anyApi: AnyApi = createApi() as any;

My Spidey Sense is tingling I feel like we are getting close.

So what does createApi do?

/**
 * Create a runtime API object that implements {@link AnyApi}.
 *
 * This allows accessing any path regardless of what directories, modules,
 * or functions are defined.
 *
 * @param pathParts - The path to the current node in the API.
 * @returns An {@link AnyApi}
 * @public
 */
function createApi(pathParts: string[] = []): AnyApi {
  const handler: ProxyHandler<object> = {
    get(_, prop: string | symbol) {
      if (typeof prop === "string") {
        const newParts = [...pathParts, prop];
        return createApi(newParts);
      } else if (prop === functionName) {
        if (pathParts.length < 2) {
          const found = ["api", ...pathParts].join(".");
          throw new Error(
            `API path is expected to be of the form \`api.moduleName.functionName\`. Found: \`${found}\``,
          );
        }
        const path = pathParts.slice(0, -1).join("/");
        const exportName = pathParts[pathParts.length - 1];
        if (exportName === "default") {
          return path;
        } else {
          return path + ":" + exportName;
        }
      } else if (prop === Symbol.toStringTag) {
        return "FunctionReference";
      } else {
        return undefined;
      }
    },
  };

  return new Proxy({}, handler);
}

I knew it! All JS magic ultimately ends with Proxy Objects.. JavaScript's most magical of magical APIs.

If you aren't familiar with Proxys in JS they basically let you intercept calls to an object be it a get, set, function call and a bunch of other things so that it makes it look like you are using a normal JS objects but no you being tricked, much like being told the world is round when it is quite clearly a torus.

https://imgur.com/gallery/flat-earth-fan-club-AoXqdYGhttps://imgur.com/gallery/flat-earth-fan-club-AoXqdYG

So what the createApi function does is create a Proxy object that converts the "dot path" into a string when you use it.

For example, api.something.count gets transformed into "api/something:count".

This makes perfect sense when you think about it, the client needs to convert these function references into API calls to send to the server. It needs to use the same format as Convex's REST API.

In fact the documentation has a section where it explains exactly this:

Client libraries in languages other than JavaScript and TypeScript use strings instead of API objects:

  • api.myFunctions.myQuery is "myFunctions:myQuery"
  • api.foo.myQueries.myQuery is "foo/myQueries:myQuery".
  • api.myFunction.default is "myFunction:default" or "myFunction".

Super cool!

This is also the syntax you would use if you were going to run a function from the CLI for example: npx convex run api/something:count

Summary

So let's summarize what happens when we run convex dev:

  1. Codegen begins by identifying all "entry points". Every entry point gets exported, regardless of whether it contains a Convex function.
  2. The ApiFromModules and FilterApi types, along with their sub-types, handle the filtering. They remove any exported modules and functions that shouldn't be in the API, leaving us with a clean record type that maps the path to our Convex functions.
  3. The api object receives this type, but at runtime it's actually a Proxy object. This enables the client library to convert paths like api.foo.myQueries.myQuery into strings like "foo/myQueries:myQuery"

Where to from here?

Now that we have a better understanding of how things work under the covers, where can we go from here? Here are some ideas I've been thinking about:

  1. Exclusion list - It would be cool if we could specify a list of paths in the Convex config that should be excluded from the generated API. This would improve TypeScript performance and allow users to maintain an older API on the server while gently discouraging clients from using it, since the TypeScript type wouldn't include the excluded functions.
  2. Custom TypeScript plugin - did you know you could write plugins for the TypeScript compiler? I know right, super cool. I can imagine a plugin that leverages all this learning to help with function refactors.
  3. Function redirects - This idea is similar to the exclusion list, but instead of totally excluding a function, we could redirect it or have other functions redirect to it. This would help with API migrations for long-lived projects.

Since this post is getting quite long, these ideas will have to wait for next time. Stay tuned and ping me a message on the Convex Discord to let me know which one you think I should tackle first!

Build in minutes, scale forever.

Convex is the sync platform with everything you need to build your full-stack project. Cloud functions, a database, file storage, scheduling, search, and realtime updates fit together seamlessly.

Get started