Stack logo
Sync up on the latest from Convex.
Ian Macartney's avatar
Ian Macartney
2 years ago

Using Dall-E from Convex

A chat app with images generated by OpenAI

This is part of a series of posts about building a multiplayer game on Convex. In this post, we’ll explore using Dall-E via the OpenAI API. The code is here if you want to play around with it. We will make a chat app where you can type in an image description as a slash command, and it will generate and insert that image into the chat. On the server, it will:

  1. Configure an OpenAI client.
  2. Check that the prompt is not offensive using their moderation API.
  3. Get the short-lived image URL from Dall-E.
  4. Download the image.
  5. Upload the image into Convex File Storage, getting a storage ID.
  6. Insert the storage ID into a new message.
  7. When serving the message to the client, we will generate a URL based on the storage ID, which they can put in an <img> tag directly.

This is now possible using File Storage and a Convex Action, a serverless function that can have side-effects1, comparable to AWS lambda. Read below for how to use OpenAI with Convex, or check out the code here.

Using OpenAI in Convex

Set up Convex

Create a Convex app by cloning the demo and running npm i; npm run dev. If you want a fresh project, see the Tutorial.

If you’re writing your app from scratch, create a file in the convex/ folder. Mine is convex/actions/sendDallE.js.

Configuring OpenAI

  1. To use OpenAI, you need an API account. Sign up here.

  2. Generate an API key here by clicking “Create new secret key” and copying the text.

  3. Store the API key Convex’s Environment Variables, where you can store secrets. Go to your project’s dashboard by running npx convex dashboard in your project directory or finding it on dashboard.convex.dev. Open the project settings on the left and enter it with the key OPENAI_API_KEY. I’d recommend saving it in both your Production and Development deployments unless you want the published version of your app to use a different key. There is a toggle to switch between them.

  4. Configure the client in your action. If you get type errors, npm install openai:

    "use node";
    import { Configuration, OpenAIApi } from "openai";
    const configuration = new Configuration({
      apiKey: process.env.OPENAI_API_KEY,
    });
    const openai = new OpenAIApi(configuration);
    

Using the Moderation API

The moderation API is useful for checking whether a prompt is obscene or will be rejected by OpenAI. For our purposes, we’ll check if it flags the message:

const modResponse = await openai.createModeration({
  input: prompt,
});
const modResult = modResponse.data.results[0];
if (modResult.flagged) {
  throw new Error(
    `Your prompt was flagged: ${JSON.stringify(modResult.categories)}`
  );
}

The categories object can be used for more fine-grained moderation. It has a score per category. Consult the moderation documentation for more details.

Generating an image with Dall-E

You can consult the image generation documentation for more details, but for our purposes, the code is:

const opanaiResponse = await openai.createImage({
  prompt,
  n: 1,
  size: "256x256",
});
const dallEImageUrl = opanaiResponse.data.data[0]["url"];

Now we have a URL to the image. However, this URL only lasts for an hour. If we want to be able to see this image in the chat longer term, we need to store it.

Downloading the image

To download the image, we use the node-fetch NPM library.

const imageResponse = await fetch(dallEImageUrl);
const image = await imageResponse.blob();

Now we have the image in memory and can store it in Convex.

Storing the image in Convex

Storing the image in Convex is simple:

const storageId = await ctx.storage.store(image);

where ctx is the first parameter to the action function:

// in sendDallE.js
export default action(async (ctx, { prompt, author }) => {
  ...
});

Store the storage ID in the database

The storage ID returned from the upload URL is now a permanent identifier of this data. You can store it in your database however you’d like. In our example, we store it in a new chat message.

// in sendDallE.js action
await ctx.runMutation(api.messages.send, {
  body: storageId, 
  author, 
  format: "dall-e",
  });

And the mutation that stores it, for completeness:

// in messages.js
export const send = mutation(async (ctx, { body, author, format }) => {
  // A bit of a hack; we are storing the storage ID in the "body"
  await ctx.db.insert("messages", { body, author, format });
});

One thing to note is that this can’t live in the same file as the openai action. Actions can run in a node environment (with the "use node"; string at the top of the file) whereas queries and mutations run in our optimized runtime (which makes them wicked fast). You can read more about that here. I put mine in convex/sendMessage.js.

Serving the image to the client

In Convex, clients subscribe to queries. For instance, in our frontend React component, we can do:

const messages = useQuery(api.message.list) || [];
...
{messages.map(message => (
  <Message author={message.author} body={message.body} format={message.format} key={message._id.toString()} />
)};

That query maps to a serverless function: the list export of convex/messages.js. To return the image URL in messages we return to clients, we generate a URL from the storage ID. For our app, we will change each storage ID into a URL. In convex/messages.js:

export const list = query(async (ctx) => {
  const messages = await ctx.db.query("messages").collect();
  for (const message of messages) {
    if (message.format === "dall-e") {
		  // Replace the storage ID with a URL in the "body"
      message.body = await ctx.storage.getUrl(message.body);
    }
  }
  return messages;
});

This URL won’t change on each request, so you don’t have to worry about caching the URL on the client. The HTTP response also has cache headers so you don't need to worry about client-side image caching either.

Performance

One thing I discovered while building this is how slow the OpenAI API calls can be. In my testing, the moderation call took 500ms, and the image generation could take 5+ seconds, sometimes taking over 60s and timing out the request. To make this a good user experience, adding a loading indicator is important. On the client, I used this code:

// In React jsx:
const [sending, setSending] = useState(false);
const sendDallE = useAction(api.actions.sendDallE.default);
...
setSending(true);
try {
  await sendDallE({ prompt, author: name });
} finally {
  setSending(false);
}

Summary

In this post, we used Convex to fetch an image from OpenAI’s image generation service based on a user-provided prompt. Read about the game I've developed on Convex using Dall-E APIs here for more tips, or read the code to see how I added error handling and more.

Footnotes

  1. This differs from other Convex functions like query and mutation, which do not have side effects, allowing them to provide features like automatic retries, reactive updates, serializable transaction isolation, and more! Read more about it here.

Build in minutes, scale forever.

Convex is the sync platform with everything you need to build your full-stack project. Cloud functions, a database, file storage, scheduling, search, and realtime updates fit together seamlessly.

Get started