Bright ideas and techniques for building with Convex.
Profile image
Ian Macartney
a year ago

Moderating ChatGPT Content: Full-Stack

Using the createModeration API to moderate code

Trying to build an app on top of OpenAI’s Dall-E or ChatGPT? Avoid violating the OpenAI Usage policies or showing inappropriate data to your users by leveraging their Moderation API. In this post, we’ll look at how to use it to flag messages before sending them to Chat-GPT. This will build on the Building a full-stack ChatGPT app post, but the approach is generally applicable.

New to Convex? Convex is a backend application platform - we make I,t easy to build apps where you need to run code in the server and store data in a database. We run your functions, host your data, and much more. It’s a great fit for working with powerful APIs like OpenAI, as it allows you to build a whole application around powerful AI models without running any servers yourself.

Without further ado, let’s look at how to use OpenAI’s moderation endpoints.

Initializing the API:

The core piece of code is in the convex/openai.js file. This takes less than a millisecond, so we can run it on each request.

const apiKey = process.env.OPENAI_API_KEY;
if (!apiKey) {
  await fail(
    "Add your OPENAI_API_KEY as an env variable in the " +
      "[dashboard](https://dasboard.convex.dev)"
  );
}
const configuration = new Configuration({ apiKey });
const openai = new OpenAIApi(configuration);

Using the moderation API:

// Check if the message is offensive.
const modResponse = await openai.createModeration({
  input: body,
});
const modResult = modResponse.data.results[0];

Updating the user’s message if the message was flagged:

if (modResult.flagged) {
  await runMutation(internal.messages.update, {
    messageId: userMsgId, 
    patch: {
      error:
        "Your message was flagged: " +
        Object.entries(modResult.categories)
          .filter(([, flagged]) => flagged)
          .map(([category]) => category)
          .join(", "),
    },
  });
  return;
}

As a reminder, this will call a mutation in convex/messages.js that will patch the message, adding an error field:

export const update = internalMutation(async ({ db }, { messageId, patch }) => {
  await db.patch(messageId, patch);
});

Why not check beforehand?

It’s a good point that we could do moderation before inserting the message into the database in the first place. In my case, I wanted to show results immediately and commit the intent to send a message. If I waited for moderation, it would take an extra 500ms+ before adding the message and appearing in the UI. For moderating identity creation (see this post on adding identities), I moderate first for simplicity. I return an error when the user tries to create a lousy identity. If it felt too long, I could add a spinner while creating the identity. One nice thing about Convex mutations and actions is that you can await the results in the UI to know when they succeed/fail, even if you don’t care about their return value. For instance:

setLoading(true);
setError(null);
const errorMsg = await addIdentity(
  newIdentityName,
  newIdentityInstructions
);
if (errorMsg) setError(errorMsg);
setLoading(false);

Refreshing the client’s copy of messages with the new field:

This happens automatically! One of the cool things about Convex is that queries are reactive by default. Here’s how it works:

  1. We originally fetched a list of 100 recent messages in the UI with a query called list in convex/messages.js, which runs in Convex’s cloud. This result is cached automatically, thanks to guarantees provided by Convex’s deterministic runtime.
  2. When we sent a new message, we inserted a message into the “messages” table.
  3. Because the list query was for the most recent 100 messages, the results were automatically invalidated and re-executed. The new results get pushed to clients over a WebSocket, triggering a React render for the component(s) that called useQuery(api.messages.list) with the new messages.
  4. The same thing happens when we update a message to add an error field. As if by magic, the message with the error field shows up in the UI and we can render a different UI.

Rendering flagged messages in the UI:

To avoid showing flagged messages, we can do some standard React gating:

<span style={{ whiteSpace: "pre-wrap" }}>
  {message.error ? "⚠️ " + message.error : message.body ?? "..."}
</span>

Here we show an Error with ⚠️ pre-pended if there is one, otherwise a body if there is one, otherwise “…” for messages we haven’t updated with a bot’s response.

Filtering flagged messages out of future bot input

We avoided sending the message to ChatGPT when we initially detected bad input. Still, when we send our next set of messages to the API, we don’t want to include the bad input in historical messages either. To do that, we can adjust our filter in messages:send:

const messages = await db
      .query("messages")
      .filter((q) => q.eq(q.field("error"), undefined))
      .filter((q) => q.neq(q.field("body"), undefined))
      .take(21); // 10 pairs of prompt/response and our most recent message.

We avoid getting any messages with an “error” field and avoid all the messages without a body, for instance, bot messages that weren’t updated (such as when we bailed due to flagged input).

Summary

In this post, we looked at how to moderate what content you send to ChatGPT, from the UI to the API calls. I hope it was helpful. Let us know in Discord what you think!

Build in minutes, scale forever.

Convex is the backend application platform with everything you need to build your project. Cloud functions, a database, file storage, scheduling, search, and realtime updates fit together seamlessly.

Get started