2 years ago

Implementing Upvoting 4 Ways on Convex

Most of the time when I’m building a new feature for a project, and I need to decide on how to model the data, I’ll come up with a couple possible options, try to brainstorm any pros/cons with them just by thinking about it, and then pretty quickly just go with one and build it. If I discover as I’m building that it’s not working like I thought, sometimes I switch gears to another idea or modify as I go. But ultimately, some potential data models that I reject at the idea phase, I never actually end up exploring.

For this sample app, I decided to do something different: actually build every idea I could come up with. I wanted to learn about the tradeoffs of different data models to support the same feature, and to actually learn by doing — to build the code for a data model even if I wouldn’t think, at the outset, that it would be the “best” idea. So let me tell you what I learned about the tradeoffs of different data modeling patterns, by building using strategies that wouldn’t otherwise have seen the light of day!

The feature I’m building is a voting feature! Now of course to vote at all, you need topics to vote on… why not Quotes?

Screenshot of an app presenting three quotes with different numbers of upvotes

My voting sample app is pretty bare-bones … the topics we’re voting on are basically just strings. But voting as a feature could be applicable to practically any other use case. Maybe in a different use case the “vote” goes by the name “like”, “favorite”, “star”, etc and the things you’re voting on could be blog posts, photos, recipes… whatever other data types your app has!

I’m building this app on Convex, because it makes it really easy to start iterating on a database structure, without having to set up a server and database myself. But the options I’m exploring for how to model the various types of data and their relationship to each other could apply to any database system. And the tradeoffs, advantages and disadvantages of the different choices are pretty generally applicable, too.

Basic Voting using counters

To get started, and get a basic front-end in place, I used the most basic data model I could think of: we have a table of topics, which in this case we’re calling quotes , and each quote contains a counter for the number of votes that quote has received. I started from the Convex tutorial for a single counter, and didn’t even have to change it that much to make it store multiple counters, one for each quote we’re voting on.

Here’s my database schema (by the way, the code for the whole project is here if you want to explore it yourself!)

// schema.ts
quotes: defineTable({
    text: s.string(),
    votes: s.number(),
  }),

Getting all the quotes with their vote counts was just a matter of querying the quotes table

// getQuotes.ts convex query
export default query(async ({ db }) => {
  const quotesDocs = await db.query('quotes').collect();
  console.log(`Got ${quotesDocs.length} quotes`);
  return quotesDocs.sort((a, b) => b.votes - a.votes);
});

And to vote for a quote, we just need to increase its counter by one

// voteForQuote.ts convex mutation
export default mutation(
  async ({ db }, quoteId: Id<'quotes'>, increment: number) => {
    const quoteDoc = await db.get(quoteId);
    if (quoteDoc === null) {
      throw 'How did you vote for a quote that did not exist?.';
    } else {
      await db.patch(quoteDoc._id, { votes: quoteDoc.votes + increment });
      console.log(`This quote now has ${quoteDoc.votes + increment} votes.`);
    }
  }
);

That’s it! A voting feature, all done! Right?

Of course not, this article has just begun :p

The problem with counter-based voting

Clicking a button to vote for a topic is the bare minimum functionality for a voting app. What if I wanted to support features like:

Ensure each user can only vote once for any given topic.
Show users which topics they’ve already voted for.
Or for that matter, show users what topics another user has voted for.
Show all the users who have voted for a particular topic.

My counters data model just doesn’t contain enough info to enable these features — by just incrementing a counter when users vote, I’m losing all kinds of information about who voted for what. This data model is going to make it basically impossible to support any of these features!

To keep track of which users have voted for which topics, we need to structure our database to model that relationship. We’ve talked about keeping track of relationships between different types of data on Convex, for example relating each chat message to the user who sent it. But here there’s a key difference: users can vote for any number of topics (or none!), and topics can get votes from any number of users (or none!). This is a many-to-many relationship.

Here’s my brainstorm of possible table designs that could keep track of this many-to-many relationship:

On each topic, store a list of all users who have voted for that topic.
On each user, store a list of the topics they’ve voted for.
Create an entire new votes table, separate from either users or topics. Each record in the votes table is a single user’s vote for a single topic.

Now, ordinarily I would start listing some reasons why each of these might be a good or bad idea, different aspects of the code they might make easier or harder… but like I said at the start, this app is for getting my hands dirty, not armchair philosophizing. Let’s build them all!

Each of these ideas would be enough, on its own, to keep track of the full data about which users have voted for which topics. But could we pick more than one? what if both users and topics kept track of votes? There’s nothing stopping us from doing that, but it would mean that the same data is being stored in multiple different ways in our database. What if they get out of sync? which should be the source of truth? Rather than going down the mix-and-match route, I’m going to build each of the 3 ideas individually. And to keep the database models separate, we’ll pick a different topic to use as an example for each data modeling strategy.

Finally, lest this comparison post get way too long, I’m just going to build the first two features on our list of desirable voting features: users will be prevented from voting more than once, and they’ll see which topics they’ve already voted on because the vote button will be disabled!

Okay let’s go!

Topics keep track of votes

The first idea for modeling a many-to-many relationship is

On each topic, store a list of all users who have voted for that topic

For this strategy, we’re calling our topics “goats” — welcome to Goat Voter!

Photo of a goat

Here’s the database schema: I created a goats table to hold the topics, and a field voters which contains a list of user ids that have voted for that goat.

// schema.ts
goats: defineTable({
    text: s.string(),
    voters: s.array(s.id('users'))
  }),

This data model actually turned out to make several things easier than I expected! (and definitely easier than some of the other data models I tried… but we’ll get there!) Counting the votes for each goat was easy — just take the length of the voters list.

Preventing users from voting more than once was a bit tricker, but not much: we just have to check whether the current user’s id is in the list already before adding it. [can we have code snippets with line numbers? then I can reference line numbers here in the text description]

//voteForGoat.ts
export default mutation(
  async ({ db, auth }, goatId: Id<'goats'>) => {
    const user = await getUserDoc(db, auth);
    const goatDoc = await db.get(goatId);
    if (goatDoc === null) {
      console.log('How did you vote for a goat that did not exist?.')
    } else {
      console.log(goatDoc.voters, user._id)
      const existingIds = goatDoc.voters.map((v) => v.id);
      if (existingIds.indexOf(user._id.id) < 0) {
        goatDoc.voters.push(user._id)
        db.replace(goatDoc._id, goatDoc)
        console.log(`successfully voted! This goat now has ${goatDoc.voters.length} votes`)
      } else {
        console.log(`Current user ${user._id} has already voted for ${goatDoc.text}`)

      }
    }
  }
)

Now we have to do this check on the front-end too so we know whether to disable the vote button — otherwise a user would think they could vote again and it just wouldn’t work!

// If the current userId's id is in the list goat.voters, they've already voted
// so disable voting again
disableVoting: goat.voters.map((v) => v.id).indexOf(userId.id) >= 0,

Downsides

Doing this check to enforce “you can’t vote twice” does require iterating through the entire voters list though, which seems like it could get slow if there were lots of votes for a particular goat. In fact, that’s not the only problem with this design — the Convex array data type supports a max of 8192 values in a list — so our Goat Voter has an implicit limit of 8192 votes per goat… guess none of our goats can go viral 🐐😭

In fact, the Convex best practices recommend limiting lists and using relationship tables. Which makes sense, since iterating over all the voters for each goat to check for double-votes might get real slow if goats end up getting anywhere close to 8192 votes each!

When would this be a good fit?

Despite the size limitations of keeping a potentially large list in each topic, I think there are some use cases where this might be a good design for a many-to-many relationship. What if instead of “voting” on “goats”, users were “joining” small “groups” (different use case but still a many-to-many relationship), and you wanted the groups to have a maximum size? Each group could hold a list of its members, and the lists lengths probably wouldn’t get too out of hand!

Users keep track of votes

Okay, let’s leave our goats behind for a moment and move on to our next data modeling idea:

On each user, store a list of the topics they’ve voted for

We’ll need a new topic to vote on, since it wouldn’t make sense to combine this idea with the data model we’ve already done with goats. Hmmm, what else rhymes with “vote”… how about Notes!

Here’s my data model. In addition to adding a notes table, I modified our users table to have each user keep track of which notes they’ve voted for, in the noteVotes field.

//schema.ts
users: defineTable({
  name: s.string(),
  tokenIdentifier: s.string(),
  noteVotes: s.set(s.string()) // must be the raw id string for a `notes` doc
}),
notes: defineTable({
  text: s.string(),
}),

Since I used the Array data type for Goats, I wanted to branch out and use Sets for Note Voter. Sets seemed like a good match since each user can only vote once for each Note, and sets prevent duplicates. Just like Arrays, the 8192 limit in convex applies to Sets as well. But by contrast to the previous example, where it’s pretty possible to imagine a really popular Goat getting more thank 8192 votes and not being able to store them, it seems pretty unlikely that a single user will end up voting for more than 8192 different Notes (that’s a lot of clicking!).

Quick technical note: if you had an eagle eye for details, you might have noticed that noteVotes is a set of strings, and not a set of Note Ids! as it turns out, Convex Ids can’t be compared using === , so doing a set-contains operation to prevent double voting wasn’t working! Instead I made a set of the raw .id properties on each Id<'notes'> .

Using sets in this way allowed me to avoid iterating over the entire list of Notes that a user had voted for so far, when checking if they’d already voted:

// from getNotes.ts
disableVoting: user.noteVotes.has(note._id.id),

// from voteForNote.ts
if(user.noteVotes.has(noteDoc._id.id)) {
  console.log(`Current user ${user._id} has already voted for ${noteDoc.text}`)
  return;
}

So implementing the “one vote per user” feature was pretty straightforward. And users being able to see what they’d voted for was also straightforward — each “user” in the database had that information right at hand!

Downsides

So the extra feature ideas we had got easier — but at it’s root, my voting app was a list of notes with their vote counts. And with this data model, actually tallying the votes got a lot harder! I ended up having to iterate through every set on every user to tally the votes — each note was basically off in its own little world and didn’t have any concept of how many users had voted for it! Here’s the code for getNotes.ts — compare this to getQuotes.ts!

//getNotes.ts 
export default query(async ({ db , auth}): Promise<NoteWithVotingInfo[]> => {
  const user = await getUserDoc(db, auth);
  const notesDocs = await db
    .query('notes')
    .collect()
  console.log(`Got ${notesDocs.length} notes`);

  const notesWithVotingInfoByNoteId: Record<string, NoteWithVotingInfo> = {};

  notesDocs.forEach((note) => {
    // type GenericId<'notes'> cannot be used as an index type...
    notesWithVotingInfoByNoteId[note._id.id] = {
      disableVoting: user.noteVotes.has(note._id.id),
      votes: 0,
      ...note
    }
  })
  // To fill in the votes for each note, we need to look at *all* the users
  // and count up each of their votes
  const allUsers = await db
    .query("users").collect();
  allUsers.forEach(({noteVotes}) => {
    if (noteVotes) {
      noteVotes.forEach((id) => {
        notesWithVotingInfoByNoteId[id].votes += 1
      })
    }
  })

  const result = []
  for (let noteId in notesWithVotingInfoByNoteId) {
    result.push(notesWithVotingInfoByNoteId[noteId])
  }

  return result.sort((a,b)=> b.votes - a.votes);
})

[Aside: it would be possible to solve this by also keeping a counter on each note — but then there would be two sources of truth for the vote count data, which might possibly get out of sync. This is a whole other topic called Denormalization]

Overall I think this was the most annoying option I built:

Getting vote counts for each note seems really inefficient— you have to look at every single user!
it still runs into size limits! A user can only technically vote for up to 8192 notes. And while this is a Convex limitation, it’s not unique to Convex — in any database system you’d likely run into issues with the size of any individual record getting too large.

When would this be a good fit?

So what’s the saving grace (note)?

This might be a reasonable choice if your use case for a many-to-many relationship is more user-centered, something like a “favorites” feature. If your UI cared more about displaying to individual users their “favorite” Notes, (or even another user’s “favorite” notes) this would be a better fit — especially if it was relatively unimportant to show the mile-high aggregate view of all topics and the number of “favorites” each of them had.

A whole new table keeps track of votes

Let’s leave our Notes to hang out with the Goats and turn to our third idea:

Create an entire new votes table, separate from either users or topics. Each record in the votes table is a single user’s vote for a single topic.

For our topic type this time, I’ve saved the best for last: Boats (my personal favorite)

A large boat is launched from the pier into the water

For the boat voter, I’ve added not one but two new tables; one to hold the boats and another to hold all the votes!

//schema.ts
boats: defineTable({
  text: s.string(),
}),
boatVotes: defineTable({
  boat: s.id('boats'),
  user: s.id('users')
})

Checking for an existing vote to prevent a user from voting twice was really simple — just one Convex query

const existingBoatVote = await db.table('boatVotes')
  .filter(q => q.and(
    q.eq(q.field('user'), user._id),
    q.eq(q.field('boat'), boatDoc._id)))
  .unique();
if(existingBoatVote !== null) {
  console.log(`Current user ${user._id} has already voted for ${boatDoc.text}`)
  return;
}

This is simpler than either Goats or Notes, both of which required querying for a whole array or set of things and checking membership in typescript.

However, counting up the votes for each boat requires looking at every single record in the boatVotes table. One way to do this would be await db.table('boatVotes').collect()to get each of the boat Votes and group/count them in typescript. But I decided not to do it this way, since I’d done a bunch of that style of counting in NoteVoter.

Instead, I decided to use convex queries to count the number of votes for each boat:

const votes = (await db.table('boatVotes')
    .filter(q => q.eq(q.field('boat'), boatDoc._id))
    .collect()).length;

it was cool that the convex query API could give me exactly the votes I wanted. On the other hand, it meant I did a bunch of these queries, one for each boat; I would be super curious to see which of these approaches (one surgically precise query for each boat, or one big query → group&count in Typescript) would scale better to lots of boats and votes.

In terms of scaling, unlike either Goats or Notes, there are no technical limits to the number of boats a single user can vote for or the number of users that can vote for a single boat — depending on your use case and expected scale, that may be a really important property, or it may not matter as much.

When would this be a good fit?

This approach would be pretty well suited to the use cases of “showing a single boat and every person who voted for it” AND “showing a single user and every boat they voted for” — by contrast to either Notes or Goats, which would work well for one of those use cases but aren’t set up well to show the other (Goats could easily display every user who’s voted for them; each User could easily see which Notes they’d already voted for, but not vice versa).

This boatVotes table design could also be pretty natural if there were different “kinds” of votes, i.e. like, heart, wow, or even a “reactions” feature where there can be dozens or hundreds of different types of “vote” - because the vote type could be a single additional field in the boatVotes table. Adding multiple vote types to either Goats or Notes would require adding a new field per vote type, e.g:

// schema.ts
boatVotes: defineTable({
  boat: s.id('boats'),
  user: s.id('users')
	voteType: s.union(
    s.literal('vote'), 
    s.literal('like'), 
    s.literal('wow'), 
    // etc -- one field holds all vote types)
})
goats: defineTable({
  text: s.string(),
  voters: s.array(s.id('users')),
  likers: s.array(s.id('users')),
  wowers: s.array(s.id('users')),
  // ..etc -- potentially many very similar field types, have to query them all!
}),

What’s next?

Building voting a bunch of different ways for a bunch of different topics has been a great way for me to actually test out my intuitions about which data models would be better or worse for different use cases… and there’s plenty more to learn! Here are some more directions I could explore with Quotes, Goats, Notes and Boats; let us know in Discord which ones we should do, or try some out yourself! In case you missed the link at the start, the full code is available here.

Denormalization: for some of my data structures, just getting a simple count required far too complicated of a query! We could intentionally store that data in another, simpler way, to make our queries simpler, but that runs the risk of data inconsistency! Convex’s platform protects us from some sources of inconsistency here, but not all…
More features!
- Build a user portal where a single user can see a list of all the topics they’ve personally voted for; compare the query we need to write to support this - is it simpler or more efficient for Notes, Goats, or Boats??
- Alternately, build a profile page for each Quote, Note, Goat, Boat to reveal which users have voted for it. I have some suspicions about which will be easier and harder, but like in this article, there’s a lot to learn from actually testing my intuitions by building!
Do some load testing. Write code to generate and import large numbers of topics, users and votes, and do some profiling to see which data models are faster or slower in which ways.

Build in minutes, scale forever.

Convex is the sync platform with everything you need to build your full-stack project. Cloud functions, a database, file storage, scheduling, search, and realtime updates fit together seamlessly.

Get started