Don't Drop ACID
An old but a great idea
With new backends-as-a-service like Convex, we’re deciding our schemas when we feel like it, storing documents instead of flat columns, and updating data reactively with ease. These great new patterns for storing and syncing data may make it seem like every concept from legacy databases is about to be tossed out the window.
But some ideas are evergreen. Like today’s topic, ACID. ACID is awesome.
But what is it, exactly? Here’s a Wikipedia summary:
ACID is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps.
In this article, we’ll focus on the most interesting two of the four1 properties: Isolation and Atomicity.
Inside Isolation
Consider a simple transaction which deducts $3 from a bank account:
var balance = readBalance();
balance -= 3; // bought a box of milk duds
writeBalance(balance);
Seems straightforward enough? But without isolation guarantees, the following could happen when two transactions run concurrently:
Step | TX1 operation | TX1 var | TX1 db | TX2 operation | TX2 var | TX2 db |
---|---|---|---|---|---|---|
1 | var balance = readBalance(); | 10 | 10 | - | - | |
2 | 10 | 10 | var balance = readBalance(); | 10 | 10 | |
3 | balance -= 3; | 7 | 10 | 10 | 10 | |
4 | 7 | 10 | balance -= 3; | 7 | 10 | |
5 | writeBalance(balance); | 7 | 7 | 7 | 7 | |
6 | 7 | 7 | writeBalance(balance); | 7 | 7 |
Each transaction individually did something reasonable. But because they ran at the same time, effectively someone got a free box of milk duds! Three dollars just disappeared, since ($10 - ($3 * 2)) should leave only $4 left in the account.
The problem here is the transactions aren’t strongly isolated from each other. If you’re familiar with multithreaded programming, you may be thinking right now: “oh, we just need a lock around these transactions!” And in fact, that is how many databases solve this problem, a method often called pessimistic locking. Another valid approach, which happens to be the one that Convex uses, is optimistic concurrency control.
But whatever method your database uses, it really should give you strong transactional isolation2 so you can confidently write your transactions without worrying about the complications of data races like these.
Isolation among friends
Granted, you’re probably not building an online bank. But nearly any application you can imagine benefits from transactional isolation!
Take Let’s Go, for example, a social gathering app built on top of Convex. In Let’s Go, you can create an event you plan, assign it a fixed number of empty seats, and then invite your whole friend group to join you if they’re free.
As an example, Ruchi might have two extra tickets to a hockey game. Let’s Go allows her to create a “Hockey Night with Ruchi” event with two seats available. Then she can invite all her friends and the first two that are interested and available can grab those seats and go watch the game with Ruchi.
But what if this particular hockey team is on a huge winning streak right now and attending the games is a hot ticket? Far more than two of Ruchi’s friends may all jump at the opportunity to tag along.
In this circumstance, we’d risk encountering a bug similar to the bank account one outlined above. If we experience a data race between checking if there is still a seat available and claiming that seat, Let’s Go could erroneously promise three of Ruchi’s friends a night out even though there are only two tickets available.
Let’s walk through the Convex mutation function that represents the transaction that has an attendee join an event.
First, we grab the event from the database:
async ({ db }, user, { eventId }: CreateAttendeeInput) => {
const event = await db.get(eventId);
if (!event) {
return null;
}
Next, we check to see that the number of attendees for this event is fewer than the number of free seats the event creator specified. Otherwise, the transaction can’t continue!
// Ensure there's still an open slot.
const attendees = await db
.query("attendees")
.filter((q) => q.eq(q.field("eventId"), eventId))
.collect();
if (attendees.length >= event?.slots) {
return null;
}
This function has an additional check… it makes sure you’re not already one of the attendees. Otherwise, one of Ruchi’s friends could claim both tickets:
// Prevent inserting duplicate attendees.
const existingAttendee = await db
.query("attendees")
.filter((q) =>
q.and(
q.eq(q.field("eventId"), eventId),
q.eq(q.field("userId"), user._id)
)
)
.first();
if (existingAttendee) {
return null;
}
Finally, if everything still looks good, we add them to the attendees table for this event.
await db.insert("attendees", {
userId: user._id,
eventId,
});
return eventId;
}
Hopefully, the parallels to our banking example are clear. Strong transaction isolation2 is necessary to ensure all of these critical validations remain true when we finally commit the attendee to the event. Otherwise, our app may place friendships in peril!
Atomic Scorekeeping
Isolation is great, but what about Atomicity? Let’s begin to understand atomicity’s utility by contrasting it with isolation.
In transactions, isolation only matters when you execute:
- At least one read. If you don’t base your write on a read, then you’re not exposed to a data race.
- At least one write. If you don’t write anything in your transaction, none of this matters!
In contrast, atomicity only matters when your transaction contains:
- Two or more writes
Why two or more? Because atomicity is about ensuring that transactions with multiple writes either entirely succeed or fail. Stated another way, either all writes commit, or none do.
Let’s see why this matters in a baseline banking context again by modeling a transaction transferring $5 from account A to account B.
var aAmount = readFromA();
var bAmount = readFromB();
writeToA(aAmount - 5);
// CRASH???
writeToB(bAmount + 5);
Imagine one of Ruchi’s friends, absolutely livid about an unfulfilled-hockey-seat, unplugged our bank’s non-ACID database server during this transaction. If our transaction stopped running after taking the $5 out of account A, but before putting it into account B, the money in transit would just disappear out of our system!
ACID databases with the appropriate atomicity ensure that this never happens. Upon failure, the entire transaction would be rolled back, and account A would still have the $5.
But when the stakes are even greater than money? When we involve victory, pride, and competitive Wordle?
Fast5 is a head-to-head Wordle-style game built on Convex. You compete against a friend to see who can guess a series of five words the fastest. And of course, we’re keeping score.
Specifically, when the game is over, the user account with the higher score gets credited with a win, and the loser's account... well, you get the idea.
This might sound too simple to need ACID. But imagine each player’s User
object that holds their win/loss totals is a separate record in our database. Just as in our bank transfer transaction, we need to ensure the win increment and loss increment both succeed so we don’t “lose” a win or a loss.
Here’s how we do win/loss accounting in Fast5:
export async function recordGameStats(db: DatabaseWriter, game: any) {
var user1 = await db.get(game.user1);
var user2 = await db.get(game.user2);
if (game.winner === 1) {
// user 1 wins, user 2 loses
user1.wins += 1;
user2.losses += 1;
} else if (game.winner === 2) {
// user 2 wins, user 1 loses
user2.wins += 1;
user1.losses += 1;
} else if (game.winner === 3) {
// tie!
user1.ties += 1;
user2.ties += 1;
} else {
throw 'unknown winner code';
}
// write out both user records with updated win/loss/tie info
db.patch(user1._id, user1);
// what happens if we crash here? Need atomicity!
db.patch(user2._id, user2);
}
In short, this transaction calls db.get
to retrieve both users’ records, adjusts each win/loss/draw counts according to the outcome of the game, and then updates both records with the new tallies using db.patch
. And without atomicity, we’d “lose” a win or a loss if we crashed after only updating one record.
But since this is Convex, this never happens!
ACID keeps things basic
If you reflect back on the code in this article, there’s nothing special about it. In fact, we’ve talked a lot about these subtle and crucial concepts, but then the actual code was pretty unexciting and obvious.
And that’s precisely why ACID is so key! When the underlying system supports ACID semantics, the developer is free to write code the intuitive way and ignore the complexities of concurrency and failure.
So keep taking a regular dose of ACID!
Footnotes
-
While we’re not spending much time on them, ACID’s Consistency amounts to more or less “follow your Atomicity + Isolation rules uniformly in every context”, and Durability means “don’t lose committed data.” ↩
-
Convex provides serializable isolation, which is the strictest and safest isolation level. Other databases often provide weaker isolation levels by default (sometimes called “read committed”, “read uncommited”, “repeatable read”, etc). Check carefully if you want to be absolutely sure your application is correct! ↩ ↩2
Convex is the sync platform with everything you need to build your full-stack project. Cloud functions, a database, file storage, scheduling, search, and realtime updates fit together seamlessly.