Convex: The Software-Defined Database
Imagine you're a database best practices guru.
You know every feature and you've got everything locked down. You did the ENABLE ROW LEVEL SECURITY
dance. You’ve used REFERENCES
for confident foreign key constraints, your indexes were CREATE
ed UNIQUE
. Hell, you’ve even used some CHECK
constraints to make sure field values are mostly sane!
Then a teammate pulls a DevOops, executes the wrong sudo
command, and summarily wipes out your entire database—protections and all. Or someone commits a malicious change to your codebase and streams your entire userbase to an IRC server somewhere. Oops! Did you forget to ENABLE DONT LEAK PII
?
Safety in a world of everything-is-code
In software systems, risks come from everywhere. And for most of the Internet’s history, we’ve had various ways to armor against each and every one of them. Permission systems in *nix. ACLs in LDAP. IAM in AWS. And of course databases. Traditional databases encode a varied, somewhat useful, but still far too limited set of basic checks for your data and its safe access.
The inconsistent ways to specify rules for each of these systems are a product of them being developed by and for a particular group of people with particular technical norms. Groups such as DBAs, sysadmins, cloud architects, SREs, and network admins.
But as time has gone on, as software has eaten the world, one by one those specialized rule sets have been replaced by code. Code with the same expressive power and management practices as the software we build. Instead of sometimes clicking things in an AWS UI, sometimes sending commands to a network switch over a serial port, and yet at other times running ALTER TABLE
on a Postgres server somewhere–let’s make everything pull requests. Change management, auditing, rollbacks, tests, are all achieved with familiar software-esque methods.
So we have software-defined networking, software-defined infrastructure, and I’m sure lots of other software-defined stuff I don’t know about.
In that spirit, consider Convex the first software-defined database.
Underpowered overpowered Convex
Out of the box, Convex doesn’t have row-level security, uniqueness constraints, or much of anything. And yet teams using Convex have had these features in their projects since even before Convex was 1.0. What's more, they’ve benefited from the same level of transactional integrity they’d have with an industrial-grade, decades-old, built-in solutions like those in Postgres.
This is possible because when you’re a Convex developer, your code is in the database.
In essence, Convex is a new kind of “virtual machine” for persistent transactional (OLTP) computing. Convex query and mutation functions are its machine code. If you’re a Convex user, you’re a database engineer, and you’ve been writing database plugins all along.
For example, let’s say you want to ensure every database access goes through an authorization check. You want to encapsulate access to your tables and utilize a modular, consistent authorization scheme. No problem–just write a function that wraps your Convex tables and exposes methods that always authenticate before passing through table data. Encapsulation, abstraction, and modularity using regular old code. It’s a software-defined database.
Examining the rock and the hard place
Are we just being lazy? Are we making you build our database for us? Surely, these tried-and-true built-in security/integrity features of traditional databases have been good enough for every growing software company thus far.
Actually… they haven’t. It might surprise you to learn that most sites with scale don’t use any of this stuff. They don’t use foreign key constraints. They don’t rely on in-database row-based permission systems. They often don’t even use the database's actual type system!
Why not?
- Performance. There’s nothing magical here. The database incurs extra reads and writes to provide these capabilities. Sometimes that makes sense to do, but at other times it doesn’t. Opting into these tradeoffs in a granular way is easier to manage in the application’s code.
- Sharding/scaling. Many relational database management system capabilities were designed for simpler times when the database primary could imagine it was a single machine. The first time you have to introduce horizontal scaling, you need to kiss a lot of this stuff goodbye!
- The semantics are too limited. For example, when it comes to valid field values or authorization policies, we want to express requirements so specific we might as well use a full programming language. So, yeah…
- Consolidation of all the rules. Since you end up encoding some of your rules in the application due to the aforementioned limitations, eventually you’d rather encode all of it there so you don’t go crazy reasoning about the cumulative behavior.
So as projects scale, more and more of these needs get pushed “up” a layer into the application. The bigger the system gets, the less it’s able to use the built-in database features. However building these capabilities into traditional applications is problematic, too. You’re further from the database so everything is slower, and it’s much more difficult to maintain transactional consistency with the rest of your data.
So which should we pick? The expressive power of programming languages, or the robustness of enforcing these rules in the transactional core of the database?
By removing the boundary between the application and the database, Convex developers have a database just as expressive and powerful as code without the limitations of living in the “external” calling application.
Simple primitives, functional composition
Does this mean the Convex team thinks the platform is finished? Hell no!
But we think we’re almost done with the foundational layer of Convex, this new persistent transactional virtual machine and its machine code. We know these foundational building blocks are the part of Convex that our users cannot change and cannot opt out of. So we strive to keep that core small but elegant and composable. We want you to be able to easily augment Convex’s power in specific ways that accelerate your team and project using your tool of choice: code.
The good news is this isn’t just theoretical. One trailblazer of this software-defined database methodology has been Ian Macartney, an engineer on the Convex DevX team. Throughout 2023, Ian created modular solutions for authentication, rich argument validation, presence, rate limiting, sessions, background job management, migrations, relational modeling, etc.
As the year went on, we discovered that thousands of Convex developers were copying and pasting his code from those articles into all of their codebases. So Ian bundled up all the most useful stuff into an npm package called convex-helpers. Consider, in that repo (currently a mere ~2500 lines of TypeScript), one engineer added 25 years of PostgreSQL features to Convex.
How about a Prisma-style ORM, with cascading deletes, easy joins, and access control rules built in? Convex engineer Michal Srb made that too, in a week or two.
Naturally, we want to double down on this library-oriented, code-oriented functional composition approach to building out the rest of Convex’s higher-level capabilities. And that’s where Convex Components come in.
Introducing Convex Components
Convex Components are a way to formalize a library pattern that allows any Convex project to easily integrate sophisticated solutions to backend problems.
Some goals and properties of Convex Components:
- Make it dead simple to share implementations of ORMs, permission systems, workflow systems, and more.
- Allow these libraries to collaborate in the same transactional window with the application’s code for correctness and consistency.
- Give them their own namespaced and private tables so they can store persistent data as part of their workflows.
- Provide isolation so components can only read and write to your tables in ways you explicitly enable, and so resource usage can be traced appropriately.
- Eventually allow Convex projects to be implemented in other (non-JS/TS) programming languages.
- Speed up project compilation time by only type checking, recompiling, and re-bundling changed components instead of the entire project.
- Provide a path for larger companies to subdivide their internal work across teams.
We’ve been developing the component framework internally for several months. Soon, we’ll be releasing an early version of this framework to iterate on with the community and gather your feedback. We’ll also be releasing a series of Convex-authored components to take on some of the most urgent ergonomic needs.
So… one day will Convex have row-level security? Yes! There will be an open-source, MIT-licensed Convex Component implementing RLS. It will work just as well as any “built-in” solution from traditional databases. And if it doesn’t quite do what you need, fork it and make it your own. It's just TypeScript!
In fact, we’re looking forward to the day when most of the exciting new features on Convex are being written by Convex users instead of Convex employees. And with the software-defined database, that day may not be far away.
Convex is the sync platform with everything you need to build your full-stack project. Cloud functions, a database, file storage, scheduling, search, and realtime updates fit together seamlessly.