Profile image
Ian Macartney
2 months ago

Throttling Requests by Single-Flighting

Two planes with colored exhaust trails

Building reactive applications can become an obsession. How much fast can I sync state? How can I keep my app feeling responsive? How can I avoid wasting resources?

Let’s say you want a responsive UI and care about how fast your UI can update. A naive approach would just add up the time to send the request, have it processed on the server, and receive a response. And for actions like clicking buttons, this will be a good approximation. However, it’s important to consider requests within the context of other requests. For something like capturing mouse movements or typing, requests could end up being created faster than they can be processed. If requests pile up in an outgoing queue on the client, or are competing for server resources, they may appear much slower as the user waits for previous requests to be processed. Even if your request itself is fast, if it has to wait for hundreds of prior requests to complete, it can seem slow to the end user. In these cases, it can be useful to limit how many requests the client sends. There are many ways to do this: throttling, debouncing, and server side rate limiting are the most common. Can we do better?

In this article we’ll be looking at an approach called “single flighting” or “singleflighting” and what an implementation looks like in React using Convex for the backend.

Throttle vs. Debounce vs. Rate limit vs. Singleflight

A quick aside on terminology.

Debouncing refers to waiting a specified amount of time before acting on a signal. My first experience with this was in handling a potentially noisy electrical signal. When you press a button, for instance, the voltage may “bounce” for a short period, like so:

Credit: Geoffrey Hunter https://blog.mbedded.ninja/electronics/circuit-design/debouncing/
Credit: Geoffrey Hunter https://blog.mbedded.ninja/electronics/circuit-design/debouncing/

You don’t want to send “on” and “off” signals in quick succession, you want to wait until it’s settled out and send the value once it’s settled out. You “de-bounce” the signal. In software, this could look like waiting to send a search query until the user has finished typing for some amount of time. Every time the user types another character, it resets the debounce timer. The benefit is avoiding intermediate requests that wouldn’t be used, but the cost is waiting for the debounce period to elapse. If the user keeps typing, they may never see any results!

Throttling is the act of spacing out requests that a client sends. In the example with the user continuously typing a search query above, a search could be executed on the first character, and every x seconds after that. So the user could start seeing results from their in-progress query as they continue typing. Under the hood, every time a request is sent, the next request will be held until some time has elapsed. If many keystrokes are issued during that time, only a single request will be sent when the time elapses, with the latest query. This limits the maximum rate at which a single client sends requests.

Rate limiting is more commonly referenced on the server side. Rather than clients proactively limiting themselves, this is a way for the server to push back on clients, telling them they’re requesting too much. This is referred to as “back pressure” - the server pushing back on clients when too much is being demanded of it. It can help keep a backend system from being overloaded due to spikes in traffic, though it then relies on clients to handle the request and retry later on. This is a good idea for a reliable system, but should be exceptional, not the only way of limiting client requests.

Single flighting is the concept of limiting requests by only ever having one request “in flight.” This is similar to throttling, in that it limits requests from the client. However, the frequency of requests isn’t specified up front, but is a function of how fast the network is, and how fast the request is processed on the server. This second factor also gives it some natural back pressure from the server, which is incredibly valuable. If the server is getting overloaded, that client won’t be sending more requests while it’s waiting for its outstanding one. It also allows us to have gradual performance degradation if many clients are executing requests in parallel. Rather than becoming overwhelmed and failing some subset of requests, the frequency of requests will decrease in each clients as the server hits a bottleneck. The only downside is that your request frequency might be slower than theoretically possible, due to time spent on network transit. Waiting until a response comes back before sending another request is slower than optimistically firing off requests continuously. Alternatively, if your requests return quickly, you might fire off more requests than are necessary and waste CPU & network resources.

As with most things, there are benefits to each, and weighing these strategies is part of the job of the application developer. For this article, we are going to be using single flighting. This plays well with Convex, since the convex client executes mutations serially. With serial execution, debouncing and throttling both risk piling up requests, if they aren’t processed as fast as they’re created. Single flighting helps us avoid this, providing a consistently responsive user experience.

Implementation: useEffect loop with useLatestValue

One way to achieve single flighting requests is to sit in an infinite loop, waiting for a new value to send and then waiting on the request. This leverages a hook we wrote: useLatestValue. This provides two functions: one to update some value, and another that you can await for the latest value, blocking until there’s a newer value than what you’ve already received. It is conceptually similar to a Promise where you can keep calling resolve with newer values to overwrite the value returned when awaited. Before we talk about how it’s implemented, let’s look at an example of how it might be used:

type Pos = {x: number, y: number};

const updatePresence = useMutation("updatePresence");
const [nextPosition, setPosition] = useLatestValue<Pos>();

useEffect(() => {
  let run = true;
  (async () => {
    while (run) {
		  const position = await nextPosition();
			await updatePresence(presenceId, position);
    }
  })();
  return () => { run = false; };
}, [nextPosition, updatePresence]);

return <div onPointerMove={(e) => setPosition({
	x: e.clientX,
  y: e.clientY, 
})}>...</div>

This sends position updates whenever a new value is available. The nextPosition() promise will resolve when the value is updated. If one or more setPosition calls happen before awaiting nextPosition, it will immediately return the value from the latest call when it is eventually awaited.

The useLatestValue hook helpers are in a working project here. You can use it as is, but for those who are curious how it works, see below.

useLatestValue details...

useLatestValue uses a Promise as a signal that a new value is available. The result of a Promise can’t be updated once it’s resolved, so the value is stored separately. When a value is retrieved, the Promise is awaited, the latest value returned, and the signal reset. Updating the value just involves updating the value and resolving the Promise, relying on the behavior that subsequent calls to resolve is a no-op if it’s already been resolved.

export default function useLatestValue<T>() {
  const initial = useMemo(() => {
    const [promise, resolve] = makeSignal();
    // We won't access data until it has been updated.
    return { data: undefined as T, promise, resolve };
  }, []);
  const ref = useRef(initial);
  const nextValue = useCallback(async () => {
    await ref.current.promise;
    const [promise, resolve] = makeSignal();
    ref.current.promise = promise;
    ref.current.resolve = resolve;
    return ref.current.data;
  }, [ref]);

  const updateValue = useCallback(
    (data: T) => {
      ref.current.data = data;
      ref.current.resolve();
    },
    [ref]
  );

  return [nextValue, updateValue] as const;
}

const makeSignal = () => {
  let resolve: () => void;
  const promise = new Promise<void>((r) => (resolve = r));
  return [promise, resolve!] as const;
};

Implementation: useSingleFlight callback

Another model that avoids the scary infinite loop is using a helper we wrote: useSingleFlight which will run a given async function at most once at a time. If no calls are in progress, it will call the function immediately. Otherwise, when the current call finishes, it will call the function again, using the most recent arguments.

const updatePresence = useMutation("updatePresence");
const tryUpdate = useSingleFlight(updatePresence);
return <div onPointerMove={(e) => tryUpdate({
	x: e.clientX,
  y: e.clientY, 
})}>...</div>

While it isn’t used here, it’s worth mentioning that tryUpdate will always return a Promise. If the call isn’t executed, the promise will never resolve or reject. If it is called, the result of the call will be passed through. So you could write code like:

console.log('trying to update');
const result = await tryUpdate(pos);
console.log('updated: ' + result);

which would log 'trying to update' for all event callbacks, and only log 'updated: ...' for requests that were actually sent to the server. And if the call failed, it would throw the exception during the await.

The useSingleFlight hook helper is in a working project here. You can use it as is, but for those who are curious how it works, see below.

useSingleFlight details...

useSingleFlight keeps track of whether a request is in flight, using a useRef hook to store state. If there is a request in flight, it returns a promise, where it has extracted the resolve and reject functions to fulfill later if it’s still the latest attempt. It updates the state’s upNext to keep track of the arguments and promise functions. If there isn’t a request in flight, it calls the function immediately, and also kicks off an async function to check for upNext once the request finishes. It will keep executing the follow-up requests until it finishes without upNext being updated.

export default function useSingleFlight<
  F extends (...args: any[]) => Promise<any>
>(fn: F) {
  const flightStatus = useRef({
    inFlight: false,
    upNext: null as null | { resolve: any; reject: any; args: Parameters<F> },
  });

  return useCallback(
    (...args: Parameters<F>): ReturnType<F> => {
      if (flightStatus.current.inFlight) {
        return new Promise((resolve, reject) => {
          flightStatus.current.upNext = { resolve, reject, args };
        }) as ReturnType<F>;
      }
      flightStatus.current.inFlight = true;
      const firstReq = fn(...args) as ReturnType<F>;
      void (async () => {
        try {
          await firstReq;
        } finally {
          // If it failed, we naively just move on to the next request.
        }
        while (flightStatus.current.upNext) {
          let cur = flightStatus.current.upNext;
          flightStatus.current.upNext = null;
          await fn(...cur.args)
            .then(cur.resolve)
            .catch(cur.reject);
        }
        flightStatus.current.inFlight = false;
      })();
      return firstReq;
    },
    [fn]
  );
}

Delta challenges

While the approach may seem simple, there are some nuances worth thinking through. If you are limiting how many requests you’ll be sending, that often means some requests will never be executed, or some results won’t get reported.

Each call to tryUpdate above will either:

  1. execute updatePresence immediately,
  2. execute updatePresence after some time has elapsed, or
  3. never execute updatePresence.

In this case, we ignore any intermediate mouse positions. This seems fine for a use case where we’re just sharing cursor location. However, if we were reporting a delta - such as reporting a series of movements rather than absolute positions, then missing intermediate values would be bad news!

Let us know in our discord if you want examples on how to handle those cases.

Optimistic updates

One thing to keep in mind with our Optimistic Updates api is that optimistic updates will only be run as often as your single-flighted function (e.g. your mutation). If you want to update local state faster than the mutations are being executed, you’ll need to manage that state separately. For example:

const myMutation = useMutation("myMutation");
const withLocalStore = useCallback((data) => {
  setLocalState(data);
  return myMutation(data);
}, [setLocalState, updatePresence]);
const tryUpdate = useSingleFlight(withLocalStore);
...

Next Steps

We’ve looked at two ways of implementing single-flighting requests, which is a great way of preventing request pile-up and keeping UIs responsive. To see an implementation of this, check out our post on implemeting Presence in Convex (coming soon!). To get the code for our hooks, check it out here.