Problems with Replication Lag

2025-10-12

DDIA

0. Overview

Being able to tolerate node failures is just one reason for wanting replication, other reasons are scalability (processing more requests) and latency (placing replicas geographically closer to users). In the read-scaling architecture, whose workloads that consist of mostly reads and only a small percentage of writes (a common pattern on the web), we can increase the capacity for serving read-only requests simply by adding more followers
However, this approach only realistically works with asynchronous replication—fully synchronous configuration would be very unreliable. And unfortunately, if an application reads from an asynchronous follower, it may see outdated information if the follower has fallen behind, this is called replication lag, the delay between a write happening on the leader and being reflected on a follower
This inconsistency is just a temporary state, the followers will eventually catch up and become consistent with the leader, this effect is known as eventual consistency. The term “eventually” is deliberately vague: in general, there is no limit to how far a replica can fall behind

1. Reading Your Own Writes

Many applications let the user submit some data and then view what they have submitted (such as comment). With asynchronous replication, there is a problem: if the user views the data shortly after making a write, the new data may not yet have reached the replica

read-from-stale-replica

To prevent this anomaly, we need read-after-write consistency, also known as read-your-writes consistency. This is a guarantee that if the user reloads the page, they will always see any updates they submitted themselves. It makes no promises about other users: other users’ updates may not be visible until some later time. However, it reassures the user that their own input has been saved correctly
There are various possible techniques to implement read-after-write consistency in a system with leader-based replication:
- When reading something that the user may have modified, read it from the leader; otherwise, read it from a follower. This requires that you have some way of knowing whether something might have been modified, without actually querying it. For example, user profile information is normally only editable by the owner. Thus, a simple rule is: always read the user’s own profile from the leader, and any other users’ profiles from a follower
- If most things in the application are potentially editable by the user, other criteria may be used to decide whether to read from the leader. For example, you could track the time of the last update and, for one minute after the last update, make all reads from the leader. You could also monitor the replication lag on followers and prevent queries on any follower that is more than one minute behind the leader
- The client can remember the timestamp of its most recent write—then the system can ensure that the replica serving any reads for that user reflects updates at least until that timestamp. If a replica is not sufficiently up to date, either the read can be handled by another replica or the query can wait until the replica has caught up. The timestamp could be a logical timestamp (something that indicates ordering of writes, such as the log sequence number) or the actual system clock (in which case clock synchronization becomes critical)
- If your replicas are distributed across multiple datacenters (for geographical proximity to users or for availability), there is additional complexity. Any request that needs to be served by the leader must be routed to the datacenter that contains the leader
Another complication arises when the same user is accessing your service from multiple devices, for example a desktop web browser and a mobile app. In this case you may want to provide cross-device read-after-write consistency: if the user enters some information on one device and then views it on another device, they should see the information they just entered
In this case, there are some additional issues to consider:
- Approaches that require remembering the timestamp of the user’s last update become more difficult, because the code running on one device doesn’t know what updates have happened on the other device. This metadata will need to be centralized
- If your replicas are distributed across different datacenters, there is no guarantee that connections from different devices will be routed to the same datacenter. (e.g., desktop computer uses home broadband connection and mobile device uses cellular data network) If your approach requires reading from the leader, you may first need to route requests from all of a user’s devices to the same datacenter

2. Monotonic Reads

Another anomaly that can occur when reading from asynchronous followers is that it’s possible for a user to see things moving backward in time. This can happen if a user makes several reads from different replicas

read-backward

To prevent this anomaly, we need monotonic reads guarantee. It’s a lesser guarantee than strong consistency, but a stronger guarantee than eventual consistency. Monotonic reads only means that if one user makes several reads in sequence, they will not read older data after having previously read newer data
One way of achieving monotonic reads is to make sure that each user always makes their reads from the same replica. For example, the replica can be chosen based on a hash of the user ID, rather than randomly. However, if that replica fails, the user’s queries will need to be rerouted to another replica

3. Consistent Prefix Reads

Another anomaly concerns violation of causality. In this example, a third person is listening to the conversation through followers, To the observer it looks as though Mrs.Cake is answering the question before Mr.Poons has even asked it

observe-conversation

To prevent this anomaly, we need consistent prefix reads guarantee. This guarantee says that if a sequence of writes happens in a certain order, then anyone reading those writes will see them appear in the same order
This is a particular problem in partitioned (sharded) databases. If the database always applies writes in the same order, reads always see a consistent prefix, so this anomaly cannot happen. However, in many distributed databases, different partitions operate independently, so there is no global ordering of writes
One solution is to make sure that any writes that are causally related to each other are written to the same partition—but in some applications that cannot be done efficiently. There are also algorithms that explicitly keep track of causal dependencies

4. Solutions for Replication Lag

If the replication lag is a bad experience for users, it’s important to design the system to provide a stronger guarantee, such as read-after-write. Pretending that replication is synchronous when in fact it is asynchronous is a recipe for problems down the line
There are ways in which an application can provide a stronger guarantee than the underlying database—for example, by performing certain kinds of reads on the leader. However, dealing with these issues in application code is complex and easy to get wrong. It would be better if application developers didn’t have to worry about subtle replication issues and could just trust their databases to “do the right thing.” This is why transactions exist: they are a way for a database to provide stronger guarantees so that the application can be simpler
Single-node transactions have existed for a long time. However, in the move to distributed (replicated and partitioned) databases, many systems have abandoned them, claiming that transactions are too expensive in terms of performance and availability, and asserting that eventual consistency is inevitable in a scalable system. There is some truth in that statement, but it is overly simplistic, and we will develop a more nuanced view

References

Designing Data-Intensive Application