In event-sourced systems, state snapshots are used to alleviate the costs of computing state from event streams. Snapshots are essential to keep processing overhead and latency in check when working with long-lived and/or high traffic models.
The Actyx Pond ships with reasonable defaults for creating and retaining snapshots. However, in certain cases, snapshots may grow too large. This post outlines how to segment state and compress snapshots to avoid this.
The state of any given entity in an event-sourced system (a
Fish in the
Pond, in our case) at any point in time is defined by the stream of events relevant to this entity up to this time. The state is computed by applying these events one by one in chronological order. This means, the larger the number of events to apply, the more computational resources are required to reach the resulting state.
To prevent having to apply all relevant events each time we want to look at the state, we employ snapshots. A snapshot is the persisted result of computing the state for a given point in time. Now, when we look at the state, we don't have to apply all events but only those that happened after the time the snapshot was taken.
The Actyx Pond transparently manages snapshot creation, persistence and application for you. About every 1000 events, a snapshot is persisted, if the base event is older than one hour. Additionally, the Pond retains snapshots from the past to aid with longer time travel distances.
If an event leads to the state being completely replaced, you can let the Pond know by returning
true from the fish's
isReset function. This prevents the Pond from unnecessarily going back further in time to compute the state. You can find an example in Semantic Snapshots.
So, while the Pond already takes care of a lot of things for you, there still are cases in which you have or want to influence the default behavior.
One case that requires special care is if the size of a snapshot exceeds
128MB. If it does happen, the Pond will let you know by throwing the message
Cxn error: Max payload size exceeded at you. Now it is up to you to review your state management, implement mitigation measures and increase the
FishId's version field afterwards.
While it is uncommon for fish to grow that large, there are cases in which it might be required. In any case, you should consider the state's estimated size over time in your designs as not to be caught off guard.
In development, you can easily review the sizes of existing snapshots by hooking into the
deserializeState function and logging it. Just don't leave it enabled in production. State deserialization happens a lot.
When designing your system, you'll want to model one physical object, process or concept from your problem domain as one fish. This helps you reason and talk about your business domain without having to mentally map additional abstractions. Oftentimes, this quite naturally leads to reasonable sized fish states. With the next version of Actyx, we're moving to the concept of
local twins which communicates this 1:1 relationship more explicitly.
Two scenarios that tend to lead to large fish states are a) time series data and b) exports of aggregated data to external systems like databases for analytics, especially if the target systems are unavailable periodically.
While exporting to external systems is common, the other pattern that can lead to largish fish states relates to exactly that. If data from events map more or less directly to rows in database relations in a 1:1 fashion and if the database is available most of the time, there should be no issues in terms of state size. But if the state you're looking to export is computed from a larger number of different event types over a larger period of time it may be required to keep more data around to figure out which parts of the database to update. This challenge and solution patterns are discussed in more detail in Real-time dashboards and reports made efficient and resilient.
In this case, compressing the fish state's snapshots helps to avoid running into the
The Pond documentation mentions the possibility of compressing snapshots. Let's walk through implementing it together.
First, we need a suitable compression library. Our own Benjamin Sieffert recommends Pako, so we'll stick to that for now. However, there are others as well. If you do decide to evaluate them, it would be great if you could share the results.
The following sample explores how to use Pako in isolation and how much it compresses some sample data. To generate a reasonable amount of random data, we use the popular faker library. We'll compress and decompress a string and an array of objects, look at the compression ratio and make sure the roundtrip does not mess with our data.
To install the required packages, run
npm install pako faker and
npm install @types/pako @types/faker --save-dev for the corresponding type definitions.
This should give us something akin to the following results. We can see that our data is compressed roughly by the factor 3.5. The achievable compression ratio obviously depends on your input data, so I encourage you to run the example on sample data from your application.
Now that we know how to use the compression library and what to expect from it, let's integrate it into our fish.
As a test scenario, we'll emit an event with the current datetime every few milliseconds and subscribe to it once with and once without compressing the snapshots. After we keep that running for a few hours, we compare the snapshot sizes as described above.
BoringFish just aggregates stores all events it receives. We'll keep
deserializeState from above to track the state's size.
In contrast, the
CompressingFish implements compression using Pako by implementing
deserializeState() in the fish and
toJSON() in the state.
toJSON() will return the compressed data, which might be counter-intuitive. You can think of
toJSON() as "serialize".
When we keep this running for some time, we should see that ...
- ... both fish have the same number of items in their state
- ... the size of the compressed snapshot should be significantly smaller than the uncompressed one (well, duh!)
And indeed, the logs confirm both assumptions.
Now that we got it working, let's look at the code we've produced. Wrangling
toJSON into our state in multiple locations is pretty ugly and could get out of hand quickly with growing numbers of fish. We mixed up our business code (the state) with technical concerns (serialization). Let's see whether we can do better. Wouldn't it be nice to have a way to make existing fish compress their state without us having to modify them?
To do so, we can implement a wrapper for existing fish, providing the functions required for (de-)compression as decoration. This requires the following parts:
- A generic wrapper for
Statetypes adding the
- A function accepting a fish and returning the decorated one
The decoration consists of the
deserializeState functions we discussed above. Additionally, we need to allow the system to discriminate between compressed and raw snapshots. Every other part is just delegated to the original fish.
Note that, if you observe the original fish alongside the wrapped instance, you will still get uncompressed snapshots as well.
Kudos to Alex for coming up with this pattern.
To see it in action, we just add a wrapped fish to our Pond from the test scenario.
When looking at the logs now, we see that the size of the wrapped fish's state corresponds to the one we manually added compression to.
We looked at some ways to reason about the size of fish states and how to influence the way snapshots are persisted.
If you assume you'll be running into the
128MB snapshot size limitation, your first impulse should be to validate whether this really is required. Check whether it is possible to segment and/or clean up state as part of its segmentation. Besides the size limitation, carrying around a lot of large state snapshots can have a negative impact on the system's performance. Also verify that, if you're pushing state to an external system, that it is not unavailable for longer periods of time.
In case this does not mitigate the issue, you can use the compression wrapper to compress it.
You can use the code above to create similar scenarios using your own data to validate it. Also, do not hesitate to get in touch. We're always curious to learn how you're using Actyx, what works for you and where your pain points are.
Credits: pufferfish photo by Brian Yurasits