> For the complete documentation index, see [llms.txt](https://docs.constellationnetwork.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.constellationnetwork.io/metagraph-development/metagraph-framework/data/state-management.md).

# State Management

A Data Application manages two distinct types of state: **OnChainState** and **CalculatedState**, each serving unique purposes in the metagraph architecture.

#### OnChain State[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#onchain-state) <a href="#onchain-state" id="onchain-state"></a>

OnChainState contains all the information intended to be permanently stored on the blockchain. This state represents the immutable record of all updates that have been validated and accepted by the network.

It typically includes:

* A history of all data updates
* Transaction records
* Any data that requires blockchain-level immutability and auditability

OnChainState is replicated across all nodes in the network and becomes part of the chain's immutable record via inclusion in a snapshot. It should be designed to be compact and contain only essential information as it contributes to storage requirements and snapshot fees.

#### Calculated State[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#calculated-state) <a href="#calculated-state" id="calculated-state"></a>

CalculatedState can be thought of as a metagraph's working memory, containing essential aggregated information derived from the complete chain of OnChainState. It is not stored on chain itself, but can be reconstructed by traversing the network's chain of snapshots and applying the `combine` function to them.

CalculatedState typically:

* Provides optimized data structures for querying
* Contains aggregated or processed information
* Stores derived data that can be reconstructed from OnChainState if needed

CalculatedState is maintained by each node independently and can be regenerated from the OnChainState if necessary. This makes it ideal for storing derived data, indexes, or sensitive information.

### Creating State Classes[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#creating-state-classes) <a href="#creating-state-classes" id="creating-state-classes"></a>

Each state described above represents functionality from the Data Application. To create these states, you need to implement custom traits provided by the Data Application:

* The OnChainState must extend the `DataOnChainState` trait
* The CalculatedState must extend the `DataCalculatedState` trait

Both traits, `DataOnChainState` and `DataCalculatedState`, can be found in the tessellation repository.

Here's a simple example of state definitions:

```
@derive(decoder, encoder)
case class VoteStateOnChain(updates: List[PollUpdate]) extends DataOnChainState

@derive(decoder, encoder)
case class VoteCalculatedState(polls: Map[String, Poll]) extends DataCalculatedState
```

### Updating State[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#updating-state) <a href="#updating-state" id="updating-state"></a>

The DataAPI includes several lifecycle functions crucial for the proper functioning of the metagraph.

You can review all these functions in the [Lifecycle Functions](/metagraph-development/metagraph-framework/data/lifecycle-functions.md) section.

In this discussion, we'll focus on the following functions: `combine` and `setCalculatedState`

#### combine[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#combine) <a href="#combine" id="combine"></a>

Is the central function to updating the states. This function processes incoming requests/updates by either increasing or overwriting the existing states. Here is the function's signature:

```
override def combine(
  currentState: DataState[OnChainState, CalculatedState],
  updates: List[Signed[Update]]
): IO[DataState[OnChainState, CalculatedState]]
```

The combine function is invoked after the requests have been validated at both layers (l0 and l1) using the `validateUpdate` and `validateData` functions.

The `combine` function receives the `currentState` and the `updates`

* `currentState`: As indicated by the name, this is the current state of your metagraph since the last update was received.
* `updates`: This is the list of incoming updates. It may be empty if no updates have been provided to the current snapshot.

The output of this function is also a state, reflecting the new state of the metagraph post-update. Therefore, it's crucial to ensure that the function returns the correct updated state.

Returning to the `water and energy usage` example, you can review the implementation of the combine function [here](https://github.com/Constellation-Labs/metagraph-examples/blob/main/examples/water-and-energy-usage/modules/shared_data/src/main/scala/com/my/water_and_energy_usage/shared_data/combiners/Combiners.scala). In this implementation, the function retrieves the current value of water or energy and then increments it based on the amount specified in the incoming request for the `CalculatedState`, while also using the current updates as the `OnChainState`.

#### setCalculatedState[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#setcalculatedstate) <a href="#setcalculatedstate" id="setcalculatedstate"></a>

Following the combine function and after the snapshot has been accepted and consensus reached, we obtain the `majority snapshot`. This becomes the official snapshot for the metagraph. At this point, we invoke the `setCalculatedState` function to update the `CalculatedState`.

This state is typically stored `in memory`, although user preferences may dictate alternative storage methods. You can explore the implementation of storing the `CalculatedState` in memory by checking the [CalculatedState.scala](https://github.com/Constellation-Labs/metagraph-examples/blob/main/examples/water-and-energy-usage/modules/shared_data/src/main/scala/com/my/water_and_energy_usage/shared_data/calculated_state/CalculatedState.scala) and [CalculatedStateService.scala](https://github.com/Constellation-Labs/metagraph-examples/blob/main/examples/water-and-energy-usage/modules/shared_data/src/main/scala/com/my/water_and_energy_usage/shared_data/calculated_state/CalculatedStateService.scala) classes, where we have detailed examples.

In the sections below, we will discuss `serializers` used to serialize the states.

### Serializers/Deserializers[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#serializersdeserializers) <a href="#serializersdeserializers" id="serializersdeserializers"></a>

We also utilize other lifecycle functions for `serialize/deserialize` processes, each designed specifically for different types of states.

For the `OnChainState`, we use the following functions:

```
def serializeState(
  state: OnChainState
): F[Array[Byte]]

def deserializeState(
  bytes: Array[Byte]
): F[Either[Throwable, OnChainState]]
```

For the `CalculatedState` we have:

```
def serializeCalculatedState(
  state: CalculatedState
): F[Array[Byte]] 

def deserializeCalculatedState(
  bytes: Array[Byte]
): F[Either[Throwable, CalculatedState]]
```

The `OnChainState` serializer is employed during the snapshot production phase, prior to consensus, when nodes propose snapshots to become the official one. Once the official snapshot is selected, based on the majority, the `CalculatedState` serializer is used to serialize this state and store the `CalculatedState` on disk.

The deserialization functions are invoked when constructing states from the `snapshots/calculatedStates` stored on disk. For instance, when restarting a metagraph, it's necessary to retrieve the state prior to the restart from the stored information on disk.

In the following section, we will provide a detailed explanation about disk storage.

### Disk Storage[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#disk-storage) <a href="#disk-storage" id="disk-storage"></a>

When operating a Metagraph on layer 0 (ml0), a directory named `data` is created. This directory is organized into the following subfolders:

* `incremental_snapshot`: Contains the Metagraph snapshots.
* `snapshot_info`: Stores information about the snapshots, including internal states like balances.
* `calculated_state`: Holds the Metagraph calculated state.

Focusing on the `calculated_state`, within this folder, files are named after the snapshot ordinal. These files contain the CalculatedState corresponding to that ordinal. We employ a logarithmic cutoff strategy to manage the storage of these states.

This folder is crucial when restarting the Metagraph. It functions as a `checkpoint`: instead of rerunning the entire chain to rebuild the `CalculatedState`, we utilize the files in the `calculated_state` directory. This method allows us to rebuild the state more efficiently, saving significant time.

### Data Privacy[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#data-privacy) <a href="#data-privacy" id="data-privacy"></a>

As previously mentioned, the `CalculatedState` serves a crucial role by allowing the storage of any type of information discreetly, without exposing it to the public. This functionality is particularly useful for safeguarding sensitive data. When you use the `CalculatedState`, you can access your information whenever necessary, but it remains shielded from being recorded on the blockchain.. This method offers an added layer of security, ensuring that sensitive data is not accessible or visible on the decentralized ledger.

By leveraging `CalculatedState`, organizations can manage proprietary or confidential information such as personal user data, trade secrets, or financial details securely within the metagraph architecture. The integrity and privacy of this data are maintained, as it is stored in a secure compartment separated from the public blockchain.

### Scalability[​](https://docs.constellationnetwork.io/sdk/metagraph-framework/data/state-management#scalability) <a href="#scalability" id="scalability"></a>

Metagraphs face a constraint concerning the size of snapshots: `they must not exceed 500kb`. If snapshots surpass this threshold, they will be rejected, which can impose significant limitations on the amount of information that can be recorded on the blockchain.

This is where the CalculatedState becomesparticularly valuable. It allows for the storage of any amount of data, bypassing the size constraints of blockchain snapshots. Moreover, CalculatedState offers flexibility in terms of storage preferences,enabling users to choose how and where their data is stored.

This functionality not only alleviates the burden of blockchain size limitations but also enhances data management strategies. By utilizing CalculatedState, organizations can efficiently manage larger datasets, secure sensitive information off-chain, and optimize their blockchain resources for critical transactional data.

[Edit this page](https://github.com/Constellation-Labs/documentation-hub/edit/main/sdk/metagraph-framework/05-data/02-state-management.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.constellationnetwork.io/metagraph-development/metagraph-framework/data/state-management.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
