To consume non-backwards-compatible major version upgrades of the Canton software the Global Synchronizer uses, you need to follow the procedure for synchronizer (i.e., domain) upgrades with downtime described below.Documentation Index
Fetch the complete documentation index at: https://docs.canton.network/llms.txt
Use this file to discover all available pages before exploring further.
Overview
This overview is trimmed to what is most relevant for the operators of regular (non-super) validators. For a more comprehensive overview, please refer to the documentation for SV operators.- Canton releases containing breaking changes become available and Splice releases compatible with these Canton releases become available.
- SVs agree and eventually confirm via an on-ledger vote on which specific date and time the network downtime necessary for the upgrade will start. Information about the downtime window is communicated to validators.
- At the start of the downtime window, the SVs automatically pause all traffic on the operating version of the global synchronizer.
- Shortly after traffic on the global synchronizer has been paused
(there is a short delay to ensure that all components have synced up
to the final state of the existing synchronizer), the validator node
software automatically exports so-called migration dumps. In a
Kubernetes deployment, this dump is saved to attached Kubernetes
volumes. In Docker-compose deployments, the dump is saved to a
docker volume. See
validator-upgrades-dumps. - Validator operators verify that their nodes have caught up to the
now-paused global synchronizer. See
validator-upgrades-catching-up. - All SVs and validators previously using the now-paused global
synchronizer create full backups of their nodes. (Both for disaster
recovery and for supporting audit requirements). See
validator-backups. - Validators wait until the SVs signal that the migration has been successful.
- All validators upgrade theirs deployments. See
validator-upgrades-deploying. - Upon (re-)initialization, the validator backend automatically consumes the migration dump and initializes the new validator participant based on the contents of this dump. App databases are preserved.
Note
This process creates a new synchronizer instance. Because
synchronizer traffic balances <traffic> are tracked per synchronizer
instance this implies that all validator traffic balances start at zero
on this new instance. The remaining traffic on the old synchronizer
instance cannot be used anymore once that instance is shut down,
effectively resulting in a loss of that balance.Validator operators are thus strongly encouraged to purchase traffic
on a pay-as-you-go basis in small enough increments such that the cost
of the remaining traffic balance lost due to a synchronizer upgrade with
downtime is well acceptable and easily amortized across the activity of
the validator node on the old synchronizer instance.Technical Details
The following section assumes that the validator node connected to the original synchronizer has already been deployed.State and Migration IDs
Synchronizer upgrades with downtime effectively clone the state of the existing synchronizer, with some finer points:- All identities are reused, which includes all party IDs and participant identities. This is realized through exporting and importing a migration dump.
- Active ledger state is preserved. This is realized through exporting and importing a migration dump.
- Historical app state in the validator app (such as transaction
history) is preserved. This is realized through persisting and
reusing the (PostgreSQL) database of the validator app. The
transaction history exposed by the participant, however, is not
preserved and the participant will only serve history going forward.
See also
validator-upgrades-apps.
- The migration ID is 0 during the initial bootstrapping of a network and incremented after each synchronizer upgrade with downtime.
- The validator app is aware of the migration ID and uses it for ensuring the consistency of its internal stores and avoiding connections to nodes on the “wrong” synchronizer.
- The validator Canton participant is not directly aware of the
migration ID. As part of
validator-upgrades-deploying, you will configure the participant to use a fresh (empty) database. The validator app will initialize the participant from a clean slate based on the migration ID configured in the validator app. A fresh participant is needed in order to upgrade across non-backwards-compatible changes to the Canton software.
Implications for Apps and Integrations
This guide focuses on the steps necessary for upgrading the validator node itself. Additional considerations may apply for ensuring that custom applications and integrations remain functional and consistent across major upgrades. As a consequence ofvalidator-upgrades-state,
additional considerations may include the following:
- A major upgrade only preserves the active contracts but not the update history inside the participant. In particular, you will not be able to get transactions from before the major upgrade on the update service on the Ledger API of the newly deployed validator node.
- Participant offsets on the upgraded validator node start from
0again. - The update history will include special import transactions for the
contracts imported from the old synchronizer. They all have record
time
0001-01-01T00:00:00.000000Z, and represent the creation of the imported contracts.
Migration Dumps
Migration dumps contain identity and transaction data from the validator participant. The migration dump is automatically created once a scheduled synchronizer upgrade begins and the existing synchronizer has been paused. When redeploying the validator app as part of the migration process (seevalidator-upgrades-deploying), the validator app will
automatically consume the migration dump and initialize the participant
based on the contents of this dump.
For Kubernetes deployments deployed using the official Helm charts and
following the Helm-based deployment documentation, a
persistent Kubernetes volume is attached to the validator-app pod and
configured as the target storage location for migration dumps.
Similarly, for Docker-compose deployments, a docker volume is created,
mounted to the validator-app container, and is configured as the
target storage location for migration dumps.
Catching Up Before the Migration
In order for the migration to the new synchronizer to be safe and successful, it is important that the validator node is fully caught up on the existing synchronizer before proceeding tovalidator-upgrades-deploying.
- To ensure that the validator participant has caught up and the
migration dump has been created as
expected, operators can check the logs of the
validator-apppod forWrote domain migration dumpmessages. If you can’t find it, you can alternatively go inside the container and check if it has the migration dump with the expected date:kubectl exec -it <validator-pod> -- bash; ls -lha /domain-upgrade-dump. - To ensure that the validator app has caught up, operators can check
the logs of the
validator-apppod for the messageIngested transaction. If the latest such message is 10 or more minutes old, the validator app has very likely (with a large safety margin) caught up to the state on the participant, and hence to the state of the existing (paused) synchronizer. - Note that the sequencers of the existing (old) synchronizer will be kept available by SVs for a limited time after the migration to the new synchronizer has been completed. Once they are shut down, the validator will not be able to catch up anymore. You should therefore ensure that your node is caught up and migrated to the new synchronizer in a timely manner after the migration.
Deploying the Validator App and Participant
Deploying the Validator App and Participant (Kubernetes)
This section refers to validators that have been deployed in Kubernetes using the instructions ink8s_validator.
Once you confirmed that your validator is caught up, as explained above,
confirm that a migration dump has been created by searching the logs of
the validator-app pod for Wrote domain migration dump messages.
Repeat the steps described in helm-validator-install for installing
the validator app and participant, substituting the migration ID
(MIGRATION_ID) with the target migration ID after the upgrade
(typically the existing synchronizer’s migration ID + 1).
While doing so, please note the following:
- Please make sure to pick the correct (incremented)
MIGRATION_IDwhen following the steps. Notably, by consistently following through on updating theMIGRATION_ID, you should (re-)deploy your participant so that is uses a fresh (empty) database. (In case your database setup requires you to create databases manually, for example because you want to limit the permissions of the database user used by the participant deployment, please ensure that you have created the new database as per the updated.persistence.databaseNamevalue on the participant chart.) - Please modify the file
splice-node/examples/sv-helm/standalone-validator-values.yamlso thatmigration.migratingis set totrue. This will ensure that the validator app will consume the migration dump and initialize the participant based on the contents of this dump. - You do not need to redeploy the
postgresrelease. The updated Canton participant will use a new database on the PostgreSQL instance, whereas the validator app will reuse the existing state (seevalidator-upgrades-state). - Use
helm upgradein place ofhelm installfor thevalidatorchart. - Please make sure that Helm chart deployments are upgraded to the expected Helm chart version; during an actual upgrade this version will be different from the one on your existing deployment.
validator_health for pointers on determining the status of your
validator after the migration. In case of issues, check your logs for
warnings and errors and consult validator-migration-troubleshooting
below.
Once you have confirmed that the migration has been successful:
- Change the
migration.migratingvalue on thevalidatorchart back tofalse(keep the incrementedMIGRATION_ID!) and perform anotherhelm upgrade. - The old participant database (usually
participant_<OLD_MIGRATION_ID>) is no longer used and can be pruned. We recommend retaining it (or a current backup thereof) for at least another week after the migration, in case the synchronizer migration needs to be rolled back due to an unexpected major issue.
Deploying the validator App and Participant (Docker-Compose)
This section refers to validators that have been deployed in Docker-Compose using the instructions incompose_validator.
Once you confirmed that your validator is caught up, as explained above,
confirm that a migration dump has been created using:
- Stop the validator, using
./stop.sh. - In case of an actual version upgrade (not just a test migration),
upgrade your validator to the target version by updating the bundle
and adjusting the
IMAGE_TAGas you would during a minor upgrade. - Restart the validator, while updating the migration ID in the
-m <migration ID>argument, and also including-Mto instruct the validator to perform the actual migration to the new migration ID.
validator_health for pointers on determining the status of your
validator after the migration. In case of issues, check your logs for
warnings and errors and consult validator-migration-troubleshooting
below.
Once you have confirmed that the migration has been successful:
- Restart the validator app once more, keeping the
-m <migration ID>but omitting the-M. The-Mis required only for the first startup after the migration, to instruct the validator to perform the actual migration. - The old participant database (
participant=-<OLD_MIGRATION_ID>) is no longer used and can be pruned. We recommend retaining it (or a current backup thereof) for at least another week after the migration, in case the synchronizer migration needs to be rolled back due to an unexpected major issue.
Troubleshooting
Common errors
If any of the steps above fail, double check the following:- The expected versions were deployed, both before the migration and after the migration.
- In case that you don’t see the
Wrote domain migration dumpmessage in the logs of the validator app despite confirming that your are on the expected version before the migration, your validator might already have taken a dump at an earlier time. You can inspect the contents of thedomain-migration-validator-pvcPVC (Helm) ordomain-upgrade-dumpvolume (compose). In case adomain_migration_dump.jsonexists there and you are unsure about the circumstances of its creation, it is recommended to remove it and restart the validator app (on the older version and migration ID) to trigger the creation of a fresh dump. - The Canton participant (re-)deployed as part of the upgrade uses a fresh (empty) database. By correctly setting the migration ID while following the deployment steps above, this should be the case.
- The correct (incremented)
MIGRATION_IDhas been set after the upgrade. - If you get an error like
Migration ID was incremented (to 1) but no migration dump for restoring from was specified.you are missing themigrating: trueflag (for Helm) or-Margument (for Docker compose).
Cleaning up the validator app database in the event of a failed upgrade
In rare occasions, where the upgrade is not successful but the validator app manages to start ingesting from the new migration id, the app’s database might contain data of the failed migration id that should be removed. To check whether any such data has been stored, you can query your validator app’s database with the following query:validator-backups) for the validator
app and drop the database of the failed migration id for the
participant.