Consensus in Exonum
Generally, a consensus algorithm is a process of obtaining an agreed result by a group of participants. In Exonum the consensus algorithm is used to agree on the list of transactions in blocks added to the blockchain. The other goal of the algorithm is to ensure that the results of the transaction execution are interpreted in the same way by all nodes in the blockchain network.
The consensus algorithm in Exonum uses some ideas from the algorithm proposed in Tendermint, but has several distinguishing characteristics compared to it and other consensus algorithms for blockchains.
The Exonum consensus algorithm assumes that the consensus participants can be identified. Thus, the algorithm fits for permissioned blockchains, which Exonum is oriented towards, rather than permissionless ones.
Not all the nodes in the blockchain network may be actively involved in the consensus algorithm. Rather, there is a special role for active consensus participants – validators or validator nodes. For example, in a consortium blockchain validators could be controlled by the companies participating in the consortium.
The consensus algorithm must operate in the presence of faults, i.e., when participants in the network may behave abnormally. The Exonum consensus algorithm assumes the worst; it operates under the assumption that any individual node or even a group of nodes in the blockchain network can crash or can be compromised by a resourceful adversary (say, a hacker or a corrupt administrator). This threat model is known in computer science as Byzantine faults; correspondingly, the Exonum consensus algorithm is Byzantine fault tolerant (BFT).
From the computer science perspective, the Exonum consensus algorithm takes usual assumptions:
- Validator nodes are assumed to be partially synchronous, i.e., their computation performances do not differ much
- The network is partially synchronous, too. That is, all messages are delivered in the finite time which, however, is unknown in advance
- Each validator has an access to a local stopwatch to determine time intervals. On the other hand, there is no global synchronized time in the system
- Validators can be identified with the public-key cryptography; correspondingly, the communication among validators is authenticated
The same assumptions are used in PBFT (the most well-known BFT consensus) and its successors.
The process of reaching consensus on the next block (at the blockchain height
consists of several rounds, numbered from 1. The first round starts once the
validator commits the block at height
H - 1. The onsets of rounds are determined
by a fixed timetable: rounds start after regular intervals. As there is no
global time, rounds may start at a different time for different validators.
When the round number
R comes, the previous rounds are not completed.
That is, round
R means that the validator can process messages
related to a round with a number no greater than
The current state of a validator can be described as a tuple
R part may differ among validators, but the height
H is usually the same.
If a specific validator is lagging (e.g., during its initial synchronization,
or if it was switched off for some time), its height may be lower.
In this case, the validator can request missing blocks from
other validators and full nodes in order to quickly synchronize with the rest
of the network.
To put it very simply, rounds proceed as follows:
- Each round has a leader node. The round leader offers a proposal for the next block and broadcasts it across the network. The logic of selecting the leader node is described in a separate algorithm
- Validators may vote for the proposal by broadcasting a prevote message. A prevote means that the validator has been able to parse the proposal and has all transactions specified in it
- After a validator has collected enough prevotes from a supermajority of other validators, it applies transactions specified in the prevoted proposal, and broadcasts a precommit message. This message contains the result of the proposal execution in the form of a new state hash. The precommit expresses that the sender is ready to commit the corresponding proposed block to the blockchain, but needs to see what the other validators have to say on the matter just to be sure
- Finally, if a validator has collected a supermajority of precommits with the same state hash for the same proposal, the proposed block is committed to the blockchain.
In the following description, +2/3 means more than two thirds of the validators, and -1/3 means less than one third.
The algorithm above is overly simplified:
- A validator may receive messages in any order because of network delays. For example, a validator may receive a prevote or precommit for a block proposal that the validator doesn’t know
- There can be validators acting not according to the consensus algorithm. Validators may be offline, or they can be corrupted by an adversary. To formalize this assumption, it’s assumed that -1/3 validators at any moment of time may be acting arbitrarily. Such validators are called Byzantine in computer science; all other validators are honest
The 3-phase consensus (proposals, prevotes and precommits) described above is there to make the consensus algorithm operational under these conditions. More precisely, the algorithm is required to maintain safety and liveness:
- Safety means that once a single honest validator has committed a block, no other honest validator will ever commit any other block at the same height
- Liveness means that honest validators keep committing blocks from time to time
Byzantine validators may send different messages to different validators. To maintain safety under these conditions, the Exonum consensus algorithm uses the concept of locks.
A validator that has collected a +2/3 prevotes for some block proposal locks on that proposal. A locked validator does not vote for any other proposal except for the proposal on which it is locked. When a new round starts, a locked validator immediately sends a prevote indicating the it’s locked on a certain proposal. Other validators may request prevotes that led to the lock from a locked validator, if they do not have them locally (these proposals are known as proof of lock).
Validator A gets prevotes from validators B and C, and they do not get prevotes from each other because of the connection problems. Then validators B and C can request each other’s prevotes from validator A.
Locks can be changed: if A locked on a propose and during next round all other validators locked on the next proposal, A would update its lock eventually.
As consensus messages may be lost or come out of order, the Exonum consensus uses the requests mechanism to obtain unknown information from the other validators. A request is sent by a validator to its peer if the peer has information of interest, which is unknown to the validator, and which has been discovered during the previous communication with the peer.
A request is sent if the node receives a consensus message from a height greater than the local height. The peer is supposed to respond with a message that contains transactions in an accepted block, together with a proof that the block is indeed accepted (i.e., precommits of +2/3 validators).
There are requests for all consensus messages: proposals, prevotes, and precommits. As consensus messages are authenticated with digital signatures, they can be sent directly in response to requests.
Node States Overview
The order of states in the proposed algorithm is as follows:
Commit -> (Round)+ -> Commit -> ...
On the timeline, these states look like this (for one of the validator nodes):
Commit: | H | | H+1 | ... Round1: | | R1 | | R1 ... Round2: | | R2 | | R2 ... Round3: | | R3 | | R3 ... Round4: | | R4 | | R4 ... ... ------------------------------------------------------------------> Time
Note that rounds have a fixed start time but they do not have a definite end
time (they end when the next block is received). This differs from common
behavior of partially synchronous consensus algorithms, in which rounds have
a definite conclusion (i.e., messages generated during the round
must be processed only during the round
See source code for more technical details on consensus messages.
The consensus algorithm makes use of several types of messages. All messages are authenticated with the help of public-key digital signatures, so that the sender of the message is unambiguously known and cannot be forged. Furthermore, the use of digital signatures (instead of, say, HMACs) ensures that messages can be freely retransmitted across the network. Moreover, this can be done by load balancers that have no idea whatsoever as to the content of messages.
Propose message is a set of transactions proposed by the round leader
for inclusion into the next block.
Instead of whole transactions,
Propose messages include only transaction hashes.
A validator that received a
Propose message can request missing transactions
from its peers.
If all validators behave correctly,
Propose is sent only by the leader node
of the round.
Prevote is a vote for a
Prevote indicates that a validator
has a correctly formed
Propose and all the transactions specified in it.
Prevote is broadcast to all validators.
Precommit is a message expressing readiness to include a certain proposal
as the next block into blockchain.
Precommit is broadcast to all validators.
Status is an information message about the current height. It is sent with a
periodicity written in the
global configuration parameter.
Block message contains a block (in the meaning of blockchain) and a
Precommit messages that allowed that block to be accepted.
Block messages are sent upon request.
There are request messages for transactions,
Prevote messages, and
blocks. The generation and processing rules for these messages are fairly obvious.
RequestPropose message is generated if a validator receives a consensus
Precommit) that refers to the
Propose message, which
is unknown to the validator. A receiver of a
RequestPropose message sends
Propose in response.
In comparison with other BFT algorithms, the consensus algorithm in Exonum has the following distinctive features.
Rounds have a fixed start time but they do not have a definite end time (a round ends only when the next block is received). This helps decrease delays when the network connection among validators is unstable.
Assume that consensus messages from a certain round need to be processed within the round. If the state of the network deteriorates, the network might not manage to accept the proposal until the end of the round. Then in the next round, the entire process of nominating a proposal and voting for it must begin again. The timeout of the next round should be increased so that the block could be accepted during new round timeout with a poor network connectivity. The need to repeat anew the work that has already been done and the increase in the timeout would lead to additional delays in accepting the block proposal.
In contrast to the case discussed in the previous paragraph, the absence of a fixed round ends in Exonum allows to accept the proposal with a minimum necessary delay.
Propose messages include only transaction hashes. (Transactions are included
Block messages.) Furthermore, transaction execution is delayed;
transactions are applied only at the moment when a node locks on a
Delayed transaction processing reduces the negative impact of malicious nodes on the system throughput and latency. Indeed, it splits transaction processing among the stages of the algorithm:
- On the prevote stage validators only ensure that a list of transactions
included in a proposal is correct (validator checks that all transactions in
Proposeare already stored by this node. Correctness of transaction is verified when a transaction is received; nodes don't store incorrect transactions.)
- On the precommit stage validators apply transactions to the current blockchain state
- On the commit stage validators ensure that they achieved the same state after applying the transactions in the proposal
If the Byzantine validator sends out proposals with a different transaction order
to different validators, the validators do not need to spend time checking
the order and applying transactions on the prevote stage.
A different transaction order will be detected when comparing
in the prevote messages from other validators and
in the proposal message.
Thus, the split of work helps reduce the negative impact of Byzantine nodes on the overall system performance.
Requests algorithm allows a validator to restore any consensus info from the other validators. This has a positive effect on system liveness.