Introduction
We suggest that readers already be familiar with the main consensus flow prior to reading this blog.
As we already know, blocks for each height are created from block proposals. Each validator generates those proposals separately. The network selects one of the proposals for the height H
and creates a new block from it.
We want to create an algorithm with the following properties:
- There is some order among the different block proposals. We need such order to choose one of the generated proposals deterministically.
- Such order should be changed for each height
H
. Even if a malefactor takes the control over nodes that have a maximum priority right now, they will lose that priority on the next blockchain height.
The RoundRobin algorithm satisfies these prerequisites. Here is its short description:
- We define some order over the nodes for each height.
- The time between a block’s acceptance is divided into the rounds.
- In the first round every node waits for the
block_proposal
from the node#1
; - If there is no correct proposal received, then the whole network moves into the round
#2
and waits for the block_proposal from the node#2
; etc. - After round
#N
node#1
receives priority back.
Wherein on the height H+1
the node order moves to 1: we have priorities 1, 2, .., N - 1, N
on the height 12345
and priorities 2, 3, .. , N, 1
on the height 12346
.
We call the node with a maximum priority the round leader.
Algorithm enhancement
Desired properties
That was a common idea. Now let us consider what characteristics we want from this algorithm.
- required The algorithm should be common for all nodes; that is, all nodes that have actual
assets-blockchain
state should receive identical nodes order after executing Round Robin. - required The algorithm should guarantee that there will be a block in the near future that is generated by the honest validator. This feature is called censorship resistance. There should not be any Byzantine strategy after which only Byzantine blocks would be accepted. Note that with usual
RoundRobin
, usually we will accept blocks from the node#1
as it creates a block proposal earlier than others; this remark will help us further. - required In particular, the algorithm should essentially depend on factors that are not under the influence of some node (or some predefined nodes). For example, the correct algorithm should not depend on a
prevblockhash
, because such hash is directly defined by the leader node from the previous height. That is, the previous leader can generateblock hash
that gives higher priority to the malicious nodes on the next block height. - desirable The order of nodes should differ in height to one another. Due to this property, Byzantine nodes would not follow one after the other in a series but would be randomly interspersed with honest nodes.
- desired The algorithm should not give preferences to any nodes (or artificially decrease priority for other nodes). If not, then some nodes are more valuable for malefactors.
- desired Round Robin orders should be calculated strictly after the previous block was accepted, but not earlier. If not then the malefactor gets a lot of time to develop its strategy for some future height.
Actual modifications
Let us make some modifications to the basic RoundRobin
algorithm.
Also, let us have F
Byzantine validators, as well as N = 3 * F + 1
validators in total. H
is the current height for assets-blockchain.
- Every node whose block proposal is accepted by the network (that is, after the node becomes the author of the newly accepted block) is moving to the
locked
state forF
blocks. During the nextF
blocks, one would not have right to create new block proposals. The node would behave itself as usual in other activities, that is one would vote for a new block, sign messages, etc. Such exclusion provides censorship resistance for our algorithm (desired property2
): even if the malefactor takes control overF
nodes, one would not be the author of every future block. After every Byzantine node is moved to thelocked
state, a new block author is guaranteed to be a honest node.
Remove authors of the accepted block proposals on the previous
F
heights. That is, Round Robin on each height includesM = N - F = 2 * F + 1
validators. Let us re-enumerate them as0, 1, ..., M - 1
according to their base numbers. - To suffice the desired property
4
we may shuffle this list pseudo-randomly. A shuffling should be deterministic but the place of each node should be uniformly distributed over the list.
To do so, we take a permutation over theseM
validators. The permutation number is calculated asT = Hash(H) mod M!
. This calculation provides uniform distribution of the orders, so Byzantine validators would be randomly distributed inside the currentH
height.
Experiments
We held experiments with the following versions of python and packages:
from sys import version
from math import factorial
from hashlib import sha256
import numpy as np
from matplotlib import pyplot as pp
from matplotlib.colors import LinearSegmentedColormap, hex2color
import seaborn
import scipy
%pylab inline
%config InlineBackend.figure_format = 'retina'
pylab.rcParams['figure.figsize'] = (6.0, 4.0)
Populating the interactive namespace from numpy and matplotlib
print('python version: ', version)
print('numpy version: ', np.__version__)
print('scipy version: ', scipy.__version__)
python version: 3.5.2 |Anaconda custom (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
numpy version: 1.11.3
scipy version: 0.19.0
Let's make some experiments for the first 10,000,000 blocks.
Quick nodes
As we remember, usually a block proposal from a leader of the round #1
is accepted. That is if every node is quick enough, distribution of nodes over the position #1
(after shuffling) should be uniform.
First, let's define some useful functions:
def kth_permutation(lst, k): # return K-th permutation over the given list
lst = lst[:]
n = len(lst)
res = []
bign = factorial(n)
for ic in range(n-1, -1, -1):
bign = bign // (ic + 1)
elem_pos = int(k / bign)
res.append(lst.pop(elem_pos))
k = k % bign
return res
def to_cmap(colors):
step = 1 / (len(colors) - 1)
cmap_dict = {'red': [],
'green': [],
'blue': []}
pos = 0.0
for ic, color_hex in enumerate(colors):
color = hex2color(color_hex)
if ic == len(colors) - 1:
pos = 1.0
cmap_dict['red'].append((pos, color[0], color[0]))
cmap_dict['green'].append((pos, color[1], color[1]))
cmap_dict['blue'].append((pos, color[2], color[2]))
pos += step
return LinearSegmentedColormap('exonum',segmentdata=cmap_dict)
EXONUM_CMAP = to_cmap(['#5c6700', '#bccc29', '#f5ff9a'])
EXONUM_COLOR = '#d0e01b'
np.random.seed(514230)
We will work with 16 nodes, the first 5 of them are be Byzantine.
N = 16
F = 5
n_blocks = 10000000
Let us look to distribution of nodes over position in the permutation if all nodes are quick enough and block_proposal
from node #1
is always accepted:
# magic only here
def get_permutation_number(h, m_fac):
hashed_height = int(sha256(h.to_bytes(4,byteorder='big')).hexdigest(),base=16)
k = hashed_height % m_fac
return k
def generate_blocks(n_blocks, N, F, get_block_author_from_shuffle):
mat_counters = np.zeros(shape=(N, N - F),dtype=np.int32)
block_authors = np.zeros(shape = (n_blocks,), dtype=np.int32)
# The list of authors of previous F blocks.
# Actualy it is a queue: we append to the end and pop elements from the head
locked_validators = []
# As we have (N - F) validators for each block, then (N - F)! permutations could happen
m_fac = factorial(N - F)
for h in range(n_blocks):
allowed_validators = [_ for _ in range(N) if _ not in locked_validators]
k = get_permutation_number(h, m_fac)
shuffled_validators = kth_permutation(allowed_validators, k)
for ic, new in enumerate(shuffled_validators[:N-F]):
mat_counters[new, ic] += 1
new_author = get_block_author_from_shuffle(shuffled_validators)
block_authors[h] = new_author
locked_validators.append(new_author) # lock new author
locked_validators = locked_validators[-F:] # release the oldest locked validator
return mat_counters, block_authors
# We believe the block proposal from the first node as accepted
mat_counters, block_authors = generate_blocks(n_blocks, N, F, lambda shuffled: shuffled[0])
How are validators positioned distributed after shuffling?
def plot_heatmap(mat_counters):
ax = pp.axes()
seaborn.heatmap(mat_counters, ax=ax, cmap=EXONUM_CMAP)
ax.set_xlabel('Position after shuffling')
ax.set_ylabel('Validator id')
pp.show()
plot_heatmap(mat_counters)
The node position looks randomly distributed. Let look if there are any outliers when validator X
falls into Pth
position too often or too rare:
def plot_frequencies_hist(mat_counters):
ax = pp.axes()
pp.hist(mat_counters.reshape(-1), bins=20, color=EXONUM_COLOR)
ax.set_xlabel('Count of matches: node vs position')
ax.set_ylabel('Frequency')
ax.set_title('How often a node falls into position')
print('frequencies mean: {0}, frequencies std: {1:.2f}'.format(
int(np.mean(mat_counters)),
np.std(mat_counters)))
plot_frequencies_hist(mat_counters)
frequencies mean: 625000, frequencies std: 835.97
Frequencies are centered around 625000 and deviation is low. Looks like there are no significant outliers.
Off-topic. Is it distributed normally?
Looks like it isn't, let's check it:
p_value = scipy.stats.shapiro(mat_counters.reshape(-1))[1]
print('p-value: {0:.4f}'.format(p_value))
p-value: 0.7377
The values of
mat_counters
represent the sum of series of length 10,000,000 of atomic variables. Here the atomic random variable stands for1
if nodeX
come into the placeY
, and0
else.
According to the Central Limit Theorem, when independent random variables are added, their sum tends toward a normal distribution even if the original variables themselves are not normally distributed.
However, our problem is that atomic variables are not independent: for each heightH
, each column ofmat_counters
could have strictly one non-zero value.
Now, how often does each node falls into the very first position (and becomes a block author)?
def block_authors_hist(block_authors):
authors_values, authors_counts = np.unique(block_authors, return_counts = True)
ax = pp.axes()
pp.bar(authors_values, authors_counts, color=EXONUM_COLOR)
ax.set_xlabel('Validator id')
ax.set_ylabel('frequency')
ax.set_title('How often the validator becomes a block author')
block_authors_hist(block_authors)
Yet, it is really uniform as we were waiting.
Now, imagine that with respect to other nodes, validators 0-4 are Byzantine ones. How long should we wait until a block from a honest validator is accepted?
honest_blocks_percent = np.sum(np.array(block_authors) >= F) / n_blocks
print('honest blocks: {0:.2f}%, Byzantine blocks: {1:.2f}%'.format(honest_blocks_percent * 100,
(1 - honest_blocks_percent) * 100))
honest blocks: 68.76%, Byzantine blocks: 31.24%
def calculate_time_to_wait(block_authors):
# throwing out last X blocks because there are no "honest" blocks after them
time_to_wait = np.zeros(shape = (n_blocks,), dtype=np.int32)
block_with_honest_blocks_later = -1
for h in range(n_blocks):
for ic in range(1, n_blocks - h):
if block_authors[h + ic] >= F: # this block author is honest
time_to_wait[h] = ic
block_with_honest_blocks_later = h
break
print('We throwed out last {0} blocks'.format(n_blocks - block_with_honest_blocks_later - 1))
print('time to wait for the first 10 blocks: ', time_to_wait[:10])
return time_to_wait[:block_with_honest_blocks_later]
def plot_time_to_wait_hist(time_to_wait):
wait_values, wait_counts = np.unique(time_to_wait, return_counts=True)
ax = pp.axes()
ind=wait_values
width=0.8
pp.bar(ind, wait_counts, width, color=EXONUM_COLOR)
ax.set_xticks(ind + width / 2)
ax.set_xticklabels(wait_values)
ax.set_title('How long should we wait until honest block is accepted')
ax.set_xlabel('Blocks to wait')
ax.set_ylabel('Frequency')
pp.show()
print('first ten authors: ', block_authors[:10])
time_to_wait = calculate_time_to_wait(block_authors)
plot_time_to_wait_hist(time_to_wait)
first ten authors: [ 0 1 2 3 4 6 14 10 5 1]
We throwed out last 2 blocks
time to wait for the first 10 blocks: [5 4 3 2 1 1 1 1 2 1]
As we see, usually the next block comes from a honest validator. The worst situation is when the current block is honest with F
Byzantine blocks later. Than we should wait for F+1
blocks to get honest one.
Honest nodes are slow; Byzantine nodes are fast
Now imagine that somehow Byzantine nodes generate block proposals much faster than honest nodes. Also, honest nodes are slow enough. In total, it takes 5 rounds for honest nodes to generate a block proposal while Byzantine nodes send their proposals immediately.
def block_author_slow_honest_nodes(shuffled):
honest_slow = 5 # how much round it takes to honest node to generete block proposal
# if first 5 rounds are for honest nodes then honest node generates proposal.
# else first Byzantine node generate proposal
for ic, validator in enumerate(shuffled):
if validator < F:
return validator
if ic >= honest_slow:
return shuffled[0]
mat_counters, block_authors = generate_blocks(n_blocks, N, F, block_author_slow_honest_nodes)
plot_heatmap(mat_counters)
It is interesting: the Byzantine nodes appear in the list of probable block authors rarely compared to the usual nodes. That happens because the Byzantine nodes become authors of a block much more often, so usually they sit in "prison" :)
plot_frequencies_hist(mat_counters)
frequencies mean: 625000, frequencies std: 225896.63
honest_blocks_percent = np.sum(np.array(block_authors) >= F) / n_blocks
print('honest blocks: {0:.2f}%, Byzantine blocks: {1:.2f}%'.format(honest_blocks_percent * 100,
(1 - honest_blocks_percent) * 100))
honest blocks: 31.89%, Byzantine blocks: 68.11%
But despite the fact that the honest nodes are so slow, they generate a significant amount of blocks anyway.
block_authors_hist(block_authors)
Byzantine nodes create many more blocks than honest nodes. But we still have enough correct blocks.
print('first ten authors: ', block_authors[:10])
time_to_wait = calculate_time_to_wait(block_authors)
plot_time_to_wait_hist(time_to_wait)
first ten authors: [ 0 1 2 3 4 6 14 10 2 1]
We throwed out last 4 blocks
time to wait for the first 10 blocks: [5 4 3 2 1 1 1 5 4 3]
And we still need just F + 1
blocks to wait for honest block.
Conclusion
We use the basic RoundRobin algorithm for choosing the round leader with two modifications.
- Only
M = N - F
validators could propose blocks for the new height. If the validator was the author of one of the last blocks, it couldn’t generate block proposals for some time. - On each height, we shuffle the list of allowed validators. That is, every allowed node has the same chance to turn up at the beginning of priority queue or at the end.
This modification gives us the next features:
- The algorithm is common for all nodes
- The algorithm guarantees that during the next
F + 1
block there will be a block generated by the honest node (i.e. censorship resistance) - Nodes priorities do not depend on Byzantine node behavior.
- Byzantine nodes do not follow one after the other in a series but are randomly interspersed with honest nodes.
We still want to bring the following properties to the leader selection procedure. We will enhance the algorithm further.
- The algorithm should not give preference to any nodes (or artificially decrease priority for other nodes).
- Round Robin orders may be calculated strictly after the previous block was accepted but not earlier.