Prime: Byzantine Replication Under Attack
Prime is a Byzantine fault-tolerant replication system whose goal is to provide a meaningful level of performance even after some of the replication servers have been compromised. Like previous Byzantine fault-tolerant replication protocols, Prime meets Safety (consistency of the correct replicas) and Liveness (the eventual execution of each update) as long as no more than f out of 3f+1 replicas are compromised and the network is sufficiently stable. Unlike previous protocols, Prime is also designed to meet a stronger performance guarantee, which we call Bounded-Delay. Bounded-Delay limits the amount of performance degradation that can be caused by malicious servers. Intuitively, Prime forces any leader that remains in power to meet a threshold level of performance, where the threshold is a function of the message delays between the correct servers in the system, which cannot be arbitrarily increased by the malicious servers.
Prime supports proactive recovery, diversity, and state transfer. Prime servers can be periodically rejuvenated to clean the system from potentially undetected intrusions. The MultiCompiler described here can be used to obfuscate the code layout of Prime servers in order to increase the resiliency of the system. The MultiCompiler uses a 64-bit random number to generate different variants of an application. A different version of a Prime server can be generated after each rejuvenation. In this way, if an adversary attacks all the servers in parallel, the probability to defeat more than f servers is low. After rejuvenation, a Prime server also validates the contents of the state on the disk with the help of other correct replicas and recovers a clean copy of the state if necessary. Subsequently, the rejuvenated replica collects all the client updates necessary to catch up and resume the execution. The state and update transfer protocols are guaranteed to meet Safety because they are coordinated by a quorum of correct replicas.
Prime can be configured to make use of Spines, an overlay network developed at Johns Hopkins (see http://spines.org). This can be useful for testing wide-area topologies and placing bandwidth and latency constraints on the links between servers.
A version of Prime suitable for evaluating the performance of the protocol in both fault-free
and under-attack executions can be downloaded here.
The code was written in C and runs on Linux. Prime was tested with version 3.5 of the MultiCompiler, which is
included in the Prime software package. Please refer to the MultiCompiler website
for further releases.