The Distributed Systems and Networks (DSN) Lab is a research lab in the Johns Hopkins University (JHU) Computer Science Department. We strive to invent and develop technologies with both academic and real-world impacts. We create practical, provably correct technological solutions to real problems and implement those solutions in publicly available software.
Our research focuses on dependable infrastructure: making the computerized networked infrastructure our society relies upon resilient, performant, and secure. Our current work includes:
- Resilient Systems: Critical applications are migrating to IP networks for cost-effectiveness and scalability, but this transition exposes our society's infrastructure to malicious cyber attacks. We create the necessary tools to enable essential systems to function correctly even while parts of them are compromised. These tools include the first intrusion-tolerant network and the first intrusion-tolerant replication engine that maintains correct consistent data with guaranteed performance under attack.
- Real-Time Reliable Internet Services: New applications with low latency and high reliability requirements, such as live TV transport and remote robotic surgery, are challenging to support on the native Internet. We create overlay networks that push intelligence to the middle of the network to enable these demanding applications to run effectively over the Internet at a global scale.
- Communication and Coordination for Modern Data Centers: Today's cloud applications have a variety of communication and coordination needs, both within a single data center and among geographically dispersed data centers. We create messaging and coordination systems that guarantee the strong semantics and high performance required by today's cloud applications.
More detailed information about our current research is available here.
Our research has resulted in practical open-source software systems available here. The primary systems involved in our current research are:
- Spines: Spines is a framework for deploying software overlay routers. Our current research includes implementing intrusion-tolerant messaging protocols in Spines and investigating techniques to support applications with extremely low latency and high reliability requirements on a continent or even global scale.
- Prime: Prime is a replication engine that provides performance guarantees under attack. Our current research integrates proactive recovery and dynamic diversity into Prime to create a highly-resilient system that can survive an unbounded number of compromises over the lifetime of the system (as long as the number of simultaneous compromises does not exceed a certain threshold, an assumption which is supported by the use of dynamic diversity).
- Spread: Spread is a widely-used group communication toolkit that provides reliable messaging as well as total ordering and delivery guarantees, with strong well-defined semantics in the presence of process failures and network partitions. Our recent research includes developing a new total ordering protocol that improves both the throughput and latency of Spread's message-delivery services.