Over the past decade, hardware and software used to build computer networks have become increasingly open and programmable. This was driven by two factors: the growing cost and complexity of running traditional networks efficiently at scale, and the emergence of two technological paradigms: Software-Defined Networking (SDN) and Network Function Virtualization (NFV). SDN decouples the network control (e.g., policy decisions) and data planes (e.g., packet forwarding) while NFV delegates network processing to the compute infrastructure.
While these generated immense interest in the industry, adoption has not been straightforward. This lag in adoption is often due to technical challenges that were glossed over when proposing these paradigms. My research has looked at identifying these challenges and developing novel and practical systems techniques to solve these. In my research, I deeply explore emerging problems, build systems and frameworks to validate and communicate ideas, and actively engage with industry to both find new problems and push towards the adoption of my ideas. My future research centers around moving high-performance networking from a niche to general applications. To do so, I am designing new interfaces through which applications use the network. I envision that this will help shape the development and ease the adoption of hardware accelerators.
Enabling performance guarantees in NFV.
NFV aims to move specialized network functions (i.e., services) from dedicated purpose-built hardware boxes to software run on commodity servers. It is made possible by the recent performance leaps in software packet processing and brings unprecedented flexibility for managing and deploying network services at scale. However, the move to software makes it harder to offer performance guarantees: the performance of software packet processors are up to 50% worse when they are co-located with other when they are co-located with other functions running on neighboring cores of the same multicore processor compared to isolated runs.
The basic building block of my solution, ResQ, is a technique that provides strong performance isolation among consolidated network functions. It accurately controls the effects of contention on the last-level processor cache using modern in-hardware QoS mechanisms (Intel Cache Allocation Technology), and I/O through careful buffer sizing. Free from the need to predict the effects of resource sharing, ResQ places network functions and allocates resources in a multi-tenant cluster such that performance objectives are enforced accurately (no violations) and efficiently (up to 2.3x lower resource usage compared to prior work). My results were used to promote Intel Cache Monitoring and Allocation Technologies and motivated further research on future in-hardware QoS mechanisms. Moreover, the profiling framework I developed for this work was adopted for internal NFV experimentation.
Bringing high-performance networking to networked applications.
The dominant application-network interface (socket API) requires applications to carefully handle communication details to optimize performance. By contrast, canonical shared memory – which hides away such details – has not been widely adopted due to performance and scalability challenges. Now is the right time to revisit data-centric network APIs: vast experience in distributed and parallel programming models helps us better understand the application requirements, and advances in high-performance packet processing allow us to hide communication details without compromising performance. My solution, Tasvir, is an interface that exposes single-writer versioned shared regions that are atomically updated through a user-controlled bulk synchronization process.
Distributed machine learning is an example where such an interface is useful. Existing ML distributed runtimes directly or indirectly build on legacy interfaces (e.g., sockets, verbs, RPC). Using the above-mentioned interface helps sidestep performance deficiencies of legacy frameworks with an implementation that is not wildly different than native shared memory. I showed that a CPU-based SGD implementation based on this interface matches the performance of a corresponding native shared-memory implementation. I plan to show that this interface can also efficiently abstract away accelerator communication and simplify scaling machine learning models in various environments.
Rethinking state management in networks.
Networking is riddled with ad-hoc solutions to similar problems. State distribution is the most prominent example; it is among the most nuanced components yet network protocols address it individually: e.g., each routing protocol devises its own message exchange protocol to carry precisely the information required for routing, and each distributed SDN controller implements its own internal state replication protocol. The algorithms which operate on the state are conceptually far simpler by contrast. Can we build a general abstraction for state distribution? The challenge lies in catering to the diverse and stringent performance requirements of network services. My preliminary results suggest that an abstraction that allows the network service to trade update visibility latency with distribution overheads is promising. It is both performant and is generally applicable to a wide range of network services including distributed control and data planes, controller-datapath synchronization, and telemetry.
SDN controller performance and scalability.
In the early days of SDN, there were serious doubts about its applicability to large networks. Pragmatic design choices of OpenFlow (the first instantiation of SDN) led to the controller performance being critically important to the overall network performance, scalability, and availability. But all early controllers fell short of meeting the demands of data center networks by orders of magnitude. My work addressed this early challenge. I showed that (a) poor controller performance was due to a mismatch between controller design and the workloads, and (b) OpenFlow control planes are amenable to distribution using well-known techniques. OpenFlow control workloads were I/O heavy (high volume of tiny requests) yet the controllers handled I/O poorly. I showed I/O batching and pipelining compute and I/O improves NOX’s (the original SDN controller) single-core throughput and latency by 10x and 5x respectively. The resulting controller, NOX-MT, also added parallelism to event handling which, with careful event grouping, scaled near-linearly with the number of cores. HyperFlow addresses distribution: it transparently synchronizes state across controller instances through event replication to ensures low latency and high availability. These projects were the basis of my work on the final release of NOX. The techniques I developed to improve NOX were subsequently adopted by several SDN controllers as a means of improving controller performance.
OpenFlow was a significant step forward for implementing new network services yet scaling remained ad-hoc and challenging. This was mainly due to a pragmatic compromise: OpenFlow had a single fixed model of the data plane. After a while, it became evident that this model fails to provide the promised flexibility or hardware simplicity. It lacked generality to support evolving data-plane functions. It was not simple either: to support a reasonably broad set of existing data plane functions, OpenFlow hardware was far more complex than necessary for simple forwarding. In Fabric, we propose an alternative SDN architecture that enables data plane function plurality without compromising on hardware simplicity. Fabric distinguishes between the network core and edge: the core is composed of forwarding elements that collectively provide packet transport (the traditional network function) whereas complex semantically-rich services are implemented at the edge. This separation allows network functions evolve independently from forwarding and helps avoid common OpenFlow performance and scalability pitfalls. This separation does not impede generality: as a case in point, I designed and implemented a purely edge-based variant of Information Centring Networking, idICN which shows that, unlike what ICN proposals claimed, ICN could be realized on existing networks without a fork-lift upgrade.