P4 Workshop 2016 has ended
Arrillaga Alumni Center: 326 Galvez Street, Stanford, CA 94305
Sponsored by: Netronome, AT&T, Cisco, Hewlett Packard Enterprise & Barefoot Networks

VMware, Mukesh Hira & Aditi Ghag
Demo: In-band Network Telemetry - Its Application to Provide Complete End-to-end Network Visibility in Virtualized Datacenter Environments (Poster)
Datacenter networks have evolved over the years in both scale and complexity of the end-to-end topology. Large datacenter networks are being deployed today using multi-tier Clos topologies with multiple paths between end-points to scale bandwidth. Network virtualization further allows for creation of virtual L2 and L3 topologies on top of a shared physical topology.
Monitoring and troubleshooting large networks with complex end-to-end topologies spanning physical switches and network virtualization layers poses several challenges. For example, consider a scenario where a particular application experiences high latency occasionally. The transient nature of the problem can make it very difficult if not impossible to identify the network element and the interfering flows that cause the occasional high latency.
We present a framework called “In-band Network Telemetry” (INT) that enables collection of end-to-end real-time state directly in the datapath. A source end-point embeds instructions in packets listing the types of network state to be collected from the network elements. Each network element inserts the requested network state in the packet as it traverses the network. P4 provides a natural way to express the kind of packet header parsing and modifications required for INT.
To highlight the value and feasibility of the framework, we present an end-to-end working prototype with Linux KVM hypervisors interconnected by behavioral models of INT-capable switches. In our prototype, VMware NSX provides network virtualization to VMs hosted in the hypervisors. Real-time network information from the INT-capable switches is exported to VMware NSX management plane greatly simplifying end-to-end network monitoring and troubleshooting.  
University of Utah, David Hancock
Demo: A Hypervisor to Magnify the Power of P4 - Demo of Core Component (Poster)
The imminent proliferation of P4-capable switches is expected to yield diverse feature sets and portability concerns.  HyPer4 is proposed as a way of providing a hypervisor for P4 programs, presenting a common interface for features not covered by the language specification, and allowing operators to deploy, modify, and automate dynamic compositions of P4 programs within a single P4-capable switch.  HyPer4 implementation efforts have produced a working prototype for the core component of HyPer4: a P4 program capable of emulating other P4 programs.  This demonstration shows: 1) how HyPer4 permits dynamic program loading and modification on a P4-capable switch even when this feature is not native to the switch; 2) how HyPer4 permits multiple P4 programs to run in parallel, each program acting on a distinct subset of received packets; and 3) how HyPer4 permits chaining of P4 programs within the switch.  This demonstration is expected to stimulate discussion regarding system architecture, interesting use cases, and feasibility on various switch targets.

USC, Yuilang Li
Demo: FlowRadar - A Better NetFlow for Data Centers (Poster)
NetFlow has been a widely used monitoring tool with a variety of applications. NetFlow maintains an active working set of flows in a hash table that supports flow insertion, collision resolution, and flow removing. This is hard to implement in merchant silicon at data center switches, which has limited per-packet processing time. Therefore, many NetFlow implementations and other monitoring solutions have to sample or select a subset of packets to monitor. In this paper, we observe the need to monitor all the flows without sampling in short time scales. Thus, we design FlowRadar, a new way to maintain flows and their counters that scales to a large number of flows with small memory and bandwidth overhead. The key idea of FlowRadar is to encode per- flow counters with a small memory and constant insertion time at switches, and then to leverage the computing power at the remote collector to perform network-wide decoding and analysis of the flow counters. Our evaluation shows that the memory usage of FlowRadar is close to traditional NetFlow with perfect hashing. With FlowRadar, operators can get better views into their net- works as demonstrated by two new monitoring applications we build on top of FlowRadar.
UMKC, Sejun Song & ETRI, Taesang Choi
Demo: P4 In-band Network Telemetry Use Cases - Seeing Trees and Leaves to Learn About Forests (Poster)
In managing a large-scale data center, real-time root cause analysis of system events is a difficult task, as it requires vast measurement and processing resources. Therefore, a staging approach has been used traditionally where it starts from a coarse system-level measurement, then chases after a root cause, in retrospect, with a targeted flow or packet-level analysis. However, the approach may overlook many important observations, and incur extra control latency. The recently proposed P4’s Inband Network Telemetry (“INT”) capability is promising for network management, especially because it enables packet-level data collection from the data plane.
In this demonstration, we will first discuss our experiences on INT expansion that infuses system-level information into packet-level data without imposing additional control plane overhead. We choose a snapshot of system-level parameters such as CPU utilization and temperature and associate them with packet-level data such as delay and queue occupancy. For instance, CPU utilization for interrupts generally indicates packet forwarding. Although express forwarding does not interrupt CPU, it still indicates potential wrong policies and attacks. We find meaningful correlations between CPU utilization and packet-level delay when they are infused. We will then explain our approach of applying an online machine learning scheme that enhances controllability over an ever-increasing network scale and traffic volume. The approach sheds light on creating various policies for devising dynamic network resource provisioning and adaptive failure predictions.

Stanford University, Stephen Ibanez, Lavanya Jose & Xilinx, Gordon Brebner

Demo: PERC Congestion Control Algorithm Written in P4 Compiled to NetFPGA SUME (Poster)
This demonstration will feature an instance of the PERC congestion control algorithm compiled from P4 onto a NetFPGA SUME board. The design will utilize the new P4 to FPGA design flow provided by Xilinx SDNet. PERC is a next generation congestion control algorithm designed to use explicit information about the network to converge to steady sending rates as quickly as possible. At high link speeds, traditional congestion control schemes like TCP and RCP converge too slowly to the optimal sending rates because they must react to congestion signals from the network. In contrast, PERC calculates the sending rates proactively and as a result is able to decrease the convergence time by a factor of 7 compared to reactive schemes like RCP. This fast convergence reduces tail flow completion time significantly in high speed networks. The demonstration will target the NetFPGA SUME platform because it is a line-rate, flexible, and open platform that is widely accessible for academic/industry research. The design process will examine the expressiveness of the current P4 language and its ability to describe stateful operations in a switch. 

Princeton University, Muhammad Shahbaz & Stanford University, Sean Choi

Demo: PISCES: A Programmable, Protocol-Independent Software Switch (Poster)
Virtualized data-centers use software hypervisor switches to steer packets to and from virtual machines (VMs). The switch frequently needs upgrading and customization---to support new protocol headers or encapsulations for tunneling or overlays, to improve measurement and debugging features, and even to add middlebox-like functions. Software switches are typically based on a large body of code, including kernel code. Changing the switch is a formidable undertaking requiring domain mastery of network protocol design and developing, testing, and maintaining a large, complex code-base. In this talk, we argue that changing how a software switch forwards packets should not require intimate knowledge of its implementation. Instead, it should be possible to specify how packets are processed and forwarded in a high-level domain-specific language (DSL) such as P4, then compiled down to run on the underlying software switch. We present PISCES, a software switch that is not hard-wired to specific protocols, which eases adding new features. We also show how the compiler can analyze the high-level specification to optimize forwarding performance. Our evaluation shows that PISCES performs comparably to Open vSwitch, a hardwired hypervisor switch, and that PISCES programs are about 40 times shorter than equivalent Open vSwitch programs.

Open Networking Lab, Carmelo Cascone

Demo: ONOS P4 Southbound Provider (Poster)
In this demonstration we show how the ONOS SDN platform can handle the programming and control of a P4 programmable data plane. We integrated support for the Behavioral Model v2 (BMv2) software switch in the ONOS southbound. This allows applications built on top of ONOS to program and control P4-capable switches using its high-level, control protocol-independent northbound APIs. We present a use case in which a network operator who wants to reduce congestion, can pick and choose among various ONOS applications, each one bringing in a different P4 program that implements a specific path load-balancing scheme. In this case, we show how logically-centralized control can aid in the process of swapping the forwarding configuration on switches, in a way that avoids or minimizes network disruption.

Netronome, Johann Tonsing & Edwin Peer

Demo: P4-based VNF and Micro-VNF Chaining for Servers with SmartNICs (Poster)
Commodity servers equipped with Smart Network Interface Cards (SmartNICs) are being used as platforms for Network Functions Virtualisation (NFV) applications. The network traffic processing required by a specific use case is frequently expressed by forming a chain of Virtual Network Functions (VNFs). This demonstration illustrates that VNFs in the chain can be hosted on the server CPU or on the SmartNIC. It furthermore illustrates that VNFs can be decomposed into components called Micro-VNFs, with the components again being hosted on the server CPU and/or the SmartNIC. A P4 program (compiled to native code running on the SmartNIC) defines the overall semantics of the datapath within a SmartNIC equipped server and expresses how VNFs and Micro-VNFs should be composed within this platform. Mechanisms like tunnels and service headers (again programmed using P4) are employed to establish the VNF service chain across multiple network nodes.

Microsoft Azure, Lihua Yuan & Guohan Yu

Demo: Enabling Rapid Innovation in the Network Using SONiC and P4 (Poster)
Software for Open Networking in the Cloud (SONiC) is a collection of software networking components that can be used for building an open sourced network switch on a Linux distribution. SONiC works with the Switch Abstraction Interface (SAI) via which it can talk to various switching ASICs giving users access to rapid innovation in the network switching space.
P4 is a high level programming language for the networking domain. It can be used to define or describe the packet processing functions of the data plane of a network switch or any such forwarding device.
This demonstration will highlight the architecture and benefits of SONiC. Additionally, it will also showcase how SONiC can use a P4 data plane for new feature development, testing and validation. A P4 program called switch.p4 has already been connected to SONiC via the SAI APIs. The talk will also cover Packet Test Framework (PTF) which is used for checking compliance to the SAI specification.

Huawei & Xilinx, Yunsong Lu

Demo: Elastic Virtual Switch (EVS): Multi-Protocol Virtual Switching Accelerated by Xilinx FPGA with P4 Programmability (Poster)
Huawei’s Elastic Virtual Switch, EVS, has powered Huawei’s Cloud, SDN, and NFV solutions with its high-performance software implementation and a novel switching acceleration architecture, under which various hardware can be leveraged to accelerate virtual switching.
By enabling P4 on Xilinx FPGA, EVS brings a complete virtual switching solution which converges both the flexibility of P4 language and the high performance of hardware acceleration.

HKUST, Li Chen & Prof. Kai Chen

Demo: Cutting Tail Latency in the Dark with Cloudburst (Poster)
In this demonstration, we showcase a new way to cut long tail latency in short flow delivery: using coding techniques to exploit the multipath environment in data center networks. By proactively sending erasure coded symbols generated from messages on multipath in parallel and employing aggressive dropping to filter out delayed packets, we can achieve consistently low message delivery time without hurting other flows. The demo presents a prototype implementation, Cloudburst, with p4-powered acceleration mechanisms: 1) True-zero queueing latency by aggressive dropping; 2) per-packet dropping decision based on experienced latency; 3) reverse path cancellation. 

HKUST, Wei Bai & Prof. Kai Chen

Demo: Enabling ECN over Generic Packet Scheduling (Poster)
Explicit Congestion Notification (ECN) is crucial for production datacenters, but current queue-length based ECN/RED implementation cannot support multiple queues with generic packet schedulers, leading to their either degraded network performance or violated scheduling policies.  To address this challenge, we present TCN, a simple yet very effective sojourn-time based ECN solution for datacenters.  Using packet sojourn-time, as opposed to queue-length, as the congestion signal, TCN is suitable for arbitrary schedulers with traffic dynamics.  TCN can be easily implemented on switching chips using P4.  Through extensive testbed experiments and large-scale simulations, we show TCN can strictly preserve scheduling policies while providing desirable network performance.

Hewlett Packard Enterprise & Barefoot Networks, Srikrishna Gopu

Demo: P4 and OpenSwitch (Poster)
OpenSwitch is an open source Network Operating System that can run on ethernet switches. Currently it can run on a variety of physical switches. There is a need for OpenSwitch running on a programmable forwarding plane so new features can be developed and validated helping in the evolution of OpenSwitch.  In this demo we will show OpenSwitch running on a P4 data plane program in a docker container. As an example of things that can be done, we will instantiate two of these containers and show BGP peering between the two. 

Forward Networks, Andreas Voellmy

Demo: Magellan - Mapping Algorithmic Policies to P4 Dataplanes (Poster)
The Algorithmic Policies (AP) SDN programming model provides a simple, familiar and expressive model for programming the behavior of both individual packet processing devices as well as network-wide packet processing behavior. The algorithmic policy model is simple for programmers, because it eliminates elements that are unnecessary in describing the input-output behavior, such as tables, matches, and actions. It is familiar, because it largely provides the same constructs of ordinary programming languages. It is expressive because it provides constructs such as complex data structures and loops.
On the other hand, effectively utilizing P4-accessible resources from algorithmic policies can be extremely challenging, because APs are expressed in a general-purpose programming language with arbitrary complex control structures and the control structures of APs can be completely oblivious to the existence of multi-tables. Hence, it is not clear at all whether one can effectively program multi-table pipelines from such APs.
This demonstration will present the architecture of Magellan, a compiler and runtime system that effectively compiles algorithmic policies to P4 pipelines and corresponding runtime controllers for the generated pipelines. The demo will present system architecture as well as algorithms for compiling APs to efficient pipeline designs and proactive runtime controllers.

Facebook & Barefoot Networks, Jithin Thomas

Demo: Using INT to Build a Real-time Network Monitoring System @ Scale (Poster)
Inband Network Telemetry (“INT”) is a framework designed to allow the collection and reporting of network state, by the data plane, without requiring intervention or work by the control plane (http://p4.org/p4/inband-network-telemetry/).  In the INT architectural model, packets contain header fields that are interpreted as “telemetry instructions” by network devices.  These instructions tell an INT-capable device what state to collect and write into the packet as it transits the network.  SwitchID, hop latency and  queue occupancy are some of the per-packet metadata that could be collected using INT.  Connection Path and Latency Tracking (PLT)  is a novel network monitoring application that leverages INT in a scalable manner to gain real-time visibility into a network's behavior. PLT uses INT to track the path and latency encountered by every connection and uses deduplication (from within the data plane) to do this in a scalable and efficient manner . Each time a new connection is detected or a change is detected in the path/latency of an existing connection,  an "INT report" is generated and sent to a remote distributed monitoring engine. The reports enable the monitoring engine to detect a variety of anomalies in the network in real time (eg: connection/switch congestion, unused switches, flow imbalance, etc.). They also facilitate other interesting use cases such as network behavior verification, faithful reconstruction of traffic patterns and network characterization.

Cornell University, Han Wang

Demo: P4FPGA, Open-source P4 Backend for FPGA (Poster)
An emerging trend in software-defined networking is programmable protocol-independent data-plane, which can be configured using a high-level language like P4. Unfortunately, there is no open-source framework available to validate data-planes designed in P4 in programmable hardware. We present P4FPGA, an open-source P4 backend, which generates data-plane pipelines on FPGA platforms from a P4 program. Our framework supports both NetFPGA-SUME (Xilinx) and DE5-Net (Altera) platforms. We will demonstrate how a P4 data plane pipeline and control channel is implemented and generated in P4FPGA from a P4 program. Further, we will demonstrate an P4 application: data plane consensus protocol (NetPaxos) running on top of P4FPGA framework. P4FPGA bridges the implementation gap from high-level declarative data-plane language to low-level hardware implementation on FPGAs. As a result, P4FPGA is well-suited for experimenting and prototyping customized router, switch, and NIC data-planes.

Concordia University, Samar Abdi

Demo: Early Validation of P4 Programs on Target Models with PFPSim (Poster)
In this demonstration, we show the integration of P4 applications with simulation models of target packet-processing platforms. We introduce PFPSim, a Programmable Forwarding Plane Simulator, for simulating and debugging packet processing applications on programmable targets. We define a Forwarding Architecture Description (FAD) language to specify the target hardware/software architecture at a high abstraction level. The behavioral description of various target modules is specified in C++. A significant part of the behavioral description, including parsing and match-table application, comes from the P4 behavioral model library (bmv2). An executable model of the P4 program, mapped to the target, is automatically generated from the FAD and behavioral specifications in PFPSim. P4 programmers can evaluate their application on different potential targets before hardware availability using PFPSim. Conversely, forwarding plane designers can use PFPSim to optimize their platform for given P4 applications. We demonstrate the evaluation of sample P4 applications on models of many-core NPUs and reconfigurable match-table pipelines as examples of our methodology.

CESNET, Pavel Benacek
Demo: P4-to-FPGA: Generating High Speed Networking Devices (Poster)
The talk introduces the process of automated transformation of a P4 program to a VHDL code suitable for FPGA. We will describe the transformation process, the generated architecture, and the results of this ongoing research. The main idea of our approach is based on mapping the acyclic oriented graphs to a linear pipeline of processing elements. The output VHDL module is highly configurable in terms of area, latency and throughput trade-offs. Our main focus so far was on the effective implementation of parsing and deparsing FPGA modules working at 100 Gbps and beyond. Both hardware architectures are inspired by a previous hand-written code and combine fixed infrastructure code templates with modules generated from scratch. We were able to automate some of the manual optimizations to achieve low resource usage without sacrificing generality. We are ready to show a short demo of our P4-HLIR based generator of VHDL code.

Barefoot Networks, Pierce McEntagart
Demo: FBOSS Support for P4 Data Planes (Poster)
The Facebook Open Switching System (FBOSS) is a software stack for control and management of network switches. Facebook open sourced a part of FBOSS called the Agent Daemon. This is a key piece of FBOSS which controls and configures the forwarding ASIC in a network switch. The talk will show the work we have done to port FBOSS Agent Daemon on to a P4 data plane via the Switch Abstraction Interface (SAI) APIs.
In this demonstration, we will show FBOSS Agent Daemon running on a P4 data plane program.

AT&T, Tom Tofigh & Netronome, Nic Viljoen

Demo: Dynamic Analytics for Programmable NIC’s Utilizing P4 - Identification and Custom Tagging of Elastic Telecoms Traffic (Poster)
In conjunction with the talk by AT&T and as a follow up to the talk at the ONS conference, this demonstration will showcase the use of P4 on an intelligent server adapter to enable identification and custom tagging for the rerouting of elastic traffic within a telecoms data center for virtualized compute nodes. This identification is done using real time dynamic measurements of flows at the Network Interface Card (NIC). Real time dynamic measurement of flows at the NIC is critical for cloud centric service models and service automation. Enabling applications such as security, root cause analysis, big data analytics, and traffic engineering to subscribe to P4 interfaces to adjust the depth and complexity of flow monitoring could enable a new wave of sophisticated features and opportunities.The purpose of the demo is to illustrate how these types of applications would benefit from a P4 framework through utilizing P4 interfaces for advanced and customized flow measurements.