Artifact for the paper "In-Network Leaderless Replication for Distributed Data Stores" (VLDB 2022).
NetLR performs data replication entirely inside the programmable
switch. This repository ships the switch data plane (netlr.p4) and
two matching switch-side controllers — a Python 2 controller for SDE
9.2.0 (the VLDB '22 version) and a Python 3 controller for SDE 9.7.0
(recommended).
Client / server applications are not part of this artifact. See Client / server applications below for what a replacement implementation needs to do.
.
├── netlr.p4 # P4 data plane (Tofino1)
├── controller.py # Python 3 controller (SDE 9.7.0, recommended)
├── controller_SDE9.2.0.py # Python 2 controller (SDE 9.2.0, VLDB '22 version)
├── README.md
└── LICENSE
- Quick start
- Hardware dependencies
- Software dependencies
- Installation
- Experiment workflow
- Client / server applications
- Citation
For experienced operators who just want the end-to-end flow:
- Install. Place
netlr.p4andcontroller.pyin the SDE directory; configure cluster IPs / switch ports insidecontroller.py. - Compile.
cmake ${SDE}/p4studio -DP4_NAME=netlr -DP4_PATH=${SDE}/netlr.p4 ..., thenmake && make install. - Three terminals on the switch control plane.
run_switchd.sh -p netlr→ bring up the data-plane ports viarun_bfshell.sh→python3 controller.py. - Drive traffic. Run your own client / server application against the switch pipeline (see Client / server applications).
The rest of this document expands each step.
- At least 3 nodes (1 client + 2 servers) for end-to-end experiments.
- A programmable switch with an Intel Tofino1 ASIC.
Switch — paper-era setup (VLDB '22):
- Ubuntu 20.04 LTS with Linux kernel 5.4
- Python 2.7
- Intel P4 Studio SDE 9.2.0 and BSP 9.2.0
Switch — refreshed setup (validated 2024-01-28):
- Ubuntu 20.04 LTS with Linux kernel 5.4
- Python 3.8.10
- Intel P4 Studio SDE 9.7.0 and BSP 9.7.0
The Python 3 controller (controller.py) is recommended; the Python
2 version (controller_SDE9.2.0.py) is kept for reproducing the
original VLDB '22 setup.
-
Place
netlr.p4andcontroller.py(orcontroller_SDE9.2.0.py) in the SDE directory on the switch control plane. -
Configure cluster information in the controller — IP addresses, switch port mapping, and related constants live near the top of the file.
-
Compile
netlr.p4against the SDE:cmake ${SDE}/p4studio \ -DCMAKE_INSTALL_PREFIX=${SDE_INSTALL} \ -DCMAKE_MODULE_PATH=${SDE}/cmake \ -DP4_NAME=netlr \ -DP4_PATH=${SDE}/netlr.p4 make make install
${SDE}and${SDE_INSTALL}are the usual Intel paths (e.g.SDE=/home/admin/bf-sde-9.7.0,SDE_INSTALL=${SDE}/install).Compilation emits a handful of
--Wwarn=unusedand--Wwarn=uninitialized_out_paramwarnings onupdate_lseq_table,get_valid_replica_table, and a few register actions. These are expected and benign — they reflect dead-code paths kept around for legacy compatibility.Expected compilation output
-- P4_LANG: p4-16 P4C: /home/tofino/bf-sde-9.7.0/install/bin/bf-p4c P4C-GEN_BRFT-CONF: /home/tofino/bf-sde-9.7.0/install/bin/p4c-gen-bfrt-conf P4C-MANIFEST-CONFIG: /home/tofino/bf-sde-9.7.0/install/bin/p4c-manifest-config -- P4_NAME: netlr -- P4_PATH: /home/tofino/bf-sde-9.7.0/netlr.p4 -- Configuring done -- Generating done -- Build files have been written to: /home/tofino/bf-sde-9.7.0 [ 0%] Built target bf-p4c [ 0%] Built target driver [100%] Generating netlr/tofino/bf-rt.json /home/tofino/bf-sde-9.7.0/netlr.p4(186): [--Wwarn=unused] warning: Table update_lseq_table is not used; removing table update_lseq_table{ ^^^^^^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(313): [--Wwarn=unused] warning: Table get_valid_replica_table is not used; removing table get_valid_replica_table{ ^^^^^^^^^^^^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(177): [--Wwarn=unused] warning: update_lseq: unused instance RegisterAction<bit<32>, _, bit<32>>(lseq) update_lseq = { ^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(303): [--Wwarn=unused] warning: get_valid_replica: unused instance RegisterAction<bit<8>, _, bit<8>>(num_valid_replica) get_valid_replica = { ^^^^^^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(122): [--Wwarn=uninitialized_out_param] warning: out parameter 'ig_md' may be uninitialized when 'SwitchIngressParser' terminates out metadata_t ig_md, ^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(119) parser SwitchIngressParser( ^^^^^^^^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(178): [--Wwarn=uninitialized_out_param] warning: out parameter 'return_value' may be uninitialized when 'apply' terminates void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(178) void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(244): [--Wwarn=uninitialized_out_param] warning: out parameter 'return_value' may be uninitialized when 'apply' terminates void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(244) void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(322): [--Wwarn=uninitialized_out_param] warning: out parameter 'return_value' may be uninitialized when 'apply' terminates void apply(inout bit<8> reg_value, out bit<8> return_value) { ^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(322) void apply(inout bit<8> reg_value, out bit<8> return_value) { ^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(342): [--Wwarn=uninitialized_out_param] warning: out parameter 'return_value' may be uninitialized when 'apply' terminates void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(342) void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(467): [--Wwarn=uninitialized_out_param] warning: out parameter 'return_value' may be uninitialized when 'apply' terminates void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(467) void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(487): [--Wwarn=uninitialized_out_param] warning: out parameter 'return_value' may be uninitialized when 'apply' terminates void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^^^^^^^^ /home/tofino/bf-sde-9.7.0/netlr.p4(487) void apply(inout bit<32> reg_value, out bit<32> return_value) { ^^^^^ [100%] Built target netlr-tofino [100%] Built target netlr [ 0%] Built target bf-p4c [ 0%] Built target driver [100%] Built target netlr-tofino [100%] Built target netlr Install the project... -- Install configuration: "RelWithDebInfo" -- Up-to-date: /home/tofino/bf-sde-9.7.0/install/share/p4/targets/tofino -- Installing: /home/tofino/bf-sde-9.7.0/install/share/p4/targets/tofino/netlr.conf -- Up-to-date: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr -- Up-to-date: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr/pipe -- Installing: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr/pipe/tofino.bin -- Installing: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr/pipe/context.json -- Installing: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr/events.json -- Installing: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr/source.json -- Installing: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr/bf-rt.json
Open three terminals on the switch control plane.
-
Terminal 1 — start the P4 program.
run_switchd.sh -p netlr
Expected output
Using SDE /home/tofino/bf-sde-9.7.0 Using SDE_INSTALL /home/tofino/bf-sde-9.7.0/install Setting up DMA Memory Pool Using TARGET_CONFIG_FILE /home/tofino/bf-sde-9.7.0/install/share/p4/targets/tofino/netlr.conf Using PATH /home/tofino/bf-sde-9.7.0/install/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/tofino/bf-sde-9.7.0/install/bin Using LD_LIBRARY_PATH /usr/local/lib:/home/tofino/bf-sde-9.7.0/install/lib::/home/tofino/bf-sde-9.7.0/install/lib bf_sysfs_fname /sys/class/bf/bf0/device/dev_add Install dir: /home/tofino/bf-sde-9.7.0/install (0x55a86a068bd0) bf_switchd: system services initialized bf_switchd: loading conf_file /home/tofino/bf-sde-9.7.0/install/share/p4/targets/tofino/netlr.conf... bf_switchd: processing device configuration... Configuration for dev_id 0 Family : tofino pci_sysfs_str : /sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0 pci_domain : 0 pci_bus : 5 pci_fn : 0 pci_dev : 0 pci_int_mode : 1 sbus_master_fw: /home/tofino/bf-sde-9.7.0/install/ pcie_fw : /home/tofino/bf-sde-9.7.0/install/ serdes_fw : /home/tofino/bf-sde-9.7.0/install/ sds_fw_path : /home/tofino/bf-sde-9.7.0/install/share/tofino_sds_fw/avago/firmware microp_fw_path: bf_switchd: processing P4 configuration... P4 profile for dev_id 0 num P4 programs 1 p4_name: netlr p4_pipeline_name: pipe libpd: libpdthrift: context: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr/pipe/context.json config: /home/tofino/bf-sde-9.7.0/install/share/tofinopd/netlr/pipe/tofino.bin Pipes in scope [0 1 2 3 ] diag: accton diag: Agent[0]: /home/tofino/bf-sde-9.7.0/install/lib/libpltfm_mgr.so non_default_port_ppgs: 0 SAI default initialize: 1 bf_switchd: library /home/tofino/bf-sde-9.7.0/install/lib/libpltfm_mgr.so loaded bf_switchd: agent[0] initialized Health monitor started Operational mode set to ASIC Initialized the device types using platforms infra API ASIC detected at PCI /sys/class/bf/bf0/device ASIC pci device id is 16 Starting PD-API RPC server on port 9090 bf_switchd: drivers initialized Setting core_pll_ctrl0=cd44cbfe / bf_switchd: dev_id 0 initialized bf_switchd: initialized 1 devices Adding Thrift service for bf-platforms to server bf_switchd: thrift initialized for agent : 0 bf_switchd: spawning cli server thread bf_switchd: spawning driver shell bf_switchd: server started - listening on port 9999 bfruntime gRPC server started on 0.0.0.0:50052 ******************************************** * WARNING: Authorised Access Only * ******************************************** bfshell> -
Terminal 2 — configure data-plane ports with
run_bfshell.sh, thenucliandpm:port-add <port>/- 100G NONEfollowed byport-enb <port>/-for each data-plane port (replace100Gwith whatever speed your testbed runs at).- Disable autoneg:
an-set -/- 2.
Familiarity with Intel Tofino is assumed; refer to the switch manual or Intel's documentation for additional flags.
-
Terminal 3 — run the controller.
python3 controller.py
Expected output
Subscribe attempt #1 Subscribe response received 0 Binding with p4_name netlr Binding with p4_name netlr successful!! Received netlr on GetForwarding on client 0, device 0 Received netlr on GetForwarding on client 0, device 0 Received netlr on GetForwarding on client 0, device 0 0.0671393871307373 Received netlr on GetForwarding on client 0, device 0 Received netlr on GetForwarding on client 0, device 0 Received netlr on GetForwarding on client 0, device 0 Received netlr on GetForwarding on client 0, device 0 Received netlr on GetForwarding on client 0, device 0 Received netlr on GetForwarding on client 0, device 0 Port monitoring..
The client- and server-side application code is not included in this artifact: the original implementation depended on libraries and toolchains that we can no longer build cleanly on our current testbeds. The core NetLR mechanism lives entirely in the switch, so a replacement application only needs to send and receive packets — no protocol logic.
- Use a raw-socket transport. Anything that can craft and ship
full Ethernet / UDP frames works:
AF_PACKET,AF_XDP, DPDK, etc. The VLDB '22 implementation used thepypackerlibrary on top ofAF_PACKET. - Do not rely on the switch rewriting L2 / L3 headers. When NetLR propagates a write to the replicas it uses multicasting and does not update destination IP / MAC, so the application is responsible for crafting these addresses correctly.
- Implement timeout-based retransmission for writes. On a write, the NetLR switch overwrites a hash slot whenever the incoming write's sequence number is larger than the one already stored. When that happens the previously-stored write can no longer be committed by its client and is observed as a packet loss — the application must time out and retransmit to make forward progress.
These requirements are straightforward to implement against any modern packet-IO library; the time-consuming part of the artifact (the in-switch replication protocol) is fully reproducible from the published P4 and controller code.
Please cite this work if you refer to or use any part of this artifact:
@article{netlr,
author = {Kim, Gyuyeong and Lee, Wonjun},
title = {In-Network Leaderless Replication for Distributed Data Stores},
journal = {Proc. VLDB Endow.},
volume = {15},
number = {7},
pages = {1337--1349},
year = {2022},
month = mar,
publisher = {VLDB Endowment},
issn = {2150-8097}
}