It is now possible to run per-packet Machine Learning (ML) inference tasks in the data plane at line-rate with dedicated hardware in programmable network switches. We refer to this approach as per-packet ML. Existing work in this area focuses on a single node setup, where the incoming packets are processed by the switch pipeline to extract features at different levels of granularity: packet-level, flow-level, cross-flow level, while also considering device-level features. The extracted features are then processed by an ML inference fabric inside the same switch. In this position paper, we propose to extend and enhance this model from a single node to a collection of nodes (including switches and servers). In fact, there are several scenarios where it is impossible for a single node to perform both feature processing (e.g., due to lack of or limited access to data) and the ML inference operations. In a multi-node setup, a node can extract ML features and encode them in packets as metadata, which are then processed by another node (e.g., switch) to execute native inference tasks. We make a case for a standard model of extracting, encoding, and forwarding features between nodes to carryout distributed, native ML inference inside networks; discuss the applicability and versatility of the proposed model; and illustrate the various open research issues and design implications.
Bracciale, L., Swamy, T., Shahbaz, M., Loreti, P., Salsano, S., Elbakoury, H. (2022). The case for native multi-node in-network machine learning. In NativeNI 2022 - Proceedings of the 1st International Workshop on Native Network Intelligence, Part of CoNEXT 2022 (pp.8-13). Association for Computing Machinery, Inc [10.1145/3565009.3569524].
The case for native multi-node in-network machine learning
Bracciale L.;Loreti P.;Salsano S.;
2022-01-01
Abstract
It is now possible to run per-packet Machine Learning (ML) inference tasks in the data plane at line-rate with dedicated hardware in programmable network switches. We refer to this approach as per-packet ML. Existing work in this area focuses on a single node setup, where the incoming packets are processed by the switch pipeline to extract features at different levels of granularity: packet-level, flow-level, cross-flow level, while also considering device-level features. The extracted features are then processed by an ML inference fabric inside the same switch. In this position paper, we propose to extend and enhance this model from a single node to a collection of nodes (including switches and servers). In fact, there are several scenarios where it is impossible for a single node to perform both feature processing (e.g., due to lack of or limited access to data) and the ML inference operations. In a multi-node setup, a node can extract ML features and encode them in packets as metadata, which are then processed by another node (e.g., switch) to execute native inference tasks. We make a case for a standard model of extracting, encoding, and forwarding features between nodes to carryout distributed, native ML inference inside networks; discuss the applicability and versatility of the proposed model; and illustrate the various open research issues and design implications.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.