Monday, August 22, 2011

Soft RoCE, an alternative to Soft iWarp

Introduction

The Soft RoCE distribution is available now as a specially patched OFED-1.5.2 distribution, which is known as OFED-1.5.2-rxe. Users familiar with the installation and configuration of OFED software will find this easy to use. It is supported by System Fabric Works. Please refer to the official website for soft-RoCE for the further details.

Features: 

Provide Infiniband-like performance and efficiency to ubiquitous Ethernet infrastructure.
  • Utilize the same transport and network layers from IB
    • Stack and swap the link layer for Ethernet.
    • Implement IB verbs over Ethernet.
  • Not quite IB strength, but it’s getting close.
  • As of OFED 1.5.1, code written for OFED RDMA , auto-magically works with RoCE.
Performance :
(from IMPLEMENTATION &  IMPLEMENTATION & COMPARISON OF  COMPARISON OF RDMA OVER ETHERNET RDMA OVER ETHERNET)


  • RoCE is capable of providing near-Infiniband QDR  performance for :
  • Latency-critical applications at message sizes from  128B to 8KB
  • Bandwidth-intensive applications for messages <1KB.
  • Soft RoCE is comparable to hardware RoCE at message sizes above 65KB.
  • Soft RoCE can improve performance where RoCE-enabled hardware is unavailable.

Installation 

The Soft RoCE distribution contains the entire OFED-1.5.2 distribution, with the addition of the Soft RoCE code.

Download link : http://www.systemfabricworks.com/downloads/roce

Installation of the OFED-1.5.2-rxe distribution works exactly the same as a “standard” OFED distribution.  Installation can be accomplished interactively, via the “install.pl” program, or automatically via “install.pl –c ofed.conf”.  The required new components are “librxe” and “ofa-kernel.” The latter is not new, but in our "rxe" version of the OFED distribution it includes the rxe/Soft RoCE kernel module.
 

Usage

After install OFED-1.5.2-rxe, you can use rxe_cfg command to configure Soft-RoCE. Herein, I list a few most useful commands for us.
# rxe_cfg -h
Usage:
rxe_cfg [options] start|stop|status|persistent|devinfo
rxe_cfg debug on|off| (Must be compiled in for this to work)
rxe_cfg crc enable|disable
rxe_cfg mtu [rxe0]  (set ethernet mtu for one or all rxe transports)
rxe_cfg [-n] add eth0
rxe_cfg [-n] remove rxe1|eth2
Options:
 -n: do not make the configuration action persistent
 -v: print additional debug output
 -l: in status display, show only interfaces with link up
 -h: print this usage information
 -p 0x8916: (start command only) - use specified (non-default) eth_proto_id
1. Enable the Soft-RoCE module
rxe_cfg start
# rxe_cfg start

Name  Link  Driver   Speed   MTU   IPv4_addr        S-RoCE  RMTU
eth0  yes   bnx2             1500  198.124.220.136
eth1  yes   bnx2             1500
eth2  yes   iw_nes           9000  198.124.220.196
eth3  yes   mlx4_en  10GigE  1500  192.168.2.3
rxe eth_proto_id: 0x8915

2. Disable the Soft-RoCE module
rxe_cfg stop

3. Add a Ethernet interface to the Soft-RoCE module
rxe_cfg add [ethx]

4. Remove a Ethernet interface from the Soft-RoCE module
rxe_cfg remove [ethx|rxex] 
 
Tuning for performance:

1) MTU size
The Soft-RoCE interface only support four MTU size: 512, 1024, 2048 and 4096. In order to max the performance, we can choose 4096.
Commands: ifconfig [ethx] mtu 9000 // set the jumbo frame for the original Ethernet interface.
rxe_cfg mtu [rxex] 4096 // set the max MTU to the according rxe interface.


Note: you also need to enable your switch to support jumbo frame

2) CRC checking
To max the performance, we need to disable crc checking.
Commands: rxe_cfg crc disable

3) Ethernet tx queue length
Also, we need to give a large number to the txqueuelen parameter of the original Ethernet interface.
Commands: ifconfig [ethx] txqueuelen 10000