Introduction
The Soft RoCE distribution is available now as a specially patched OFED-1.5.2 distribution, which is known as OFED-1.5.2-rxe. Users familiar with the installation and configuration of OFED software will find this easy to use. It is supported by System Fabric Works. Please refer to the official website for soft-RoCE for the further details.Features:
Provide Infiniband-like performance and efficiency to ubiquitous Ethernet infrastructure.
- Utilize the same transport and network layers from IB
- Stack and swap the link layer for Ethernet.
- Implement IB verbs over Ethernet.
- Not quite IB strength, but it’s getting close.
- As of OFED 1.5.1, code written for OFED RDMA , auto-magically works with RoCE.
(from IMPLEMENTATION & IMPLEMENTATION & COMPARISON OF COMPARISON OF RDMA OVER ETHERNET RDMA OVER ETHERNET)
- RoCE is capable of providing near-Infiniband QDR performance for :
- Latency-critical applications at message sizes from 128B to 8KB
- Bandwidth-intensive applications for messages <1KB.
- Soft RoCE is comparable to hardware RoCE at message sizes above 65KB.
- Soft RoCE can improve performance where RoCE-enabled hardware is unavailable.
Installation
The Soft RoCE distribution contains the entire OFED-1.5.2 distribution, with the addition of the Soft RoCE code.
Download link : http://www.systemfabricworks.com/downloads/roce
Installation of the OFED-1.5.2-rxe distribution works exactly the same as a “standard” OFED distribution. Installation can be accomplished interactively, via the “install.pl” program, or automatically via “install.pl –c ofed.conf”. The required new components are “librxe” and “ofa-kernel.” The latter is not new, but in our "rxe" version of the OFED distribution it includes the rxe/Soft RoCE kernel module.
Usage
After install OFED-1.5.2-rxe, you can use rxe_cfg command to configure Soft-RoCE. Herein, I list a few most useful commands for us.# rxe_cfg -h Usage: rxe_cfg [options] start|stop|status|persistent|devinfo rxe_cfg debug on|off|1. Enable the Soft-RoCE module(Must be compiled in for this to work) rxe_cfg crc enable|disable rxe_cfg mtu [rxe0] (set ethernet mtu for one or all rxe transports) rxe_cfg [-n] add eth0 rxe_cfg [-n] remove rxe1|eth2 Options: -n: do not make the configuration action persistent -v: print additional debug output -l: in status display, show only interfaces with link up -h: print this usage information -p 0x8916: (start command only) - use specified (non-default) eth_proto_id
rxe_cfg start
# rxe_cfg start
Name Link Driver Speed MTU IPv4_addr S-RoCE RMTU eth0 yes bnx2 1500 198.124.220.136 eth1 yes bnx2 1500 eth2 yes iw_nes 9000 198.124.220.196 eth3 yes mlx4_en 10GigE 1500 192.168.2.3 rxe eth_proto_id: 0x8915
2. Disable the Soft-RoCE module
rxe_cfg stop
3. Add a Ethernet interface to the Soft-RoCE module
rxe_cfg add [ethx]
4. Remove a Ethernet interface from the Soft-RoCE module
rxe_cfg remove [ethx|rxex]
Tuning for performance:
1) MTU size
The Soft-RoCE interface only support four MTU size: 512, 1024, 2048 and 4096. In order to max the performance, we can choose 4096.
Commands: ifconfig [ethx] mtu 9000 // set the jumbo frame for the original Ethernet interface.
rxe_cfg mtu [rxex] 4096 // set the max MTU to the according rxe interface.
Note: you also need to enable your switch to support jumbo frame
2) CRC checking
To max the performance, we need to disable crc checking.
Commands: rxe_cfg crc disable
3) Ethernet tx queue length
Also, we need to give a large number to the txqueuelen parameter of the original Ethernet interface.
Commands: ifconfig [ethx] txqueuelen 10000
Is there an equivalent perftest for Soft RoCE ? These tests look for an IB device and quit.
ReplyDeleteIn C code you can use :
ReplyDelete"ibv_get_device_list" to get the device list
"ibv_query_device" to get the device info
Or in shell if you installed the utility tools (see post on soft-iwarp) :
ibv_devinfo
ibv_devices
These tools will give you info on the IB devices present.
Then you just have to wrap that aroudn with shell command (grep / sed etc..) to test for wathever device you are looking for
thanks.
ReplyDeleteHi
ReplyDeleteCorrect me If I am wrong but this means that we could use PCs available in the market with ordinary NIC cards to create a OFED cluster correct?
Yes you can use SoftIwarp or SoftRoCE to create an OFED cluster. However be warned that the performance will much lower than the HW version.
ReplyDeleteIt is a nice low cost alternative solution for dev /testing.
hi,
ReplyDeletehow about the performance of soft RoCE over 1 gigabit ethernet. is it gives better performance than 1GbE
Hi mny, i am not sure i understand your question, do you mean softIwarp over 1 GbE vs softRoCE over 1 GbE ?
ReplyDeletesoftRoCE tend to provide better performance as there is less software layer to go through. However SoftIwarp is making good progress in term of performance also you are not limited to your local LAN and route the packet over internet since its over TCP/IP..
This comment has been removed by the author.
ReplyDeletehi,
ReplyDeletethanks for replying.....
i am a student of M.Tech& i have chosen this topic for my thesis.
my idea is i am first separately measure the performance of 1 GB ethernet & softRoCE using OMB & IMB Benchmarks.
then i am going to compare the results of OMB & IMB which one is giving the better performance.
i have few doubts....
is it possible to do this???
or is there an other meaning of of this line " performance evaluation of Soft RoCE over 1 gigabit ethernet."
if it is possible then is soft RoCE perform well over ethernet in terms of bandwidth ,latency....
i am bit confused ...i am new to this area. plz help me &guide me whether i have chosen a right path????
will wait for ur reply...
Yes it is possible . You will be running the MPI program/ Benchmark and comparing standard TCP sockets over 1GbE vs RDMA with softRoCE over 1GbE.
ReplyDeleteNote that i would suggest that you run pure network benchmark first in order to compare the performance and also identify potential limitation early.
hi Benoit,
ReplyDeletethanks for replying....if u can provide me few knowledge about soft RoCE.then it will be very helpful for me.