Saturday, October 17, 2020

CryptMPI: A Fast Encrypted MPI Library

 As more #HPC applications move to cloud infrastructure, securing and protecting HPC sensitive data in such an environment becomes critical.

But HPC solution tends to fall short when it comes to security. Security features tend to be perceived as detrimental to the performance of the applications.

By example, encrypted communication has always been seen as incurring very significant overheads when you are aiming for microsecond latency.

The author of the Crypt MPI paper demonstrates that you can ensure the privacy and integrity of sensitive data with minimal performance degradation using an enhance MPI library.

I hope to see this kind of feature integrated as standard in a future version of MPI.



Wednesday, July 08, 2020

Software RDMA revisited : setting up SoftiWARP on Ubuntu 20.04

Almost ten years ago I wrote about installing SoftIwarp on Ubuntu 10.04. Today I will be revisiting the process. First, what is SoftIwarp: Soft-iWARP is a software-based iWARP stack that runs at reasonable performance levels and seamlessly fits into the OFA RDMA environment provides several benefits. SoftiWARP is a software RDMA device that attaches with the active network cards to enable RDMA programming. For anyone starting with RDMA programming, RDMA-enabled hardware might not be at hand. SoftiWARP is a very useful tool to set up the RDMA environment, and code and experiments with.

To install SoftIwarp you have to go through 4 stages: Setting up the environment, Building SoftIwarp, Configuring Softiwarp, Testing.

Setting up RDMA environment

Before you start you should prepare the environment for building a kernel module and userspace library.
Basic building environment

sudo apt-get install build-essential libelf-dev cmake

Installing userspace libraries and tools

sudo apt-get install libibverbs1 libibverbs-dev librdmacm1 \
librdmacm-dev rdmacm-utils ibverbs-utils

Insert common RDMA kernel modules

sudo modprobe ib_core
sudo modprobe rdma_ucm

Check if everything is correctly installed : 

sudo lsmod | grep rdma 

You should see something like this : 

rdma_ucm               28672  0
ib_uverbs             126976  1 rdma_ucm
rdma_cm                61440  1 rdma_ucm
iw_cm                  49152  1 rdma_cm
ib_cm                  57344  1 rdma_cm
ib_core               311296  5 rdma_cm,iw_cm,rdma_ucm,ib_uverbs,ib_cm

Now set up some library for the userspace libs : 

sudo apt-get install build-essential cmake gcc libudev-dev libnl-3-dev \
libnl-route-3-dev ninja-build pkg-config valgrind

Installing SoftiWARP

10 years ago you had to clone the SoftiWARP source code and build it ( Now you are lucky, it is by default in the Linux kernel 5.3 and above!

You just have to type : 

sudo modprobe siw

verify it works : 

sudo lsmod | grep siw
you should see : 
siw                   188416  0
ib_core               311296  6 rdma_cm,iw_cm,rdma_ucm,ib_uverbs,siw,ib_cm
libcrc32c              16384  3 nf_conntrack,nf_nat,siw

moreover, you should check if you have an Infiniband device present : 

ls /dev/infiniband 

Result : 


You also need to add the following file in your /etc/udev/rules.d/90-ib.rules directory containing the below entries : 

 ####  /etc/udev/rules.d/90-ib.rules  ####
 KERNEL=="umad*", NAME="infiniband/%k"
 KERNEL=="issm*", NAME="infiniband/%k"
 KERNEL=="ucm*", NAME="infiniband/%k", MODE="0666"
 KERNEL=="uverbs*", NAME="infiniband/%k", MODE="0666"
 KERNEL=="uat", NAME="infiniband/%k", MODE="0666"
 KERNEL=="ucma", NAME="infiniband/%k", MODE="0666"
 KERNEL=="rdma_cm", NAME="infiniband/%k", MODE="0666"

If it doesn't exist you need to create it.

I would suggest you add also the module to the list of modules to load at boot by adding them to /etc/modules file

You need now to reboot your system.

Userspace library

Normally, recent library support softiwarp out of the box. But if you want to compile your own version follow the step bellow. However, do this at your own risk... I recommend to stick with the std libs.

Optional build SIW userland libraries: 

All the userspace library are in a nice single repository. You just have to clone the repo and build all the shared libraries. If you want you can also just build libsiw but it's just easier to build everything at once. 

git clone
cd ./softiwarp-user-for-linux-rdma/

Now we have to setup the $LD_LIBRARY_PATH so that build libraries can be found. 
cd ./softiwarp-user-for-linux-rdma/build/lib/

or you can add the line in your .bashrc profile:

End of optional section

Setup the SIW interface : 

Now we will be setting up the loopback and a standard eth interface as RDMA device:

sudo rdma link add <NAME OF SIW DEVICE > type siw netdev <NAME OF THE INTERFACE>

In this case for me : 

sudo rdma link add siw0 type siw netdev enp0s31f6
sudo rdma link add siw_loop type siw netdev l0

You can check the two devices have been correctly set up using ivc_devices and ibv_devinfo command
result of ibv_devices  :
    device              node GUID
    ------           ----------------
    siw0             507b9ddd7a170000
    siw_loop         0000000000000000

result of ibv_devinfo :

hca_id: siw0
 transport:   iWARP (1)
 fw_ver:    0.0.0
 node_guid:   507b:9ddd:7a17:0000
 sys_image_guid:   507b:9ddd:7a17:0000
 vendor_id:   0x626d74
 vendor_part_id:   0
 hw_ver:    0x0
 phys_port_cnt:   1
  port: 1
   state:   PORT_ACTIVE (4)
   max_mtu:  1024 (3)
   active_mtu:  invalid MTU (0)
   sm_lid:   0
   port_lid:  0
   port_lmc:  0x00
   link_layer:  Ethernet
hca_id: siw_loop
 transport:   iWARP (1)
 fw_ver:    0.0.0
 node_guid:   0000:0000:0000:0000
 sys_image_guid:   0000:0000:0000:0000
 vendor_id:   0x626d74
 vendor_part_id:   0
 hw_ver:    0x0
 phys_port_cnt:   1
  port: 1
   state:   PORT_ACTIVE (4)
   max_mtu:  4096 (5)
   active_mtu:  invalid MTU (0)
   sm_lid:   0
   port_lid:  0
   port_lmc:  0x00
   link_layer:  Ethernet

Testing with RPING: 

Now we simply test the setup with rping : 

In one shell : 
rping -s -a <serverIP> 

in the other : 

rping -c -a <serverIP> -v 

And you should see the rping working successfully! 

You are now all set to use RDMA without the need for expensive hardware. 

Thursday, July 02, 2020

[Links of the Day] 02/07/2020 : Database query optimization, Deep Learning Anomaly detection survey, Large scale packet capture system

  • event-reduce : accelerate query result after write. Basically if cache part of the write and recalculate the new query result using past query result and the recent write event. The authors observe an up to 12 times faster displaying of new query results after a write occurred.
  • Deep Learning for Anomaly Detection: A Survey : comprehensive survey of anomaly detection techniques out there. 
  • Moloch : Large scale, open-source, indexed packet capture and search.

Tuesday, June 30, 2020

Data is the new oil fueling Machine learning adoption but Businesses are discovering #AI is no silver bullet

Data is the new oil. However, unlike oil, as data scarcity is becoming less of a problem, processing costs are skyrocketing. The business world is waking up to the fact that while the cost of computing keeps getting cheaper all the time. The cost of training machine learning models is outpacing the compute cost drop.

Moreover,  business are finding challenging to adopt #ai, and the economist report numbers are showing how often #machinelearning projects in the real business world fail :
  • Seven out of ten said their #ai projects had generated little impact so far.
  • Two-fifths of those with “significant investments” in ai had yet to report any benefits at all.
Companies are finding that #machinelearning is not the promised silver bullet. The non-tech company are discovering what tech companies had to learn the hard way: that they are no Google, Facebook, ...

To successfully deploy an AI/ML/DL project you need: a vast amount of data, skilled employee, solid engineering practice, access to infrastructure and last but not least, a clear understanding of the business problem.

I have a false hope that corporation will abandon the silver bullet thinking, but I would settle for avoiding another #ai winter cycle.

[Links of the Day] 30/06/2020 : Homomorphic encryption for Machine Learning, Neural Network on Silicon, Python Graph visualization library

  • PySyft : Python framework for homomorphic encryption for Machine learning. It allows you to train model on encrypted data without the need to decrypt it. It's 40x slower than normal method but you this means you don't have to deal with the new EU regulation on AI. 
  • Neural Networks on Silicon : a collection of papers and works on Neural Networks on Silicontopic
  • Pygraphistry Python visual graph analytics library to extract, transform, and load big graphs in Graphistry 

Thursday, June 25, 2020

[Links of the Day] 25/06/2020 : Architecture decision record, Database stress test, Rust Network Function Framework

  • Architecture decision record : Methods and tools for capturing software design choices that address a functional or non-functional requirement that is architecturally significant. [template]
  • pstress : Perconna Database concurrency and crash recovery testing tool
  • capsule : A framework for network function development. If you want to do fast packet process in a memory safe programing language (RUST) this is for you.

Tuesday, June 23, 2020

The rise of Domain Specific Accelerators

Two recent articles indicate a certain pick up of Domain-specific Accelerators adoption. With the end of Moore's Law, domain-specific hardware solution remains one of the few paths to continuing to increase the performance and efficiency of computing hardware.

For a long time, domain-specific Accelerators adoption was limited by economics factors. Historically, the small feature sizes, small batch sizes, and high cost of fab time (for ASICs) translated in a prohibitive per unit cost.
However, economic factors have shifted :

  • move toward standardised opensource tooling,
  • more flexible licensing model,
  • RISC-V architecture coming of age and maturing rapidly
  • Fab cost dropping
  • Wide availability of FPGA (AWS F1)
  • Rise of co-designed high-level programming language reducing the learning curve and design cycle.
  • power/performance wall of general-purpose compute unit

We are about to see a dramatic shift toward heterogeneous compute infrastructure over the next couples of years.