Thursday, November 06, 2014

On the emergence of hardware level API for dis-aggregated datacenter resources


The technologies enabling the modular dis-aggregated data-center concept are reaching maturation point as demonstrated by the latest technology showcase RSA from Intel or to a lesser extent FusionCube / FusionSphere from Huawei. The needs for such technologies arise from the fact that current cloud and data-center technology does not and cannot fulfill all the demands of cloud users for multiple reasons. On one hand, as the number of cores and amount of memory on servers continues to increase (over the next few years, we expect to see servers with hundreds of cores and terabytes of memory per server commonly used), leasing an entire server may be too large for many customer’s needs with resources wasted. On the other hand, with the emergence of a broad class of high-end applications for analytic, data-mining, etc., the available amounts of memory and compute power on a single server may be insufficient.

Moreover, leasing cloud infrastructure resources in a fixed combination of CPU, memory, etc. are only efficient when the customer load requirements are both known in advance and remain constant over time. As neither of these conditions are met for a majority of customers, the ability to dynamically mix-and-match different amounts of compute, memory, and I/O resources is the natural evolutionary step after the hyper-converged solutions.

The objective here is to address the gaps that allow us to go beyond the boundaries of the traditional server, effectively breaking the barrier of using a single physical server for resources. in other words, we will be able to provision compute, memory, and I/O resources across multiple hosts within the same rack, while being consumed dynamically by varying quantities at run-time instead of in fixed bundles. This will effectively enable a fluid transformation of current cloud infrastructures targeting fixed commodity sized physical nodes to a very large pool of resources that can be tapped into and consumed without classical server limitation.


Intel has been advertising its RSA stack for a while and it is finally becoming reality, however, the real interesting part is not the technology. Indeed, to a certain extent, a lot of technology already exists and enables us to implement resource pooling. We demonstrated that it was already feasible to deliver cloud memory pooling in the Hecatonchire Project as well as with other vendors such as TidalScale or ScaleMP, who already offer compute aggregation. However, the last two solutions are monolithic and lack the flexibility needed to be used with the cloud consumption model and as a result, they are confined to a niche market.

What can really kick the dis-aggregated model into top gear is that Intel has now teamed up with a couple vendors and has already, created a draft hardware API specification called Redfish. Such API can be leveraged by a higher level of the stack thus allowing more intelligent, flexible and predictable resource consumption of how, where, and when workloads (VMs, containers, standard processes/threads) get scheduled onto that hardware. In a certain way this then enables Mesos / Kubernetes to deliver enhanced scheduling for every hardware aspect.

This brings some interesting capabilities to existing cloud technologies, cores and memory which then can be dynamically reallocated across the workload and arguably, , it greatly reduces the need for load balancing via live migration. You would then dynamically re-allocate the resource underneath (core, memory) rather than the whole system, thus making such process more robust and less error prone.

On the container side it would solve a lot of security headache the community is now facing. Rather than going with the physical->virtual->container route, you could simply run physical->container with a fine grained per core allocation using RSA / Redfish. Effectively you would provide fine grained subscription from your system in order to get maximal separation and performance guarantees. One can use this for separating critical applications while guaranteeing performance and isolation and indeed something we can already do now with jailhouse, at the cost of under subscribing your system.

If Intel is successful in disseminating (or having the other vendors standardize around it’s Hardware API), it would allow the technology to leap forward, as it’s biggest enemy is the difficulty to port across management API from one fabric, compute, I/O, storage model to another.