Showing posts with label ethernet. Show all posts
Showing posts with label ethernet. Show all posts

Wednesday, September 14, 2016

[Links of the day] 14/09/2016 : Ethic in AI , Survey of fully homomorphic encryption, RDMA over Ethernet at scale at Microsoft

  • Ethical Preference-Based Decision SupportSystems : when AI and other autonomous agent start to be more ubiquitous in the human environment. As the decision of these systems will start to have a greater impact on our daily life, trust will need to be build and to achieve that these system will need that they are perceived to act in a moral and ethical way. 
  • A brief survey of Fully Homomorphic Encryption, computing on encrypted data : fully homomorphic encryption allow you to manipulate encripted data without decrypting it. This is great for database and other systems as it allow service to modify and update information without the need to know its content. Effectively partitioning operation from knowledge. However this comes at a cost (but its going down). Might finally end up with the security pipe dream where the data is immediately encrypted and is only manipulated in this form until it is finally consumed.
  • RDMA over Commodity Ethernet at Scale : It is interesting to see that RDMA start to slowly permeate hyper-scale data-center. However it is even more interesting to see that Microsoft decided to go for the RoCE version of it instead of infiniband. It make sens as there was a lot of investment in scaling the Ethernet for their cloud infrastructure and allow a lot of reuse and collocate normal and RDMA traffic on a single underlying fabric.


Thursday, May 05, 2016

[Links of the day] 05/05/2016 : OVH Kinetic, Go best practices and 9front


  • 9front : excellent book on plan9 and 9front , the first chapter is a must read for anybody interested in the field of distributed systems and OS.
  • Kinetic : OVH start deploying in Beta Eth Connected drive
  • Go best practices : well the title say it all 



Friday, April 29, 2016

[Links of the day] 29/04/2016 : End of numerical error, 504 Eth Drive Ceph Cluster, Modern Storage architecture

  • End of Numerical Error : Really cool concept of encoding of floating point number. It seems really promising , faster, lighter and significantly reduce error rate.. Unums should probably the net default encoding of the future [Julia implementation]. Ps : the non associative property for floats is really scary.
  • Ceph cluster with 504 ethernet drives : Well, its start to happen, and it might quickly take over all these pesky storage cluster out there. 
  • Modern Storage Architectures : Intel devloper forum slide deck looking at the future of storage class memory and its impact on storage architecture.

Tuesday, February 16, 2016

[Links of the day] 16/02/2016: VM density, VLDB15, Cassandra MQ, Eth roadmap

  • VM density : a look at VM density and utilization profile in Pernixdata cloud. Seems that the customers preference goes for 16 core dual socket virtual system while hogging as much memory as possible.
  • VLDB 2015 : Full VLDB 2015 program with papers attached. Notable Paper : Gobblin by linkedin crowd and Coordination Avoidance in Database Systems from Berkeley 
  • Cassieq : Distributed queue built on top of Cassandra

Bonus: Ethernet roadmap


Tuesday, September 08, 2015

Links of the day 08/09/2015 : Kinetic ethernet storage drives, Silicon Valley Show and Time Maps

  • Silicon Valley : If you can't wait for the new season here is the hilarious script written by the founder of firefox. 
  • Kinetic Open Storage : now a collaborative project under the Linux Foundation for Ethernet drives. I'm really excited about this project.
  • Time Maps : technique for visualizing many events across multiple timescales in a single image

Monday, June 22, 2015

Links of the day 22 - 05 - 2015

Today's links 22/05/2015: Steal key via radio, Turing x86 mov, Beyond #Nash equilibrium, #Ethernet jungle




Tuesday, May 05, 2015

The rise of micro storage services


Current and emergent storage solutions are composed of sophisticated building blocks: dedicated fabric, RAID controller, layered cache, object storage etc. There is the feeling that storage is going against the current evolution of the overall industry where complex services are composed of small, independent processes and services, each organized around individual capabilities.

What surprised me is that most of the storage innovation trends focus on very high level solutions that try to encompass as many features possible in a single package. Presently, insufficient efforts are being made to build storage systems based on small, independent, low-care and indeed low-cost components. In short, nimble, independent modules that can be rearranged to deliver optimal solution based on the needs of the customer, without the requirement to roll out a new storage architecture every time is simply lacking - a "jack of all trade" without,or limited "master of none" drawbacks or put another way modules that extend or mimic what is happening in the container - microservices space.

Ethernet Connected Drives

Despite this, all could change rapidly as an enabler (or precursor, depending how you look at it), of this alternative solution as it is currently emerging and surprisingly, coming from the Hard Drive vendors : Ethernet Connected Drives [slides][Q&A].This type of storage technology is going to enable the next generation of hyperscale cloud storage solution. Therefore, massive scale out potential with better simplicity and maintainability,not to mention lower TCO.

Ethernet Connected Drives are a step in the right direction as they allow a reduction in capital and operating costs by reducing:
  • software stack (File System, Volume Manager, RAID system);
  •  corresponding server infrastructure; connectivity costs and complexity; 
  • granularity which enable greater variable costs by application (e.g. cold storage, archiving, etc.).
Currently, there are two vendors offering this solution : Seagate with Kinetic and HGST with the Open Ethernet Drive. In fact we are already seeing some rather interesting applications of the technology. Seagate released a port of SheepDog project onto its Kinect product [Kinect-sheepdog] there by enabling the delivery of a distributed object storage system for volume and container services that doesn't requires dedication. Indeed there is a proof of concept presented HEPiX of HGST drive running CEPH or Dcache. While these solutions don’t fit all the scenarios, nevertheless, both of these solutions demonstrate the versatility of the technology and its scalability potential (not to mention the cost savings).

What these technologies enables is basically the transformation of the appliances that house masses of HDD into switches thereby eliminating the need for a block or file header as there is now a straight IP connectivity to the drive making these ideal for object based backends.

Emergence of fabric connected Hardware storage:

What we should see over the next couple of years is the emergence of a new form of storage appliance acting as a fabric facilitator for a large amount of compute and network enable storage devices. To a certain extend it would be similar to HP's moonshot except with a far greater density.

Rather than just focusing on Ethernet, it would be easy to see PCI, Intel photonic, Infiniband or more exotic fabrics been used. Obviously Ethernet still remains the preferred solution due to its ubiquity in the datacenter. However, we should not underestimate the need for a rack scale approach which would deliver greater benefit if designed correctly.
While HGST Open Ethernet solution is one good step towards the nimble storage device, the drive enclosure form factor is still quite big and I wouldn't be surprised if we see a couple of start-ups coming out of stealth mode in the next couple of months with fabric (PCIe most likely) connected Flash. This would be an equivalent of the Ethernet connected drive interconnected using a switch + backplane fabric as shown in the crudely designed diagram below.







Is it all about hardware?

No, indeed quite the opposite. That said, there is a greater chance of penetration of new hardware in the storage ecosystem as compared to the server market. This is probably where ARM has a better chance of establishing a beach head within the hyperscale datacenter as the microserver path seems to have failed.
What this implies is that it is often easier to deliver and sell a new hardware or appliance solution in the storage ecosystem than a pure software one. Software solutions tend to take a lot longer to get accepted, but when they pierce through, they quickly take over and replace the hardware solution. Look at the object storage solution such as CEPH or other hyper-converged solution. They are a major threat to the likes of Netapp and EMC.
To get back on the software side as a solution, I would predict that history repeats itself to varying degrees of success or failure. Indeed, like the microserver story we see, hardware micro storage solutions while rising, at the same time we see the emergence of software solutions that will deliver more nimble storage features than before.

In conclusion, I feel that we are going to see the emergence of many options for a massive scale-out, using different variants of the same concept: take the complex storage system and break it down to its bare essential components; expose each single element as its own storage service; and then build the overall offer dynamically from the ground up. Rather than leveraging complexed pooled storage services we would have dynamically deployed storage applications for specific demands composed of a suite of small services, each running in its own process and communicating with lightweight mechanisms.These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a minimum of centralized management relating to these services, which may be written in different programming languages and use different data storage technologies, . which is just the opposite of current offers where there are a lot of monolithic storage applications (or appliances) that are then scaled by replicating across servers.

This type of architecture would enable a true, on-demand dynamic tiered storage solution. To reuse a current buzzword, this would be a “lambda storage architecture”.
But this is better left for another day’s post that would look into such architecture and lifecycle management entities associated with it.





Thursday, April 23, 2015

Links of the day 23 - 04 - 2015

Today's links 23/04/2015: HPC MQ, Ethernet Connected Drives, MongoDB consistency issue, Memory Errors
  • CoralMQ : full-fledged, ultra-low-latency, high-reliability, software-based middle ware for the development of distributed systems based on asynchronous messages. Developed for high speed trading and FX applications. Its always interesting to see that a lot of high frequency trading apps are written in Java while the (often flawed) perception is that it should be avoid for HPC solution. 
  • Ethernet Connected Drives : I really think that this type of storage is going to be the next generation of jack of all trade cloud storage solution. Massive potential scale out potential with better simplicity and maintainability ( not to mention lower TCO) [slides][Q&A]
  • MongoDB stale reads : well not all good news if you really want to have consistent transaction.. MongoDB can return old / stall read value and stripe heavily relies on it is even more puzzling. This all scream run to me..
  • Memory Errors in Modern Systems : with more memory in system ( and exascale HPC) more errors will occur and with the current state of things probably be silently ignored.. The risk is to what extend these error will damage the system and how to detect and mitigate them.


Wednesday, February 11, 2015

Links of the day 11 - 02 - 2015

Today's links 11/02/2015: Test and optimization articles, scaling product team, PCIe vs Eth , Distributed Sys fallacies
  • 100 Must-Read Articles on Testing and Optimization : data driven, big data, a/b testing etc.. The best articles from 2014
  • Scaling a product team : lesson learned from Intercom on how they scaled a product building team, and the nitty gritty involved in getting valuable product out the door as fast as possible.
  • Eight Fallacies of Distributed Computing : very good tech talk with real life encounter of the fallacies.
  • PCIe vs Ethernet : with the rise of Intel’s silicon photonics (SiPh) optical PCIe (OPCIe) and other PCIe fabric, is it time to fragment your datacenter and use fast PCIe rack fabric and Eth for cross rack one. To be honest time will tell as you already know the best technology doesn't always win.