Monday, November 28, 2016

[Links of the Day] 28/11/2016 : Earth Computing network fabric, CS video courses, Go app tracing

  • Earth Computing Network Fabric : event based protocol for datacenter that target specifically datacenter as it eliminate the need for heartbeats and timeouts. The protocol relies on recoverable atomic token to deliver deterministic in order communication. To some extend they are proposing to move back to latices system where each server are a node within the network and act as router for messages. This eliminate switch requirement and looks really neat. However adoption might be difficult due to the ubiquitous Ethernet hardware and also the need to change the underlying communication protocol. Last but not least I do not really know how to efficiently secure and trust messages on such network. [slides
  • Computer Science video courses : Extensive collection of links to CS courses ranging from introductory to expert in pretty much the full scope of CS subject (DB, distributed systems, etc..) 
  • Appdash : Application tracing system for Go, based on Google's Dapper.

Friday, November 25, 2016

[Links of the Day] 25/11/2016 : CD/CI maturity model , Deep Learning lip reading, Microservices make

  • Continuous Delivery Maturity Model : look at the different level of maturity for continuous integration-delivery-build-.... in software development 
  • LipNet : deep learning for full sentence lip reading. One step closer to a fully fledged HAL.
  • Dmake : tool to manage micro-service based applications. It allows to easily build, run, test and deploy an entire application or one of its micro-services.


Wednesday, November 23, 2016

[Links of the Day] 23/11/2016 : AMD Exascale vision, Hardware Resiliency myths and truths, MIT EmTech

  • Resiliency for Reliability– Myths and Truths : this slide deck provide an overview of the resiliency issue and how Intel tackle those for  hardware fault. From fans down to soft errors ( ex: neutron beam ... yes this can £%£ your system). The authors present the two type of approach , reactive and proactive handling of errors.
  • AMD's Exascale computing vision : Its all about 3d stacked chip with future interconnect. The interesting bit is the ROCM platform and the P2P multiGPU and P2P with RDMA. Slowly we are removing the need to have a full server to deploy GPU, one step closer to fully modular system with each resourced pooled and optimized in their own enclosure. Its a lot easier to design power supply, cooling system, etc.. When you do not have to deal with heterogeneous hardware with different power, and cooling profile ( cpu, memory , disk etc.. in the same enclosure).
  • MIT EmTech 16 : This year MIT EmTech is all about AI & machine learning ... reaching maximum hype in the domain

Monday, November 21, 2016

[Links of the Day] 21/11/2016 : Erasure Code for Big data and cluster cache, Emerging Interconnect Tech

  • Erasure Coding for Big-data Systems : Technical report presenting the state of the art of erasure code and how they are use in practice.
  • EC-Cache: the authors present an interesting solution. Where they combine erasure code to compensate for limitation of selective replication. The solution provide a load-balanced, low-latency cluster caching and improve resilience against failure from the inherent benefit of the code.
  • Emerging Interconnect Technologies : really cool overview of the current and future of the interconnect especially on chip or chip to chip interconnect. This looks at the future of communication when chips will be stacked to keep Moore's law going. 

Friday, November 18, 2016

[Links of the Day] 18/11/2016 : Extreme Scale OS, GPU Stream Benchmark, Neural Net that produce neural net

  • Neural Architecture Search with Reinforcement Learning : Neural net that produce neural net. Cool thing is that the authors are able to beat human generated model for text processing and deliver equivalent performance for image processing model. Who needs human anymore.... 
  • Extreme-Scale Operating Systems : multi-OS research project at Intel aiming to be the node OS for HPC machine. Intel is trying to deliver a polymorphic OS that can quickly adapt to new software and hardware without the need for specialized solution like it exist commonly on high end HPC systems. To some extend it looks like the Jailhouse system. Where the HW is physically partitioned. A few core are dedicated for management, while the rest are partitioned and are running lightweight kernel (LWK) + application. Note that I really resent Intel for always trying to rename things that are commonly used. LWK are Unikernel dammit.. Anyway its jailhouse + unikernel for HPC. 
  • GPU-STREAM : Stream benchmark for GPU, much needed benchmark to understand and quantify memory transfer rate to from global memory device on GPUs.

Wednesday, November 16, 2016

[Links of th Day] 16/11/2016 : Open Whisper, Security via BPF + XDP, Why cloud fails

  • X3DH : stand for Extended Triple Diffie-Hellman key agreement protocol. It allow asynchronous secure communication.
  • Cilium : BPF start to emerge as the dominant tool for any network functionality out there. Google cilium which leverage BPF and XDP for enforcing dynamic security rules via eBPF. [github]
  • Why Cloud Fails : Excellent review by Murat on the paper analyzing various cloud failures . Turn out that the vast majority are config related and upgrade related. Human failure only represent a very small portion of the overall numbers. [paper]

Monday, November 14, 2016

[Links of the day] 14/11/2016 : Time series features extraction, GCHQ data analysis platform and Alibaba's Kafka contender : RocketMQ

TsFresh : Automatic extraction of relevant features from time series. This is really usefull for anybody dealing with metrics. It allow in one sweep data cleaning and feature extraction.
Stroom : GCHQ data processing storage and analysis platform.
RocketMQ : Alibaba's MQ proposed as an Apache project. It tries to solve some of the limitation of Kafka while providing better performance.


Friday, November 11, 2016

[Links of the Day] 11/11/2016 : Anonymous Trustless Bitcoin, Zap golang log lib, Intel RSA controller

  • ZeroCash : Trustless Bitcoin Tumbling, the authors proposed a pooled approach to anonymise transaction in Bitcoin. However the authors go a step further than just pooling. They popose a system where participant can anonymously check in and out resources from a global pool. Effectively creating an anonymous cooperative resource sharing infrastructure. [github] [Paper]
  • Zap : Fast, structured, leveled logging in Go. When you start to reach Uber or other hyperscale microservice architecture. Every aspect counts, and logs are everywhere. This library provide a high performance structure log for go. 
  • Scalable software controller : This controller basically allow to allocate on the fly hardware ressource, compute, memory storage, network based ont the demand of the deployment tool (openstack, k8, mesos, etc..) . 

Wednesday, November 09, 2016

[Links of the Day] 09/11/2016 : Deep Neural Net Threats, Scaling Uber, Tcp over Sound

  • Assessing Threat of Adversarial Examples on Deep Neural Networks : machine learning is the next frontier for hacker. And because of its inherent opacity it requires special capabilities to secure system that relies on this underlying technology. This paper show that for text driven classification, adversarial exemple are more an academic curiosity than a real threat. However, we need to see if this can be applied to other type of classification. 
  • Lesson learns about scaling Uber : Many talk are about scaling, however most company and startup would love to have those problems. Often its not about scaling, its about having the right product market fit. Then you can enjoy the roller coaster of scaling problems. 
  • Quiet : TCP over sound . This is really cool, it allows to pass data through speakers on android devices.

Monday, November 07, 2016

[Links of the Day] 07/11/2016 : Baidu Open Source Repo(s), Wan Replicated DB

  • Baidu : Baidu open source code on Github. It looks like it replicate a lot of service / feature that other hyperscale system use. Raft seems to be the default underlying consensus protocol for all applications. A lot of nice goodies in there, especially: 
    • BFS : Baidu file system that provide the underlying persistence for Baidu real time application. Its a distributed multi datacenter using raft for metadata coherence and use a shared nothing approach for linear scalability. 
    • Tera : Distributed database 
    • Galaxy : mesos / kubernetes equivalent. 
    • Paddle : Distributed machine learning 
    • iNexus : Distributed K/V store . Looks similar to consul and it also use raft as the underlying consensus protocol
  • Bedrock : Wan replicated distributed data (base). Designed to use SSD and other nice features. 

Friday, November 04, 2016

[Links of the Day] 04/11/2016 : high performance reliable message passing framework, speculative paxos, Awesome Falsehood

  • Aeron : Aeron is an efficient reliable UDP unicast, UDP multicast, and IPC message transport. The key word here is reliable. Under the hood is brokered architecture built from the top down with performance in mind.They offer Java and C++ clients. 
  • specpaxos : Interesting concept but relies on multicast. SDN might help solve the inherent multicast drawbacks by creating topologies, distribution trees, etc.. ahead of time. But practically, how often do you see multicast deployed and enabled in modern datacenters ?
  • Awesome Falsehood : curated list of awesome falsehoods programmers believe in.

Wednesday, November 02, 2016

[Links of the Day] 02/11/2016 : Unik fast easy unikernel builder, Noms decentralized DB and dev books

  • Noms : decentralized database using GIT principle. There is some nice feature in there, such as content addressing (no duplicate), append only and last but not least : decentralized. Which means you can fork / merge, disconnect etc.. for seconds, hours or years. Like GIT, however i am not sure yet how they handle merge and conflict resolution.... 
  • Programming books : Good list of books for developers.
  • Unik : Tool by EMC to compile unikernels directly rather than going the binary route. Its nice to see an increased effort to facilitate unikernel adoption. Previously i talked about the effort. This is a slightly different approach here as it all to build unikernel in almost any langage using a tool chain as intuitive as docker. If this trend continue we might see a decline in container adoption with a move to unikernel. But its not for the short term as the optimisation cycle of the containers technology didn't fully kick in yet. [video] [slides]