Tuesday, August 08, 2017

[Links of the Day] 08/08/2017 : Python image drawing & animations, Breaking x86 ISA, Correlation between NYT and Stock markets

  • pywonderland : A collection of python scripts for drawing beautiful figures or animating interesting algorithms in mathematics.
  • Breaking the x86 ISA : Back hat conf presentation on the sand sifter tool for detecting hardware bug or undocumented instruction in modern processors. Sadly, it's known that Intel's microcode binaries are encrypted, and are secured with a RSA2048-SHA256 signature. Which makes any discovery a little bit useless. Unless it's a bug and then it opens up a whole new world of possibility. [Slides] [Github]
  • Correlations and Flow of Information between The New York Times and Stock Markets : Well as everybody knows, there is a correlation between information and the market state. Nothing new, but it confirms to some extent that the new york time especially has a specific influence on wall street ( and vice versa).

Thursday, August 03, 2017

[Links of the Day] 03/08/2017 : NVMe over TCP , Lineage mapping of cryptocurrency , Perceptions of probability

  • NVMe Over TCP : interesting kernel module by solarflare that allow people to use NVMe over TCP. It will be really interesting to see what type of performance you can start to get out of such setup. Even if performance is significantly decreased ( but higher than other storage solution) the economic gain vs costly NVMe solution would make this worth it. Also, It can start to accelerate the arrival of a new type of high-performance low-cost storage applications by lowering the barrier to entry.  
  • Map of Coins : Impressive lineage mapping of crypto currency. But what is more concerning is the amount of dead Bitcoin child cryptocurrency, a lot of pump and dump scheme going on
  • Perceptions : this really cool graphics show the how human perceive probability and how fuzzy it can be.  This can explain why a certain type of person might take more risk or less based on certain information due to a different interpretation of the content. 

Tuesday, August 01, 2017

[Links of the Day] 01/08/2017 : Os Image generation, K/V optimized for SSD, Command line cloud deployment tool

  • mkosi : Tool for generating OS image. This is nice if you need to create an maintain low-level OS image with EFI support and GPT based table. It is invaluable if you do bare metal deployment / PXE  based deployment. [github]
  • WiscKey : Key / Value pair system optimised for SSD systems. Especially aiming at avoiding write amplification with traditional LSM based K/V system. 
  • Arc :  plaintext manifest for provisioning and deploying a cloud infrastructure. It's quite nice and elegant interface for such tedious task. If you have used chef, puppet, cloud formation or terraform you know how verbose this can get. 

Thursday, July 27, 2017

[Links of the Day] 27/07/2017 : Russian Information Warfare, Public Cloud Economics, Impact of slowness in ultra fast networks

  • Russian Information Warfare Handbook : In depth description of Russian information war machine and how it affects the rest of the world. Some suggestion on how to fight it, however, I feel that the internet and the current social networks model heavily bias in favour of propaganda exploitation. Moreover, with the advance of machine learning, it becomes easy to identify key nodes in the network to attack in order to achieve maximum disinformation reach. 
  • Usage Patterns and the Economics of the Public Cloud : paper looking at the current cloud economic and cloud customers. It seems that most customers have a steady or mild variation usage with the obvious outlier ( probably academia or other heavy HPC / Machine learning job) 
  • To slow or not? : the authors look at the impact of slowing down (hardware error, software bug, or simply laws of physics)  next gen network on data centres and application when we start to reach 1 digit microsecond latency. There is a wide range of implication, from technical to legal as well as ethical. Sadly most of them tend to stay unanswered until a problem or conflict arise. 

Tuesday, July 25, 2017

[Links of the Day] 25/07/2017 : Universal Scalability Law Model , k8s + cloud native apps, python distributed execution engine

  • usl4j : Implementation of Universal Scalability Law model, this is really cool because it allows you to model ( and predict ) when your system performance will start to degrade as its scale. It allows you to take real measurements from a live system and continuously build models. [github]
  • daft : a tool for developers to create cloud-native applications on Kubernetes.
  • ray : distributed execution engine written in python. Useful if you want to execute and schedule task across a cluster of nodes.

Friday, July 21, 2017

[Links of the Day] 21/07/2017 : Tofu HPC interconnect challenges, Satellite Quantum key distribution, Go K/V store

  • Evolution and challenges of Tofu Interconnect : Deep learning and heterogeneous hardware is putting a strain on HPC interconnect. They need to adapt to new application communication model as well as hardware while retaining best of breed capability for "traditional" HPC application. 
  • Satellite-to-ground quantum key distribution : this is ground breaking from Chinese scientists. Where they demonstrate they are able to solve the main hurdle behind planet-wide quantum communication by leveraging satellite to satellite quantum crypto key distribution.  And they are deploying a proof of concept!
  • Badger :  nice key/value store written in go. It's based on Wiskey paper and is heavily optimised for SSD. It's 3.5 faster than rocksDB .[github]

Tuesday, July 18, 2017

[Links of the Day] 18/07/2017 : Banking API, how Cooperation strategy evolve, Distributing file system images

  • Teller : API for your bank account, already support a couple of UK bank. This is nice, however, I wonder how banks will react? Also, EU is forcing bank open their API but UK is leaving the EU, and there is a chance that the UK banking system will try to seize the chance to create its own independent banking API. Creating more barrier to entry for fintech startup. Anyway this is rather cool, however, I am still disappointed that most banks do not expose an API to access your data. 
  • How Cooperation Evolves :  the authors looked at evolution as a thermodynamic process and found how cooperation strategy evolve and how they can be manipulated.  
  • Casync : A tool for distributing file system images, really cool if you have to update images often and want to have a cheap traffic and storage wise solution. 

Thursday, July 13, 2017

[Links of the Day] 13/07/2017 : Jack of all trade Deep learning Model, Pay with Group Selfie, Support Vector Machine

  • One Model To Learn Them All : the authors propose a model that is good enough for most needs, kind of a jack of all trades/master of none model for deep learning. This can be really practical for experimenting and probing dataset for potential use. Or if you do not have the time to spend to create the ideal model. However, there is always the risk to end up having a sub-performing solution. 
  • Pay-with-a-Selfie : interesting payment model where the authors propose to use group selfie to extend the split bank note metaphor for executing financial transactions. 
  • Introduction to Support Vector Machines : if you want to learn about the SVM classification systems.

Tuesday, July 11, 2017

[Links of the Day] 11/07/2017 : Chip Hall of Fame, AMD Software optimization guide, Pocket negotiator

Thursday, July 06, 2017

[Links of the Day] 06/07/2017 : venture capital investment framework, Aftershock in complex systems, ISC workshop

  • Picking Winners : the authors propose a framework for venture capital investment. However, I feel that there is a major flaw in the approach as it is designed to correlate previous success with future investment success. This assumes a high degree of repeatability with clear identifiable elements. The model can suffer from the purple cow effect where higher growth opportunities are located in underserved segments. As a result, quite often by following the model, you might suffer from diminishing returns.
  • Aftershocks in a complex system : the authors look at the behaviour of a complex system ( currency market) after a catastrophic event. They discovered that often such system follows a similar pattern to the one exist in earthquake's aftershock. In extenso, after an initial catastrophic event, most systems suffer a series of gradually diminishing aftershock spaced in time. 
  • ISC Workshop 2017 : all about container and optimisation for high-performance workloads.

Tuesday, July 04, 2017

[Links of the Day] 04/07/2017 : Classic Papers, Operating Model Canvas, Origami anything

  • Classic Papers : a collection of highly-cited papers in their area of research that has stood the test of time. For each area, we list the ten most-cited articles that were published ten years earlier.
  • Operating Model Canvas : a set of tools and model to help to align operations and organisation with strategy. 
  • Origami anything : New algorithm generates practical paper-folding patterns to produce any 3-D structure.

Thursday, June 29, 2017

[Links of the Day] 29/06/2017 : BeeGFS distributed FS, Virtual memory in Big memory systems, PdfX

  • An Introduction to BeeGFS : Fraunhofer distributed parallel file system for HPC system. Mainly a concurrent of Lustre I would say. This had potential and they are making a foray into the business side of storage. Let's see how well they fare. [website]
  • Preserving the Virtual Memory Abstraction : the author work aims at maintaining the virtual memory abstraction throughout a set of various hardware implementation. [thesis]
  • PDFx : really cool tool that allows you to extract all the reference and metadata and download them!!

Tuesday, June 27, 2017

[Links of the Day] 27/06/2017 : Blockchain trust & authentication for IoT, K8s patterns, Ripple cryptocurrency Network analysis

  • Kubernetes Production Patterns and anti-patterns :  a lot of common sense, actually a lot of the patterns and anti-patterns can be applied to the other environments. But still a good refresher. 
  • Blockchain based trust & authentication for decentralized sensor networks : using blockchain to solve the trust issue in a swarm of IoT on a network. The critical bit missing is the power requirement for running all the crypto operations.
  • Large-Scale Analysis of the Ripple Cryptocurrency Network : an overview of the paper analysing the Ripple p2p blockchain based money transaction network. Turns out it suffer the same issue that "old school" p2p network. Take out the highly connected nodes and you can bring down / split the network. Nothing new, but still a good read and reminder that small network tends to be resilient to attack. But, if their resiliency diminishes with the increased reliance on a small number of highly connected nodes.

Thursday, June 22, 2017

[Links of the Day] 22/06/2017 : Modern Web Dev spell-book, .Net microservices framework, Optimizing Rust

  • Spellbook of Modern Web Dev : like an awesome list for javascript development but with more thought and structure. Must read for front end javascript devs.
  • Microdot : not a .net guy but here is a framework for easily creating .NET Microservices with Orleans
  • Rust Optimization : a lot of stuff can be applied to any language. But, there is some nugget of information for Rust language in this document.

Tuesday, June 20, 2017

[Links of the Day] 20/07/2017 : Statistics lectures notes, HA transactions, Capability Models for Manycore Memory Systems

  • Lectures on Statistics : 2003 lecture note on statistics, pretty much cover all the basics of what you need to know about stats.
  • Highly Available Transactions : the authors looks at the state of the database transactions system and well like any good scientist their conclusion is that there is more research to be done. But more seriously, highly available transactions and system need to be perfected and new semantics with hybrid systems are required to be developed in order to ensure the availability of transactions. 
  • Capability Models for Manycore Memory Systems : Programming for manycores system is hard, optimising them is even harder. In this paper, the authors try to design models that help programmers to deliver efficient software for these type of hardware architecture. [slides]

Friday, June 16, 2017

Maxims of Maximally Effective Startup Developer

  1. Test, then Deploy
  2. A coding Developer outranks an Architect who doesn't know what's going on
  3. If the food is good enough, the devs will stop complaining about the incoming workload
  4. Only you can prevent prod failure. 
  5. If testing wasn’t your last resort, you failed to resort to enough of it.
  6. The longer everything goes according to sprint planning, the bigger the impending disaster. 
  7. The world is richer when you turn competitors into partners, but that's not the same as you being richer.
  8. Give a developer a task, he will code for a day. Take his software away and tell him he's lucky just to be paid, and he'll figure out how to code another one for you to take tomorrow. 
  9. "Deploy and Forget" is fine, provided you never actually forget.
  10. Don't be afraid to be the first to resort to rollback.
  11. The competitor of my competitor is my competitor's competitor. No more. No less.
  12. There is no over testing.' There is only 'Continuous Integration' and 'I need to spawn more Jenkins slave'
  13. Just because a feature is easy for you, it can still be hard to your clients.
  14. There is a difference between a spare feature and extra [feature].
  15. Not all good news is competitor action. 
  16. “Do you have a backup?” means “I can’t fix this.” 
  17. The size of a developer startup stock options is inversely proportional to the likelihood of the startup surviving to collect it
  18. Don’t try to save money by conserving lines of code.
  19. Don't expect the competition to cooperate in the creation of your dream startup
  20. If it ain't broke, it hasn't been deployed to prod yet.
  21. The dev team you've got is never the dev team you want.
  22. The product management guideline you've got is never the guideline you want
  23. The best way to win a one-on-one architecture design is to be the third to arrive.
  24. It's only too many features if only the devs use them
  25. Don't bring big VMs into small servers
  26. Management knows how to do it by knowing who can code it 
  27. Failure is not an option - it is mandatory. The option is whether or not to let failure be the last thing you test.

Borrowed from Shlock Mercenary Comic 

Thursday, June 15, 2017

[Links of the Day] 15/06/2017 : Corporate Wargaming, ScatterText , Forecasting and BigData

  • Scattertext : Nice and easy to use tool allowing to find independent terms in corpora and present them in an attractive way via interactive scatter plot.
  • Forecasting in the light of Big Data : the authors look at how forecasting is changing in the light of new tool and data collection capability emerge. And as often the authors suggest that the best approach would be to combine model and quantitative analysis in order to obtain the best forecasting strategy. However, I am afraid that this would require a new framework and also training data scientist to leverage this two opposite methodology correctly. 
  • Competitive Wargaming and Simulations for Business Forecasting & Analytics : slide deck providing a good insight on how wargaming can help shape the decision process and strategy of a company. However, like any tool, it's about as much about the preparation and how to leverage the outcome of the game itself. 

Wednesday, June 14, 2017

[Links of the Day] 14/06/2017 : Formally proven HTTPS replacement, Kisrhombille geometry public key cryptosystem using Mersenne Numbers

Everest: Towards a Verified, Drop-in Replacement of HTTPS. The authors ( team of Microsoft, MIT, INRIA) propose a complete, verified replacement of  TLS and other components of HTTPS. Entirely written in F* for provability, Everest is then compiled into a low-level language. This is a highly praiseworthy solution. However, there is still a great portion of the dev world that do not completely embrace or understand formal verification. And until the big corporation ( Google, Microsoft, AWS, etc.. ) start pushing such library the adoption will remain marginal.
Public-Key Cryptosystem via Mersenne Numbers : an interesting new approach to delivering new Public key cryptosystem, for crypto buff only.
Kisrhombille geometry : tessellation of plane using rhombic faces divided in a centre point into four triangles. While Voronoi tesselation tend to still have my preference this type of tessellation has a high potential and like all of them, they are really pretty :) 

Tuesday, June 13, 2017

[Links of the Day] 13/06/2017 : Next Gen Fabric Comparison, Marginal Revolution Books, AWS awless CLI

  • Marginal Revolution Books : A browsable database of books discussed on Marginal Revolution, sorted by the month of their posting.
  • CCIX, GEN-Z, OpenCAPI : Overview and comparison of the different next generation fabric. This presentation shows that key difference and use case associated with each fabric. CCIX  mainly focusing on low latency main memory expansion with its hardware cache coherence. It enables accelerator, network, main processor, etc.. to work on a same coherent view of dataset. Gen-z is really all about hardware/ component disaggregation. I tend to prefer this approach as I feel this will be the next step in HW efficiency. While OpenCapi is a little bit like CCIX but with the hope to extend beyond to rack level in the future. 
  • awless : fast, powerful and easy-to-use command line interface (CLI) to manage Amazon Web Services. This a nicer version of the AWS cli. Also, it is WAY more human-friendly. Check it out. 

Thursday, June 08, 2017

[Links of the Day] 08/06/2017 : Machine Learning Tuning DBMS, Direct SSD to GPU SQL and Large Graph DB processing

  • Tuning DBMS with Machine Learning : From the people behind Peloton, they demonstrate a way to automatically tune DB using machine learning. This is rather interesting, however, there is a key element that is missing in the approach: Cost. Your DB system can become highly optimise but your AWS cost can skyrocket too. What you need is a system that automatically tunes perf & cost tradeoff to maximise ROI Sometimes being a little bit slower can save $$
  • MOSAIC : More heterogeneous approach: graph processing engine that exploits all the hardware resources available in a standard Xeon host processor, Xeon Phi coprocessors, NVMe, and a fast interconnect. Because fast processing of your Facebook social network for fast advertisment targeting is worth it :) [slides]
  • PG-Strom : By-passing CPU for SQL operation by allowing direct SSD to GPU communication for Postgress SQL processing. We are slowly entering the age of heterogeneous computing system were core CPU get relegated to highly generic tasks. [slides]

Tuesday, June 06, 2017

[Links of the Day] 06/06/2017 : Secure Machine Learning, Quantum secured blockchain and Survey of Machine Learning in Hardware

  • DeepSecure : a framework that enables scalable execution of the state-of-the-art Deep Learning models in a privacy-preserving setting. The authors propose a system that enables data owner and model owner to maintain segregation of information while allowing them to work together without data leak between the two parties. 
  • Quantum-secured blockchain : The authors propose in this paper a quantum blockchain architecture specifically designed to solve the post-quantum computer cryptographic weakness of currently used crypto algorithms in Bitcoin and other blockchain frameworks. However, it seems that they conveniently ignore newer cryptographic solutions that are "quantum resistant".
  • Survey of Neuromorphic Computing and Neural Networks in Hardware : heterogeneous hardware solutions are becoming the norm as classic CPU are not able to handle the bandwidth and processing power. Seriously, how a Intel or AMD CPU can process 1 Tb/S of bandwidth ... Anyway, as machine learning is reaching peak hype, the hardware that comes to accelerate it is getting more mainstream and diverse. This paper provides a good overview of the various technic and hardware used in the field. Moreover, it references an exhaustive collection of papers of the field.

Thursday, June 01, 2017

[Links of the Day] 01/06/2017 : Istio microservice mesh , encrypted p2p network, formally proven system still vulnerable to bugs

  • Istio : this is a fantastic project, it allow an efficient delivery of micro service infrastructure without tying the developer to a language specific framework. It relies on data plan using the Envoy proxy for managing and mediating all communication as well as a control plane for managing and enforcing proxy policies. 
  • CJDNS : encrypted IPv6 network using public-key cryptography for address allocation and a distributed hash table for routing. This provides near-zero-configuration networking, and prevents many of the security and scalability issues that plague existing networks. [github]
  • An Empirical Study on the Correctness of Formally Verified Distributed Systems : Spoiler alert, even formally verified project can fall prey to bugs. And it seems that these bugs can seriously affect real systems in the wild. Well looks like that human in the loop is the weakest link in the production of robust systems after all.

Tuesday, May 30, 2017

[Links of the day] - 30/05/2017 : Next Gen File system / Machine learning Infrastructure & stealth denial of service in public cloud

  • TFS: Some Interesting concepts behind TFS - Next-generation file system. The Design goals are ambitious & well thought. However, Machine learning for caching can make sense but I will still prefer to see how it cope in real life over a long period of time. Especially when there is historical access pattern that span days - months.
  • Stanford DAWN Project : While I know some cool tech will come out of it public cloud provider will probably shape the future of machine learning infrastructure at faster pace & with shorter iteration
  • Bolt : this is really cool, the authors of the paper are pretty much pushing the noisy neighbour to 11. They propose a technique that offers stealth denial of service on public cloud infrastructure. 

Tuesday, May 23, 2017

[Links of the Day] - 23/05/2017 : Intel Manufacturing Forecast, Topological Quantum Computing and Serverless Conf Video

  • Intel Manufacturing Conf : sets of slide deck giving a peek into Intel manufacturing process and the upcoming wave of 10 nm chips. It seems Intel is currently keeping up with Moore's law, not by reducing the transistor size but by also increasing transistor density.
  • Introduction to Topological Quantum Computation : introduce the concept of quantum computing with anyons which allow a more resilient quantum computing system. 
  • Serverless Conf video :  almost all video are now available. Check out the Serverless at Nordstrom video. This is, in my view, the best of the bunch. It's an actual practical talk by the dev who implemented it and without marketing spin. 

Tuesday, May 16, 2017

[Links of the Day] 16/05/2017 : Exascale Project, Storage as Stream , ServerlessConf

Thursday, May 11, 2017

[Links of the Day] 11/05/2017 : Google - Push on green , Tensor flow in Datacenter , TCP congestion protocol

  • In-Datacenter Performance Analysis of a Tensor Processing Unit : How custom deep learning hardware behave in real datacenter, implication and gain associated with the use of such custom solution.
  • Push on Green : Great article on google roll out policy and process. A lot of common sense, and also some less common but as important. This is a great read for anybody involved in software delivery and especially if you are aiming at having an efficient CI/DI system. 
  • BBR : google congestion protocol for maximising bandwidth usage. It's new TCP scheduling algorithm to fight buffer-bloat at the TCP level. Since the majority of internet traffic is TCP, wide adoption would cause a big improvement. TCP scheduling only affects outgoing packets

Tuesday, May 09, 2017

[Links of the Day] 09/05/2017 : Serverless Design Patterns, Paper Reading How-to, HPE's The Machine

  • Serverless Design Patterns : Basic serverless pattern, nothing fancy but useful to keep in mind. 
  • How to Read a Paper : the short version is quite simple: Abstract -> conclusion ->figure, then start iterating through the paper, one layer at a time. 
  • Billion node graph inference: iterative processing on The Machine : I still believe that the machine is vaporware. It was promised for years. Still at the stage of nice plastic prototype and everybody developing/designing for it use simulator/emulator. But hey, it's nice to see what you can theoretically achieve on "the machine" even if it's a Superdome X in reality.

Friday, April 28, 2017

[Links of the Day] 28/04/2017 : Bitcoin Antbleed , Social Networks Rumors, HPC & AI trends

  • Increasing the Flow of Rumors in Social Networks by Spreading Groups : Looks like by fragmenting groups rumours flow more easily in social networks. To a certain extent, this mimics real life as by isolating and fragmenting group it becomes easier to spread gossip due to the difficulty by an individual in each group to check the validity of the information within its neighbourhood.
  • HPC & AI Technology Trends : Dr Eng Lim Goh of HPE talks about the trend in HPC and AI.
  • Antbleed : Apparently, BITMAIN,  the ASIC system provider of up to 70% of bitcoins miner embedded a backdoor that can disable or compromise remotely its hardware. The funny aspect is that it can potentially allow the company to pass the 51% control of bitcoin miner network, and hence allow it to rewrite the whole blockchain. The 51% threshold has always been considered as a theoretical threat that was not attainable in real circumstance. Well, guess what, it's not theoretical anymore.

Wednesday, April 26, 2017

[Links of the Day] 26/04/2017 : Aphyr Scala Day, Sia blockchain file storage , How brains are built

  • Aphyr Scala Day 17 : Aphyr breaks database for a living and then talks about it :) 
  • Sia : a Blockchain-based marketplace for file storage, the really attractive thing is the cost comparison of SIA vs public cloud system. Which is between a tenth to a hundredth time cheaper than S3 or other similar solution. I would be curious to see the performance thought.
  • How brains are built: High-level overview of principles of computational neuroscience.

Monday, April 24, 2017

[Links of the Day] 24/04/2017 : Data Center Perf Index, Physical Limits of Computing And DNS as Code

  • Data Center Performance Index : Performance index aiming at providing a reliable idea of the performance and efficiency of the datacenter. It primarily focuses on Availability, Efficiency and Environmental impact. This effort is lead by Dean Nelson, Uber Had of computing. 
  • Physical Limits of Computing : A look at the limitation of compute from a physicist point of view. It seems that some limitations are fundamental and require a new and different approach in order to create compute device that goes around these limitations. 
  • DnsControl : system built to manage DNS systems. To some extent, it looks like a terraform for DNS where you can plug multiple DNS backend providers. This allows you to deploy and distribute your DNS infra in an agnostic way across multiple cloud provider. [github]

Friday, April 21, 2017

[Links of the Day] 21/04/2017 : HPC 2017 trends, Docker cheat-sheet, Incident response best practice

  • Current Trends in High-Performance Computing and Challenges : Jack Dongarra annual HPC review, It's amazing how the chinese progressed. They literally took over the top 500 in less than ten years. And now they dominate using homegrown chips and network fabric. [slides]
  • Docker cheatsheet : 'nough said.
  • Increment - On-call : New magazine providing article on how to scale companies. Each edition focus on a different topic. For the inaugural issue, they focus on industry best practices around on-call and incident response.

Wednesday, April 19, 2017

[Links of the Day] 19/04/2017 : AMD ROCm GPU open platform, Weak Memory Models concurrency report, SSH server for distributed infrastruscture

  • ROCm : this slide deck give an overview of the AMD ROCm open platform for GPU computing exploration. They are really pushing to become the open source standard for the GPU industry battling against NVIDIA supremacy in the domain. It looks like they are making really good progress and I would be curious to see how this progress when combining with their Ryzen CPU. 
  • Concurrency with Weak Memory Models : this is a really good report on the state of memory models in hardware and software. It provides a wide spectrum overview of Hardware and Software concurrency model and approaches as well as the future direction in the domain. 
  • Teleport 2 : a modern SSH server designed for teams managing distributed infrastructure. [github]

Monday, April 17, 2017

[Links of the Day] 17/04/2017 : Pedis Redis Clone, Serverless framework, Deep learning best practices

  • Pedis : Redis Compatible NoSQL datastore using the Seastar Framework. It's interesting to see that on the single thread benchmark Redis and Pedis are on par while it Redis gets smoked on 8 thread benchmark. However on a side note, the author should probably have chosen another name for project. 
  • serverless : Serverless Framework with serverless architectures using AWS Lambda, Azure Functions, Google CloudFunctions [github]
  • Best Practices for Applying Deep Learning to Novel Applications : this is pretty much a must read for machine learning expert using deep learning. This report decomposes deep learning project in phases and provides best practice for each phase.

Friday, April 14, 2017

[Links of the Day] 14/04/2016 : OpenFabric Workshop , Docker's Containerd , Category Theory

  • OpenFabrics Workshop 2017 : Some interesting talk this year at the open fabric conference:
    • uRDMA : Userspace RDMA using DPDK. This opens up a certain amount of possibility, especially for object storage solution. [Video , Slides, github]
    • Crail : Using urdma above to deliver accelerated storage solution for Apache big data projects [Slides, github]
    • Remote Persistent Memory: I think this is the next killer app for RDMA. If Intel doesn't jump onto it and deliver a dpdk like solution. [Video, Slides]
    • On Demand paging: slowly the tech is crawling its way up to upstream acceptance. While on-demand paging introduces a certain performance cost. It also allows a greater flexibility in consuming RDMA. One of the interesting aspects that nobody mentioned yet is how this feature could be used with persistent memory. I think that there is some good potential for p2p NVM storage solution.[Video, Slides]
  • Containerd : Containerd move to github, the docker "industry standard" container runtime is also reaching its v.0.2.x release.  [github]
  • Category Theory : If you are into functional programming and Haskell. This is a must read book for you.

Wednesday, April 12, 2017

[Links of the Day] 12/04/2017: Linux Perf tools, libp2p and Contagion of Information in Social Media

  • Perf Toolsmiscellaneous collection of in-development and unsupported performance analysis tools for Linux ftrace and perf_events. 
  • Contagion of Information in Social Media :  The authors look at how information spread on social media ( twitter ). The authors model contagion behaviour in the hope to create effective defences against "fake news" and other propaganda. However, to some extend the research can also be used to optimise the spread of such malicious information. 
  • libp2p :  really cool network stack ( used by IPFS) that tackle a lot of the nitty gritty detail of p2p applications. It should allow devs to focus on the actual value of their p2p apps rather than the technical underlying problems of p2p itself. [github]

Monday, April 10, 2017

[Links of the Day] 10/04/2017 : Loopy , Distributed execution engine and Yet another distributed ledger algo

  • Loopy : Fantastic tools for explaining and describing complex system interaction. It's easy to use and even easier to get the message across [github]
  • Ray : experimental distributed execution engine replicating code across multiple workers. Written in python it leverages object store and distributed task execution to achieve parallelism. I really wonder if it wouldn't have been better to code Ray using AWS lambda and S3. 
  • Algorand : yet another distributed ledger. The approach proposes to eliminate the segregation of actors in the ledger system. No more miners and users, everybody is an equal participant. Moreover, it relies on a new for of Byzantine agreement ( need a TLA+ proof to really feel comfortable with that) and cryptographic selection algorithm for selecting leader (verifier) of the ledger process execution. This is a rather interesting paper and I will try to produce a short summary of it if I find the time.

Friday, April 07, 2017

[Links of the Day] 07/04/2017 : TensorFlow Example, Systems in the microseconds era and blockchain distributed direct democracy

  • Naked Tensor : Google tensor flow bare-bone example. 
  • Attack of the Killer Microseconds : Hardware ( especially storage ) is entering the micro or sub-micro second era. This has far and wide ranging implication. And system designer needs to rethink the existing stack that was designed for the millisecond era. It looks like we are entering an era where the software stack is not the bottlenneck.
  • Cicada : Distributed secure proof of work blockchain combined with a privacy guaranteed ID system. The creator of cicada aim at enabling distributed direct democracy and decentralised application platform. This is worthy goals, however, the creator forgot that direct democracy tend to fail as the majority of the population is not interested or knowledgeable enough in the problem they will be asked to vote on. 

Wednesday, April 05, 2017

[Links of the Day] 05/04/2017 : Large network resilience, Distributed Systems, Machine Learning & Bayesian reasoning

Monday, April 03, 2017

[Links of the Day] 03/04/2017 : Conway's Game of life Clock, Human-Bot social interaction, SQL time series DB

  • Digital clock in Conway's Game of Life : I can't even start to comprehend how you can design this. But this is beyond cool.
  • Online Human-Bot Interactions: Detection, Estimation, and Characterization : An analysis of bots on socials network (twitter). I think we need a reverse Turing test. When a robot can detect when they talk to a human.... Reverse captcha to weed out that pesky meat-bag from meddling from our robotic overlord affairs.
  • Timescale : SQL compatible time series database. Another competitor for Influxdb. Let's just say that the clustering feature will make or break it as Influxdb has some serious issue there [github]

Friday, March 10, 2017

[Links of the Day] 10/03/2017 : User-space SysFS, Key Value consensus Algo, Cost efficient Big Data Serverless Framework

  • ProcStat : Userspace equivalent of kernel SysFS. Really cool project by my friend Sasha. makes it really easy to expose internal counter and state of a process via FUSE
  • Bizur : Key-value Consensus Algorithm using a nice solution where consensus is achieved on the key themselves rather than relying on a globally distributed log. The great aspect is that recovery and failure management is greatly simplified and streamlined. However, it implies that the progress and consensus on each key are independent of each other. As a result, you cannot rely on serialisation of state between key. Which can be limiting if you expect the state of Key A to be changed after the State of Key B by example. 
  • PyWren : Framework that let you use serverless functions for cheap large-scale data analysis. [github]

Wednesday, March 08, 2017

[Links of the Day] 08/03/2016 : Intel blockchain, Fast17 conference and papers, AWS cloud formation devops tool

After a small hiatus, here is the return of the links of the day.
  • Sawtooth Lake: Intel distributed ledger system. It uses an interesting security mechanism to deliver secure consensus. Sadly it relies on Intel proprietary hardware encryption modules to deliver this feature.
  • Fast17: File and Storage technology Usenix conference happened last month. There were a couple of interesting papers but one picked my interest: Redundancy Does Not Imply Fault Tolerance:Analysis of Distributed Storage Reactions toSingle Errors and Corruptions. The authors look at single file system fault impact on Redis, ZooKeeper, Cassandra, Kafka, RethinkDB, MongoDB, LogCabin, and CockroachDB. Turns out most systems are not able to handle these type of faults very well. It seems that a single node persistency layer error can have an adversarial ripple effect as distributed system seems to have put way to much trust in the reliability of this layer. Sadly they lack tools for recovering from errors or corruption emerging from file systems.
  • Stacker : remind 101 tools for creating and updating AWS formation stacks. Looks like an interesting alternative to terraform. 

Tuesday, January 24, 2017

[Links of the Day] 24/01/2017 : Numpy cheat sheet, Innovations patterns, Persistent memory summit

  • Numpy Cheat Sheet: all you need for data analysis in python with NumPy 
  • Mathematical Model of innovation patterns: Vittorio Loreto at the Sapienza University of Rome in Italy and al, created the first mathematical model that accurately reproduces the patterns that innovations follow. 
  • Persistent Memory Summit: SNIA NVM summit 2017. Finally, with the introduction of Intel 3dXpoint, we start to see more HW NVM solution out there. And with that software that uses it. Some really interesting talks: 
    • Nova file system demonstrates the benefit of NVM optimises storage solution. 
    • SAP Hana on NVM: interesting to see that they still require redundant copies as they fear data corruption on NvDimms. I wonder when we will start to see ECC NvDimms on the market? 
    • New interconnect: lots of hot new interconnects battling for the heterogeneous compute ecosystem domination. And this pass by offering persistent memory specific solution:  Gen-z provide PM pooling, Open CAPI accelerate PM access and CCIX share PM 

Wednesday, January 18, 2017

[Links of the Day] 18/01/2017 : Multi-tenant K/V cache, Http Tunnel , Google Infrastructure security

  • Memshare: Multi-tenant in-memory key value store, the authors target specifically web caching use case. It is interesting to see that they are using log-structured for maximising memory usage and hit ratio. However, the really novel approach is that it allow each application to define its own eviction policy. 
  • Chisel: Interesting tunnel approach over Http, to some extent similar to coding but with a different approach. I really like that is provide something more akin to crowbar for firewall bypass and with out of the box encryption. Also, it seems to be a lot faster that other tunnel out there. 
  • Google Infrastructure Security Design: Google approach to security is really interesting. While it makes great use of hardware security feature it also leverages a more software-defined security approach allowing them to have multiple lines of defences stacked between each communicating component while eliminating a lot of the restriction that often scleroses highly secure infrastructure.  

Wednesday, January 11, 2017

Ancillary business opportunities from the emergence of autonomous ride share car services

The list of self-driving vehicles and companies starting to offer services of these vehicles is ever growing. I recently started to be interested in the ancillary challenges brought by the deployment of a fleet of autonomous vehicle and the business opportunity that emerge.

There is two interesting business area with a certain potential: cleaning services and real estate. 

Support service ecosystem:

Supports services are the main area of expansion created by companies like Uber, Google making a foray in the autonomous ride share business model. They will more likely outsource these operation to third parties as it these business has a low-profit margin and tends to be hard to automate (as in requiring manual labour). 
I have three daughters under 5 and, let's face it, my car is a mess. It takes less than 2 rides to transform a clean spotless car interior in the equivalent of the Omaha beach d-day aftermath. And this is the same for taxi / uber drivers, the current best practice recommendation is to have cleaning implements and a throw-up bag at all time in the car in order to maintain high standard and rating. Not to mention the cleaning fee if things go really bad. 

Now if you have an autonomous car, you will need to have it clean often as they will be providing ride 24/7. 
Repair and maintenance requirements are obviously another areas that will need to be developed. By example, In Uber current model, the cleaning, repairing and refuelling is the responsibility of the owner of the car. However, when shifting to the autonomous ride, Uber will start to need to, either provide this service internally or outsource it. 
Refuelling and recharging might be less of a problem as there is a clearer way of automating the process. 

Real estate issue:

Another side effect is that for cleaning, recharging and repairing operations require real estate. You cannot deliver these service in the middle of the street. And this is another problem that corporations will have to solve. To some extent Google and Uber are trying to go around this issue by deploying their solution first in confined areas like college campuses, military bases or corporate office parks. As the owner of these private space will be able to provide space for free in order to benefit from the service. However, as they expand outside, this will become more problematic. Moreover, they might want to have buffer zone where the fleet of vehicles is at rest in off-peak periods.
Being able to deliver efficiently the logistic for support service while maximising resource efficiency will literally make or break the business model of autonomous rideshare. One possibility would be for these companies to contract, uber style, individual to offer their driveway and cleaning/refuelling services. Companies will be able to use the cleanliness rating made by the customer to evaluate the service quality of the individuals. This would partially solve the real estate issue until town planner starts to accommodate this new mode of transport. It will also allow to grow cheaply a widely distributed service point location. Enabling just in time servicing, hence maximising car usage efficiency. 

To some extend the new business model deployed by the like of Uber couple with the commoditization of service create new business opportunity for ancillary support services. Unsurprisingly, these services can copy or adapt the same business model to scale while keeping cost down. However, it might be a little bit too early for these to blossom as we haven’t reached peak Uber fade and fleet of self-driving cars are a couple of years away. I would probably keep an eye instead on the less glamorous but potentially more lucrative self-driving trucks business instead.

Monday, January 09, 2017

[Links of the Day] 09/01/2017 : Incident Response process, Plain english Legal guide to start a startup, 33C3 videos

  • Incident Response : pager duty incident response documentation. This is a very thorough and well documented process for handling incident, before during and after their occurrence. Probably not one size fit all by can be easily adapted to an company needs.
  • Plain English legal guide on how to start a business : nice guide of the legal aspect of how to start a startup and the various options and pitfall associated with it. A little bit too US centric for my taste but still has some good insights. 
  • 33C3 :Chaos Computer Congress videos are now available. Chris Hager gave a great overview of the different talks. As always a lot of diversity and challenging presentations.

Friday, January 06, 2017

[Links of the Day] 06/01/2017 : System we love videos, Distributed programming book and all the code from NISP16 papers

  • Systems we love : Video of the recent systems we love conference. Checkout the lessons from the cell one. 
  • Distributed programming book : yet another open source book on distributed systems. 
  • Code of NIPS16 : all code from the machine learning NIPS16 conference. Finally we start to see some traction to make code available alongside papers. I think all top conference should make mandatory that the code needs to be available on publication date. We don't care if its not production ready or nice.. We just want to see how these wonderful ideas translate in real code.

Wednesday, January 04, 2017

[Links of the Day] 04/01/2017 : ProxySQL, Smudge Golang lib, Hypervisor costs comparisons

  • ProxySQL : SQL proxy fir MySQL ( or any other fork like Percona and Mariadb) [Github]
  • Smudge : Go library providing group member discovery, status dissemination and failure detection using the SWIM epidemic protocol. This is really cool as provide a building block for an equivalent to consul but with a very low footprint ressource and network wise. 
  • Hypervisor costs : interesting comparison of hypervisor solution costs. The surprising number is that contrary to popular belief VMware is not the most expansive but Hyperv is on a apple to apple comparison. However I am not sure that this hold true when you are talking for a complete solution.

Monday, January 02, 2017

[Links of the day] 02/01/2017 : Cloud Cron, KSM hypervisor, Technology landscape radar

  • Cloud Cron : cool tool for executing cron job on cloud
  • KSM : while the name might be confusing ( KSM also stand for kernel shared memory). KSM is a neat small hypervisor and support a lot of hardware feature.
  • Technology Radar : ToughtWorks maintained technology landscape. Really useful to spot what up and coming tech are been baked by different startups. Also at what stage are the different technologies