Thursday, October 19, 2017

[Links of the Day] 19/10/2017 : Distroless minimal container images, Strangeloop and CppCon 2017

  • Distroless : a toolkit for creating container images which contain only your application and its runtime dependencies. No package shells, managers, any other programs. This is really awesome as you always end up with a lot of clutter in your layer. too often you see apt-get update ; apt-get upgrade in your Dockerfile. Or you could simply move to Golang and enjoy a From scratch environment! [talk] [video]
  • Strangeloop: a very good review of the excellent strange loop conference. Every time I watch some of these talks, I tell myself I should get into functional programming .. Then life ( and three kids under 6 ) takes over. [slides] [videos]
  • CppCon : another dev conference, this time C++. Some interesting talks, especially the includeOS one. Which provides a C++ microkernel functionality by simply including a single header #include ! A lot of in-depth and technical talks, give it a look if you are a C++ dev. [slides] [videos]





Tuesday, October 17, 2017

[Links of the Day] 17/10/2017: Corporate Taxes & wages, Storage with macromolecule, 25 years of MPI


  • Do Higher Corporate Taxes Reduce Wages? : interesting paper where the authors estimate that 40% of the corporate tax burden is passed onto the worker. And that most of the tax variation is directly imposed on the workforce. 
  • Macromolecules as Storage Media : the authors suggest that you can achieve a petabyte per cubic centimetre. Stability and durability are not fully addressed yet compared to another non-organic medium. However, I doubt that this can be a major concern as it can be an extremely viable solution for short-term transport of digital data. 
  • MPI symposium : MPI is 25 years old, and still improving. The venerable HPC message passing interface is still widely used and underpin a lot of non-HPC critical infrastructure such as stock markets. A must read is Jack Dongarra presentation on the evolution of MPI.

Thursday, October 12, 2017

[Links of the Day] 12/10/2017 : bitcoin resource list, time series DB seminar, Microservices debugger

  • Bitcoin resource list : extensive list of bitcoin resource ranging from basic introduction, history, tutorial, to in-depth tech materials
  • Time Series Database Lecture : 2017 Carnegie Mellon university lectures. This is quite good as it not this series of lecture not only offer high-quality theoretical knowledge in the field but also invited talk from key commercial and opensource player in this field ( influxdb, timescale, etc..) 
  • Squash : microservices debugger, because now you can't rely on your monolith debugging skill and tool set anymore ( ^_^).


Tuesday, October 10, 2017

[Links of the Day] 10/10/2017 : Machine Learning Hardware acceleration , Homomorphic encryption

  • Tutorial on Hardware Architectures for Deep Neural Networks : How to leverage hardware for accelerating machine learning processes. 
  • A Survey on Homomorphic Encryption Schemes : this paper presents a thorough survey of the state of homomorphic encryption schemes. Homomorphic encryption allows manipulation of the encrypted data without the need to decrypt it. This will allow when hardware will be fast enough to deal with the complexity of the operations, to have a true secure distributed multitenant database. As no operation on the hosting side will require clear text decryption of the data and everything can be done securely on the client side. 
  • Efficient Methods and Hardware for Deep Learning : Standford lecture where guest lecturer Song Han present algorithms and specialized hardware (FPGA, GPU, ASIC, etc..) that can be used to accelerate training and inference of deep learning workloads. [video]



Friday, October 06, 2017

[Links of the Day] 06/10/2017 : HPC routing topology on dependency graph, Arm network stack development, Multicore graph processing

  • Routing on the Channel Dependency Graph : the author aim are providing a toolset for topology calculation for HPC network. 
  • Arm for network stack developers : arm is trying to slowly move up the stack into the data centre world. For that, it needs to address one of its main limitation: IO. This slide deck describe the current effort to tackle network stack limitation ( and support RDMA ) as well as providing pointer where ARM or devs can push the Linux stack further. As Linux is pretty much the only viable software stack for such hardware infrastructure in the datacenter. 
  • Multicore graph processing : this present a very good overview of the Multicore graph processing problem, solution landscape and where it's heading. Graph problems are essential to solve in the domain of social network modelling as well as for items recommendation or search and website ranking. [slides]



Wednesday, October 04, 2017

[Links of the Day] 04/10/2017 : Tensor Flow Extended , 5d Torus , Machine learning influencing hardware designs

  • TFX : Google paper describing it's tensor flow based platform for production scale deployment. It is impressive how they created a platform that delivers robust TensorFlow-based learners and support for continuous training and serving with production-level reliability. [video]
  • 5d torus : when your 3d torus fabric is not enough :) . The authors demonstrate how using 6 port NIC they are able to achieve such feat. Its domain of application is still be limited to HPC environment as only tailored applications can leverage such network topology. However, there might be potential in the machine learning domain. There is only a future for such approach if they are able to demonstrate significant acceleration for popular problems ( ML ). And make is fungible for cloud-like deployment. I.e. you need to be able to easily partition a set of nodes connected by such network and share it in multitenancy model. [slides] [thesis]
  • Machine Learning and the Implications for Computer System Design : Jeff Dean talk at hot chips 2017 conference. Jeff really really show the current influence of ML on the present and future hardware architecture. 





Thursday, September 14, 2017

[Links of the Day] 14/09/2017 : Automating Turing test, Deep Learning Survey, Unix History


  • Toward Automatic Turing Test : when software is used to detect if its a software or a real person talking, feel like it should be submitted to totally, not robot subreddit. The problem is way more complex and useful than it seems. By automating the procedure you could do fast prototyping and testing of models with limit human input. Accelerating the research and reducing costs. 
  • Survey of Deep Reinforcement Learning : l cover central algorithms in deep reinforcement learning, including the deep Q-network, trust region policy optimisation, and asynchronous advantage actor-critic.
  • Unix - History and Timeline : The history of Linux's grandfather OS , surprisingly enough the latest version 3 spec and ISO/IEC spec came out in 2003. Which is only 14 years ago ( I feel old now.... )

Tuesday, September 12, 2017

[Links of the Day] 12/09/2017 : OpenFaaS serverless framework, Papers I like, Docker comparison tool

  • FaaS : Functions as a Service (OpenFaaS) is serverless framework using Docker & Kubernetes. What I really like about this approach is that it simply relies on using STDIN and STDOUT as a way of passing event trigger and output of the serverless function. It allows great flexibility and open up functionality that you wouldn't have by using Lambda by example. As it constraints you behind the REST + API gateway model.
  • Papers I like : start of a really cool series ( 5 parts so far) of interesting fundamental papers. Must check out!
  • lstags : a practical little tool that allows you to compare local docker image with the ones in a repository.


Thursday, September 07, 2017

[Links of the Day] 07/09/2017 : Platform for Partial Differential Equations solver, High perf serverless event & data processing, K8s serverless framework

  • FEniCS Project : a platform for solving partial differential equations. It's really cool as it allows you to rapidly solve and test pde model using a minimum amount of code. 
  • nuclio : High-Performance Serverless event and data processing framework
  • kubeless : Kubernetes Native Serverless Framework, I have a feeling that Mesos lost the container orchestrator war ...


Tuesday, September 05, 2017

[Links of the Day] 05/09/2017 : Patent Surviving Alice, Hot Cloud 17 conference papers

  • 7 Post-Alice Patent Cases That Survived 101 Rejections : Alice US supreme court decision started a slaughter in the US patent office regarding IT related patent: 8400 applications dropped and 60k+ rejected. While courts invalidated the vast majority of patent litigation. However, it seems that there is a way to survive the onslaught, and it's quite simple. You just need your patent to satisfy the following criteria: novelty, enablement, non-obvious, and last but not least useful. it seems like a no brainer, but it seems that the USPTO allowed itself to be flooded by sub par applications that gamed the system. Not to mention that the agency financially gained from such practice to some extent also. 
  • Hot cloud 17 : hot cloud conference just finished, here is a selection of interesting paper
    • JavaScript for extending low-latency in-memory key-value stores : Adrian Colyer takes a look at in memory javascript engine using RamCloud. RamCloud project is entering the use case phase of the research project,  eyeing the comercialisation. Sadly most solution put forward are extremely niche. The risk is for great people to be stuck in a zombie startup if they try to run with it. Taking separately, the tech that came out of the RamCloud project is amazing. However, the solution as a whole doesn't really have a great killer app or any potential beyond some niche market.  [paper]
    • Towards Index-based Global Trading in Cloud Spot Markets :  the authors propose to use an index based prediction model rather than per spot instance in order to obtain greater reliability at lower cost.
    • DAL: A Locality-Optimizing Distributed Shared Memory System :  Different take on the whole in memory K/V system, the authors aggressively move the data to the computation rather than offering remote access. This allows great data reuse. We used something similar in hecatonchire. However, there is a certain risk when you have a high level of churn or serial data access and local caching of data generate a high level of eviction, effectively doubling the bandwidth usage. 
    • Leader or Majority: Why have one when you can have both? : raft is a great consensus protocol ( and easier to understand). However, the over reliance on the leader is the main bottleneck for scalability of operations. The authors ( from cockroachdb ) propose a quorum based read operations that allow alleviating the load on the leader while still retaining strong consistency. This allows them to improve write by 4x write perf and increase throughput by 60%. Which is quite impressive. 
    • DCCast: Efficient Point to Multipoint Transfers Across Datacenters : the authors proposed an efficient multipoint data transfer protocol allowing greater efficiency and bandwidth usage.


Tuesday, August 08, 2017

[Links of the Day] 08/08/2017 : Python image drawing & animations, Breaking x86 ISA, Correlation between NYT and Stock markets

  • pywonderland : A collection of python scripts for drawing beautiful figures or animating interesting algorithms in mathematics.
  • Breaking the x86 ISA : Back hat conf presentation on the sand sifter tool for detecting hardware bug or undocumented instruction in modern processors. Sadly, it's known that Intel's microcode binaries are encrypted, and are secured with a RSA2048-SHA256 signature. Which makes any discovery a little bit useless. Unless it's a bug and then it opens up a whole new world of possibility. [Slides] [Github]
  • Correlations and Flow of Information between The New York Times and Stock Markets : Well as everybody knows, there is a correlation between information and the market state. Nothing new, but it confirms to some extent that the new york time especially has a specific influence on wall street ( and vice versa).



Thursday, August 03, 2017

[Links of the Day] 03/08/2017 : NVMe over TCP , Lineage mapping of cryptocurrency , Perceptions of probability


  • NVMe Over TCP : interesting kernel module by solarflare that allow people to use NVMe over TCP. It will be really interesting to see what type of performance you can start to get out of such setup. Even if performance is significantly decreased ( but higher than other storage solution) the economic gain vs costly NVMe solution would make this worth it. Also, It can start to accelerate the arrival of a new type of high-performance low-cost storage applications by lowering the barrier to entry.  
  • Map of Coins : Impressive lineage mapping of crypto currency. But what is more concerning is the amount of dead Bitcoin child cryptocurrency, a lot of pump and dump scheme going on
  • Perceptions : this really cool graphics show the how human perceive probability and how fuzzy it can be.  This can explain why a certain type of person might take more risk or less based on certain information due to a different interpretation of the content. 




Tuesday, August 01, 2017

[Links of the Day] 01/08/2017 : Os Image generation, K/V optimized for SSD, Command line cloud deployment tool


  • mkosi : Tool for generating OS image. This is nice if you need to create an maintain low-level OS image with EFI support and GPT based table. It is invaluable if you do bare metal deployment / PXE  based deployment. [github]
  • WiscKey : Key / Value pair system optimised for SSD systems. Especially aiming at avoiding write amplification with traditional LSM based K/V system. 
  • Arc :  plaintext manifest for provisioning and deploying a cloud infrastructure. It's quite nice and elegant interface for such tedious task. If you have used chef, puppet, cloud formation or terraform you know how verbose this can get. 




Thursday, July 27, 2017

[Links of the Day] 27/07/2017 : Russian Information Warfare, Public Cloud Economics, Impact of slowness in ultra fast networks

  • Russian Information Warfare Handbook : In depth description of Russian information war machine and how it affects the rest of the world. Some suggestion on how to fight it, however, I feel that the internet and the current social networks model heavily bias in favour of propaganda exploitation. Moreover, with the advance of machine learning, it becomes easy to identify key nodes in the network to attack in order to achieve maximum disinformation reach. 
  • Usage Patterns and the Economics of the Public Cloud : paper looking at the current cloud economic and cloud customers. It seems that most customers have a steady or mild variation usage with the obvious outlier ( probably academia or other heavy HPC / Machine learning job) 
  • To slow or not? : the authors look at the impact of slowing down (hardware error, software bug, or simply laws of physics)  next gen network on data centres and application when we start to reach 1 digit microsecond latency. There is a wide range of implication, from technical to legal as well as ethical. Sadly most of them tend to stay unanswered until a problem or conflict arise. 




Tuesday, July 25, 2017

[Links of the Day] 25/07/2017 : Universal Scalability Law Model , k8s + cloud native apps, python distributed execution engine

  • usl4j : Implementation of Universal Scalability Law model, this is really cool because it allows you to model ( and predict ) when your system performance will start to degrade as its scale. It allows you to take real measurements from a live system and continuously build models. [github]
  • daft : a tool for developers to create cloud-native applications on Kubernetes.
  • ray : distributed execution engine written in python. Useful if you want to execute and schedule task across a cluster of nodes.

Friday, July 21, 2017

[Links of the Day] 21/07/2017 : Tofu HPC interconnect challenges, Satellite Quantum key distribution, Go K/V store

  • Evolution and challenges of Tofu Interconnect : Deep learning and heterogeneous hardware is putting a strain on HPC interconnect. They need to adapt to new application communication model as well as hardware while retaining best of breed capability for "traditional" HPC application. 
  • Satellite-to-ground quantum key distribution : this is ground breaking from Chinese scientists. Where they demonstrate they are able to solve the main hurdle behind planet-wide quantum communication by leveraging satellite to satellite quantum crypto key distribution.  And they are deploying a proof of concept!
  • Badger :  nice key/value store written in go. It's based on Wiskey paper and is heavily optimised for SSD. It's 3.5 faster than rocksDB .[github]


Tuesday, July 18, 2017

[Links of the Day] 18/07/2017 : Banking API, how Cooperation strategy evolve, Distributing file system images

  • Teller : API for your bank account, already support a couple of UK bank. This is nice, however, I wonder how banks will react? Also, EU is forcing bank open their API but UK is leaving the EU, and there is a chance that the UK banking system will try to seize the chance to create its own independent banking API. Creating more barrier to entry for fintech startup. Anyway this is rather cool, however, I am still disappointed that most banks do not expose an API to access your data. 
  • How Cooperation Evolves :  the authors looked at evolution as a thermodynamic process and found how cooperation strategy evolve and how they can be manipulated.  
  • Casync : A tool for distributing file system images, really cool if you have to update images often and want to have a cheap traffic and storage wise solution. 


Thursday, July 13, 2017

[Links of the Day] 13/07/2017 : Jack of all trade Deep learning Model, Pay with Group Selfie, Support Vector Machine

  • One Model To Learn Them All : the authors propose a model that is good enough for most needs, kind of a jack of all trades/master of none model for deep learning. This can be really practical for experimenting and probing dataset for potential use. Or if you do not have the time to spend to create the ideal model. However, there is always the risk to end up having a sub-performing solution. 
  • Pay-with-a-Selfie : interesting payment model where the authors propose to use group selfie to extend the split bank note metaphor for executing financial transactions. 
  • Introduction to Support Vector Machines : if you want to learn about the SVM classification systems.

Tuesday, July 11, 2017

[Links of the Day] 11/07/2017 : Chip Hall of Fame, AMD Software optimization guide, Pocket negotiator


Thursday, July 06, 2017

[Links of the Day] 06/07/2017 : venture capital investment framework, Aftershock in complex systems, ISC workshop

  • Picking Winners : the authors propose a framework for venture capital investment. However, I feel that there is a major flaw in the approach as it is designed to correlate previous success with future investment success. This assumes a high degree of repeatability with clear identifiable elements. The model can suffer from the purple cow effect where higher growth opportunities are located in underserved segments. As a result, quite often by following the model, you might suffer from diminishing returns.
  • Aftershocks in a complex system : the authors look at the behaviour of a complex system ( currency market) after a catastrophic event. They discovered that often such system follows a similar pattern to the one exist in earthquake's aftershock. In extenso, after an initial catastrophic event, most systems suffer a series of gradually diminishing aftershock spaced in time. 
  • ISC Workshop 2017 : all about container and optimisation for high-performance workloads.


Tuesday, July 04, 2017

[Links of the Day] 04/07/2017 : Classic Papers, Operating Model Canvas, Origami anything

  • Classic Papers : a collection of highly-cited papers in their area of research that has stood the test of time. For each area, we list the ten most-cited articles that were published ten years earlier.
  • Operating Model Canvas : a set of tools and model to help to align operations and organisation with strategy. 
  • Origami anything : New algorithm generates practical paper-folding patterns to produce any 3-D structure.






Thursday, June 29, 2017

[Links of the Day] 29/06/2017 : BeeGFS distributed FS, Virtual memory in Big memory systems, PdfX

  • An Introduction to BeeGFS : Fraunhofer distributed parallel file system for HPC system. Mainly a concurrent of Lustre I would say. This had potential and they are making a foray into the business side of storage. Let's see how well they fare. [website]
  • Preserving the Virtual Memory Abstraction : the author work aims at maintaining the virtual memory abstraction throughout a set of various hardware implementation. [thesis]
  • PDFx : really cool tool that allows you to extract all the reference and metadata and download them!!


Tuesday, June 27, 2017

[Links of the Day] 27/06/2017 : Blockchain trust & authentication for IoT, K8s patterns, Ripple cryptocurrency Network analysis

  • Kubernetes Production Patterns and anti-patterns :  a lot of common sense, actually a lot of the patterns and anti-patterns can be applied to the other environments. But still a good refresher. 
  • Blockchain based trust & authentication for decentralized sensor networks : using blockchain to solve the trust issue in a swarm of IoT on a network. The critical bit missing is the power requirement for running all the crypto operations.
  • Large-Scale Analysis of the Ripple Cryptocurrency Network : an overview of the paper analysing the Ripple p2p blockchain based money transaction network. Turns out it suffer the same issue that "old school" p2p network. Take out the highly connected nodes and you can bring down / split the network. Nothing new, but still a good read and reminder that small network tends to be resilient to attack. But, if their resiliency diminishes with the increased reliance on a small number of highly connected nodes.


Thursday, June 22, 2017

[Links of the Day] 22/06/2017 : Modern Web Dev spell-book, .Net microservices framework, Optimizing Rust

  • Spellbook of Modern Web Dev : like an awesome list for javascript development but with more thought and structure. Must read for front end javascript devs.
  • Microdot : not a .net guy but here is a framework for easily creating .NET Microservices with Orleans
  • Rust Optimization : a lot of stuff can be applied to any language. But, there is some nugget of information for Rust language in this document.

Tuesday, June 20, 2017

[Links of the Day] 20/07/2017 : Statistics lectures notes, HA transactions, Capability Models for Manycore Memory Systems


  • Lectures on Statistics : 2003 lecture note on statistics, pretty much cover all the basics of what you need to know about stats.
  • Highly Available Transactions : the authors looks at the state of the database transactions system and well like any good scientist their conclusion is that there is more research to be done. But more seriously, highly available transactions and system need to be perfected and new semantics with hybrid systems are required to be developed in order to ensure the availability of transactions. 
  • Capability Models for Manycore Memory Systems : Programming for manycores system is hard, optimising them is even harder. In this paper, the authors try to design models that help programmers to deliver efficient software for these type of hardware architecture. [slides]


Friday, June 16, 2017

Maxims of Maximally Effective Startup Developer


  1. Test, then Deploy
  2. A coding Developer outranks an Architect who doesn't know what's going on
  3. If the food is good enough, the devs will stop complaining about the incoming workload
  4. Only you can prevent prod failure. 
  5. If testing wasn’t your last resort, you failed to resort to enough of it.
  6. The longer everything goes according to sprint planning, the bigger the impending disaster. 
  7. The world is richer when you turn competitors into partners, but that's not the same as you being richer.
  8. Give a developer a task, he will code for a day. Take his software away and tell him he's lucky just to be paid, and he'll figure out how to code another one for you to take tomorrow. 
  9. "Deploy and Forget" is fine, provided you never actually forget.
  10. Don't be afraid to be the first to resort to rollback.
  11. The competitor of my competitor is my competitor's competitor. No more. No less.
  12. There is no over testing.' There is only 'Continuous Integration' and 'I need to spawn more Jenkins slave'
  13. Just because a feature is easy for you, it can still be hard to your clients.
  14. There is a difference between a spare feature and extra [feature].
  15. Not all good news is competitor action. 
  16. “Do you have a backup?” means “I can’t fix this.” 
  17. The size of a developer startup stock options is inversely proportional to the likelihood of the startup surviving to collect it
  18. Don’t try to save money by conserving lines of code.
  19. Don't expect the competition to cooperate in the creation of your dream startup
  20. If it ain't broke, it hasn't been deployed to prod yet.
  21. The dev team you've got is never the dev team you want.
  22. The product management guideline you've got is never the guideline you want
  23. The best way to win a one-on-one architecture design is to be the third to arrive.
  24. It's only too many features if only the devs use them
  25. Don't bring big VMs into small servers
  26. Management knows how to do it by knowing who can code it 
  27. Failure is not an option - it is mandatory. The option is whether or not to let failure be the last thing you test.



Borrowed from Shlock Mercenary Comic 

Thursday, June 15, 2017

[Links of the Day] 15/06/2017 : Corporate Wargaming, ScatterText , Forecasting and BigData

  • Scattertext : Nice and easy to use tool allowing to find independent terms in corpora and present them in an attractive way via interactive scatter plot.
  • Forecasting in the light of Big Data : the authors look at how forecasting is changing in the light of new tool and data collection capability emerge. And as often the authors suggest that the best approach would be to combine model and quantitative analysis in order to obtain the best forecasting strategy. However, I am afraid that this would require a new framework and also training data scientist to leverage this two opposite methodology correctly. 
  • Competitive Wargaming and Simulations for Business Forecasting & Analytics : slide deck providing a good insight on how wargaming can help shape the decision process and strategy of a company. However, like any tool, it's about as much about the preparation and how to leverage the outcome of the game itself. 


Wednesday, June 14, 2017

[Links of the Day] 14/06/2017 : Formally proven HTTPS replacement, Kisrhombille geometry public key cryptosystem using Mersenne Numbers

Everest: Towards a Verified, Drop-in Replacement of HTTPS. The authors ( team of Microsoft, MIT, INRIA) propose a complete, verified replacement of  TLS and other components of HTTPS. Entirely written in F* for provability, Everest is then compiled into a low-level language. This is a highly praiseworthy solution. However, there is still a great portion of the dev world that do not completely embrace or understand formal verification. And until the big corporation ( Google, Microsoft, AWS, etc.. ) start pushing such library the adoption will remain marginal.
Public-Key Cryptosystem via Mersenne Numbers : an interesting new approach to delivering new Public key cryptosystem, for crypto buff only.
Kisrhombille geometry : tessellation of plane using rhombic faces divided in a centre point into four triangles. While Voronoi tesselation tend to still have my preference this type of tessellation has a high potential and like all of them, they are really pretty :) 



Tuesday, June 13, 2017

[Links of the Day] 13/06/2017 : Next Gen Fabric Comparison, Marginal Revolution Books, AWS awless CLI

  • Marginal Revolution Books : A browsable database of books discussed on Marginal Revolution, sorted by the month of their posting.
  • CCIX, GEN-Z, OpenCAPI : Overview and comparison of the different next generation fabric. This presentation shows that key difference and use case associated with each fabric. CCIX  mainly focusing on low latency main memory expansion with its hardware cache coherence. It enables accelerator, network, main processor, etc.. to work on a same coherent view of dataset. Gen-z is really all about hardware/ component disaggregation. I tend to prefer this approach as I feel this will be the next step in HW efficiency. While OpenCapi is a little bit like CCIX but with the hope to extend beyond to rack level in the future. 
  • awless : fast, powerful and easy-to-use command line interface (CLI) to manage Amazon Web Services. This a nicer version of the AWS cli. Also, it is WAY more human-friendly. Check it out. 

Thursday, June 08, 2017

[Links of the Day] 08/06/2017 : Machine Learning Tuning DBMS, Direct SSD to GPU SQL and Large Graph DB processing

  • Tuning DBMS with Machine Learning : From the people behind Peloton, they demonstrate a way to automatically tune DB using machine learning. This is rather interesting, however, there is a key element that is missing in the approach: Cost. Your DB system can become highly optimise but your AWS cost can skyrocket too. What you need is a system that automatically tunes perf & cost tradeoff to maximise ROI Sometimes being a little bit slower can save $$
  • MOSAIC : More heterogeneous approach: graph processing engine that exploits all the hardware resources available in a standard Xeon host processor, Xeon Phi coprocessors, NVMe, and a fast interconnect. Because fast processing of your Facebook social network for fast advertisment targeting is worth it :) [slides]
  • PG-Strom : By-passing CPU for SQL operation by allowing direct SSD to GPU communication for Postgress SQL processing. We are slowly entering the age of heterogeneous computing system were core CPU get relegated to highly generic tasks. [slides]


Tuesday, June 06, 2017

[Links of the Day] 06/06/2017 : Secure Machine Learning, Quantum secured blockchain and Survey of Machine Learning in Hardware

  • DeepSecure : a framework that enables scalable execution of the state-of-the-art Deep Learning models in a privacy-preserving setting. The authors propose a system that enables data owner and model owner to maintain segregation of information while allowing them to work together without data leak between the two parties. 
  • Quantum-secured blockchain : The authors propose in this paper a quantum blockchain architecture specifically designed to solve the post-quantum computer cryptographic weakness of currently used crypto algorithms in Bitcoin and other blockchain frameworks. However, it seems that they conveniently ignore newer cryptographic solutions that are "quantum resistant".
  • Survey of Neuromorphic Computing and Neural Networks in Hardware : heterogeneous hardware solutions are becoming the norm as classic CPU are not able to handle the bandwidth and processing power. Seriously, how a Intel or AMD CPU can process 1 Tb/S of bandwidth ... Anyway, as machine learning is reaching peak hype, the hardware that comes to accelerate it is getting more mainstream and diverse. This paper provides a good overview of the various technic and hardware used in the field. Moreover, it references an exhaustive collection of papers of the field.



Thursday, June 01, 2017

[Links of the Day] 01/06/2017 : Istio microservice mesh , encrypted p2p network, formally proven system still vulnerable to bugs

  • Istio : this is a fantastic project, it allow an efficient delivery of micro service infrastructure without tying the developer to a language specific framework. It relies on data plan using the Envoy proxy for managing and mediating all communication as well as a control plane for managing and enforcing proxy policies. 
  • CJDNS : encrypted IPv6 network using public-key cryptography for address allocation and a distributed hash table for routing. This provides near-zero-configuration networking, and prevents many of the security and scalability issues that plague existing networks. [github]
  • An Empirical Study on the Correctness of Formally Verified Distributed Systems : Spoiler alert, even formally verified project can fall prey to bugs. And it seems that these bugs can seriously affect real systems in the wild. Well looks like that human in the loop is the weakest link in the production of robust systems after all.

Tuesday, May 30, 2017

[Links of the day] - 30/05/2017 : Next Gen File system / Machine learning Infrastructure & stealth denial of service in public cloud

  • TFS: Some Interesting concepts behind TFS - Next-generation file system. The Design goals are ambitious & well thought. However, Machine learning for caching can make sense but I will still prefer to see how it cope in real life over a long period of time. Especially when there is historical access pattern that span days - months.
  • Stanford DAWN Project : While I know some cool tech will come out of it public cloud provider will probably shape the future of machine learning infrastructure at faster pace & with shorter iteration
  • Bolt : this is really cool, the authors of the paper are pretty much pushing the noisy neighbour to 11. They propose a technique that offers stealth denial of service on public cloud infrastructure. 


Tuesday, May 23, 2017

[Links of the Day] - 23/05/2017 : Intel Manufacturing Forecast, Topological Quantum Computing and Serverless Conf Video

  • Intel Manufacturing Conf : sets of slide deck giving a peek into Intel manufacturing process and the upcoming wave of 10 nm chips. It seems Intel is currently keeping up with Moore's law, not by reducing the transistor size but by also increasing transistor density.
  • Introduction to Topological Quantum Computation : introduce the concept of quantum computing with anyons which allow a more resilient quantum computing system. 
  • Serverless Conf video :  almost all video are now available. Check out the Serverless at Nordstrom video. This is, in my view, the best of the bunch. It's an actual practical talk by the dev who implemented it and without marketing spin. 


Tuesday, May 16, 2017

[Links of the Day] 16/05/2017 : Exascale Project, Storage as Stream , ServerlessConf



Thursday, May 11, 2017

[Links of the Day] 11/05/2017 : Google - Push on green , Tensor flow in Datacenter , TCP congestion protocol

  • In-Datacenter Performance Analysis of a Tensor Processing Unit : How custom deep learning hardware behave in real datacenter, implication and gain associated with the use of such custom solution.
  • Push on Green : Great article on google roll out policy and process. A lot of common sense, and also some less common but as important. This is a great read for anybody involved in software delivery and especially if you are aiming at having an efficient CI/DI system. 
  • BBR : google congestion protocol for maximising bandwidth usage. It's new TCP scheduling algorithm to fight buffer-bloat at the TCP level. Since the majority of internet traffic is TCP, wide adoption would cause a big improvement. TCP scheduling only affects outgoing packets

Tuesday, May 09, 2017

[Links of the Day] 09/05/2017 : Serverless Design Patterns, Paper Reading How-to, HPE's The Machine

  • Serverless Design Patterns : Basic serverless pattern, nothing fancy but useful to keep in mind. 
  • How to Read a Paper : the short version is quite simple: Abstract -> conclusion ->figure, then start iterating through the paper, one layer at a time. 
  • Billion node graph inference: iterative processing on The Machine : I still believe that the machine is vaporware. It was promised for years. Still at the stage of nice plastic prototype and everybody developing/designing for it use simulator/emulator. But hey, it's nice to see what you can theoretically achieve on "the machine" even if it's a Superdome X in reality.

Friday, April 28, 2017

[Links of the Day] 28/04/2017 : Bitcoin Antbleed , Social Networks Rumors, HPC & AI trends

  • Increasing the Flow of Rumors in Social Networks by Spreading Groups : Looks like by fragmenting groups rumours flow more easily in social networks. To a certain extent, this mimics real life as by isolating and fragmenting group it becomes easier to spread gossip due to the difficulty by an individual in each group to check the validity of the information within its neighbourhood.
  • HPC & AI Technology Trends : Dr Eng Lim Goh of HPE talks about the trend in HPC and AI.
  • Antbleed : Apparently, BITMAIN,  the ASIC system provider of up to 70% of bitcoins miner embedded a backdoor that can disable or compromise remotely its hardware. The funny aspect is that it can potentially allow the company to pass the 51% control of bitcoin miner network, and hence allow it to rewrite the whole blockchain. The 51% threshold has always been considered as a theoretical threat that was not attainable in real circumstance. Well, guess what, it's not theoretical anymore.

Wednesday, April 26, 2017

[Links of the Day] 26/04/2017 : Aphyr Scala Day, Sia blockchain file storage , How brains are built

  • Aphyr Scala Day 17 : Aphyr breaks database for a living and then talks about it :) 
  • Sia : a Blockchain-based marketplace for file storage, the really attractive thing is the cost comparison of SIA vs public cloud system. Which is between a tenth to a hundredth time cheaper than S3 or other similar solution. I would be curious to see the performance thought.
  • How brains are built: High-level overview of principles of computational neuroscience.




Monday, April 24, 2017

[Links of the Day] 24/04/2017 : Data Center Perf Index, Physical Limits of Computing And DNS as Code


  • Data Center Performance Index : Performance index aiming at providing a reliable idea of the performance and efficiency of the datacenter. It primarily focuses on Availability, Efficiency and Environmental impact. This effort is lead by Dean Nelson, Uber Had of computing. 
  • Physical Limits of Computing : A look at the limitation of compute from a physicist point of view. It seems that some limitations are fundamental and require a new and different approach in order to create compute device that goes around these limitations. 
  • DnsControl : system built to manage DNS systems. To some extent, it looks like a terraform for DNS where you can plug multiple DNS backend providers. This allows you to deploy and distribute your DNS infra in an agnostic way across multiple cloud provider. [github]

Friday, April 21, 2017

[Links of the Day] 21/04/2017 : HPC 2017 trends, Docker cheat-sheet, Incident response best practice

  • Current Trends in High-Performance Computing and Challenges : Jack Dongarra annual HPC review, It's amazing how the chinese progressed. They literally took over the top 500 in less than ten years. And now they dominate using homegrown chips and network fabric. [slides]
  • Docker cheatsheet : 'nough said.
  • Increment - On-call : New magazine providing article on how to scale companies. Each edition focus on a different topic. For the inaugural issue, they focus on industry best practices around on-call and incident response.


Wednesday, April 19, 2017

[Links of the Day] 19/04/2017 : AMD ROCm GPU open platform, Weak Memory Models concurrency report, SSH server for distributed infrastruscture

  • ROCm : this slide deck give an overview of the AMD ROCm open platform for GPU computing exploration. They are really pushing to become the open source standard for the GPU industry battling against NVIDIA supremacy in the domain. It looks like they are making really good progress and I would be curious to see how this progress when combining with their Ryzen CPU. 
  • Concurrency with Weak Memory Models : this is a really good report on the state of memory models in hardware and software. It provides a wide spectrum overview of Hardware and Software concurrency model and approaches as well as the future direction in the domain. 
  • Teleport 2 : a modern SSH server designed for teams managing distributed infrastructure. [github]


Monday, April 17, 2017

[Links of the Day] 17/04/2017 : Pedis Redis Clone, Serverless framework, Deep learning best practices

  • Pedis : Redis Compatible NoSQL datastore using the Seastar Framework. It's interesting to see that on the single thread benchmark Redis and Pedis are on par while it Redis gets smoked on 8 thread benchmark. However on a side note, the author should probably have chosen another name for project. 
  • serverless : Serverless Framework with serverless architectures using AWS Lambda, Azure Functions, Google CloudFunctions [github]
  • Best Practices for Applying Deep Learning to Novel Applications : this is pretty much a must read for machine learning expert using deep learning. This report decomposes deep learning project in phases and provides best practice for each phase.


Friday, April 14, 2017

[Links of the Day] 14/04/2016 : OpenFabric Workshop , Docker's Containerd , Category Theory

  • OpenFabrics Workshop 2017 : Some interesting talk this year at the open fabric conference:
    • uRDMA : Userspace RDMA using DPDK. This opens up a certain amount of possibility, especially for object storage solution. [Video , Slides, github]
    • Crail : Using urdma above to deliver accelerated storage solution for Apache big data projects [Slides, github]
    • Remote Persistent Memory: I think this is the next killer app for RDMA. If Intel doesn't jump onto it and deliver a dpdk like solution. [Video, Slides]
    • On Demand paging: slowly the tech is crawling its way up to upstream acceptance. While on-demand paging introduces a certain performance cost. It also allows a greater flexibility in consuming RDMA. One of the interesting aspects that nobody mentioned yet is how this feature could be used with persistent memory. I think that there is some good potential for p2p NVM storage solution.[Video, Slides]
  • Containerd : Containerd move to github, the docker "industry standard" container runtime is also reaching its v.0.2.x release.  [github]
  • Category Theory : If you are into functional programming and Haskell. This is a must read book for you.

Wednesday, April 12, 2017

[Links of the Day] 12/04/2017: Linux Perf tools, libp2p and Contagion of Information in Social Media

  • Perf Toolsmiscellaneous collection of in-development and unsupported performance analysis tools for Linux ftrace and perf_events. 
  • Contagion of Information in Social Media :  The authors look at how information spread on social media ( twitter ). The authors model contagion behaviour in the hope to create effective defences against "fake news" and other propaganda. However, to some extend the research can also be used to optimise the spread of such malicious information. 
  • libp2p :  really cool network stack ( used by IPFS) that tackle a lot of the nitty gritty detail of p2p applications. It should allow devs to focus on the actual value of their p2p apps rather than the technical underlying problems of p2p itself. [github]



Monday, April 10, 2017

[Links of the Day] 10/04/2017 : Loopy , Distributed execution engine and Yet another distributed ledger algo

  • Loopy : Fantastic tools for explaining and describing complex system interaction. It's easy to use and even easier to get the message across [github]
  • Ray : experimental distributed execution engine replicating code across multiple workers. Written in python it leverages object store and distributed task execution to achieve parallelism. I really wonder if it wouldn't have been better to code Ray using AWS lambda and S3. 
  • Algorand : yet another distributed ledger. The approach proposes to eliminate the segregation of actors in the ledger system. No more miners and users, everybody is an equal participant. Moreover, it relies on a new for of Byzantine agreement ( need a TLA+ proof to really feel comfortable with that) and cryptographic selection algorithm for selecting leader (verifier) of the ledger process execution. This is a rather interesting paper and I will try to produce a short summary of it if I find the time.


Friday, April 07, 2017

[Links of the Day] 07/04/2017 : TensorFlow Example, Systems in the microseconds era and blockchain distributed direct democracy


  • Naked Tensor : Google tensor flow bare-bone example. 
  • Attack of the Killer Microseconds : Hardware ( especially storage ) is entering the micro or sub-micro second era. This has far and wide ranging implication. And system designer needs to rethink the existing stack that was designed for the millisecond era. It looks like we are entering an era where the software stack is not the bottlenneck.
  • Cicada : Distributed secure proof of work blockchain combined with a privacy guaranteed ID system. The creator of cicada aim at enabling distributed direct democracy and decentralised application platform. This is worthy goals, however, the creator forgot that direct democracy tend to fail as the majority of the population is not interested or knowledgeable enough in the problem they will be asked to vote on. 



Wednesday, April 05, 2017

[Links of the Day] 05/04/2017 : Large network resilience, Distributed Systems, Machine Learning & Bayesian reasoning





Monday, April 03, 2017

[Links of the Day] 03/04/2017 : Conway's Game of life Clock, Human-Bot social interaction, SQL time series DB


  • Digital clock in Conway's Game of Life : I can't even start to comprehend how you can design this. But this is beyond cool.
  • Online Human-Bot Interactions: Detection, Estimation, and Characterization : An analysis of bots on socials network (twitter). I think we need a reverse Turing test. When a robot can detect when they talk to a human.... Reverse captcha to weed out that pesky meat-bag from meddling from our robotic overlord affairs.
  • Timescale : SQL compatible time series database. Another competitor for Influxdb. Let's just say that the clustering feature will make or break it as Influxdb has some serious issue there [github]