Tuesday, March 31, 2020

[Links of the Day] 31/03/2020 : Quantum Computing course, Risk quantifying library, Google time windowed availability metric

  • Meaningful Availability : google folks propose a different interpretation of availability in this paper. The authors propose a new metric called "windowed user-uptime".  The objective of this metric is to measure user perceived uptime combined with calculating the availability over many windows in order to identify transient vs long periods unavailability. 
  • riskquant :  Netflix Python library for quantifying risk 
  • Quantum Computation Course : approachable quantum computation course. 

Thursday, March 26, 2020

[Links of the Day] 26/03/2020 : Golang distributed in memory key value store, Datasciences github repo trove, Developer Road-maps

  • olric : Distributed cache and in-memory key/value data store. This can be embedded as a go library.
  • Pilsung Kang : a lot of really cool git repository for machine learning and datascience lecture , notes, code etc.. by Pilsing Kang of the School of Industrial Management Engineering Korea University.
  • Developer Roadmaps : Step by step guides and paths to learn different tools or technologies, checkout the devops one .. you probably need two lifetime to cover everything.

Tuesday, March 24, 2020

[Links of the Day] 24/03/2020 : Machine learning visualisation, debugging and project template

  • hiplot : Facebook lightweight interactive visualization toolkit, quite useful for discovering correlations and patterns in high dimensional data.
  • manifold : Machine learning Visual debugging tooling bu Uber [github]
  • cookie cutter Data Science : love cookie-cutter, and this is a great one for Machine learning projects

Thursday, March 19, 2020

[Links of the Day] 19/03/2020 : Directed Acyclic Graph structure estimation, Groovy Linter, AI hierarchy of needs

  • DAGs with NO TEARS : NIPS 2018 paper that demonstrate a novel way to Estimate the structure of directed acyclic graphs. Bonus point for code in github ! [arxiv]
  • groovyfmt: I wish I knew about this one a long time ago. All those Jenkins File errors and debugging session I could have avoided. Well, let's add it to my default list of linter to run with every job.
  • AI hierarchy of needs : neet representation of what is needed to deliver an AI project and how much effort and information is required as you progress throughout the hierarchy.

Tuesday, March 17, 2020

[Links of the Day] 17/03/2020 : Machine Learning research Guide, Engineering Strategy, Contrastive Self Supervised Learning Techniques

Thursday, March 12, 2020

[Links of the Day] 12/03/2020 : Neuromodulation in Deep Neural Networks, AWS landingzones as code, 100 days of Machine learning

  • Introducing Neuromodulation in Deep Neural Networks to Learn Adaptive Behaviours : the authors of this papers propose to leverage cellular neuromodulation, the biological mechanism that dynamically controls intrinsic properties of neurons and their response to external stimuli in a context-dependent manner. In order to build a construct a new deep neural network architecture that is specifically designed to learn adaptive behaviours. They demonstrate that their solution is able to adapt to change in the environment as well as providing more flexibility during the lifespan of their model.
  • AwsOrganizationFormation : Alternative to AWS Landingzones. The advantage of this solution is that you can manage your organisation as code. This will help in the long run will simplify your life when it comes to updating and maintaining those resources.
  • 100 Days of ML : this repository captures the journey of Hithesh to learn machine learning in 100 days.

Tuesday, March 10, 2020

[Links of the Day] 10/03/2020 : NLP models platform for elasticsearch, Encrypted Tensor flow framework, Reformer transformer machine learning model

  • nboost : scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms (i.e. Elasticsearch)
  • tf-encrypted : encrypted tensor flow. This allows you to work on an encrypted dataset for generating models. It's privacy (??) preserving machine learning framework [github]
  • reformer : while most transformers are limited to a short number of tokens (512.. maybe more). Google folks came up with a new architecture called Reformer that leverage locality preserving hashing that blast past this limation and a allow handling context windows of up to 1 million words, all on a single accelerator and using only 16GB of memory.[arxiv]

Friday, March 06, 2020

Google Cloud GKE control plane price introduction: the tragedy of the commons or bait and switch?

Recently when GCP announced that On June 6, 2020, Google Kubernetes Engine (GKE) clusters will start accruing a management fee. The fee is $0.10 per cluster per hour which amount to roughly $73/month. Everybody using GKE hit the roof as it was seen as yet another flakiness episode of the chocolate factory. A lot of existing customers are seeing themselves trapped into a bait and switch tactic by Google however the story is a little bit more complex than what it appears.
It seems that Google made a series of mistake in its rush to try to attract enterprise customer with it’s Kubernetes offering.

The first mistake is to have used the free control plane as a "loss leader." GKE provides the manager node, and cluster management so that it’s customers don't have to. And in exchange, you sell more compute, storage, network, and app services. 
The side effect of not charging for the control plane and charging for the control plane leads to two very different Kubernetes architectures. Small, single application clusters are simpler to set up and operated.  With the free control plane, customer embraced this approach as they didn’t need to architect their cloud infrastructure in a multi-tenant fashion.
Moreover, as per google docs, those decisions made at the start are very much set in stone. Customer cannot change their cluster from a regional cluster to a single zone cluster for example. So Google has customers who built their stacks taking into account Google’s free control plane, and GCP is turning the screws in by adding a cost for it — but they cannot change the type of their cluster to optimise their spend, since, per your docs, those decisions are set in stone. Hence the entrapment feeling that a lot of existing clients feel at the moment.

The second key metric they missed when announcing the free control plane is that the majority of Kubernetes deployments tend to have a single application to cluster mapping. So it would have been normal to assume that most of their potential customers would have started with small single app cluster deployment as they didn’t have the natural inertia brought by the cost of running the control plane. 

Third, as per their own metrics, they discovered that customers will use and abuse those free resources. It’s the tragedy of the commons where all those empty clusters cost Google money
Obviously, Google hoped that their customers would have applied their best practices and deployed multi-tenant cluster. Multitenant clusters are harder to manage, deploy and maintain.
And no amount of "best practice" documentation will solve this. However, it is not as simple and not every company is a hyper-scale corporation like Netflix and al.  Engineering is about balancing cost and benefit. Often the best practice to have many clusters for a variety of reasons such as: "Create one cluster per project to reduce the risk of project-level configurations". And company are ok with the waste as long as their software & deployment practices can treat any hosted Kubernetes service as essentially the same. Often corporation accepts waste as part of the inherent cost of not rearchitecting their process and culture. It's often more efficient to simplify the infra complexity albeit extra cost than trying to re-architect the company IT structure to embrace the latest best practice. It's a simple cost/risk/ROI analysis. They are even more ok with waste when Google fitted part of the bill with their free control plane. 

In the end, the folks at Google cloud fell between a rock and a hard place. They fell to the trap of the tragedy of the common hoping that all their customer will run their operations like google and now are trying to recoup those extra $$ by introducing a cost for the control plane. By doing so Google did the equivalent of adding a tax to running Kubernetes clusters on GCP. This is perceived as an ex-Oracle way of thinking, "what can we do to meet growth objectives," "how can we tax the people who we own".
To some extent, this is the equivalent by Google of a carbon or petrol tax. Customers need now to rethink their strategy, I.e. adopt public transport ( multi-tenant cluster ) or move to electric (cloud run). Some might move away completely from GCP because of the perceived lack of stability of the offering both in term of services and pricing.

Thursday, March 05, 2020

[Links of the Day] 05/03/2020 : Linux AI tuning, easy AutoML , lightweight container development environment

  • OpenEuler :  Huawei Linux distribution, interesting side project is A-tune which relies on AI for identifying the workload that runs on your the OS and tries to tune it to optimise its performance.
  • AutoGluon : AutoGluon enables easy-to-use and easy-to-extend AutoML with a focus on deep learning and real-world applications spanning image, text, or tabular data. [github]
  • k3c : kubernetes but lightweight and easy to use for container development

Tuesday, March 03, 2020

[Links of the Day] 03/03/2020 : Embedded linux build toolchain, Cyber security body of knowledge, AWS API change tracker

  • BuildRoot : a tool to generate embedded Linux systems through cross-compilation.
  • Cybok v1.0 : aims to codify the foundational and generally recognised knowledge on cyber security. In the same fashion as SWEBOK, CyBOK is meant to be a guide to the body of knowledge; the knowledge that it codifies already exists in literature such as textbooks, academic research articles, technical reports, white papers, and standards. [website]
  • AWS API Change : feel overwhelmed by the pass of change of the AWS API stack. Have no idea why your code doesn't work anymore? Want to use the latest, shiniest aws feature. You need this, this web page tracks all the API change in AWS stack.