Friday, March 06, 2020

Google Cloud GKE control plane price introduction: the tragedy of the commons or bait and switch?

Recently when GCP announced that On June 6, 2020, Google Kubernetes Engine (GKE) clusters will start accruing a management fee. The fee is $0.10 per cluster per hour which amount to roughly $73/month. Everybody using GKE hit the roof as it was seen as yet another flakiness episode of the chocolate factory. A lot of existing customers are seeing themselves trapped into a bait and switch tactic by Google however the story is a little bit more complex than what it appears.
It seems that Google made a series of mistake in its rush to try to attract enterprise customer with it’s Kubernetes offering.



The first mistake is to have used the free control plane as a "loss leader." GKE provides the manager node, and cluster management so that it’s customers don't have to. And in exchange, you sell more compute, storage, network, and app services. 
The side effect of not charging for the control plane and charging for the control plane leads to two very different Kubernetes architectures. Small, single application clusters are simpler to set up and operated.  With the free control plane, customer embraced this approach as they didn’t need to architect their cloud infrastructure in a multi-tenant fashion.
Moreover, as per google docs, those decisions made at the start are very much set in stone. Customer cannot change their cluster from a regional cluster to a single zone cluster for example. So Google has customers who built their stacks taking into account Google’s free control plane, and GCP is turning the screws in by adding a cost for it — but they cannot change the type of their cluster to optimise their spend, since, per your docs, those decisions are set in stone. Hence the entrapment feeling that a lot of existing clients feel at the moment.

The second key metric they missed when announcing the free control plane is that the majority of Kubernetes deployments tend to have a single application to cluster mapping. So it would have been normal to assume that most of their potential customers would have started with small single app cluster deployment as they didn’t have the natural inertia brought by the cost of running the control plane. 

Third, as per their own metrics, they discovered that customers will use and abuse those free resources. It’s the tragedy of the commons where all those empty clusters cost Google money
Obviously, Google hoped that their customers would have applied their best practices and deployed multi-tenant cluster. Multitenant clusters are harder to manage, deploy and maintain.
And no amount of "best practice" documentation will solve this. However, it is not as simple and not every company is a hyper-scale corporation like Netflix and al.  Engineering is about balancing cost and benefit. Often the best practice to have many clusters for a variety of reasons such as: "Create one cluster per project to reduce the risk of project-level configurations". And company are ok with the waste as long as their software & deployment practices can treat any hosted Kubernetes service as essentially the same. Often corporation accepts waste as part of the inherent cost of not rearchitecting their process and culture. It's often more efficient to simplify the infra complexity albeit extra cost than trying to re-architect the company IT structure to embrace the latest best practice. It's a simple cost/risk/ROI analysis. They are even more ok with waste when Google fitted part of the bill with their free control plane. 




In the end, the folks at Google cloud fell between a rock and a hard place. They fell to the trap of the tragedy of the common hoping that all their customer will run their operations like google and now are trying to recoup those extra $$ by introducing a cost for the control plane. By doing so Google did the equivalent of adding a tax to running Kubernetes clusters on GCP. This is perceived as an ex-Oracle way of thinking, "what can we do to meet growth objectives," "how can we tax the people who we own".
To some extent, this is the equivalent by Google of a carbon or petrol tax. Customers need now to rethink their strategy, I.e. adopt public transport ( multi-tenant cluster ) or move to electric (cloud run). Some might move away completely from GCP because of the perceived lack of stability of the offering both in term of services and pricing.

No comments :

Post a Comment