Reflections Of The Void: 11/01/2015

Monday, November 30, 2015

Bitcoin double “double trouble”

Two of the main hurdle for large scale Bitcoin adoption as an everyday payment method are :

Any payment for a physical good at a brick and mortar shop require the equivalent of a double foreign exchange transaction by all the party involved.
This type of payment are also exposed to double spending attack. Where the buyer fool the seller into thinking that the transaction occurred while it get invalidated at a later date but not before the buyer left with the goods.

These problems can potentially generate a significant overhead (as in fees) due to the risk associated with each operations. These uncertainties are major roadblocks on the democratisation of Bitcoin as an alternative currency for common transaction.

Double fee problem :

So what is the double fee issue ? It is the need for both party involved to convert the currency for paying / getting payed into another one due to the lack of widespread Bitcoin use coupled with the need to pay or get payed in the official currency of the country of residence. To put it simply it is as if Alice is selling the good while living in US (dollar), but is getting paid for them in Euro, by Bob from in China (yuan).

They will be exposed to this problem unless the consumer and the end-to-end product value chain ( supply and demand chain) relies only on Bitcoin currency for all their monetary transaction. By only relying on Bitcoin, I also implies that tax, salaries etc.. are in paid Bitcoins. In extenso, this implies a blanket replacement of the “traditional” currency for any operations associated with the sell/buy. Without these conditions fulfilled, all the parties involved expose themselves to double fee problem.

I will now explore some of the sub problem associated with the double “FX” fee problem : price conversion tracking and long term fluctuation risk.

Price Conversion tracking:

Alice buys her goods in dollars, and sell in Dollars or Bitcoins. To avoid losing on the conversion due to the fluctuating Bitcoin exchange rate, Alice needs to update daily (at least) the goods price. While, this is not a major hurdle for online shops, it is quite a tedious process for brick and mortars one. Obviously, this could be done at the till on the spot however this implies that there would be no possibility to dual pricing sticker with the items.

Long term fluctuation risk :

After Bob pays Alice in Bitcoin, Alice will not convert the payment immediately in order to avoid paying a large amount of exchange Fees for each transaction. Ideally, Alice want to batch the conversion of her Bitcoins in order to reduce the transaction overhead. However, the lengthier the retention, the higher the risk of loss due to drop in Bitcoin price. Naturally if the price goes up Alice would have made a profit on the exchange rate, but this is equivalent to trading of the foreign exchange rate. This can be perceived as a significant risk by merchants, banks, accountant. Moreover, it is not really clear how this needs to be treated under accounting rules and how tax needs to be paid.

One can advocate that Alice does not have to convert her Bitcoin and keep using them within a self contains market ecosystem. However, this is rather impractical or near impossible to achieve as Alice still needs to pay tax and salaries in the local currency. It is only possible if the Bitcoin’s transaction are marginal compared to the overall amount of transactions the business does.

On Bob’s side, the risk is more limited as it is almost the same as the one of a Tourist buying currency in advance of its trip.

Next we will be looking at another consequence of the use of Bitcoin in everyday transaction : fraud by double spending attack.

Transaction delay and double spending :

The average confirmation time of a Bitcoin transaction is ~7 minutes. As a result, it is clear that brick and mortar vendors, such as vending machines and take-away stores, cannot rely on transaction conﬁrmation when accepting Bitcoin payments. We cannot really see the cashier asking the client to wait 7 mns to get the transaction acknowledgement. It is even worse for fee less transaction. You would be lucky to get it done within 24h and it can take much longer. To address that, Bitcoin encourages vendors to accept fast Bitcoin payments with zero conﬁrmations as soon as the vendor receives a transaction from the network transferring the correct amount of BTCs to one of its addresses.

However, this opens up the whole process to potential fraud via double spending techniques. In our case Bob needs to trick Alice into accepting a Bitcoin transaction that Alice will not be able to redeem subsequently. Basically Bob issue a transaction toward on of its own addresses for the same Bitcoin used for the payments of Alice goods. If both transactions are sent at the same time, and Bob’s transaction is acknowledge before Alice one (can use a higher fee, or other tricks), Alice transaction will be cancelled. By this time of course, in a fast transaction, Alice has already released the goods.

This beg the question, how Bitcoin debit card provider handle this issue? And what insurance and accounting solution they have in place to prevent abuse and protect from malicious operations. This is a huge risk as well as trial by fire for these companies. They not only risk the future of their company but also the trust in Bitcoin as a trade currency. Maybe Insurance companies might be able to offer a partial solution here as it might not be 100% possible to cull such attacks. However, this would make Bitcoin impractical for small due to excessive fees associated with insurance premium.

To conclude Bitcoin has a really bright future as the overlay virtual currency on top of the blockchain technology. However, its potential reach within the everyday financial transactions system might be greatly overestimated. There is a lot of complex problem to be solved in order to establish trust as well as cost efficient system for brick and mortar shops. And like anything happening in the real world, the spread of new technology is often slow, just look how long it took to get NFC payment out. And don't get me started on the chip and pin system which is not even widely used in USA yet.

Links of the day 26/11/2015: Flat corporation structure, Openstack appliance, Supply chain Bigdata

Predictive analytics and Supply chain : how you some technique can be used to optimised your supply chain Management ,making the process more accurate, reliable, and at reduced cost.
Openstack Appliance by Mirantis + FusionStorm : Looks like Mirantis is moving away from the pure Integrator player and into the turnkey Openstack solution. What would be interesting is if the move next into the hyper-convergence market.
Flat will kill you : debatable point of view, however the fact that people will naturally cluster in group implies that flat structure are a myth. Maybe pure flat structure will not exist unless true rule system for breaking deadlock and preventing (too much) power grab.

Wednesday, November 25, 2015

Links of the day 25/11/2015: Micron NVDIMM, Ceph load balancer, Programming with Promise

Micron Persistent Memory & NVDIMM : Micron announced the production of 8GB DDR4 NVDIMM. Similar to the diablo tech, viking nvdimm. However for some reason Micron decided to externalize its super-capacitor to an external module compare to the other vendor who integrated it on the stick themselves. The trade-off is that you can fit more silicon on a stick however it obviously restrict the HW they can deployed onto. [slides]
Mantle: a programmable metadata load balancer for the ceph file system. It allows the user to dynamically inject information into the Ceph load balancer in order to optimize data placement and hence performance of the overall system
How do Promises Work? : excellent coverage of the promise technology, how it works, when to use it (or not) and how to avoid some of its pitfall.

Tuesday, November 24, 2015

Links of the day 24/11/2015 : Product prioritization using the Kano Model and Timers

Hashed and Hierarchical Timing Wheels : how to handle loads of object with timers and action attached to them.
Kano Model : Overview of set of ideas and techniques developped by Noriaki Kano, that help determine customers (and prospects) satisfaction with product features.
20 Product Prioritization Techniques : derived from the above a breakdown in 20 separate techniques for product development prioritization.

Monday, November 23, 2015

Canonical Land and Grab Strategy to capture the private cloud market.

Canonical recently released their Autopilot product. Autopilot fit within the bring your own hardware (ByoHW) market segment for private cloud. This product allow you to deploy and manage a full Openstack cloud. Canonical makes heavy use of its JuJu, MaaS and other in house (but open sourced) software. You can have a glimpse of how this product work in this blog post. This post also provide a glimpse of the complexity associated with the deployment and orchestration of an Openstack cloud. However, what is even more interesting is their pricing strategy and offering an insight on how they plan to "land & grab" the private cloud market.

Land :

For anything under 10 servers, it's basically free (but no support). Note that you need at least 5 server for a deployment and 6 for HA, so you have a little bit of wiggle room there but not much. With this strategy, Canonical tries to specifically target the biggest user segment of Openstack. You have to remember that most deployment (78%) are less than 100 nodes with 36% less than 10. Moreover, according to the survey most of these small deployment are DIY and use off the shelf Openstack solution (source or distro). We can deduce that there is a very small sales and profit potential in this segment as these users rarely have the budget. Most of the time they relies on in-house expertise and community support.

However, it is quite smart to try to hook them up by providing them with easy to deploy, robust and maintainable solution. As the user base is large there is a significant market potential for some of them to upgrade as they start reaching a certain size and their needs change the benefit from the paid solution arise. Obviously the objective is to use the land and grab approach with the least barrier (as in free) to entry possible in order to "land" the biggest Openstack user segment out there.

Grab:

If we look now at the pricing model, we can notice that they offer three type of payment plan per VM/Hour (0.003$/Hour), per Server/Year (1000$ / Year), or per availability zone. If you run less than the equivalent of 39 VM full time for a year per server (see chart below), you are better off going for the VM pay as you go model. This might be a smart choice if you tend to run big VMs such as the one for big data application. However, if run a hoard of small VMs you might quickly pay way more than the per server pricing plan, especially if you cram 1000+ VMs per rack.

The nice thing about this tiered approach is that it allows you to test the water with the pay as you go model and then switch to a more “classic” model if you discover that you will cross the 39 VMs per server ratio. Naturally heavy user will prefer the per servers model, but the dynamic one will be more easily adopted by a crowed already accustomed to the AWS cloud pricing model.
Moreover, even without a highly dynamic instance count, the per VM/H model should remain the preferred payment for the majority of small to medium Openstack deployment. As based on the survey, the vast majority (74%) of deployment have less than 1000 VMs. If we count that 78% of the deployments have less than 100 Nodes we can safely assume that the average number of VM per nodes is less than 39. As you can see, this pricing model is a very smart way of targeting the Openstack ecosystem by reducing friction to transition from free to a paying model.

Retain

Obviously, when you cross the 75 servers line customer might want to switch to the availability zone pricing plan. However, based on the Openstack statistics, there is very few deployment with more than 100 nodes out there. As a result, this pricing model is probably there to show an natural upgrade path in pricing and support for potential customer. The objective is to demonstrate that they are able to service the complete spectrum of Openstack customers. While, I do not know yet how well Canonical fare on the integration/consulting side against Mirantis or other integrator out there, I suspect that they want to show a full end to end pricing model and solution in order to be taken seriously by the market. Moreover, it help reassure the customer that they can accompany them as their cloud needs grow.

Links of the day 23/11/2015 : Machine learning book, Spotify monitoring , graph

Heroic : How Spotify monitor its infrastructure. TL;DR: Cassandra for time series persistence, Kafka for transport and elastic search on top for presentation. Nice feature: you can federate cluster across zones or data-center making the overall system more robust.
Understanding Machine Learning - From Theory to Algorithms : Ebooks introducing machine learning, and the algorithmic paradigms it offers, in a principled way. The book provides a theoretical account of the fundamentals underlying machine learning and the mathematical derivations that transform these principles into practical algorithms.
Plotly.js : want nice graph, get plotly.js. The core technology and JavaScript graphing library behind Plotly’s products (MIT license).

Friday, November 20, 2015

Links of the day 20/11/2015: Open vSwitch conf, Postgres HA, Survey of Bigdata Systems

Yoke : Postgres redundancy/auto-failover solution that provides a high-availability PostgreSQL cluster. Note that solution like this one doesn't try to achieve any linearizable semantics, but to perform failure detection and fail over to some slave, with heuristics about what slave to pick. While the behavior of those systems is acceptable in many production situations (it's up to the product requirements to understand if it is ok that a given set of failure modes produce data loss and/or inconsistencies) but the important bit is that the semantics is very well documented so that users know what tradeoffs they are making.
A survey on big data systems : Detailed analyse and benchmark of the value chain of Big data system and classify them into four sequential modules, namely data generation, data acquisition, data storage, and data analytic.
Open Virtual switch conference : all the slide deck and summary of the talks.

Thursday, November 19, 2015

Links of the day 19/11/2015: #ARM rack scale infrastructure fabric, Canonical ByoHW #Cloud software and Next Gen #Bigdata Storage

X-Tend : Applied micro get into the rack scale computing market with its ARM server and now the fabric to enable and support dis-aggregation of the resources. What is interesting there is since the ARM processor are more nimble in a certain sens it might make them more suitable for this paradigm than the big Intel processor. And if you throw into the mix hybrid and heterogeneous processors might end up having a better fit as you can more easily tailor the resource to your need as well as the power consumption coupled with it.
Autopilot : canonical private ByoHW cloud solution (more about that in another post).
Pyro: A Spatial-Temporal Big-Data Storage System for high resolution geometry queries and dynamic hotspots.

Wednesday, November 18, 2015

Links of the day 18/11/2015: DHT comparison, decentralized control plane, NVMe for mobile

Xenon : set of software components and a service oriented design pattern delivering a decentralized Control Plane Framework by Vmware
DHT comparison : paper analyzing churn , scale , performance of the 4 main DHT under churn.
PCIe/NVMe for mobile : while the argument might make sens the slide deck carefully avoid the elephant in the room : what is the power / performance cost ?

Tuesday, November 17, 2015

Links of the day 17/11/2015 : Memory system design, multi agents systems, NVMe over Fabrics

Rethinking Memory System Design : for Data-Intensive Computing , a presentation of the future of memory and the challenge accompanying them.
Multi Agents Systems : provide the foundations via the presentations of distributed problem solving, non-cooperative game theory, multi-agent communication and learning, social choice, mechanism design, auctions, coalitional game theory, and logical theories of knowledge, belief, and other aspects of rational agency.
NVM Express Fabrics Protocol and Architecture : explore NVMe Over Fabrics specification, and how it enables NVMe to be used across RDMA fabrics (e.g., Ethernet or InfiniBand™ with RDMA, Fibre Channel, etc.) and connect to other NVMe storage devices.

Monday, November 16, 2015

Links of the day 16/11/2015: Named Data networking, VM scheduler, Asymmetric multi-core processors.

Named Data : an effort to transform networking the same way object storage transformed well storage. By naming the data content for efficient transport rather than solely focusing on the point to point aspect. Really cool concept.
BtrPlace : Virtual machine scheduler system. While really nice it suffer from a lack of integration with existing system. However from looking at the code it would be rather simple to integrate with Openstack or other cloud / virtualization system.
Survey on Asymmetric Multicore Processors : extensive review of the literature out there on heterogeneous processors systems.

Thursday, November 12, 2015

Links of the day 12/11/2015: Netflix architecture, Google ML explained and Backblaze storage pods

360 view of Netflix architecture : most of it i already published, however it is really nice to have all the different part in one place and more importantly the links to all the open source goodness they leverage or contribute.
Tensor flow explained : Jeff Dean, Google Large-Scale Deep Learning for Intelligent Computer Systems explain tensor flow (just open sourced)
Backblaze Storage pods : what hardware it takes to deliver $0.005/GB/month for a S3 like service.

Wednesday, November 11, 2015

Links of the day 11/11/2015 : Networking for storage, Google ML and Platform design toolkit

Low Latency Networking for Storage : overview of Intel networking stack and library for storage at the excellent Tech Field Day events.
TensorFlow : google opensource "some" of its machine learning tools.
Platform Design toolkit : interesting toolkit and canvas focusing on how to create a platform to access and leverage the potential of an ecosystem is increasingly recognized as the most powerful way to achieve market success.

Tuesday, November 10, 2015

Links of the day 10/11/2015: #containers, #Unikernel & #virtualization power consumption, security, and streamlined

Qubes with Rumprun : Interesting application of unikernel to deliver greater security at speed. the goal is to reduce the possible attack surface while maximizing performance.
Power Consumption of Virtualization Technologies: empirical Investigation that confirm a lot of what we knew already, virtualization power consumption is higher than container.
Vagga : containerisation tool without daemons. Slim down streamline version of container. No deamons or other process to run to support it. Looks like a solid contender to docker and other vagrant.

Monday, November 09, 2015

Links of the day 09/11/2015 : workload resources management and ACID in memory DB

Quantiles Threshold : How to determine the best threshold for your alarms using quantiles. Interesting, however the following paper tend to argue that this approach is fine until the system get close to maximum capacity.
Percentile-Based Approach to Forecasting Workload Growth : this paper explains the issue when the resource utilization gets closer to the workload-carrying capacity of the resource, upper percentiles level off (the phenomenon is colloquially known as flat-topping or clipping), leading to under predictions of future workload and potentially to undersized resources. They analyse the problem and propose a new approach that can be used for making useful forecasts of workload when historical data for the forecast are collected from a resource approaching saturation.
MemDB : ACID compliant distributed in memory database. The performance are interesting as they deliver ACID transaction at 25k/s per shard (essentially per core). While not fantastically fast when you take a "simple"12 core server you could potentially run 250k/s transactions per system. Rapidly you can see that you can reach 1M+/s transactions with a small setup. However I would still query the need for a pure in memory ACID without persistence.. I would be curious to see if anybody has any use case for it.

Friday, November 06, 2015

Links of the day 06/11/2015: More Intel rack scale resource

A lot of new stuff released by Intel around their rack scale effort:

Ressource : Probably you main point of entry for the RSA project. Documentation, API and reference architecture.
Github : everything you need to get started (as long as you have Intel HW)
How it works : very good and short overview on how Intel intend to Orchestrate the bare metal hardware in order to offer dis-aggregated pools of resources. The objective is to have these transparently and just in time procured for cloud or any workload. Ultimately you will be able to avoid the age long issue of server sizing and simply manage each type of resource independently within pools. Adding and removing them, upgrading them without the need to re-architect your whole data-center.

Thursday, November 05, 2015

Links of the day 05/11/2015: SNIA conf goodness , Parallella , Worst case Distributed system design

Worst-Case Distributed Systems Design : nice blog post on the advantage of designing distributed systems with handle worst-case scenarios gracefully in order to improve average case behavior handling.
Parallella : 18-core credit card sized computer, nice playground for parallel programming ( and hence publications)
2015 Storage Developer Conference Presentations | SNIA : slide deck of the conference, some gems in there:

RDMA + PMEM : coupling pmem with RDMA in order to deliver remote persistent memory.
NVDIMM cookbook : very good overview of what you can do and use case of NVDIMM.
Hashing algo for storage : Very good overview of the main hashing techniques, trade-off etc.. of hashing techniques for K/V (not just for storage)
Pro & Cons of Erasure Code vs Replication vs Raid : as always its depends but here is the exec summary : RAID is reaching its limit, Erasure code is the preferred option for large scale however replication is required if you want certain type of performance. Finally everything will be really dependent of the storage defined system running it.

Wednesday, November 04, 2015

Links of the day 04/11/2015: Intel ISA-L , Linux Kernel userland page fault handling, Evolution of CI at stratoscale

ISA-L : brief introduction to the Intel Intelligent Storage Acceleration Library (ISA-L). Some nice feature for erasure code in there [intel 01 website]
Evolution of CI at Stratoscale : How the development team develops, tests, deploys and operates it. How do we get tens of developers to work productively at a high velocity while maintaining system cohesion and quality? How can we tame the inherent complexity of such a distributed system? How do we continuously integrate, test, deploy and operate it? How do we take devops to the limit? And just how tasty is our own dog food?
Userfaultfd : Nice to see the code making it into the upstream release for user land page fault resolution.

Tuesday, November 03, 2015

Links of the day 03/11/2015 : Automation memory lapse danger, disaggregating disk and complex data-sets visualisation

Automation memory lapse danger : forgetting how to do things after automating everything is one of the major risk. Relearning , making the same mistake again when you need to upgrade / change or worse fix the system after a major breakdown becomes a major challenge.
Dis-aggregating disk : well its all about trade off but slowly the classic storage paradigm is moving away from the current models. However it will take some times until all the different pieces fall into place enabling seamless dis-aggregation. In the mean time the idea is slowly percolating throughout the community.
Mirador : tools for visual exploration of complex datasets

Monday, November 02, 2015

Links of the day 2/11/2015 : #openstack summit videos , #fail @scale , data-flow computers

Tokyo Openstack summit 2015 videos : live from the big tent, what is really interesting is that company start talking about scale, HA challenge they are facing. Surprise, not everybody deploy cattle and there is still a lot of pet out there.
Fail at scale : go big fail big.. But most product cannot even fail at "small" ....
Dataflow computers: history and future of this alternative computational model. While the concept seems alien a lot of principle are already used in system such as storm or samza.

Subscribe to: Posts ( Atom )