Friday, May 29, 2015

Links of the day 29 - 05 - 2015

Today's links 29/05/2015: 2015 Internet Trends, Intel Purley, Short Novel

  • Internet Trends 2015 : A very good overview of global internet trends.
  • The Counselor : short novel on ethic , AI, and stuff, worth a read. 
  • Purley : a peak at Intel next Gen platform. What I really like is the RDMA and DPDK embedded within the package. Especially for standard Ethernet ( while the photonic fabric is quite interesting but probably pricey). Combine with the NVM memory architecture this will open up some new option in distrusted systems.

Wednesday, May 27, 2015

Links of the day 27 - 05 - 2015

Today's links 27/05/2015: L3 #datacenter #networking, #Linux futex #bug, Distributed system simultaneity problem
  • Calico : pure L3 approach to data center networking. It uses BGP route management rather than an encapsulating overlay network, and thus avoids NAT and port proxies, doesn't require a fancy virtual switch setup, and supports IPv6. Looks like a viable probably more scalable alternative to virtual switch approach.
  • Futex bug : The linux futex_wait call has been broken for about a year (in upstream since 3.14, around Jan 2014), and has just recently been fixed (in upstream 3.18, around October 2014). More importantly this breakage seems to have been back ported into major distros (e.g. into RHEL 6.6 and its cousins, released in October 2014), and the fix for it has only recently been back ported (e.g. RHEL 6.6.z and cousins have the fix).
  • There is No Now : explore the problem with simultaneity in distributed systems

Tuesday, May 26, 2015

Links of the day 26 - 05 - 2015

Today's links 26/05/2015: dynamic organisational model, hardware memory de duplication, Elliptic curve #cryptography
  • Hybrid Dynamic Model : modern approach to structuring an organisation's portfolio that allows for multiple ways of working to co-exist from innovation to utility work and for that work to change which processes, cultures and practices it uses over time.
  • Thesis on fined grained memory de duplication : cache line and hardware solution for memory deduplication. The company that was subsequently created after the PhD was acquired by intel.
  • Elliptic Curve : I always really liked elliptic curve crypto , especially what you can do with it beyond just classic encryption such as Identity based encryption. Enabling simpler and more versatile cryptography application that are really hard to deliver with classic scheme.

Monday, May 25, 2015

Links of the day 25 - 05 - 2015

Today's links 25/05/2015 : PIP for #GO, #RumpKernel stack and #Bitcoin discussion

Friday, May 22, 2015

Links of the day 22 - 05 -2015

Today's links 22/05/2015:  #IaaS #cloud magic quadrant, Qboot lightweight firmware, #openstack state
  • Cloud Magic quadrant : Gartner magic quadrant is out and the result is without any real surprise...
  • Qboot : nice effort from the Paolo Bonzini ( Qemu lead ) and amazing technical skills ( last and first commit is less than 24h old!), but it feels like a knee jerk reaction to the  in  Intel Clear Linux project. Qboot   deliver a minimal x86 firmware that runs on QEMU and, together with a slimmed-down QEMU configuration, boots a virtual machine in 40 milliseconds on an Ivy Bridge Core i7 processor. However i think that while all these solution are technically great they miss the point, I'll try to detail this in a future blog post. Almost forgot the github repo is here.
  • State of the stack : Openstack in all its glory , 2015 version. What really worries me is the last slide last lines: OpenStack is not speciļ¬c code or APIs, it’s: Community, common values, and common governance. It should be about product , stability and robustness, I can see the big tent syndrome written all over, aka too many clown in there.

Thursday, May 21, 2015

Using financial tools to manage technical Debt

Technical debt is a term commonly used in the IT sector. This term, first introduced by Ward Cunningham, describe the deferred necessary work during the planning or execution of a software project. However, while the term debt comes from the financial vocabulary, quite often the analogy stops there. I find this rather disconcerting as we can borrow more than the terminology, we can borrow the mechanism associated with the term in order to prevent, address, classify such liability. Some started already to push the analogy closer to the economic concepts. Lets first have a look at the two main type of technical debt.

External vs. Internal Debt :

We need to understand that developers group maintain two type of debt, external and internal. The external debt represent the total amount of deferred work that was contracted or required by a third party. The creditor can be a customer, another group within the company, feature for the product manager, etc..
On he other side, we have the internal debt, which is only visible within the group and only impact the work executed internally. Like the submerged part of an iceberg, the internal debt is invisible to the outside world, tend accumulate faster,and is sometime the reason for the creation of external debt. Moreover, the internal debt often need to be resolved in order to chip away at the external one. This dependency increase significantly the cost of paying the external one.
In order to maintain the analogy, we could describe internal debt as credit card debt, payday loan, etc.. Easy to acquire and accumulate and snowball easily if we don't pay attention. While, on the other side, the external debt is more like contracting a mortgage for a house, any failure to repay it can lead you to deep trouble. Internal debt tend to have a high interest rate as the more you wait the higher the chance you pile up more debt that you need to repay while the external one tend to have a lower one.
To follow the analogy we can classify the debt as follow :

Frequency of acquisition
Interest Rate
low - medium

How do team accumulate debt :

It is not rare to see team, project, company being trapped in the IT equivalent borrow-and-spend cycle. Lets have a look at the technical debt acquisition spiral.

Project inception:

The Technical Debt (TD) slippery slope can start right at the project conception phase. Often, part of the requirements, statement of work, and/or analysis are missing. This situation can quickly create an initial credit buildup since trying to fulfill correctly the work at that stage is often unfeasible due to lack of real understanding of the requirements. For many people TD become the only choice. TD generation immediately puts the project in a bad position as you are accumulating debt at a time when you probably do not have enough "velocity" to make even a single payment in an effort to reduce the debt balance. However, often you do not have a choice unless you are part of a big enough organisation that can weather the initial cost.

Daily project routine :

The daily development effort soon come into play to help you cover Internal TD cost. However, often this internal TD is accruing interest and putting you deeper in debt. Moreover, the high interest rate of the internal TD make it even harder to catch up on the debt you already contracted at the inception stage of the project.

Real requirement settle in :

Like most project, at some point the actual requirements comes in forcing you to re-evaluate the technological choices as well as established scheduled. Sometimes, you need to acquire a certain type of technology, skill set or even workforce to cope with this mutation. With every acquisition comes a certain amount of inertia to fulfill it further increasing the debt burden.

Customer comes in :

Here, external debt start to form as milestone and deadline have to be met. Quite often to meet them teams cut corners and the hidden internal debt accumulate in order to avoid default on the external one. At this stage, the project risk spiraling out of control unless we establish clear way of dealing with the increasing cost of debt.

Preventing technical debt accumulation :

How do we resolve the debt crisis that every group face at some point ? Like any financial issue, you need to have a plan.

Preemption via process management :

One of the most common way to prevent debt buildup is to use a process management method. Agile approach like Kaban or Scrum are in vogue for the moment. They tried to enable just in time delivery by minimizing the development cycle in order to allow as much agility as possible. Effectively, it limit the potential amount of debt that can be accumulated before taking action and enables greater visibility.
However, do not dismiss the other approach such as waterfall, TOGAF, ITIL, 6sigma, etc... Each alternative have their own merits, default and should be used depending of the circumstance. Agile method are best used when the project inception phase generated a lot of debt ( in term of unknown requirements by example) and create the need for a on the fly adaptation model to deal with changing requirements.
More industrialized development project with defined, well understood and highly repeatable efforts are more effectively managed by a highly structured method. Agile approach would create too much waste and potential dispersion which result in technical debt generation.

Remember, one size doesn't fit all and moreover you need to adapt your process as the project mature ( or change ). By example, with the structured approach in a rapidly changing environment you end up rapidly overrunning your allocated resources due to change control associated with structured methods i.e. you're applying a method designed to reduce deviation to an activity that is constantly deviating.

Snow balling:

This classic debt reduction strategy simply target the technical debt that can be payed off the fastest and you work you way up the debt pile. This requires to have a clear understanding and provisioning of resource to deal with it. However, if you end up strapped in resource this is not always possible. One of the great thing with this approach is it create the perception of momentum which help reinforce the team commitment in dealing with this predicament.

Too Big to Fail :

This options might be riskier, but if you are desperate, you might want to try it and hope for a not too painful bailout. Simply put, you are trying to externalize the risk by following the old adage : “If you owe a bank thousands, you have a problem; owe a bank millions, the bank has a problem”. The objective is to rely on investment by piling up external debt in order to subsidize your budget. As a result, the overall exposure of the company force it to invest more into it in order to avoid significant repercussion at a larger scale.

On the downside, you have to be careful to not hit the debt wall. This can happen when the project external debt outweigh the benefit of pumping more resource into it. The consequences are that project manager and all the members of the team may lose significant reputation and also impact negatively their career or position within the company. 

When it's already too late:

Well when you are in too deep with technical debt there is various options. But you need a plan to mitigate the personal and corporate impact.
First you need to acknowledge the issue. Then you might consider notifying the creditor(s), the product manager by example for internal debt or the customer for the external one. however, i would suggest considering the various options at your disposition:
  • Negotiating a technical debt adjustment with your creditor by stretching some for of debt over a longer period of time. However this can only be applied to external debt due to its nature.
  • Alternatively you could try to consolidate your internal debt by exposing it to the external parties. Effectively refinancing your technical debt by securing a lower interest rate on the entire debt load. Typically this would take the form of dedicating a whole sprint to address the internal debt by demonstrating its value to the customer.
  • Finally you could seek debt relief with a partial or total forgiveness. This could take the form of re-scoping the requirements, realigning the statement of work etc.. 

Beyond hope :

Most of the time this part is completely ignored in technical debt talk or paper. It is kind of a taboo subject but is still essential to be aware of. Sometimes, you might want to consider defaulting on your debt altogether. By claiming insolvency, you are declaring that you are unable to meet the technical debt fulfillment obligations. Insolvency occurs when the amount of work needed to clear the technical debt is higher than the net worth of the project itself. Quite often, you might want to keep an eye on the debt accumulation rate and stop the project or deliver it /pass it on before it reach that stage.

Wednesday, May 20, 2015

Links of the day 20 - 05 - 2015

Today's links 20/05/2015: #BigData anomaly detection, Machine learning, Secure #Containers

Generic and Scalable Framework for Automated Time-series Anomaly Detection : Yahoo time series anaomaly detection framework.
A Few Useful Things to Know about Machine Learning : very good overview of machine learning, it summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions.
Clear Containers : Intel effort to improve security of containers by using the VT-x technology.[LWN]


Monday, May 18, 2015

Links of the day 18 - 05 - 2014

Today's links 18/05/2015: System Performance management, Grows the next agile ?, Async task queue,application sharding
  • Shark : System Performance Management in Lua
  • Grows method : Some agile manifesto founders got disgruntled with the agile approach and its shortcoming and propose a new one that tries to address its limitation and enhance it. Worth a read but if you don't have time let me just give you this advice: Grow your own method that suit best the mix of personality within your team. Imposing an approach from top down most of the tie led to a shipwreck. However, you are no precluded from seeding it with the right tools. I think that's what this manifesto is all about, create the right environment for the creation of effective project delivery process.
  • Machinery : asynchronous task queue/job queue based on distributed message passing ( using GO + RabbitMQ).
  • Ringpop : Node.js library providing Scalable, fault-tolerant application-layer sharding from uber. [talk about it here]

Thursday, May 14, 2015

Links of the day 14 - 05 - 2015

Today's links 14/05/2015 : #Openstack #container , Http/2 PCie switch, #google Heracles resources management
  • magnum : Openstack container project in all its glory .. more tools and knobs ( but sadly no production ready distribution without heavy work)
  • HTTP/2 : how architecting Websites will change with the coming HTTP/2 Era 
  • Avago PEX9700 : the first PCIe switch product release after the acuqisition of PLX by avago. Some nice feature, host to host coms, Nic mode dma, etc.. 
  • Heracles : Descendant of Google pegasus project, moving from power management to full blown resources orchestration.

Wednesday, May 13, 2015

Links of the day 13 - 05 - 2015

Today's link 13/05/2015: lisp diff, hierarchical Erasure code, real time #bigdata requirements, 30PB #Ceph deployment
  • ydiff : a structural comparison tool for Lisp
  • Hierarchical Erasure Coding : cover two different approaches to erasure coding – a flat erasure code across JBOD, and a hierarchical code with an inner code and an outer code.
  • 8 system-level requirements : really good paper on the requirement for scalable robust real time data processing ( and why using DB or rule engine has certain limiations.
  • Ceph ~30PB Test Report : Benchmark at Cern , the interesting bit is that it show that Ceph still has quite a bit of runway on the perfomance side, not to mention managability in order to really scale. Which would match the openstack conf report that most deployment of such tech tend to be in small cluster.

Monday, May 11, 2015

Links of the day 11 - 05 - 2015

Today's links 11/05/2015: Openstack reality check, Programming Framework, Open vSwitch design, Real time sessions handling
  • Openstack reality check : I agree with a lot of stuff in there. I would even add that if the Openstack comunity doesn't correct its course quickly it might end up like Linux desktop... 
  • 5 Billions Sessions/Day : How twitter use kafka, and stream processing for real time sessions handling at scale
  • Unison : another new programming framework for functional programming with IDE being the centerpiece.
  • Design and Implementation of Open vSwitch : If you ever wondered about the design and internal of open vSwitch

Wednesday, May 06, 2015

Links of the day 06 - 05 - 2015

Today's links 06/05/2015: Hyperloglog sandwich, Pony stuff, Durable MQ performance, Intel E7

  • HyperLogSandwich : A probabilistic data structure for frequency/k-occurrence cardinality estimation of multisets.
  • Pony : object-oriented, actor-model, capabilities-secure, high performance programming language.
  • Evaluating persistent, replicated message queues: Nice bench marking of various popular message queue performance.
  • Intel Core E7 : “Haswell-EX” Xeon E7 v3 processors for big iron with big memory :)

Tuesday, May 05, 2015

The rise of micro storage services

Current and emergent storage solutions are composed of sophisticated building blocks: dedicated fabric, RAID controller, layered cache, object storage etc. There is the feeling that storage is going against the current evolution of the overall industry where complex services are composed of small, independent processes and services, each organized around individual capabilities.

What surprised me is that most of the storage innovation trends focus on very high level solutions that try to encompass as many features possible in a single package. Presently, insufficient efforts are being made to build storage systems based on small, independent, low-care and indeed low-cost components. In short, nimble, independent modules that can be rearranged to deliver optimal solution based on the needs of the customer, without the requirement to roll out a new storage architecture every time is simply lacking - a "jack of all trade" without,or limited "master of none" drawbacks or put another way modules that extend or mimic what is happening in the container - microservices space.

Ethernet Connected Drives

Despite this, all could change rapidly as an enabler (or precursor, depending how you look at it), of this alternative solution as it is currently emerging and surprisingly, coming from the Hard Drive vendors : Ethernet Connected Drives [slides][Q&A].This type of storage technology is going to enable the next generation of hyperscale cloud storage solution. Therefore, massive scale out potential with better simplicity and maintainability,not to mention lower TCO.

Ethernet Connected Drives are a step in the right direction as they allow a reduction in capital and operating costs by reducing:
  • software stack (File System, Volume Manager, RAID system);
  •  corresponding server infrastructure; connectivity costs and complexity; 
  • granularity which enable greater variable costs by application (e.g. cold storage, archiving, etc.).
Currently, there are two vendors offering this solution : Seagate with Kinetic and HGST with the Open Ethernet Drive. In fact we are already seeing some rather interesting applications of the technology. Seagate released a port of SheepDog project onto its Kinect product [Kinect-sheepdog] there by enabling the delivery of a distributed object storage system for volume and container services that doesn't requires dedication. Indeed there is a proof of concept presented HEPiX of HGST drive running CEPH or Dcache. While these solutions don’t fit all the scenarios, nevertheless, both of these solutions demonstrate the versatility of the technology and its scalability potential (not to mention the cost savings).

What these technologies enables is basically the transformation of the appliances that house masses of HDD into switches thereby eliminating the need for a block or file header as there is now a straight IP connectivity to the drive making these ideal for object based backends.

Emergence of fabric connected Hardware storage:

What we should see over the next couple of years is the emergence of a new form of storage appliance acting as a fabric facilitator for a large amount of compute and network enable storage devices. To a certain extend it would be similar to HP's moonshot except with a far greater density.

Rather than just focusing on Ethernet, it would be easy to see PCI, Intel photonic, Infiniband or more exotic fabrics been used. Obviously Ethernet still remains the preferred solution due to its ubiquity in the datacenter. However, we should not underestimate the need for a rack scale approach which would deliver greater benefit if designed correctly.
While HGST Open Ethernet solution is one good step towards the nimble storage device, the drive enclosure form factor is still quite big and I wouldn't be surprised if we see a couple of start-ups coming out of stealth mode in the next couple of months with fabric (PCIe most likely) connected Flash. This would be an equivalent of the Ethernet connected drive interconnected using a switch + backplane fabric as shown in the crudely designed diagram below.

Is it all about hardware?

No, indeed quite the opposite. That said, there is a greater chance of penetration of new hardware in the storage ecosystem as compared to the server market. This is probably where ARM has a better chance of establishing a beach head within the hyperscale datacenter as the microserver path seems to have failed.
What this implies is that it is often easier to deliver and sell a new hardware or appliance solution in the storage ecosystem than a pure software one. Software solutions tend to take a lot longer to get accepted, but when they pierce through, they quickly take over and replace the hardware solution. Look at the object storage solution such as CEPH or other hyper-converged solution. They are a major threat to the likes of Netapp and EMC.
To get back on the software side as a solution, I would predict that history repeats itself to varying degrees of success or failure. Indeed, like the microserver story we see, hardware micro storage solutions while rising, at the same time we see the emergence of software solutions that will deliver more nimble storage features than before.

In conclusion, I feel that we are going to see the emergence of many options for a massive scale-out, using different variants of the same concept: take the complex storage system and break it down to its bare essential components; expose each single element as its own storage service; and then build the overall offer dynamically from the ground up. Rather than leveraging complexed pooled storage services we would have dynamically deployed storage applications for specific demands composed of a suite of small services, each running in its own process and communicating with lightweight mechanisms.These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a minimum of centralized management relating to these services, which may be written in different programming languages and use different data storage technologies, . which is just the opposite of current offers where there are a lot of monolithic storage applications (or appliances) that are then scaled by replicating across servers.

This type of architecture would enable a true, on-demand dynamic tiered storage solution. To reuse a current buzzword, this would be a “lambda storage architecture”.
But this is better left for another day’s post that would look into such architecture and lifecycle management entities associated with it.

Links of the day 05 - 05 - 2015

Today's links 05/05/2015: First Aid Git,Intel cache allocation tech, FIDO automatic security analysis.
  • First Aid Git : searchable collection of the most frequently asked git questions.
  • Intel Cache Allocation Technology :provides a way for the Software (OS/VMM) to restrict cache allocation to a defined 'subset' of cache which may be overlapping with other 'subsets'. When multi-threaded applications run concurrently, they compete for shared resources including L3 cache. At times, this L3 cache resource contention may result in inefficient space utilization. For example a higher priority thread may end up with lesser L3 cache resource or a cache sensitive app may not get optimal cache occupancy thereby degrading the performance. CAT kernel patch helps provides a framework for sharing L3 cache so that users can allocate the resource according to set requirements.
  • Fido : system for automatically analyzing security events and responding to security incidents.