Reflections Of The Void: model

Showing posts with label model. Show all posts

Tuesday, April 21, 2020

[Links of the Day] 21/04/2020 : Machine Learning for Relational Query processing, Augmenting Language model with latent knowledge retriever, Computer Vision Recipies

Extending Relational Query Processing with ML Inference : the authors present advanced cross-optimizations between ML and DB operators in Raven DB. The authors demonstrate significant performance improvement, up to 5.5x from the native integration of ML in SQL Server, and up to 24x from cross-optimizations.
REALM: Retrieval-Augmented Language Model Pre-Training : the authors propose to leverage Retrieval-Augmented Language Model pre-training for the challenging task of Open-domain Question Answering.By using augmenting their model with latent knowledge retriever they are able to beat current SOTA models while limiting the model growth size.
Computer Vision Recipies : Microsoft is releasing a lot of really good content, this time it's for computer vision. In this repository, you will find best practices, code samples, and documentation for Computer Vision.

Tuesday, July 25, 2017

[Links of the Day] 25/07/2017 : Universal Scalability Law Model , k8s + cloud native apps, python distributed execution engine

usl4j : Implementation of Universal Scalability Law model, this is really cool because it allows you to model ( and predict ) when your system performance will start to degrade as its scale. It allows you to take real measurements from a live system and continuously build models. [github]
daft : a tool for developers to create cloud-native applications on Kubernetes.
ray : distributed execution engine written in python. Useful if you want to execute and schedule task across a cluster of nodes.

Thursday, July 13, 2017

[Links of the Day] 13/07/2017 : Jack of all trade Deep learning Model, Pay with Group Selfie, Support Vector Machine

One Model To Learn Them All : the authors propose a model that is good enough for most needs, kind of a jack of all trades/master of none model for deep learning. This can be really practical for experimenting and probing dataset for potential use. Or if you do not have the time to spend to create the ideal model. However, there is always the risk to end up having a sub-performing solution.
Pay-with-a-Selfie : interesting payment model where the authors propose to use group selfie to extend the split bank note metaphor for executing financial transactions.
Introduction to Support Vector Machines : if you want to learn about the SVM classification systems.

Thursday, July 06, 2017

[Links of the Day] 06/07/2017 : venture capital investment framework, Aftershock in complex systems, ISC workshop

Picking Winners : the authors propose a framework for venture capital investment. However, I feel that there is a major flaw in the approach as it is designed to correlate previous success with future investment success. This assumes a high degree of repeatability with clear identifiable elements. The model can suffer from the purple cow effect where higher growth opportunities are located in underserved segments. As a result, quite often by following the model, you might suffer from diminishing returns.
Aftershocks in a complex system : the authors look at the behaviour of a complex system ( currency market) after a catastrophic event. They discovered that often such system follows a similar pattern to the one exist in earthquake's aftershock. In extenso, after an initial catastrophic event, most systems suffer a series of gradually diminishing aftershock spaced in time.
ISC Workshop 2017 : all about container and optimisation for high-performance workloads.

Wednesday, April 19, 2017

[Links of the Day] 19/04/2017 : AMD ROCm GPU open platform, Weak Memory Models concurrency report, SSH server for distributed infrastruscture

ROCm : this slide deck give an overview of the AMD ROCm open platform for GPU computing exploration. They are really pushing to become the open source standard for the GPU industry battling against NVIDIA supremacy in the domain. It looks like they are making really good progress and I would be curious to see how this progress when combining with their Ryzen CPU.
Concurrency with Weak Memory Models : this is a really good report on the state of memory models in hardware and software. It provides a wide spectrum overview of Hardware and Software concurrency model and approaches as well as the future direction in the domain.
Teleport 2 : a modern SSH server designed for teams managing distributed infrastructure. [github]

Tuesday, January 24, 2017

[Links of the Day] 24/01/2017 : Numpy cheat sheet, Innovations patterns, Persistent memory summit

Numpy Cheat Sheet: all you need for data analysis in python with NumPy
Mathematical Model of innovation patterns: Vittorio Loreto at the Sapienza University of Rome in Italy and al, created the first mathematical model that accurately reproduces the patterns that innovations follow.
Persistent Memory Summit: SNIA NVM summit 2017. Finally, with the introduction of Intel 3dXpoint, we start to see more HW NVM solution out there. And with that software that uses it. Some really interesting talks:

Nova file system demonstrates the benefit of NVM optimises storage solution.
SAP Hana on NVM: interesting to see that they still require redundant copies as they fear data corruption on NvDimms. I wonder when we will start to see ECC NvDimms on the market?
New interconnect: lots of hot new interconnects battling for the heterogeneous compute ecosystem domination. And this pass by offering persistent memory specific solution: Gen-z provide PM pooling, Open CAPI accelerate PM access and CCIX share PM

Monday, December 19, 2016

[Links of the Day] 19/12/2016 : Cloud storage consistency models, heterogeneous memory management and atomic consistency for storage class memory

Consistency Models for Cloud Storage Services : A must read for anybody relaying on any for of cloud storage. It is imperative to understand the consistency model of these service in order to avoid bad surprises. Sadly, a lot of cloud storage out there lack of official documentation on the subject or are really fuzzy and lack proof.
Soft2LM : heterogeneous memory management , basically optimise memory allocation and migration between tier in order to minimise power consumption while maximizing performance.
Free atomic consistency in storage class memory with software based write-aside persistence : interesting article on a software stack that aim to deliver atomic consistency for SCM in write aside scenario. I am not sure how often write aside pattern though.

Thursday, September 01, 2016

[Links of the Day] 01/09/2016 : Cloud reference model, Scaling with Threads and economyics of response time

Economic Value of Rapid Response Time : classic 1989 paper demonstrating that lower software response time yield significant economic benefit with
ClouNS : A Cloud-native Application Reference Model for Enterprise Architects. The authors propose a reference model for cloud-native applications that relies only on a small subset of well standardized IaaS services. The reference model can be used for codifying cloud technologies. It can guide technology identification, classification, adoption, research and development processes for cloud-native application and for vendor lock-in aware enterprise architecture engineering methodologies.
Scaling to Thousands of Threads : excellent blog post looking at the misconception that thread based system are inherently flawed when it comes to availability.

Tuesday, February 09, 2016

[Links of the day] 09/02/2016: Run openstack in containers, Flocon network security conf, Agent motion book

Kolla : run Openstack in containers for ease of management and operational flexibility.
Flocon 2016 presentation : Network Situational Awareness for large-scale network flow analytics.
Space & motion of communicating agents : CS thesis looking at leveraging bigraph to model the space , motion of communicating agents. This essential as for the pas decades system mainly focused on getting the itnernal communication of entity right. But with the upcoming Punctuated equilibrium of IoT we external communication systems in an non static world will become critical.

Saturday, February 06, 2016

Guesstimating Private Cloud TCO

I decided to try out the fantastic GessTimate. Guesstimate is a spreadsheet for things that aren't certain. I recently started to gather informations on private cloud TCO and I decided to see if i could quickly sketch out a probabilistic private cloud TCO model with it.

First, a little bit of literature review. On one side, we have the public cloud with an open pricing which make cost calculation and comparison straightforward. On the other we have the fog of war of the private cloud. There is very few non-behind paywall reports or information out there and here is the result of a (very) quick search for serious source about private cloud TCO data :

Redhat Private Cloud TCO model which seems interesting but the lack of public methodology makes it difficult to assess.
EU report from e-InfraNet FP7 project : “Cloud Computing Economics: An evidence-based approach for Research Applications” . Their analysis show that private cloud operated by public agency operate at a fraction of the cost of AWS (25% of the cost in certain case)
Helix Nebula – The Science Cloud report : again in this case it show the the CERN openstack based cloud is a fraction of the cost of AWS (up to 25x).
The Magellan Report : DOE cloud were found to be 2–13x less expensive than typical commercial offerings.

While all the numbers and comparison are highly interesting, I needed to find something usable for building a probabilistic TCO model. Finally, I found that many reports and online information referenced Amazon's James Hamilton analysis of the breakdown of cloud cost. His model boil down to the following :

Servers: 57%
Networking equipment: 8%
Power distribution and cooling: 18%
Power: 13%
Other infrastructure: 4%

For my own model I decided to use the following categories :

Cost of money : The interest that could be earned if the amount invested it. Yep money is not free.
Software : Not everything is Open source, and even so you still need to deploy it.
Hardware : Server, SSD, HDD, etc…
Network :connectivity, switch, cable, router, etc...
People/Support : How much do you spend on maintaining your environment and deal with the customer need.
Aircon/power : Powering and cooling your cloud
Building Racking : Well it’s not like you are running everything in the cloud :)

Now that we have a model, let's plug that into guesstimate. You can see in the picture below the result of the exercise. Note, DO NOT pay attention to the numbers, they are completely fantasist as I couldn’t get access to the latest IDC datacenter cost report.

Private Cloud TCO Guesstimate

What I ended up with is a really neat model where you can plug your own numbers and come up with a quick estimation of your private cloud TCO. It will take more work to generate a more accurate model but its good enough for basic estimation. In the mean time, you can find the source here. If you extend/change it, don’t forget to give me a shout. I am interested to see what other private cloud TCO model are out there.

Update 07/02/2016: +Carlo Daffara pointed me to his extremely detailed CAPEX / OPEX spreadsheet model. I'm going to see if I can translate it to guesstimate as a more advanced exercise.

Wednesday, September 09, 2015

No, "you weren't ahead of time", you just were riding the wrong diffusion curve

“Launched ahead of their time” - a claim a lot of startups (and indeed more established companies) use to explain their product failure. In some rare cases, a product is truly ahead of it’s time, however, there is no market for it at all and no supporting component within the supply chain enabling it to be viable commercially and economically. But in most cases, these claims can be boiled down to a lack of traction from their offering.

In this blog post, I will focus on the “prematurely interrupted” hockey stick growth curve that some companies experience and the misunderstanding surrounding same. It looks and feels like exponential growth, but the ride terminates far earlier than the potential market research predicted. Incomprehension, surprise and denial are often common when the sales flat-line occur because customer feedback was great. As a consequence, companies use the “ahead of their time” excuse to explain their failure. However, the truth is the market for the product they built simply dried up.

Often these companies misunderstood the true reality of the diffusion of innovations curve presented below. With successive groups of consumers adopting the new technology (shown in blue), its market share (yellow) will eventually reach saturation level. The interpretation is that the technology adoption implies same product consumption across a consumer group.

In this graph, each phase of the adoption is represented by a different customer group that requires a tailored product in order for them to adopt it. While the concept and technology to a certain extent is similar across each consumer group, the actual product may vary drastically in shape and form. As a result, the technology, product, and consumption model evolves with each phase at different pace. In the graphic below, I have overlayed the actual diffusion curve of each sub group on top of the diffusion of innovation curve, in order to make it clearer. Note that this concept is derived from Wardley’s mapping technique tying diffusion and evolution within a single map.

As you can see, each customer type represents an independent sub-market with its own characteristic and inertia. It can be extremely easy to become trapped within a sub customer ecosystem. Often companies validated their products within such subspace and show impressive stats along a number of dimensions, such as high engagement, viral coefficient, or long-term retention. However, what is important to understand is how big is the customer market you validate your product in as well as asking the question, does it belong to a bigger ecosystem? Without this information, a company can quickly end up trapped into a local maxima. As a result, companies get boxed into a line of creative design thinking making tiny incremental improvements but never looking beyond that one solution. They became addicted to positive reinforcement, created out of their customer feedback, thereby preventing them from looking beyond that one solution to an innovative solution along different creative lines of thinking. That's how a company ends up having hipchat vs slack. The only difference between the two is the packaging of technology and it allows one to thrive along a bigger diffusion curve, while the other one seems stuck.

As mentioned, the technology evolves over time and with each diffusion wave. Quite often from genesys, custom built, product, and finally utility. However, there are many chasms to cross as there are a multitude of competing versions created, evolved (and dying). To be able to cross from one stage to another requires not only to understand the technological requirements of the new consumption model for the diffusion curve, but also the economic imperative associated with it, as shown in the graphic below. The reality is that the market fabric is a fractal tissue, made of a multitude of diffusion curves. You have the actual technology evolution as shown in the graph below, for each of these curves you have the same similar sub-curve representing the various adoption rate. These sub-curves are then subdivided and overlapped with smaller ones created by each company's product/services competing within the space.

This overall complex fabric creates a difficult environment for determining the correct strategy to apply. Identifying the current state of the ecosystem, its direction and when to adapt is a daunting task with a multitude of variables to take into consideration (which I might try to take a stab at in a future post). For the lucky or for the visionary, that spot the trend early enough, they may then attempt to sell early, or pivot their strategy. Pivoting their strategy is a rather difficult operation to execute correctly or even at the right time. Too early or too late and you can lose momentum of the current diffusion wave while the next one might not have picked up yet. In this case, your capacity to wait it out depends ruthlessly on your burn rate. Many companies fail at that stage simply because of bad timing.

To conclude, often when a product, company or startup claims to have failed in their endeavours because they were “ahead of their time”, this is a misconception. In reality and unfortunately in the majority of cases, they simply did not understand the ecosystem they had evolved in and got stuck in a local maxima. For some, it turned into a kiss of death while others, into a curse of zombification.

Subscribe to: Posts ( Atom )

Reflections Of The Void