- GPUcc : Open-Source GPGPU Compiler by google using LLVM [slides] [code]
- Trinity Facilities and Operations : what does it take to plane and operate a supercomputer such as the Trinity at the LANL
- Escher Sketch : ever wanted to do tessellation based wallpaper or just a pretty picture , now you can!
A blog about life, Engineering, Business, Research, and everything else (especially everything else)
Wednesday, March 30, 2016
[Links of the day] 30/03/2016 : Supercomputer facilities and ops, gpu compiler, escher etch a sketch
Labels:
compiler
,
escher
,
gpu
,
links of the day
,
operations
,
supercomputer
Tuesday, March 29, 2016
[Links of the day] 29/03/2016: cache strategy and write avoiding algorithms + Stats intro ebook
- Write-Avoiding Algorithms : when you have to deal with the CAP theorem, sometimes the best strategy is to avoid confrontation. In this case, avoid operations that trigger consistency transaction. This paper lokos into algorithm that tries to minimise write operations in order to minimise distributed coherence related operations and the associated benefits.
- FairRide : Paper looking into the possibility to deliver cache Isolation Strategy , Pareto Guarantee and Proofness Efficiency is hard. And it turns out it is actually not possible but you can get close enough.
- Intro Stat with Randomization and Simulation : free statistics intro ebook
Labels:
cache
,
CAP
,
consistency
,
links of the day
,
statistics
Monday, March 28, 2016
[Links of the day] 28/03/2016: Hierarchy of engagement, Latency measurement, TAO consistency at Facebook
- The Hierarchy of Engagement : Another excellent Greylock partners's slides deck on how to leverage the Hierarchy of Engagement to fuel the growth of your company.The proposed hierarchy model has three levels: 1) Growing engaged users, 2) Retaining users, and 3) Self-perpetuating.
- Measuring and Understanding Consistency at Facebook : paper summary of Facebook highly consistent DB : TAO. Interesting thing is that they have a hierarchical consistency model with synchronous cache consistency and asynchronous cache, DB/storage invalidation model.
- How NOT to Measure Latency : in-depth overview of Latency and Response Time Characterization, including proven methodologies for measuring, reporting, and investigating latencies, and overview of some common pitfalls encountered (far too often) in the field
Labels:
business
,
business model
,
consistency
,
db
,
Distributed systems
,
engagement
,
facebook
,
growth
,
latency
,
links of the day
Saturday, March 26, 2016
PureStorage bring us one step closer to micro storage architecture
Pure storage just released its Flashblade product. It is an fabric connected object storage solution. It is a modular solution composed of a large numbers of blades which are each made of :
What is interesting is that, when you look at Purestorage solution, they decided to integrate the high level compute aspect of storage directly with the low level one in a single blade. They ended of with an hybrid solution combining ARM and FPGA for low level aspect such as deduplication, erasure code. And the Xeon for the object storage and file system solution.
One can assume that the decision behind such architecture was driven by the customers requirement that tend to want a high performance Jack of all trade solution. I can picture the product manager arguing for supporting every scale out storage protocol popular at the moment. However, Jack always end up master of none and to over compensate PureStorage had to pump up the compute capabilities.
- 8TB to 52TB raw NAND storage capacity : a lot but still take less than half the real estate space on each blade.
- NV-RAM+supercapacitor write buffer : when your NAND is still too slow you want to have a persistent buffer of NVRAM to handle the bursts
- ARM CPU + FPGA : to deal with the “low level” operations such as erasure code, etc..
- 8 core Xeon System on chip : for moving the computation to where the data is located, pretty much all the high level operation such as NFS , S3 , object storage etc..
- 40 Gbit ethernet : that s where the data gets out
- PCIe fabric networking : in chassis solution linking compute, storage cards via a proprietary protocol, what’s interesting is that the system is self contained and scaling with other box goes through the 10 Gb/s connectivity and not a proprietary fabric link. Which implies that it doesn’t need exotic solution once you go past the box boundaries. This is great as it makes it easy (and cheap) to scale however I wonder what are the implication in term of performance once you start crossing boundaries.
One can assume that the decision behind such architecture was driven by the customers requirement that tend to want a high performance Jack of all trade solution. I can picture the product manager arguing for supporting every scale out storage protocol popular at the moment. However, Jack always end up master of none and to over compensate PureStorage had to pump up the compute capabilities.
While this seems like a good choice it is also counter productive in term of Watt per GB coupled with a lot of real estate wasted or duplicated. Don’t get me wrong, what Pure achieved with the flashblade is impressive but I can’t stop thinking that they should have taken it a step further.
This type of high performance, high-cost and high-power architecture technology is a right step toward micro storage architecture which delivers low cost low power high performance and scalability features. Now it is all about trimming down the system while maintaining scalability by dividing the blade system into a much larger number of smaller nodes, literally offering what the ethernet connected equivalent of HGST with flash.
However this might also implies that you won’t be able to offer support for every single storage solution out there (NFS, S3, block, etc..) without having to rely on either client side processing or using a frontend. This should be achievable while maintaining excellent performance, the key to this will hide in the detail of the core storage api employed.
This type of high performance, high-cost and high-power architecture technology is a right step toward micro storage architecture which delivers low cost low power high performance and scalability features. Now it is all about trimming down the system while maintaining scalability by dividing the blade system into a much larger number of smaller nodes, literally offering what the ethernet connected equivalent of HGST with flash.
However this might also implies that you won’t be able to offer support for every single storage solution out there (NFS, S3, block, etc..) without having to rely on either client side processing or using a frontend. This should be achievable while maintaining excellent performance, the key to this will hide in the detail of the core storage api employed.
Labels:
arm
,
bigdata storage
,
flash
,
fpga
,
micro storage
,
nand
,
nvram
,
purestorage
,
storage
Friday, March 25, 2016
[Links of the day] 25/03/2016: Scheduling with queuing theory, LLVM Assembler Framework, Erasure Coding at Azure
- Efficient Queue Management for Cluster Scheduling : MS researcher look into introducing queue management techniques, such as appropriate queue sizing, prioritization of task execution via queue reordering, starvation freedom, and careful placement of tasks to queues for big data task cluster scheduling.
- Keystone : a indie gogo project to refactor LLVM to build a multi-architecture, multi-platform, open source assembler framework.
- Erasure Coding in Windows Azure Storage : MS azure use Local Reconstruction Codes (LRC) for its storage. LRC greatly reduces the number of erasure coding fragments needed for reconstruction in case of failure/ offline data. The key benefit is a drastic reduction in I/O and bandwidth requirement for repairs while maintaining the storage overhead low.
Labels:
assembler
,
azure
,
bigdata
,
cluster
,
erasure code
,
links of the day
,
llvm
,
Queuing Theory
,
scheduling
,
storage
Thursday, March 24, 2016
[Links of the day] 24/03/2016: Testing distributed systems, SDN OS, HW/SW for Storage Class Memory
- Technologies for Testing Distributed Systems : testing distributed system is hard, and unit testing do not really cut it when it come to byzantine fault..
- ONOS : Open Network Operating System (ONOS) is a software defined networking (SDN) OS
- WrAP : Hardware and Software Support for Atomic Persistence in Storage Class Memory
Labels:
Distributed systems
,
hardware
,
links of the day
,
nvm
,
os
,
sdn
,
software defined
,
testing
Wednesday, March 23, 2016
[Links of the day] 23/03/2016: containers patterns, delta CRDTs, probabilistic DB
- Container Patterns : WiP but promising documentation of containers patterns. Check v1.0 branch
- Efficient State-based CRDTs by Delta-Mutation : instead of maintaining a full information in a CRDT the authors propose to use delta based messages in order to reduce storage and network space overhead.
- BlinkDB : allows users to trade-off query accuracy for response time, enabling interactive queries over massive data by running queries on data samples and presenting results annotated with meaningful error bars. Really cool, we start to see the emergence of probabilistic programming everywhere. We just have to get used to that like real life, computer programs can be more efficient when not everything is certain.
Labels:
container
,
crdt
,
database
,
links of the day
,
probabilistic
Subscribe to:
Posts
(
Atom
)