Thursday, January 18, 2018

[Links of the Day] - 18/01/2018 : Stellar Cryptocurrency Consensus protocol, Optimizing linux server for high throughput and low latency, performance impact of meltdown patch on HPC Filesystem

  • Stellar Consensus Protocol : from ripple for to full-blown rewrite. Stellar looks like an impressive protocol addressing many of the shortcoming and risk of Ripple. Also, the authors seem to be smart enough to avoid jumping to fast onto the smart contract aspect as it is a really tough nut to crack. Maybe, with all the mayhem surrounding cryptocurrency, the stellar approach seems to be rather measured. Worth keeping an eye on. 
  • Optimizing web servers for high throughput and low latency : very good post on how to optimise your Linux system. A lot of it has already described many times, but it is never a bad thing to repeat them.
  • The performance impact of Meltdown patches on HPC FS (Lustre) : no surprise here, IO intensive applications are the one the most heavily impacted. However, I wasn't expecting 40% performance penalty and up to 45% for large folders. 



Image result for stellar

Tuesday, January 16, 2018

[Links of the Day] 16/01/2018 : planetary scale DB - AntidoteDB, Benchmarks for Machine Learning and the hardware running the algorithms

  • AntidoteDB : large scale ( planet-scale ) distributed DB system. Competing with the like of cockroachDB or spanner. The core differentiator the architecture heavily rely on CRDT for its core functionality. It is a spin-off from the SyncFree EU research project. Sadly like a lot of EU or research-driven startup spin-off the documentation and website are slightly lacking polish. The architecture reference link is broken and a lot of stuff seems to be work in progress. Common guys! If you want to build a community and a product you really need to pick up the pace. This project has great potential, don't let it go to waste. 
  • Machine Learning Benchmarks - Hardware Provider : a very good survey of machine learning benchmark of the current cloud provider. What is even more useful from that benchmark is that you get a cost overview of running ML application. Which is often a big unknown at the moment. 
  • DeepMind Control Suite : benchmark suite for machine learning algorithms using a set of continuous control tasks with a standardised structure and interpretable rewards


Thursday, January 11, 2018

[Links of the Day] 11/01/2018 : Two machine learning conference NIPS 2017 & Robot Learning CoRL 2017 , CS Paper ML detector can still be fooled too easily

  • Nips : This conference is considered one of the biggest events in ML\DNN Research community. Here are two sets of notes from the conference by ‎Olga Liakhovich and by David Abel. These are two fairly long article but worth a read. Looks like fairness and bias is one of the big topics of the moment. Also, I like how ML is compared to alchemy. The current approach is extremely fragile, tailor-made and not fully understood. Too often machine learning tools are considered black box where you shove in data at one end and get a result on the other. 
  • Conference on Robot Learning (CoRL) : robot and machine learning are converging at an aggressive pace. It is rather impressive how all these different aspects of computer science are clicking together and with each small improvement in each domain lead to an overall jump in robotic capability. 
  • Adversarial Examples that Fool Detectors : last but not least, common machine learning classifiers are still way too fragile and can be easily fooled. With the boom in use of ML technique everywhere. This can become really quickly a problem in the near future. 




Tuesday, January 09, 2018

[Links of the day] 09/01/2018 : Learned index structures, 2 paper on Human behavior : herding and stubbornness in Jury deliberation, overconfidence is universal?

  • The Case for Learned Index Structures : as we performance progression for single code cpu slow down ( not to mention spectre and meltdown slowing down existing one). Application moves to a distributed model to scale. As a result databases and distributed systems are forced to become more data-aware to achieve efficiency and performance. This is a very nice paper that demonstrates that data structures often contain components that are learnable and machine learning system can help optimise those data structures. 
  • Evidence of Herding and Stubbornness in Jury Deliberations : human do not rely on logic for important decision and try to coherence fellow human to fit its opinion... While this is widely know, we now have a good hint that this even happens in the judicial system of trial by jury. That or too many people saw twelve angry men. 
  • Overconfidence Is Universal? : interesting paper trying to understand how to identify overconfidence and if this behaviour is more predominant in a certain type of population or gender. 




Thursday, November 16, 2017

[Links of the Day] 16/11/2017 : Sparse and dense array database, Rythm of memory, Routing over blockchain

  • TileDB : manages massive dense and sparse multi-dimensional array data simply. This is a really good project as often there is no real support in existing database. 
  • Rhythm of memory : the brain is a complex organ. And we just barely scratched the surface. Scientist discovered that part of the memory processing in the brain is segregated in different subcomponent that process information in parallel and at different speeds. This gives a glimpse of how the brain works and how it is able to store and access so much data at the various level of granularity fast.
  • IPvPub : I really think there is something behind this concept. While I tend to be wary of the current trend of sprinkling blockchain everywhere. Using this technology for large-scale address resolution and routing can solve so many problems... Reduce reliance on DNS system in the age of lambda. What I really want to see is this integrated with a Lambda framework for simple exposure of service endpoints.


Tuesday, November 14, 2017

[Links of the Day] 14/11/2017: wallaroo elastic data processing, Unified clock for real time synchronization, Golang text search engine


  • Wallaroo : this is a really impressive alternative to spark/storm. The creators really focused on ease of use and deployment. Which can be a huge barrier to entry with the spark, storm, Hadoop stacks.  [github]
  • Unified Clock : Must read blog post, on how Riot games use a unified clock mechanism to synchronise player in the league of legends multiplayer games.  
  • Riot : golang text search engine with some really good perf - 1M blog 500M data 28 seconds index finished, 1.65 ms search response time, 19K search QPS


Thursday, November 02, 2017

[Links of the Day] 02/11/2017 : Probabilistic programming Library, Better than Word2Vec , Serverless conf

  • Edward : Turing complete deep probabilistic programming software library. Probabilistic computing is going to be at the forefront of the next big AI improvement wave. [software]
  • Better than Word2Vec : first word2vec is great and there is already so many pre-built libraries out there that it should be your number 1 go to approach. Then if you want to develop custom word embedding library, as the blog post explains, SVD might be a better approach. 
  • ServerlessConf NYC 2017 : a good summary of New York serverless conference  2017.