Tuesday, October 10, 2017

[Links of the Day] 10/10/2017 : Machine Learning Hardware acceleration , Homomorphic encryption

  • Tutorial on Hardware Architectures for Deep Neural Networks : How to leverage hardware for accelerating machine learning processes. 
  • A Survey on Homomorphic Encryption Schemes : this paper presents a thorough survey of the state of homomorphic encryption schemes. Homomorphic encryption allows manipulation of the encrypted data without the need to decrypt it. This will allow when hardware will be fast enough to deal with the complexity of the operations, to have a true secure distributed multitenant database. As no operation on the hosting side will require clear text decryption of the data and everything can be done securely on the client side. 
  • Efficient Methods and Hardware for Deep Learning : Standford lecture where guest lecturer Song Han present algorithms and specialized hardware (FPGA, GPU, ASIC, etc..) that can be used to accelerate training and inference of deep learning workloads. [video]



Friday, October 06, 2017

[Links of the Day] 06/10/2017 : HPC routing topology on dependency graph, Arm network stack development, Multicore graph processing

  • Routing on the Channel Dependency Graph : the author aim are providing a toolset for topology calculation for HPC network. 
  • Arm for network stack developers : arm is trying to slowly move up the stack into the data centre world. For that, it needs to address one of its main limitation: IO. This slide deck describe the current effort to tackle network stack limitation ( and support RDMA ) as well as providing pointer where ARM or devs can push the Linux stack further. As Linux is pretty much the only viable software stack for such hardware infrastructure in the datacenter. 
  • Multicore graph processing : this present a very good overview of the Multicore graph processing problem, solution landscape and where it's heading. Graph problems are essential to solve in the domain of social network modelling as well as for items recommendation or search and website ranking. [slides]



Wednesday, October 04, 2017

[Links of the Day] 04/10/2017 : Tensor Flow Extended , 5d Torus , Machine learning influencing hardware designs

  • TFX : Google paper describing it's tensor flow based platform for production scale deployment. It is impressive how they created a platform that delivers robust TensorFlow-based learners and support for continuous training and serving with production-level reliability. [video]
  • 5d torus : when your 3d torus fabric is not enough :) . The authors demonstrate how using 6 port NIC they are able to achieve such feat. Its domain of application is still be limited to HPC environment as only tailored applications can leverage such network topology. However, there might be potential in the machine learning domain. There is only a future for such approach if they are able to demonstrate significant acceleration for popular problems ( ML ). And make is fungible for cloud-like deployment. I.e. you need to be able to easily partition a set of nodes connected by such network and share it in multitenancy model. [slides] [thesis]
  • Machine Learning and the Implications for Computer System Design : Jeff Dean talk at hot chips 2017 conference. Jeff really really show the current influence of ML on the present and future hardware architecture. 





Thursday, September 14, 2017

[Links of the Day] 14/09/2017 : Automating Turing test, Deep Learning Survey, Unix History


  • Toward Automatic Turing Test : when software is used to detect if its a software or a real person talking, feel like it should be submitted to totally, not robot subreddit. The problem is way more complex and useful than it seems. By automating the procedure you could do fast prototyping and testing of models with limit human input. Accelerating the research and reducing costs. 
  • Survey of Deep Reinforcement Learning : l cover central algorithms in deep reinforcement learning, including the deep Q-network, trust region policy optimisation, and asynchronous advantage actor-critic.
  • Unix - History and Timeline : The history of Linux's grandfather OS , surprisingly enough the latest version 3 spec and ISO/IEC spec came out in 2003. Which is only 14 years ago ( I feel old now.... )

Tuesday, September 12, 2017

[Links of the Day] 12/09/2017 : OpenFaaS serverless framework, Papers I like, Docker comparison tool

  • FaaS : Functions as a Service (OpenFaaS) is serverless framework using Docker & Kubernetes. What I really like about this approach is that it simply relies on using STDIN and STDOUT as a way of passing event trigger and output of the serverless function. It allows great flexibility and open up functionality that you wouldn't have by using Lambda by example. As it constraints you behind the REST + API gateway model.
  • Papers I like : start of a really cool series ( 5 parts so far) of interesting fundamental papers. Must check out!
  • lstags : a practical little tool that allows you to compare local docker image with the ones in a repository.


Thursday, September 07, 2017

[Links of the Day] 07/09/2017 : Platform for Partial Differential Equations solver, High perf serverless event & data processing, K8s serverless framework

  • FEniCS Project : a platform for solving partial differential equations. It's really cool as it allows you to rapidly solve and test pde model using a minimum amount of code. 
  • nuclio : High-Performance Serverless event and data processing framework
  • kubeless : Kubernetes Native Serverless Framework, I have a feeling that Mesos lost the container orchestrator war ...


Tuesday, September 05, 2017

[Links of the Day] 05/09/2017 : Patent Surviving Alice, Hot Cloud 17 conference papers

  • 7 Post-Alice Patent Cases That Survived 101 Rejections : Alice US supreme court decision started a slaughter in the US patent office regarding IT related patent: 8400 applications dropped and 60k+ rejected. While courts invalidated the vast majority of patent litigation. However, it seems that there is a way to survive the onslaught, and it's quite simple. You just need your patent to satisfy the following criteria: novelty, enablement, non-obvious, and last but not least useful. it seems like a no brainer, but it seems that the USPTO allowed itself to be flooded by sub par applications that gamed the system. Not to mention that the agency financially gained from such practice to some extent also. 
  • Hot cloud 17 : hot cloud conference just finished, here is a selection of interesting paper
    • JavaScript for extending low-latency in-memory key-value stores : Adrian Colyer takes a look at in memory javascript engine using RamCloud. RamCloud project is entering the use case phase of the research project,  eyeing the comercialisation. Sadly most solution put forward are extremely niche. The risk is for great people to be stuck in a zombie startup if they try to run with it. Taking separately, the tech that came out of the RamCloud project is amazing. However, the solution as a whole doesn't really have a great killer app or any potential beyond some niche market.  [paper]
    • Towards Index-based Global Trading in Cloud Spot Markets :  the authors propose to use an index based prediction model rather than per spot instance in order to obtain greater reliability at lower cost.
    • DAL: A Locality-Optimizing Distributed Shared Memory System :  Different take on the whole in memory K/V system, the authors aggressively move the data to the computation rather than offering remote access. This allows great data reuse. We used something similar in hecatonchire. However, there is a certain risk when you have a high level of churn or serial data access and local caching of data generate a high level of eviction, effectively doubling the bandwidth usage. 
    • Leader or Majority: Why have one when you can have both? : raft is a great consensus protocol ( and easier to understand). However, the over reliance on the leader is the main bottleneck for scalability of operations. The authors ( from cockroachdb ) propose a quorum based read operations that allow alleviating the load on the leader while still retaining strong consistency. This allows them to improve write by 4x write perf and increase throughput by 60%. Which is quite impressive. 
    • DCCast: Efficient Point to Multipoint Transfers Across Datacenters : the authors proposed an efficient multipoint data transfer protocol allowing greater efficiency and bandwidth usage.