Reflections Of The Void: papers

Showing posts with label papers. Show all posts

Friday, June 19, 2020

With enough data and/or fine tuning, simpler models are as good as more complex models

This is an age-old issue that seems to repeat itself in every field. There are a couple of recent papers published criticising the race to beat SOTA.

This recent paper demonstrates that older and simpler model perform as well as newer models as long as they get enough data to train.

This has some interesting impact on production systems. As if you already have a good enough model, throwing more data at it can help achieve close to SOTA result.
Which means that you won't have to build from scratch a new model to keep up with SOTA in your production system. You just need to collect more data as the system run and retrain your model once in a while.
Also, less complex models tend to have shorter Inference time in production. Which would be a quite crucial component as well that gets impacted by model complexity.

In another recent paper, the authors look at Metric learning papers from the past four years and demonstrate that the performance claims over the old method (often more than double) are mainly due to the lack of tuning.

Most of the time the authors of the SOTA beating algorithm show two evaluations. One where they finetune their algorithm on the test set and compare against the off the shelf tuning SOTA algorithm.

"Our results show that when hyperparameters are properly tuned via cross-validation, most methods perform similarly to one another"

"...this brings into question the results of other cutting edge papers not covered in our experiments. It also raises doubts about the value of the hand-wavy theoretical explanations in metric learning papers."

This happens time and time again across the industry and academia: perf benchmark of CPU Intel vs AMD, GPU Nvidia vs ATI, Network, Storage, etc....
This can be due to lack of knowledge, time, integrity, etc..

To conclude, be careful, the latest shiny model might note the best one for your production. If you spend enough time and data on older models you might achieve the same performance at lower inference cost.
Obviously, this assumes that you already have the best practice when it comes to model monitoring in production :)

Tuesday, December 11, 2018

[Links of the Day] 11/12/2018 : Papers : Recognising disguised faces and deceiving NeuralNet with visuals illusions, and the dry history of liquid computers

Recognizing Disguised Faces in the Wild : you can hide but not from the almighty computer. It is way more efficient to actually deceive in this case (see below).
Convolutional Neural Networks Deceived by Visual Illusions : As already known, a neural net can be easily deceived. I think we are going to quickly enter a sort of arms race. Probably like that golden age of the virus/anti-virus software. Where we will see an ever more complex recognition and anti-recognition tools in the wild.
The dry history of liquid computers : it's not all about silicon. You can make logic gates using fluids!

Thursday, September 20, 2018

[Links of the Day] 20/09/2018 : Arxiv paper viewer, Artificial intelligent atomic force microscope, What they don't teach you running a business by yourself

Arxiv Vanity : If you are like me and read a lot of papers from Arxiv. This website will save you a ton of time. It allows you to render academic papers from Arxiv so you don't have to download or decipher the pdf. It makes life so much easier if you are on mobile and can't wait to read the latest paper on kitten deep learning recognition.
Artificial Intelligent Atomic Force Microscope Enabled by Machine Learning : the authors demonstrate how you can use artificial intelligence with an atomic force microscope for pattern recognition and feature identification.
Things they don’t teach you running a business by yourself : great short post on the different aspect of running a small business by yourself. If you want to start your own business, I would also advise reading "Start Small, Stay Small" - by Rob Walling and Mike Taber. It was an eye-opener. You don't have to go big with your business. Instead, you can run ten simultaneous businesses, diligently managing and tracking his time to run each one as efficiently as possible. It doesn't matter if one falters. This approach allows you to create a comfortable cushion and increase the chance of a higher payoff.

by dahlig

Thursday, June 07, 2018

[Links of the Day] 07/06/2018 : Quantum algo for beginners, Dynamic branch prediction and Running Python in Go

Quantum Algorithm Implementations for Beginners : this paper present a lot of the basic algorithm used for quantum computation. It's a good start if you want to check out what quantum computer can do and test it on a real one!
A Survey of Techniques for Dynamic Branch Prediction : with all the spectre and meltdown attack, this paper is a good refresher on what is dynamic branch prediction, how it works and why we need these techniques.
Cgo and Python : When you want to run python in your go, and you realise that the python threading model is still a pain in the $"!$*(^!& .

Thursday, May 17, 2018

[Links of the Day] 17/05/2018 : Edge Computing and the Red Wedding problem, Vector Embedding utility , Scalability efficiency

Towards a Solution to the Red Wedding Problem : interesting look at how to handle massive Read spike while being able to update (write spike ) the content at the same time. The authors propose to leverage edge computing to spread and limit the impact of a write-heavy spike in such network
Magnitude : this is a really cool project for those out there dabbling with NLP and vector embedding. This package delivers a fast, efficient universal vector embedding utility.
Scalability! But at what COST? : the authors of this paper introduce the concept of measuring the scalability performance of a solution by comparing it to the hardware configuration required before the platform outperforms a competent single-threaded implementation. As always, and often, most system and company do not need a monstrous cluster to satisfy their need. But it's always more glamorous to say: "we used a cluster" rather than: "I upgraded the RAM so the model can fit in memory".

Tuesday, July 04, 2017

[Links of the Day] 04/07/2017 : Classic Papers, Operating Model Canvas, Origami anything

Classic Papers : a collection of highly-cited papers in their area of research that has stood the test of time. For each area, we list the ten most-cited articles that were published ten years earlier.
Operating Model Canvas : a set of tools and model to help to align operations and organisation with strategy.
Origami anything : New algorithm generates practical paper-folding patterns to produce any 3-D structure.

Thursday, August 04, 2016

[Links of the day] 04/08/2016 : Industrial Organisation Reading List, SQL migration, Quiescent States

Industrial Organization reading list : extensive fall reading list on industrial organistion. Covering : Sectors: finance, health care, others - Incentives - Production - Organization - Competition
gh-ost : triggerless online schema migration solution for MySQL. It is testable and provides pausability, dynamic control/reconfiguration, auditing, and many operational perks.[github]
Using Quiescent States to Reclaim Memory : Explanation on how to use Quiescent states to implement lock free algorithms. Similar to RCU in userspace.

Monday, June 13, 2016

[Links of the day] 13/06/2016 : Geometric Algorithms, Logical Clock are easy & distributed system testing

Geometric Algorithm : Princeton algorithms and data structures lecture. Provide an excellent overview of the geometric algorithms out there and why they are important.
Why Logical Clocks are Easy : well not really but still , AMC queue article on the ever recurrent issue of time in distributed systems.
Distributed Systems Testing : When you cannot find good article on a subject, ask on twitter. The result is a good list of distributed system testing papers.

Tuesday, May 17, 2016

[Links of the day] 17/05/2016: CMU DB lectures , Seminal IA papers, Storage noisy neighbors

Database Systems Lectures: Carnegie Mellon University lectures on database system. It gives a really good overview of the state of the art of database systems.
Intelligence without representation & Intelligence Without Reason : 1991 Seminal paper by Rodney A. Brooks from the MIT artificial intelligence lab. In these the author argue that intelligent behavior could be generated without having explicit manipulable internal representations and it also can be generated without having explicit reasoning systems present.
Noisy Neighbor analysis : a look at the effect of deploying heavy workload onto modern storage systems and the collateral effect on overall performance for all the participant in the cluster.

Monday, May 16, 2016

[Links of the day] 16/06/2016 : HPC fabric routing, Software Architecture , 1977 cloud paper

Transitively Deadlock-Free Routing Algorithms : interesting routing solution for the BULL (now Atos) BXI fabric for HPC system. 4-level rearrangeable non-blocking fat-tree which support up to 64 800 nodes, 11 160 switches, 194 400 inter-switches links. Problem : this represent 50GB of routing table, and errors occurs (often). Obviously recomputing the routing tables for each fault is not an option. The authors propose a process using offline/online recompute with non blocking routing table update process. Interestingly enough the proposed solution looks a lot like online linux kernel patch update system. [slides]
Architect's Clue Bucket : Big slide deck by Ruth Malan looking at sfotware architecture and how to use Clues to deliver great product. The author look at the type of clue: design principles, heuristics, tipds, hints... How to organise them : mapping the clue landscape and finally where and how to look for clues
1977 Cloud : Insightful paper describing what would be today's modern cloud solution.. Sometimes I think that the 70s were caught into a time warp caused by hardware lag. Software tech advanced way faster than the hardware tech and we just spent the past 30-40 years waiting for it to catch up. Sadly, we forgot (and reinvented the wheel many time) while waiting.

Friday, May 13, 2016

[Links of the day] 13/05/2016 : NVMesh , NVM file system

nvmesh : pure software product using a shared nothing architecture that leverages, NVMe SSD, SR-IOV and RDMA. Performance are interesting: 4M read and 2.8M write 4k IOPS, 16GB/s throughput and super low latency with 90µs/25µs for read and write from client to server. Whats is really interesting is the dual mode of operations: shared nothing with direct storage access for really fast access or centralized one which offer more redundancy and serviceability feature at the cost of a lower ( but still fast ) performance [video]
Fine-grained Metadata Journaling on NVM : the authors propose to move away from the limitation of block based journaling to a fine grained approach more suitable for NVM storage. They propose to move to a inode based transaction and journaling approach, each inode representing 256 byte. The solution seems cache friendly however it beg the question : why do we need to go through the CPU .. With DAX and other system it should be more efficient to completely bypass it[slides]
Fast and Failure-Consistent Updates of ApplicationData in Non-Volatile Main Memory File System : being crash consistent is the number 1 requirement for any storage solution. Current File system optimized for NVM doesn't seem to be good enough. The authors propose an alternative file system specifically tailored for consistency and high performance by moving away from the FS level consistency and target application level consistency solution. Naturally this put a greater burden on the application layer.. Then again researcher really need to move away from the classical FS solution and deliver a new paradigm. [slides]

Monday, February 01, 2016

[Links of the day] 01/02/2016: Linux internals, best paper and Spotify goes SDN

Best Papers : Best Paper Awards in Computer Science (since 1996)
Linux Internals : very good ebooks on the internal of linux kernel from boot to memory
SDN Internet Router [part2] : a very impressive demonstration of the implication of the SDN technology for company. It allowed spotify to replace routers that would have cost 1/2 $M each with a couple of SDN switches.

Thursday, January 21, 2016

[Links of the day] 21/01/2016 : best effort distributed K/V store, Robotics SLAM, ICCV15

OneCache : a best-effort, replicated KV store accessible via the memcached protocol
Real-Time SLAM : Deep learning is just one small part of the solution for enabling SLAM
ICCV 2015: Twenty one hottest research papers using Deep Learning tools applied to creative tasks

Monday, December 21, 2015

Links of the day 21/12/2015 : super malloc, ISMM15 , Distributed system reading list

SuperMalloc : implementation of malloc(3) originally designed for X86 Hardware Transactional Memory (HTM). It turns out that the same design decisions also make it fast even without HTM. [github]
ISMM15 : ACM SIGPLAN International Symposium on Memory Management
Distributed Systems Seminar's reading list : list of papers used by the excellent Murat Demirbas for his distributed system seminar. A must read for anybody in the field.

Monday, September 07, 2015

Links of the day 07/09/2015 : #bitcoin, computer science paper, and micro datacenter

Paper we love : A lot of good talk mainly introduction level:

Bitcoin : overview of the bitcoin Peer-to-Peer Electronic Cash System for those living under a rock for the past 2 years.
Propositions as Types : Michael Bernstein talks about Philip Wadler’s paper Propositions as Types, which starts out with the following sentence: "Powerful insights arise from linking two fields of study previously thought separate." And just keeps on going from there. In less than 9 full pages, Wadler assembles an exuberant, hilarious take on the deep, meaningful connections between mathematics, philosophy, and computer science.

Micro Datacenter : Microsoft makes the argument that for the future of mobile computing edge computation and the doting of physical landscape will be the only way to solve the latency issue. I agree to a certain extent however as usual the requirement for that to happens many other technology needs to bridge some chasm in term of maturity. Surprisingly AOL experimented with that idea ~3 years ago.

Monday, December 22, 2014

Links of the day 22 - 12 - 2014

Today's links 19/12/2014: #azure postmortem, #machinelearning and programmer resources list, #robot knowledge engine

Root Cause Analysis : Nov 18 Azure Storage Service Interruption detailed postmortem.
10 Technical Papers Every Programmer Should Read (At Least Twice) : some good paper , basically the minimum paper list that any CS student should have read by the end of its university cycle.
200 machine learning and data science resources : looks like everybody love to make list when the year comes to an end.
Robot brain : Knowledge Engine For Robots

Thursday, August 24, 2006

Scientific publications , self advertisement and academic decay

I decided to post my publications on my blogs for a couple of reasons :
its hard to find information you need in research, even with tools like citeseer or google scholar . So allowing them to be easily accessible and most of all find able is not easy. And also its always good for the ego to see its publication refereed somewhere.

But when i think about it , the actual academic system is kind of perverted : teacher / professors , researchers get funded, or get a pay / position raised based on their amount of publication. YES amount , not quality , that leads to a enormous amount of paper with different title layout and a few paragraph changed published for the sake of publishing. Which leads inevitably to bury interesting paper in a mountain of cloned paper.

Why not creating an online repository , maybe using some idea from wikipedia where people submit their paper online , anybody can review it and give its opinion. With a notation system , that conference will use to accept paper ( i was thinking of a digg like system ) . But such a system is not that easy to implement , we need to avoid fraud , rating manipulation etc..
But it will at least deter the dupe and other rehashing of information over and over .

So what do we need for such a system :

A wikepedia style of creating topic , sub topic and general description
A rating system for paper
A comment system for paper
A way to enable conference to use the system to review submitted paper and publish them
An efficient search function
A multilingual support
An obligation to provide tools and protocols that can be used to verify the claim ( how many publication claim something and when you ask for a piece of their code they reply that the research is secret or other *^£!^&* excuse ) as well as dataset used.
A benchmark library and data sample for experiments

So far tthat's all, i will try to complete the list and come up with a decent description of the overall later.

Subscribe to: Posts ( Atom )

Reflections Of The Void