Today's links 21/01/2015: cost of #bigdata scalability and when you are stuck in the middle, to big for small data and to small for big data : #mediumdata
  • Scalability! But at what COST? : Big data systems may scale well, but this can often be just because they introduce a lot of overhead. Rather than making your computation go faster, the systems introduce substantial overheads which can require large compute clusters just to bring under control. In many cases, you’d be better off running the same computation on your laptop. This has been a well know side effect in HPC as for certain type of problem high parallelism just create more overhead than accelerate the processing. The only need for such cluster is the lack of capabilities to buy hardware that is able to fit the necessary amount of data. I guess often people prefer to buy a lot of small cheaper server and say : "look i m running a cluster", than one single beefed up machine that would solve the problem faster.
  • Medium Data : when your big data is too small to warrant a cluster but too big to fit within a single machine .. I think the authors of this article and the one above should collaborate on a follow up one. Also.. one size fit all does not exist.. News at ten.

