Reflections Of The Void: amd

Showing posts with label amd. Show all posts

Sunday, November 22, 2020

HPC ecosystem - SC20

This article is a quick summary of SC20 trends and the current state of the HPC ecosystem from a tech and market perspective.

Technology-wise there are three main competing HPC architectures:

Commodity (e.g. Intel)
Commodity + accelerator (e.g. GPUs)
Lightweight cores (e.g. IBM BG, Xeon Phi, TaihuLight, ARM )

Commodity systems represent the bulk of the systems out there. However, commodity + accelerator are ramping up their presence aggressively. Nvidia dominates this market segment with 142 systems out of 149. With Intel scooping 4 with it's Phi solution. Lightweight cores systems are a minority with only four systems. But with the new A64FX and a renewed appetite for custom chips, this might change rapidly.

Intel is still dominating the ecosystem, with 92% of the shares, followed by AMD with 4%. However, this might change rapidly with AMD EHP technology ramping up. Another aspect is that AMD technology tends to be more open-source friendly, which can make it more attractive long term. Not to mention that their GPU also start to become highly competitive in the AI space.

From a market size, the HPC market was $39.0 billion in 2019, up 8.2% from $36.1 billion in 2018. Predictions show growth to $55.0 billion in 2024. Most of the growth was led by government spending after six years of growth led by industry. The number of system in industry vs public is not equally divided with ~50% each.

One notable change is double-digit growth of cloud HPC related market. Cloud grew 17.8% to $1.4 billion; however, this might only be the tip of the iceberg as many companies might be using HPC like system in the cloud without labelling it as HPC. Cloud solutions are heavily displacing low-end HPC segment. Entry and mid-range level server classes have the slowest growth in years as consumers prefer to buy HPC as a service solution and reduce their CAPEX.

AI is still heavily influencing the HPC infrastructure market as it represents a considerable opportunity for HPC solution vendors. HyperscaleAI infrastructure by itself is about $8 billion. It seems that for the moment, AI and HPC future are closely intertwined.

Sources: Intersect360 research - Pre-SC20 Market Update & Jack Dongarra - An overview of HPC

Sunday, November 01, 2020

ARM ecosystem disintegration and the rise of RISC-V

#ARM acquisition by #Nvidia is making people uneasy.

And the early sign of the unravelling of the #ARM ecosystem start to appear: ThunderX3 general-purpose ARM CPU has been cancelled.

One would ask why spending $$ to build a better product and increase its number of consumers if, for that, it will have to use the Nvidia IP and compete directly against the IP owner.
If you combine this with the difficult viability of putting together a general-purpose #ARM alternative to #Intel / #AMD as #ARM vendors are effectively competing on cost with much lower volumes.

We start to understand why Marvell decided to shift toward the much more trendy IPU/PDU/Smartnic market.

On the other hand, I think we will see an acceleration of RISC-V adoption. Eating away at the traditional #ARM market share. This will be driven by the large scale edge deployment of #riscv sees chips with a RISC-V core and an #NPU (neural processing unit). These chips can be churned out at incredibly cheap cost, less than $10, and these will become ubiquitous really rapidly.

It might take 10-15 years but ultimately this will seal the fate of the ARM franchise.

Tuesday, July 11, 2017

[Links of the Day] 11/07/2017 : Chip Hall of Fame, AMD Software optimization guide, Pocket negotiator

Pocket Negotiator : this is a really cool negotiation software helping people in the actual negotiation process or prepare for a negotiation.
Software Optimization Guide for AMD Family 17h Processors : AMD optimisation guideline for its latest processors family.
Chip Hall of Fame : The stories of the greatest and most influential microchips in history—and the people who built them

Wednesday, April 19, 2017

[Links of the Day] 19/04/2017 : AMD ROCm GPU open platform, Weak Memory Models concurrency report, SSH server for distributed infrastruscture

ROCm : this slide deck give an overview of the AMD ROCm open platform for GPU computing exploration. They are really pushing to become the open source standard for the GPU industry battling against NVIDIA supremacy in the domain. It looks like they are making really good progress and I would be curious to see how this progress when combining with their Ryzen CPU.
Concurrency with Weak Memory Models : this is a really good report on the state of memory models in hardware and software. It provides a wide spectrum overview of Hardware and Software concurrency model and approaches as well as the future direction in the domain.
Teleport 2 : a modern SSH server designed for teams managing distributed infrastructure. [github]

Wednesday, November 23, 2016

[Links of the Day] 23/11/2016 : AMD Exascale vision, Hardware Resiliency myths and truths, MIT EmTech

Resiliency for Reliability– Myths and Truths : this slide deck provide an overview of the resiliency issue and how Intel tackle those for hardware fault. From fans down to soft errors ( ex: neutron beam ... yes this can £%£ your system). The authors present the two type of approach , reactive and proactive handling of errors.
AMD's Exascale computing vision : Its all about 3d stacked chip with future interconnect. The interesting bit is the ROCM platform and the P2P multiGPU and P2P with RDMA. Slowly we are removing the need to have a full server to deploy GPU, one step closer to fully modular system with each resourced pooled and optimized in their own enclosure. Its a lot easier to design power supply, cooling system, etc.. When you do not have to deal with heterogeneous hardware with different power, and cooling profile ( cpu, memory , disk etc.. in the same enclosure).
MIT EmTech 16 : This year MIT EmTech is all about AI & machine learning ... reaching maximum hype in the domain