Quoting Forbes here:
"With 400,000 programmable processor cores, 18 GB of memory, and an on-chip fabric capable of 25 petabits, the WSE comprises 1.2 trillion transistors in 46,225 mm2 of silicon real estate (for contrast, it is 56x larger than the largest GPU for AI, which is 815mm2)"
"On top of these engineering innovations, the company develop new programmable Sparse Linear Algebra Cores (SLAC) optimized for AI processing. The SLAC skips any function that multiplies by zero, which can significantly speed the multiplication of matrices in the deep learning process while reducing power. The company also reduced the memory stack by eliminating cache and putting large amounts of high-speed memory (18 GB of SRAM) close to the processing cores. All this is connected by what the company calls the Swarm communication fabric, a 2D mesh fabric with 25 petabits of bandwidth that is designed to fit between the processor cores and tiles, including what would normally be die cut area on the wafer."
"Because of its design, the Cerebras WSE platform has advantages in latency, bandwidth, processing efficiency, and size. According to Cerebras, the WSE is 56.7 times larger than the largest GPU, has 3,000 times more on-die memory, has 10,000 times more memory bandwidth, and fits into 1/50th of the space of a traditional data center configuration with thousands of server nodes. The company has not discussed the availability of the platform or estimated cost."
https://www.cerebras.net/
Way much more info in the site. Of course I'm not related in any capacity to Cerebras, Forbes or any other company mentioned in the article.