DescriptionLarge Language Models are shifting “what’s possible” in AI, but distributed training across thousands of traditional accelerators is massively complex and always suffers diminishing returns as more compute is added. Always? No longer. In this talk, Natalia Vassilieva from Cerebras Systems, will present a cluster of 16 Cerebras CS-2 nodes that achieves near-perfect linear scaling across more cores than the world’s most powerful supercomputer. And there’s more: the programming model is radically simple: the code for 16 nodes is exactly the same as that for a single node. A new era of easy access to extreme-scale AI has just begun.