State-of-the-art deep learning, foundation models, and AI for Science is advancing quickly, defined by very large models and data, and very detailed sparse models. GPU-based infrastructure struggles to effectively manage these workloads without complex and costly overhead and compromises in performance and accuracy.
DataScale’s dataflow architecture combined with large on-chip and system memory enable organizations to run workloads that cannot practically be handled by GPUs, including:
Training and running large models consumes massive processing power spanning a huge succession of steps; however, processing power is often radically underutilized as time is spent moving data in and to memory in preparation for the next processing step.
SambaNova’s unique processors called Reconfigurable Dataflow Units (RDUs), fuse processing steps together and enable data to stay on the RDU and avoid excess memory access, increase performance and reduce the cost and power needs of neural networks.
Training and deploying deep learning and foundation models requires massive parallelization of hundreds, or even thousands, of GPUs. This process is costly, complex, and can degrade accuracy.
DataScale’s large on-chip and system memory enables it to handle the largest models and data. The SambaFlow software manages scaling across any number of devices or configurations. Seamlessly scale up to 48 racks of DataScale systems with consistent rack-to-rack bandwidth and latency.
This enables organizations to train and deploy the largest models without the cost, complexity, and overhead that GPUs require.
A complete software stack for SambaNova DataScale® , SambaFlow™ fully integrates with popular standard frameworks such as PyTorch. SambaFlow provides an open, flexible, and easy-to-use development interface.
SambaFlow automatically extracts, optimizes, and executes the optimal dataflow graph of any of your models. This enables you to achieve out-of-the-box performance, accuracy, scale, and ease of use. With SambaFlow, you can maximize productivity by focusing your development efforts in the frameworks without ever again worrying about low-level tuning.
The state-of-the-art in AI is constantly being redefined, introducing changing requirements as new breakthroughs emerge. Legacy systems such as GPUs require data science and engineering teams to use proprietary coding languages to optimize these new models. On top of that, sufficiently training and deploying even the same model requires different forms of scalabilities. Managing a mixed portfolio of hardware to cover the full usage spectrum can be overbearing, let alone managing the frequent needs to add or switch to new infrastructure in order to keep up with the new requirements.
Now organizations can achieve ROI faster, substantially reduce risk, and scale more cost effectively than is possible with any other AI infrastructure offering.