Microsoft’s updated DeepSpeed can train trillion-parameter AI models with fewer GPUs

Microsoft today released an updated version of its DeepSpeed library that introduces a new approach to training AI models containing trillions of parameters, the variables internal to the model that inform its predictions. The company claims the technique, dubbed 3D parallelism, adapts to the varying needs of workload requirements to power extremely large models while balancing scaling efficiency.

Read More