Make writing sparse code easy.
Sparsity Use Case
Sparse tensors arise in problems in science, engineering, machine learning, and data analytics. Programs that operate on such tensors can leverage sparsity to reduce storage requirements and computational time. In recent years, the birth and exponential growth of large deep neural networks mandate more efficient approaches to sparse matrix computation. The MLIR Sparsifier is an initiative to extend Google's compiler stack for sparse deep learning workloads using various frameworks (JAX, PyTorch) and targets (mobile/server CPU, GPU, and TPU).
Sparsity as a Property
The MLIR Sparsifier treats sparsity as a property of tensors, not a tedious user implementation task. Programmers are merely required to annotate sparse tensor types; subsequently, the compiler generates sparse code automatically from a sparsity-agnostic definition of the computation. With sparse tensor types as first-class citizens, any operation can be made sparse by simply annotating the tensor types of the operands. Compiler transformations take care of lowering the operation to imperative constructs and sparse storage formats that only store and iterate over nonzero elements to perform the matrix multiplication, completely abstracting away complexities from users.
Google’s MLIR Sparsifier capabilities are built on top of MLIR, ensuring the most up-to-date design paradigms and extensibility. The ability to progressively lower dialects closer to the target hardware during the compilation process, together with an intuitive transformation mechanism, has made MLIR a popular compiler infrastructure for domain-specific languages that need to bridge large semantic gaps, such as compiling for machine learning.
The MLIR Sparsifier not only enables non-expert programmers to generate sparse code quickly but also enables expert programmers to explore the full space of possible sparse implementations. Additionally, the MLIR Sparsifier is not restricted to a specific type of sparsity or hardware. The long-term goal of this initiative is to address different types of sparsity (weight, activation, or semantic) on different devices (CPU, GPU, or TPU).
rocket_launch Long-term Goal
The MLIR Sparsifier team has developed the vision and pilot implementation of the sparsification technology. The MLIR Sparsifier will be a multi-year initiative that further strengthens the leadership position of Google in ML and compiler technology. With feedback from our valued customers and partners including hardware vendors, and with support from ML framework layer, we aim to identify key workloads and requirements and roll out the sparse ML compiler into production for CPUs, GPUs, and ultimately TPUs.