Today, NVIDIA is announcing the availability of cuTENSOR version 1.3.0. This software can be downloaded now free for members of the NVIDIA Developer Program.
Today, NVIDIA is announcing the availability of cuTENSOR version 1.3.0. This software can be downloaded now free for members of the NVIDIA Developer Program.
What’s New
- Support for up to 40-dimensional tensors
- Support 64-bit strides
- Support for BFloat16 Element-wise operations
- Improved performance for direct Tensor Contractions
- Bug fixes
See the cuTENSOR Release Notes for more information.
About cuTENSOR
cuTENSOR is a high-performance CUDA library for tensor primitives; its key features are:
- Extensive mixed-precision support:
- FP64 inputs with FP32 compute.
- FP32 inputs with FP16, BF16, or TF32 compute.
- Complex-times-real operations.
- Conjugate (without transpose) support.
- Support for up to 40-dimensional tensors.
- Arbitrary data layouts.
- Trivially serializable data structures.
- Main computational routines:
- Element-wise tensor operations:
- Support for various activation functions.
- Arbitrary tensor permutations.
- Conversion between different data types.
- Element-wise tensor operations:
Learn more:
- GTC 2021: S31754 Recent Developments in NVIDIA Math Libraries
- GTC 2021: S31286 A Deep Dive into the Latest HPC Software
- GTC 2021: CWES1098 Tensor Core-Accelerated Math Libraries for Dense and Sparse Linear Algebra in AI and HPC
- cuTENSOR Product Documentation
Recent Developer Blog posts: