Abstract: Structured sparsity has been proposed as an efficient way to prune the complexity of Machine Learning (ML) applications and to simplify the handling of sparse data in hardware. Accelerating ...
MIT researchers have designed silicon structures that can perform calculations in an electronic device using excess heat instead of electricity. These tiny structures could someday enable more ...
We took this version of HeCBench and are modifying it to build the CUDA and OMP codes to gather their roofline performance data. So far we have a large portion of the CUDA and OMP codes building ...
Abstract: Sparse-sparse matrix multiplication (SpGEMM) is a well-studied problem on CPUs, GPUs, accelerators (e.g. FPGAs), and distributed systems. The main computational bottleneck in SpGEMM is the ...
Quantum-inspired adaptive tiling for high-performance matrix multiplication. Uses WKB tunneling physics with the golden ratio to derive optimal tile sizes from real-time CPU state. 15%+ gains on ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results