Write a Blog >>
PPoPP 2018
Sat 24 - Wed 28 February 2018 Vösendorf / Wien, Austria
Sat 24 Feb 2018 10:30 - 11:00 at Europa 5 - WPMVP 2018 Session 2

System tracking is an old problem and has been heavily optimized throughout the past. However, in High Energy Physics, many small systems are tracked in real-time using Kalman filtering and no implementation satisfying those constraints currently exists. In this paper, we present a code generator used to speed up the Cholesky factorization and the Kalman filter for small matrices. The generator is easy to use and produces portable and heavily optimized code. We focus on current SIMD architectures (SSE, AVX, AVX512, Neon, SVE, Altivec and VSX). Our Cholesky factorization outperforms any existing libraries: from x3 to x10 faster than the MKL. The Kalman filter is also faster than existing implementations, and achieves a performance of 4e9 iter/s on a 2x24C Intel Xeon.

CERN SIMD slides (CERN-SIMD.pdf)1.43MiB