TTT-Discover optimizes GPU kernels 2x faster than human experts — by training during inference
via arxiv.org
Short excerpt below. Read at the original source.
Researchers from Stanford, Nvidia, and Together AI have developed a new technique that can discover new solutions to very complex problems. For example, they managed to optimize a critical GPU kernel to run 2x faster than the previous state-of-the-art written by human experts. Their technique, called “Test-Time Training to Discover” (TTT-Discover), challenges the current paradigm […]