TTT-Discover optimizes GPU kernels 2x faster than human experts — by training during inference

via arxiv.org

Short excerpt below. Read at the original source.

Researchers from Stanford, Nvidia, and Together AI have developed a new technique that can discover new solutions to very complex problems. For example, they managed to optimize a critical GPU kernel to run 2x faster than the previous state-of-the-art written by human experts. Their technique, called “Test-Time Training to Discover” (TTT-Discover), challenges the current paradigm […]

Read at Source