Google’s Gemma 4 AI models get 3x speed boost by predicting future tokens
via arstechnica.com
Short excerpt below. Read at the original source.
Google launched its Gemma 4 open models this spring, promising a new level of power and performance for local AI. Google’s take on edge AI could be getting even faster already with the release of Multi-Token Prediction (MTP) drafters for Gemma. Google says these experimental models leverage a form of speculative decoding to take a […]