Show HN: RunAnwhere – Faster AI Inference on Apple Silicon
via github.com
Short excerpt below. Read at the original source.
Hi HN, we’re Sanchit and Shubham (YC W26). We built a fast inference engine for Apple Silicon. LLMs, speech-to-text, text-to-speech – MetalRT beats llama.cpp, Apple’s MLX, Ollama, and sherpa-onnx on every modality we tested. Custom Metal shaders, no framework overhead. Also, we’ve open-sourced RCLI, the fastest end-to-end voice AI pipeline on Apple Silicon. Mic to […]