Hi HN, we're Sanchit and Shubham (YC W26). We built a fast inference engine for Apple Silicon. LLMs, speech-to-text, text-to-speech – MetalRT beats llama.cpp, Apple's MLX, Ollama, and sherpa-onnx on every modality we tested. Custom Metal shaders, no framework overhead.<p>Also, we've o...
Stories by sanchitmonga22
Demo of LOCAL Browser agent (powered by Web GPU Liquid LFM & Alibaba Qwen models) opening the All in Podcast on Youtube running as a chrome extension.<p>Source: <a href="https://github.com/RunanywhereAI/on-device-browser-agent" rel="nofollow">https://github.com/...