Stories by cheese
Jan-nano-128k: A 4B Model with a Super-Long Context Window (Still Outperforms 671B [in MCP])
(huggingface.co)
[Reddit post where title taken from with more context](https://www.reddit.com/r/LocalLLaMA/comments/1ljyo2p/jannano128k_a_4b_model_with_a_superlong_context/)