Run LLMs locally in your terminal. Supports custom .pt GPT-2 and LLaMA decoder checkpoints (auto-handles compact key naming, tied embeddings, and split-halves RoPE) plus GGUF models, with a live dashboard of throughput, VRAM, and chat.