LLM inference engine - main crate with CLI and re-exports
LLM model architectures and serving components for the Lmonade inference engine
HTTP API server with OpenAI-compatible endpoints for the Lmonade LLM inference engine
Actor-based runtime for LLM inference orchestration and resource management