CLI¶
The llmrouter command is installed via the llmrouter package and is registered in pyproject.toml.
Summary¶
The CLI is the primary interface for training, inference, and chat. It is implemented in:
llmrouter/cli/router_main.py(entrypoint): https://github.com/ulab-uiuc/LLMRouter/blob/main/llmrouter/cli/router_main.pyllmrouter/cli/router_train.py(training): https://github.com/ulab-uiuc/LLMRouter/blob/main/llmrouter/cli/router_train.pyllmrouter/cli/router_inference.py(inference): https://github.com/ulab-uiuc/LLMRouter/blob/main/llmrouter/cli/router_inference.py
Dependencies
The chat command requires gradio to be installed.
Global options¶
llmrouter --versionprints the CLI version.llmrouter <subcommand> --helpprints flags for a subcommand.
Environment variables¶
API_KEYS (only for real API calls)¶
llmrouter infer (without --route-only) calls an OpenAI-compatible endpoint via LiteLLM and requires API_KEYS.
It supports either a single key or a JSON list of keys (used round-robin).
Single key
Multiple keys (JSON list)
Tip
If you only want routing decisions (no API calls, no API keys), always add --route-only.
router_r1 is different
router_r1 does not use API_KEYS. It requires hparam.api_base and hparam.api_key in its YAML config, and it does not support --route-only.
LLMROUTER_PLUGINS (optional)¶
If you use custom routers, set LLMROUTER_PLUGINS to point to plugin directories (colon-separated). See Plugin system.
train¶
Train a router model.
Signature¶
Parameters¶
| Name | Type | Description | Default |
|---|---|---|---|
--router |
str | Router method name | required |
--config |
path | YAML config file | required |
--device |
str | Training device (cuda, cpu, or auto) |
auto |
--quiet |
flag | Suppress verbose output | false |
Example¶
infer¶
Run inference or routing.
Signature¶
llmrouter infer --router <name> --config <path> (--query <text> | --input <file>) [--load_model_path <path>] [--route-only] [--output <file>] [--output-format json|jsonl] [--temp <float>] [--max-tokens <int>] [--verbose]
Input/output schemas
For --input formats and output fields, see Data formats.
API endpoint resolution
Inference expects an endpoint from either:
llm_data[model_name].api_endpoint(per model), orapi_endpointin the YAML config
See Config reference.
Parameters¶
| Name | Type | Description | Default |
|---|---|---|---|
--router |
str | Router method name | required |
--config |
path | YAML config file | required |
--query |
str | Single query string | required (mutually exclusive) |
--input |
path | Input file (.txt, .json, .jsonl) |
required (mutually exclusive) |
--load_model_path |
path | Override model_path.load_model_path |
none |
--route-only |
flag | Skip API call and only return routing | false |
--output |
path | Output file for batch inference | none |
--output-format |
str | Output format (json or jsonl) |
json |
--temp |
float | Generation temperature | 0.8 |
--max-tokens |
int | Max tokens for generation | 1024 |
--verbose |
flag | Print verbose output | false |
Example¶
llmrouter infer --router knnrouter --config configs/model_config_test/knnrouter.yaml --query "What is AI?"
Batch example (JSONL)¶
llmrouter infer --router knnrouter --config configs/model_config_test/knnrouter.yaml --input queries.jsonl --output results.jsonl --output-format jsonl --route-only
chat¶
Launch the interactive chat UI.
Signature¶
llmrouter chat --router <name> --config <path> [--load_model_path <path>] [--temp <float>] [--host <host>] [--port <port>] [--mode full_context|current_only|retrieval] [--top_k <int>] [--share]
Parameters¶
| Name | Type | Description | Default |
|---|---|---|---|
--router |
str | Router method name | required |
--config |
path | YAML config file | required |
--load_model_path |
path | Override model_path.load_model_path |
none |
--temp |
float | Generation temperature | 0.8 |
--host |
str | Host to bind the server | all interfaces |
--port |
int | Server port | 8001 |
--mode |
str | Query mode (full_context, current_only, retrieval) |
current_only |
--top_k |
int | Retrieval count when mode=retrieval |
3 |
--share |
flag | Create a public share link | false |
Example¶
list-routers¶
List all available routers (including discovered plugins).
Signature¶
Notes¶
- This prints routers available for inference and routers available for training (separately).
- If a router is missing, check optional dependencies and plugin discovery paths.
version¶
Show CLI version information.