Parameter | Description | Default | Environment Variable |
---|---|---|---|
model | Model name running on vLLM server | "Qwen/Qwen2.5-32B-Instruct" | - |
vllm_base_url | vLLM server URL | "http://localhost:8000/v1" | VLLM_BASE_URL |
api_key | API key (dummy for local) | "vllm-api-key" | VLLM_API_KEY |
temperature | Sampling temperature | 0.1 | - |
max_tokens | Maximum tokens to generate | 2000 | - |
max_model_len
vllm
config are present in Master List of All Params in Config.