Files
llmqt/README.md
Jaroslav Benes a45ced89de Initial commit: llmqt LLM Query Tester
Single-file Python CLI to batch-test multiple LLM models with predefined
queries. Supports YAML/JSON config, reasoning detection (<think> tags and
reasoning_content field), per-query token/speed stats, and graceful API
error handling. Install with `pip install -e .` to get the `llmqt` command.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 12:25:34 +02:00

83 lines
1.9 KiB
Markdown

# llmqt — LLM Query Tester
Batch-test multiple LLM models against a set of queries. Results are saved as nicely formatted Markdown files — one per model — including per-query stats and a summary table.
## Install
```bash
pip install -e .
```
This installs the `llmqt` command into your PATH.
## Setup
Export your API credentials:
```bash
export OPENAI_API_KEY=your_key_here
export OPENAI_API_BASE=https://your-endpoint/v1 # optional, for custom/local endpoints
```
## Usage
```bash
llmqt <system_prompt.md> <config1.yaml> [config2.yaml ...]
```
Examples:
```bash
llmqt prompt.md test1.yaml
llmqt prompt.md test1.yaml test2.yaml test3.json
```
Outputs are written to `./<config_stem>/<model_name>.md` in the current working directory.
## Config file format
YAML (`.yaml` / `.yml`) and JSON (`.json`) are both supported.
```yaml
models:
- gpt-4o-mini
- gpt-4o
queries:
- "What is the capital of France?"
- "Explain TCP vs UDP."
- "Write a Python prime-checker function."
```
See [example_test.yaml](example_test.yaml) and [example_system_prompt.md](example_system_prompt.md).
## Output format
For `llmqt prompt.md test1.yaml` with models `gpt-4o-mini` and `gpt-4o`:
```
test1/
gpt-4o-mini.md
gpt-4o.md
```
Each file contains:
- A **statistics table** (elapsed time, prompt/completion tokens, tok/s per query + totals)
- For each query: the query text, per-query stats, optional **Reasoning** section (if the model returns chain-of-thought), and the **Response**
### Reasoning detection
Reasoning content is extracted automatically from:
- The `reasoning_content` field on the message (DeepSeek API style)
- `<think>...</think>` tags in the response content (DeepSeek R1 / QwQ open-source style)
## Execution order
```
for each config file:
for each model:
for each query → POST to API, wait for response
write <config_stem>/<model>.md in CWD
```