CLI Reference
mcpbr provides a command-line interface for running evaluations and managing configurations.
mcpbr --help
mcpbr run --help
Commands Overview
| Command | Description |
mcpbr run | Run benchmark evaluation with configured MCP server |
mcpbr init | Generate an example configuration file |
mcpbr config | Manage configuration templates |
mcpbr models | List supported models for evaluation |
mcpbr providers | List available model providers |
mcpbr harnesses | List available agent harnesses |
mcpbr benchmarks | List available benchmarks |
mcpbr cleanup | Remove orphaned mcpbr Docker resources |
mcpbr run
mcpbr run -c CONFIG [OPTIONS]
| Option | Short | Description |
--config PATH | -c | Path to YAML configuration file (required) |
--model TEXT | -m | Override model from config |
--benchmark TEXT | -b | Override benchmark from config |
--sample INTEGER | -n | Override sample size from config |
--mcp-only | -M | Run only MCP evaluation (skip baseline) |
--baseline-only | -B | Run only baseline evaluation (skip MCP) |
--output PATH | -o | Path to save JSON results |
--report PATH | -r | Path to save Markdown report |
--output-yaml PATH | -y | Path to save YAML results |
--verbose | -v | Verbose output (-vv for detailed) |
--task TEXT | -t | Run specific task(s) by instance_id |
--filter-difficulty | | Filter by difficulty (repeatable) |
--filter-category | | Filter by category (repeatable) |
Examples
# Full evaluation with verbose output
mcpbr run -c config.yaml -v
# MCP-only with specific tasks
mcpbr run -c config.yaml -M -t django__django-11099
# Override model and sample size
mcpbr run -c config.yaml -m opus -n 50
# Save all output formats
mcpbr run -c config.yaml -o results.json -y results.yaml -r report.md
mcpbr init
mcpbr init [OPTIONS]
| Option | Short | Description |
--output PATH | -o | Path to write config (default: mcpbr.yaml) |
--template TEXT | -t | Template ID to use |
--interactive | -i | Interactive template selection |
mcpbr cleanup
Remove orphaned mcpbr Docker resources (containers, volumes, networks).
# Preview what would be removed
mcpbr cleanup --dry-run
# Remove with confirmation
mcpbr cleanup
# Force remove all immediately
mcpbr cleanup -f
Exit Codes
| Code | Meaning |
| 0 | Success — at least one task resolved |
| 1 | Fatal error — config invalid, Docker unavailable, API error |
| 2 | No resolutions — evaluation ran but 0% success |
| 3 | Nothing evaluated — all tasks cached/skipped |
| 130 | Interrupted by user (Ctrl+C) |
Environment Variables
| Variable | Required | Description |
ANTHROPIC_API_KEY | Yes | Anthropic API key for Claude models |
Next Steps
Created by Grey Newell