Configuration

mcpbr uses YAML configuration files to define your MCP server settings and evaluation parameters.

Quick Start

The fastest way to get started is with example configurations:

mcpbr run -c examples/quick-start/getting-started.yaml -v

Or generate a custom config:

mcpbr init

Full Configuration Example

# MCP Server Configuration
mcp_server:
  name: "mcpbr"
  command: "npx"
  args:
    - "-y"
    - "@modelcontextprotocol/server-filesystem"
    - "{workdir}"
  env: {}

# Provider and Harness
provider: "anthropic"
agent_harness: "claude-code"

# Model Configuration
model: "sonnet"

# Benchmark Selection
benchmark: "swe-bench-verified"
sample_size: 10

# Execution Parameters
timeout_seconds: 300
max_concurrent: 4
max_iterations: 10

# Docker Configuration
use_prebuilt_images: true

MCP Server Section

FieldTypeDescription
namestringName to register the MCP server as (default: mcpbr)
commandstringExecutable to run (e.g., npx, uvx, python)
argslistCommand arguments. Use {workdir} as placeholder
envdictAdditional environment variables

Environment Variables

mcp_server:
  command: "npx"
  args: ["-y", "@supermodeltools/mcp-server"]
  env:
    SUPERMODEL_API_KEY: "${SUPERMODEL_API_KEY}"
    LOG_LEVEL: "${LOG_LEVEL:-info}"

Benchmark Selection

FieldDefaultDescription
benchmarkswe-bench-verifiedBenchmark to run
sample_sizenullNumber of tasks (null = full dataset)

Execution Parameters

FieldDefaultDescription
timeout_seconds300Timeout per task in seconds
max_concurrent4Maximum parallel task evaluations
max_iterations10Maximum agent iterations per task
thinking_budgetnullExtended thinking token budget (1024-31999)
budgetnullMaximum budget in USD

Example Configurations

Filesystem Server

mcp_server:
  command: "npx"
  args: ["-y", "@modelcontextprotocol/server-filesystem", "{workdir}"]

Fast Iteration (Development)

mcp_server:
  command: "npx"
  args: ["-y", "@modelcontextprotocol/server-filesystem", "{workdir}"]

model: "haiku"
sample_size: 3
max_concurrent: 1
timeout_seconds: 180

Full Benchmark Run

mcp_server:
  command: "npx"
  args: ["-y", "@modelcontextprotocol/server-filesystem", "{workdir}"]

model: "sonnet"
sample_size: null
max_concurrent: 8
timeout_seconds: 600

Next Steps

Created by Grey Newell