MXCP Quickstart Guide

This guide demonstrates MXCP's structured approach to building production-ready MCP servers. You'll learn not just how to create AI tools, but how to build them the right way - with proper data modeling, comprehensive testing, and enterprise-grade security.

Installation
Path 1: Hello World (2 minutes)
Path 2: Real-World Data Pipeline (10 minutes)
Path 3: Python-Powered Tools (5 minutes)
Path 4: Enterprise Features (15 minutes)
Advanced Patterns
LLM Integration Options
Production Deployment
Next Steps
Troubleshooting
Why MXCP?
Key Architecture Patterns

Installation

First, install MXCP:

pip install mxcp

# For advanced features (optional)
pip install "mxcp[vault]"      # HashiCorp Vault integration
pip install "mxcp[onepassword]" # 1Password integration
pip install "mxcp[all]"        # All optional features

Path 1: Hello World (2 minutes)

Perfect for understanding the basics with both SQL and Python examples:

1. Initialize a Project

Create a new project with hello world examples:

# Create a new directory and initialize MXCP
mkdir my-mxcp-project
cd my-mxcp-project
mxcp init --bootstrap

This creates an organized project structure:

my-mxcp-project/
├── mxcp-site.yml       # Project configuration
├── tools/              # Tool definitions (MCP tools)
│   └── hello-world.yml # Example tool definition
├── resources/          # Resource definitions (MCP resources)
├── prompts/            # Prompt definitions (MCP prompts)
├── evals/              # Evaluation definitions
├── python/             # Python endpoints and shared code
├── plugins/            # MXCP plugins for DuckDB
├── sql/                # SQL implementations
│   └── hello-world.sql # SQL implementation for tools
├── drift/              # Drift detection snapshots
├── audit/              # Audit logs
└── server_config.json  # Claude Desktop config (auto-generated)

🏗️ Organized by Design: MXCP enforces a structured approach where each endpoint type has its own directory:

tools/ - MCP tool definitions (.yml files that define callable functions)
resources/ - MCP resource definitions (.yml files that define data resources)
prompts/ - MCP prompt definitions (.yml files that define reusable prompts)
evals/ - Evaluation definitions for testing your endpoints
python/ - Python endpoints and shared code
plugins/ - MXCP plugins for DuckDB (User Defined Functions)
sql/ - SQL implementations for data queries
drift/ - Schema drift detection snapshots (auto-generated)
audit/ - Audit logs (auto-generated when enabled)

The --bootstrap flag provides:

✅ Complete organized directory structure
✅ Examples in both SQL and Python
✅ Automatic server_config.json generation that handles virtualenvs
✅ Clear, actionable next steps
✅ Platform-specific Claude Desktop config paths

2. Explore the Generated Files

The bootstrap creates examples in both languages:

SQL Example (for data queries):

# tools/hello-world.yml
mxcp: 1
tool:
  name: "hello_world"
  description: "A simple hello world tool"
  enabled: true
  parameters:
    - name: "name"
      type: "string"
      description: "Name to greet"
      examples: ["World"]
  return:
    type: "string"
    description: "Greeting message"
  source:
    file: "../sql/hello-world.sql"

-- sql/hello-world.sql
SELECT 'Hello, ' || $name || '!' as greeting

Python Example (for complex logic):

# tools/calculate-fibonacci.yml
mxcp: 1
tool:
  name: "calculate_fibonacci"
  description: "Calculate Fibonacci number"
  language: python
  parameters:
    - name: "n"
      type: "integer"
      description: "Position in Fibonacci sequence"
      minimum: 0
      maximum: 100
  return:
    type: "object"
    properties:
      position: { type: "integer" }
      value: { type: "integer" }
      calculation_time: { type: "number" }
  source:
    file: "../python/math_tools.py"

# python/math_tools.py
import time
from mxcp.runtime import db, config

def calculate_fibonacci(n: int) -> dict:
    """Calculate the nth Fibonacci number"""
    start_time = time.time()
    
    if n <= 1:
        value = n
    else:
        a, b = 0, 1
        for _ in range(2, n + 1):
            a, b = b, a + b
        value = b
    
    # Optional: Store in database for caching
    db.execute(
        "INSERT OR REPLACE INTO fibonacci_cache (n, value) VALUES ($n, $value)",
        {"n": n, "value": value}
    )
    
    return {
        "position": n,
        "value": value,
        "calculation_time": time.time() - start_time
    }

3. Start the MCP Server

mxcp serve

The server starts in stdio mode, ready for LLM integration. If you used --bootstrap, the generated server_config.json is already configured correctly for your environment (virtualenv, poetry, or system-wide installation).

Path 2: Real-World Data Pipeline (10 minutes)

Experience MXCP's production capabilities with the COVID-19 + dbt example:

1. Get the COVID Example

git clone https://github.com/raw-labs/mxcp.git
cd mxcp/examples/covid_owid

2. Understand the dbt Integration

This example showcases MXCP's killer feature: dbt-native data caching

# dbt_project.yml - Standard dbt project
name: 'covid_owid'
version: '1'
profile: 'covid_owid'
model-paths: ["models"]
target-path: "target"

-- models/covid_data.sql - dbt model that creates covid_data table
{{ config(materialized='table') }}

select *
from read_csv_auto('https://github.com/owid/covid-19-data/raw/master/public/data/owid-covid-data.csv')

The magic: This dbt model fetches COVID data from the web and creates a covid_data table in DuckDB. MXCP endpoints then query this table directly using standard SQL.

3. Run the Data Pipeline

# Install dbt dependencies
dbt deps

# Run dbt transformations (this caches the data locally)
dbt run

# Start MXCP server
mxcp serve

What just happened?

dbt run fetched COVID data from OWID and created a covid_data table in DuckDB
MXCP server exposes SQL query tools that can query this table directly
LLMs can analyze months of COVID data instantly (no API calls!)

4. Connect to Claude Desktop

Add this to your Claude Desktop config:

{
  "mcpServers": {
    "covid": {
      "command": "mxcp",
      "args": ["serve", "--transport", "stdio"],
      "cwd": "/absolute/path/to/mxcp/examples/covid_owid"
    }
  }
}

5. Test the Integration

Ask Claude:

"Show me COVID cases in Germany vs France during 2021"
"What were the vaccination rates in the UK by month?"
"Compare hospitalization data between Italy and Spain"

The responses are instant because the data is cached locally!

Built-in SQL Tools: MXCP provides optional SQL query tools (execute_sql_query, list_tables, get_table_schema) that let Claude explore and query your data directly. These tools are disabled by default but can be enabled in your configuration.

Path 3: Python-Powered Tools (5 minutes)

Build complex AI tools using Python's full ecosystem:

1. Create a Python Analysis Tool

# tools/sentiment-analyzer.yml
mxcp: 1
tool:
  name: sentiment_analyzer
  description: "Analyze text sentiment with ML"
  language: python
  parameters:
    - name: texts
      type: array
      items:
        type: string
      description: "Texts to analyze"
      maxItems: 100
  return:
    type: array
    items:
      type: object
      properties:
        text: { type: string }
        sentiment: { type: string, enum: ["positive", "negative", "neutral"] }
        confidence: { type: number, minimum: 0, maximum: 1 }
  source:
    file: "../python/ml_tools.py"

# python/ml_tools.py
from mxcp.runtime import db, config, on_init
import asyncio

# Simulate ML model loading (replace with real model)
model = None

@on_init
def load_model():
    """Load ML model on startup"""
    global model
    print("Loading sentiment model...")
    # model = load_your_model_here()
    model = {"loaded": True}  # Placeholder

async def sentiment_analyzer(texts: list[str]) -> list[dict]:
    """Analyze sentiment for multiple texts concurrently"""
    
    async def analyze_one(text: str) -> dict:
        # Simulate async API call or model inference
        await asyncio.sleep(0.1)
        
        # Simple rule-based sentiment (replace with real ML)
        sentiment = "positive" if "good" in text.lower() else \
                   "negative" if "bad" in text.lower() else \
                   "neutral"
        
        confidence = 0.95 if sentiment != "neutral" else 0.6
        
        # Optional: Store results for analytics
        db.execute(
            """
            INSERT INTO sentiment_history (text, sentiment, confidence, analyzed_at)
            VALUES ($text, $sentiment, $confidence, CURRENT_TIMESTAMP)
            """,
            {"text": text[:200], "sentiment": sentiment, "confidence": confidence}
        )
        
        return {
            "text": text,
            "sentiment": sentiment,
            "confidence": confidence
        }
    
    # Process all texts concurrently
    results = await asyncio.gather(*[analyze_one(text) for text in texts])
    return results

2. Create an API Integration Tool

# tools/weather-forecast.yml
mxcp: 1
tool:
  name: weather_forecast
  description: "Get weather forecast for a location"
  language: python
  parameters:
    - name: location
      type: string
      description: "City name or coordinates"
  return:
    type: object
    properties:
      location: { type: string }
      temperature: { type: number }
      conditions: { type: string }
      forecast: 
        type: array
        items:
          type: object
          properties:
            day: { type: string }
            high: { type: number }
            low: { type: number }
  source:
    file: "../python/api_tools.py"

# python/api_tools.py
import httpx
from mxcp.runtime import config, db
from datetime import datetime

async def weather_forecast(location: str) -> dict:
    """Fetch weather forecast from external API"""
    
    # Get API key from secure config
    api_key = config.get_secret("weather_api_key")
    
    # Check cache first
    cached = db.execute(
        """
        SELECT forecast_data 
        FROM weather_cache 
        WHERE location = $location 
          AND cached_at > datetime('now', '-1 hour')
        """,
        {"location": location}
    ).fetchone()
    
    if cached:
        return cached["forecast_data"]
    
    # Fetch from API
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://api.weather.com/forecast",
            params={"location": location, "key": api_key}
        )
        data = response.json()
    
    # Transform and cache the result
    result = {
        "location": location,
        "temperature": data["current"]["temp"],
        "conditions": data["current"]["conditions"],
        "forecast": [
            {
                "day": day["date"],
                "high": day["high"],
                "low": day["low"]
            }
            for day in data["forecast"][:5]
        ]
    }
    
    # Cache for future requests
    db.execute(
        """
        INSERT OR REPLACE INTO weather_cache (location, forecast_data, cached_at)
        VALUES ($location, $data, CURRENT_TIMESTAMP)
        """,
        {"location": location, "data": result}
    )
    
    return result

3. Combine SQL and Python

Create powerful tools that leverage both SQL for data and Python for logic:

# tools/customer-insights.yml
mxcp: 1
tool:
  name: customer_insights
  description: "Generate AI insights for customer behavior"
  language: python
  parameters:
    - name: customer_id
      type: string
  return:
    type: object
  source:
    file: "../python/insights.py"

# python/insights.py
from mxcp.runtime import db
import statistics

def customer_insights(customer_id: str) -> dict:
    """Combine SQL data analysis with Python ML insights"""
    
    # Use SQL for efficient data aggregation
    purchase_data = db.execute("""
        SELECT 
            COUNT(*) as total_purchases,
            SUM(amount) as total_spent,
            AVG(amount) as avg_order_value,
            MAX(purchase_date) as last_purchase,
            STRING_AGG(DISTINCT category, ', ') as categories
        FROM purchases
        WHERE customer_id = $customer_id
    """, {"customer_id": customer_id}).fetchone()
    
    # Get purchase history for pattern analysis
    history = db.execute("""
        SELECT amount, purchase_date, category
        FROM purchases
        WHERE customer_id = $customer_id
        ORDER BY purchase_date DESC
        LIMIT 50
    """, {"customer_id": customer_id}).fetchall()
    
    # Use Python for complex analysis
    amounts = [p["amount"] for p in history]
    
    insights = {
        "customer_id": customer_id,
        **dict(purchase_data),
        "spending_trend": calculate_trend(amounts),
        "purchase_frequency": calculate_frequency(history),
        "predicted_ltv": predict_lifetime_value(purchase_data, history),
        "recommendations": generate_recommendations(history)
    }
    
    return insights

def calculate_trend(amounts):
    """Calculate spending trend"""
    if len(amounts) < 2:
        return "insufficient_data"
    
    recent = statistics.mean(amounts[:10])
    older = statistics.mean(amounts[10:20]) if len(amounts) >= 20 else amounts[-1]
    
    if recent > older * 1.1:
        return "increasing"
    elif recent < older * 0.9:
        return "decreasing"
    else:
        return "stable"

4. Test Your Python Tools

# Run individual tools
mxcp run tool sentiment_analyzer --param texts='["This is great!", "Bad experience"]'

# Test with mock data
mxcp test

# View execution logs
mxcp log --tool sentiment_analyzer

Path 4: Enterprise Features (15 minutes)

Experience MXCP's production-grade security and governance:

1. Policy Enforcement

Create a new endpoint with access control:

# tools/employee-data.yml
mxcp: 1
tool:
  name: employee_data
  description: "Query employee information"
  parameters:
    - name: employee_id
      type: string
      description: "Employee ID to query"
  return:
    type: object
    properties:
      name: { type: string }
      department: { type: string }
      salary: { type: number }
      ssn: { type: string, sensitive: true }
  source:
    code: |
      SELECT 
        'John Doe' as name,
        'Engineering' as department,
        95000 as salary,
        '123-45-6789' as ssn

# Add enterprise-grade policies
policies:
  input:
    - condition: "!('hr.read' in user.permissions)"
      action: deny
      reason: "Missing HR read permission"
  output:
    - condition: "user.role != 'hr_manager'"
      action: filter_fields
      fields: ["salary", "ssn"]
      reason: "Sensitive data restricted to HR managers"

2. Enable Audit Logging

# mxcp-site.yml - Add audit configuration
profiles:
  production:
    audit:
      enabled: true
      path: audit-logs.jsonl

3. Test with User Context

# Test as regular user (will filter sensitive data)
mxcp run tool employee_data \
  --param employee_id=123 \
  --user-context '{"role": "user", "permissions": ["hr.read"]}'

# Test as HR manager (will see all data)
mxcp run tool employee_data \
  --param employee_id=123 \
  --user-context '{"role": "hr_manager", "permissions": ["hr.read", "pii.view"]}'

# View audit logs
mxcp log --since 10m

4. Authentication Setup

For production, enable OAuth authentication:

# mxcp-site.yml
profiles:
  production:
    auth:
      enabled: true
      provider: github
      client_id: your_github_client_id
      redirect_uri: http://localhost:8080/callback

Advanced Patterns

1. Multi-Source Data Pipeline

Step 1: dbt creates the tables

-- models/sales_analysis.sql (dbt model)
{{ config(materialized='table') }}

WITH daily_sales AS (
  SELECT * FROM {{ source('raw', 'sales_data') }}
),
customer_info AS (
  SELECT * FROM {{ ref('customers') }}  -- Another dbt model
),
external_data AS (
  SELECT * FROM 'https://api.example.com/market-data.json'
)
SELECT 
  s.date,
  s.amount,
  c.segment,
  e.market_trend
FROM daily_sales s
JOIN customer_info c ON s.customer_id = c.id
JOIN external_data e ON s.date = e.date

Step 2: MXCP endpoint queries the table

# tools/sales-analysis.yml
tool:
  name: get_sales_analysis
  source:
    code: |
      SELECT * FROM sales_analysis  -- Table created by dbt
      WHERE date >= $start_date

2. Dynamic Caching Strategy

Step 1: dbt model combines live and historical data

-- models/live_dashboard.sql (dbt model)
{{ config(
  materialized='table',
  post_hook="PRAGMA optimize"
) }}

-- Cache recent data every hour, historical data daily
SELECT * FROM read_json('https://api.metrics.com/live-data')
WHERE timestamp >= current_timestamp - interval '24 hours'

UNION ALL

SELECT * FROM {{ ref('historical_metrics') }}
WHERE timestamp < current_timestamp - interval '24 hours'

Step 2: MXCP endpoint queries the combined table

# tools/dashboard.yml
tool:
  name: get_dashboard_metrics
  source:
    code: |
      SELECT * FROM live_dashboard  -- Table created by dbt
      WHERE metric_type = $metric_type
      ORDER BY timestamp DESC

3. Common Data Transformation Patterns

Reading and Aggregating Data

-- Aggregate data by time periods
SELECT 
  DATE_TRUNC('month', created_at) as month,
  COUNT(*) as total_orders,
  SUM(amount) as total_revenue,
  AVG(amount) as avg_order_value
FROM orders
WHERE created_at >= $start_date
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month DESC;

-- Join multiple data sources
SELECT 
  u.name,
  u.email,
  COUNT(o.id) as order_count,
  SUM(o.amount) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at >= $since_date
GROUP BY u.id, u.name, u.email
HAVING COUNT(o.id) > 0;

Reading from Various Sources

-- Read from CSV files
SELECT * FROM read_csv('data/sales-*.csv') 
WHERE region = $region;

-- Read from JSON APIs
SELECT * FROM read_json_auto('https://api.example.com/data')
WHERE status = 'active';

-- Read from Parquet files
SELECT customer_id, SUM(amount) as total
FROM read_parquet('s3://bucket/transactions/*.parquet')
GROUP BY customer_id;

4. Type-Safe Parameter Validation

parameters:
  - name: date_range
    type: object
    properties:
      start_date:
        type: string
        format: date
        description: "Start date (YYYY-MM-DD)"
      end_date:
        type: string
        format: date
        description: "End date (YYYY-MM-DD)"
    required: ["start_date", "end_date"]
    
  - name: metrics
    type: array
    items:
      type: string
      enum: ["revenue", "users", "conversion"]
    description: "Metrics to include"
    minItems: 1
    maxItems: 5

LLM Integration Options

Option A: Claude Desktop Integration

Best for interactive development and testing:

{
  "mcpServers": {
    "my-project": {
      "command": "mxcp",
      "args": ["serve", "--transport", "stdio"],
      "cwd": "/path/to/your/project"
    }
  }
}

Option B: HTTP API Mode

Perfect for web applications and custom integrations:

# Start HTTP server
mxcp serve --transport http --port 8080

# Test with curl
curl -X POST http://localhost:8080/tools/employee_data \
  -H "Content-Type: application/json" \
  -d '{"employee_id": "123"}'

Option C: Built-in SQL Tools (Optional)

MXCP provides optional SQL exploration tools that work with any MCP client:

Available Tools:

execute_sql_query - Run custom SQL queries
list_tables - See all available tables
get_table_schema - Inspect table structure

Example Usage:

# With mcp-cli
pip install mcp-cli
mcp-cli tools call list_tables
mcp-cli tools call execute_sql_query --sql "SELECT COUNT(*) FROM users"

# LLMs can use these directly
"Show me what tables are available, then count the users created this month"

Configure SQL Tools (required to enable):

# mxcp-site.yml
sql_tools:
  enabled: true  # Default: false - must be explicitly enabled

Production Deployment

1. Environment Configuration

# mxcp-site.yml
profiles:
  development:
    database: dev.duckdb
    auth:
      enabled: false
      
  staging:
    database: staging.duckdb
    auth:
      enabled: true
      provider: github
    audit:
      enabled: true
      path: staging-audit.jsonl
      
  production:
    database: production.duckdb
    auth:
      enabled: true
      provider: atlassian
    audit:
      enabled: true
      path: /var/log/mxcp/audit.jsonl
    policies:
      strict_mode: true

2. Monitoring and Alerting

# Monitor error rates
mxcp log --since 1h --status error --export-duckdb errors.db

# Set up alerts for policy violations
mxcp log --policy deny --since 10m --export-csv violations.csv

# Track performance
mxcp log --since 1d | jq '.duration_ms' | awk '{sum+=$1; count++} END {print "Avg:", sum/count "ms"}'

3. Schema Drift Detection

# Create baseline snapshot
mxcp drift-snapshot --profile production

# Check for changes (run in CI/CD)
mxcp drift-check --profile production

Next Steps

Immediate Actions

Validate Your Setup

mxcp validate     # Check all endpoints
mxcp test         # Run endpoint tests
mxcp list         # Verify everything is loaded

💡 Learn more about writing tests and ensuring quality in the Quality & Testing Guide

Explore the CLI

mxcp --help       # See all commands
mxcp run --help   # Understand execution options
mxcp log --help   # Learn about audit querying

Dive Deeper

Type System - Master MXCP's type safety features
Quality & Testing - Write comprehensive tests and ensure quality
Policies - Implement fine-grained access control
Authentication - Set up OAuth for your organization
Plugins - Extend DuckDB with custom Python functions
Drift Detection - Monitor changes across environments

Build Your Own

Start Simple: Begin with basic SQL queries
Add Types: Implement comprehensive type definitions
Enable Security: Add authentication and policies
Monitor: Set up audit logging and drift detection
Scale: Move to production with proper profiles and monitoring

Troubleshooting

Common Issues

dbt models not found:

# Ensure dbt project is properly configured
dbt debug
dbt compile

Policy errors:

# Test with explicit user context
mxcp run tool my_tool --user-context '{"role": "admin"}'

Authentication issues:

# Check OAuth configuration
mxcp validate --profile production

Getting Help

Documentation: All features are documented in the docs/ directory
Examples: Check examples/ for real-world patterns
Community: Join our community for support and discussions
Issues: Report bugs and feature requests on GitHub

Why MXCP?

After completing this quickstart, you should understand MXCP's unique value:

Flexible Implementation: Choose SQL for data queries, Python for complex logic, or both
Enterprise Security: Policy enforcement, audit trails, OAuth authentication
Production Ready: Type safety, monitoring, drift detection, comprehensive testing
Developer Experience: Fast iteration, hot reload, comprehensive validation
Scalability: From prototype to production without re-architecting

Key Architecture Patterns

SQL + dbt Workflow:

dbt models (models/*.sql) → Create tables/views in DuckDB using dbt syntax
MXCP tools (tools/*.yml) → Query the dbt tables directly using standard SQL
SQL files (sql/*.sql) → Contain the actual SQL logic referenced by tools

Python Workflow:

Python modules (python/*.py) → Implement complex logic, API calls, ML models
MXCP tools (tools/*.yml) → Define interfaces with language: python
Runtime services → Access database, config, and secrets via mxcp.runtime

Perfect separation: Choose the right tool for each job, with all endpoints getting enterprise features automatically.

Learn More

Python Endpoints - Build complex tools with Python
Type System - Master MXCP's type validation system
dbt Integration - Build robust data transformation pipelines
Drift Detection - Monitor changes across environments
Plugin Development - Extend MXCP with custom functionality

Table of Contents​

Installation​

Path 1: Hello World (2 minutes)​

1. Initialize a Project​

2. Explore the Generated Files​

3. Start the MCP Server​

Path 2: Real-World Data Pipeline (10 minutes)​

1. Get the COVID Example​

2. Understand the dbt Integration​

3. Run the Data Pipeline​

4. Connect to Claude Desktop​

5. Test the Integration​

Path 3: Python-Powered Tools (5 minutes)​

1. Create a Python Analysis Tool​

2. Create an API Integration Tool​

3. Combine SQL and Python​

4. Test Your Python Tools​

Path 4: Enterprise Features (15 minutes)​

1. Policy Enforcement​

2. Enable Audit Logging​

3. Test with User Context​

4. Authentication Setup​

Advanced Patterns​

1. Multi-Source Data Pipeline​

2. Dynamic Caching Strategy​

3. Common Data Transformation Patterns​

4. Type-Safe Parameter Validation​

LLM Integration Options​

Option A: Claude Desktop Integration​

Option B: HTTP API Mode​

Option C: Built-in SQL Tools (Optional)​

Production Deployment​

1. Environment Configuration​

2. Monitoring and Alerting​

3. Schema Drift Detection​

Next Steps​

Immediate Actions​

Dive Deeper​

Build Your Own​

Troubleshooting​

Common Issues​

Getting Help​

Why MXCP?​

Key Architecture Patterns​

Learn More​

Table of Contents