SQL Endpoints Tutorial
Related Topics: Type System (parameter types) | DuckDB Integration (extensions, features) | dbt Integration (data modeling) | SQL Reference (built-in functions)
SQL endpoints are ideal for data queries and aggregations. In this tutorial, you’ll learn to build increasingly complex SQL tools, from simple queries to sophisticated analytics.
Build SQL tools that:
- Query data with parameters
- Perform aggregations and joins
- Use DuckDB’s analytical features
- Read from various data sources
Prerequisites
Section titled “Prerequisites”- Completed the Hello World Tutorial
- Basic SQL knowledge
- A project directory with
mxcp init
Step 1: Simple Query with Parameters
Section titled “Step 1: Simple Query with Parameters”Create a tool that fetches user data by ID.
Create sample data (sql/setup.sql):
-- Run this manually to set up test dataCREATE TABLE IF NOT EXISTS users ( id INTEGER PRIMARY KEY, name VARCHAR NOT NULL, email VARCHAR, department VARCHAR, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);
INSERT INTO users (id, name, email, department) VALUES (1, 'Alice Smith', 'alice@example.com', 'Engineering'), (2, 'Bob Johnson', 'bob@example.com', 'Sales'), (3, 'Carol Williams', 'carol@example.com', 'Engineering'), (4, 'David Brown', 'david@example.com', 'Marketing');Initialize the database:
mxcp query --file sql/setup.sqlTool definition (tools/get-user.yml):
mxcp: 1tool: name: get_user description: Get user information by ID parameters: - name: user_id type: integer description: User's unique identifier minimum: 1 examples: [1, 2, 3]
return: type: object properties: id: type: integer name: type: string email: type: string department: type: string
source: file: ../sql/get-user.sql
tests: - name: get_alice arguments: - key: user_id value: 1 result_contains: name: "Alice Smith"SQL implementation (sql/get-user.sql):
SELECT id, name, email, departmentFROM usersWHERE id = $user_idTest it:
mxcp run tool get_user --param user_id=1Step 2: Filtering with Multiple Parameters
Section titled “Step 2: Filtering with Multiple Parameters”Create a tool that searches users with filters.
Tool definition (tools/search-users.yml):
mxcp: 1tool: name: search_users description: Search users by department and name pattern parameters: - name: department type: string description: Department to filter by (Engineering, Sales, Marketing, HR) default: null
- name: name_pattern type: string description: Name pattern to search (case insensitive) default: "%"
- name: limit type: integer description: Maximum number of results minimum: 1 maximum: 100 default: 10
return: type: array items: type: object properties: id: type: integer name: type: string department: type: string
source: file: ../sql/search-users.sqlSQL implementation (sql/search-users.sql):
SELECT id, name, departmentFROM usersWHERE (department = $department OR $department IS NULL) AND name ILIKE $name_patternORDER BY nameLIMIT $limitTest different filters:
# All Engineering usersmxcp run tool search_users --param department=Engineering
# Users with 'a' in namemxcp run tool search_users --param name_pattern='%a%'
# Combined filtersmxcp run tool search_users --param department=Engineering --param name_pattern='%a%'Step 3: Aggregations
Section titled “Step 3: Aggregations”Create a tool that generates department statistics.
Tool definition (tools/department-stats.yml):
mxcp: 1tool: name: department_stats description: Get statistics by department parameters: - name: department type: string description: Specific department (optional, omit for all) default: null
return: type: array items: type: object properties: department: type: string user_count: type: integer emails_configured: type: integer
source: file: ../sql/department-stats.sqlSQL implementation (sql/department-stats.sql):
SELECT department, COUNT(*) AS user_count, COUNT(email) AS emails_configuredFROM usersWHERE department = $department OR $department IS NULLGROUP BY departmentORDER BY user_count DESCTest:
mxcp run tool department_statsmxcp run tool department_stats --param department=EngineeringStep 4: Joins and Complex Queries
Section titled “Step 4: Joins and Complex Queries”Add orders data and create analytics tools.
Add orders table (sql/setup-orders.sql):
CREATE TABLE IF NOT EXISTS orders ( id INTEGER PRIMARY KEY, user_id INTEGER REFERENCES users(id), amount DECIMAL(10,2), status VARCHAR, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);
INSERT INTO orders (id, user_id, amount, status) VALUES (1, 1, 150.00, 'completed'), (2, 1, 75.50, 'completed'), (3, 2, 200.00, 'pending'), (4, 2, 50.00, 'completed'), (5, 3, 300.00, 'completed');Initialize:
mxcp query --file sql/setup-orders.sqlTool definition (tools/user-orders.yml):
mxcp: 1tool: name: user_orders_summary description: Get order summary for a user parameters: - name: user_id type: integer description: User ID minimum: 1
return: type: object properties: user_id: type: integer user_name: type: string total_orders: type: integer total_amount: type: number average_order: type: number completed_orders: type: integer
source: file: ../sql/user-orders.sqlSQL implementation (sql/user-orders.sql):
SELECT u.id AS user_id, u.name AS user_name, COUNT(o.id) AS total_orders, COALESCE(SUM(o.amount), 0) AS total_amount, COALESCE(AVG(o.amount), 0) AS average_order, COUNT(CASE WHEN o.status = 'completed' THEN 1 END) AS completed_ordersFROM users uLEFT JOIN orders o ON u.id = o.user_idWHERE u.id = $user_idGROUP BY u.id, u.nameTest:
mxcp run tool user_orders_summary --param user_id=1mxcp run tool user_orders_summary --param user_id=2Step 5: Date Handling
Section titled “Step 5: Date Handling”Create a tool with date parameters.
Tool definition (tools/orders-by-date.yml):
mxcp: 1tool: name: orders_by_date_range description: Get orders within a date range parameters: - name: start_date type: string format: date description: Start date (YYYY-MM-DD) examples: ["2024-01-01"]
- name: end_date type: string format: date description: End date (YYYY-MM-DD) examples: ["2024-12-31"]
- name: status type: string description: Filter by status (pending, completed, cancelled) examples: ["pending", "completed", "cancelled"] default: null
return: type: array items: type: object properties: order_id: type: integer user_name: type: string amount: type: number status: type: string created_at: type: string format: date-time
source: file: ../sql/orders-by-date.sqlSQL implementation (sql/orders-by-date.sql):
SELECT o.id AS order_id, u.name AS user_name, o.amount, o.status, o.created_atFROM orders oJOIN users u ON o.user_id = u.idWHERE o.created_at >= $start_date::DATE AND o.created_at < ($end_date::DATE + INTERVAL '1 day') AND (o.status = $status OR $status IS NULL)ORDER BY o.created_at DESCStep 6: Reading External Data
Section titled “Step 6: Reading External Data”DuckDB can read data directly from files and URLs.
Tool definition (tools/read-csv.yml):
mxcp: 1tool: name: analyze_csv description: Analyze a CSV file parameters: - name: file_path type: string description: Path to CSV file examples: ["data/sales.csv"]
- name: column type: string description: Column to aggregate examples: ["amount", "quantity"]
return: type: object properties: row_count: type: integer sum: type: number average: type: number min: type: number max: type: number
source: code: | SELECT COUNT(*) AS row_count, SUM(CAST(column_value AS DOUBLE)) AS sum, AVG(CAST(column_value AS DOUBLE)) AS average, MIN(CAST(column_value AS DOUBLE)) AS min, MAX(CAST(column_value AS DOUBLE)) AS max FROM ( SELECT $column AS column_value FROM read_csv_auto($file_path) )Reading from URLs (requires httpfs extension):
# In mxcp-site.ymlextensions: - httpfsSELECT *FROM read_csv_auto('https://example.com/data.csv')WHERE date >= $start_dateStep 7: Window Functions
Section titled “Step 7: Window Functions”Create analytics with window functions.
Tool definition (tools/user-ranking.yml):
mxcp: 1tool: name: user_spending_ranking description: Rank users by total spending parameters: - name: top_n type: integer description: Number of top users to return minimum: 1 maximum: 100 default: 10
return: type: array items: type: object properties: rank: type: integer user_name: type: string total_spent: type: number order_count: type: integer
source: file: ../sql/user-ranking.sqlSQL implementation (sql/user-ranking.sql):
WITH user_totals AS ( SELECT u.name AS user_name, COALESCE(SUM(o.amount), 0) AS total_spent, COUNT(o.id) AS order_count FROM users u LEFT JOIN orders o ON u.id = o.user_id AND o.status = 'completed' GROUP BY u.id, u.name)SELECT ROW_NUMBER() OVER (ORDER BY total_spent DESC) AS rank, user_name, total_spent, order_countFROM user_totalsORDER BY rankLIMIT $top_nVerification
Section titled “Verification”Run all validations:
# Validate all toolsmxcp validate
# Run all testsmxcp test
# List available toolsmxcp listBest Practices
Section titled “Best Practices”1. Use Named Parameters
Section titled “1. Use Named Parameters”Always use $param_name for parameters:
-- GoodWHERE id = $user_id
-- Avoid (not supported)WHERE id = ?2. Handle NULL Parameters
Section titled “2. Handle NULL Parameters”Use OR $param IS NULL for optional filters:
WHERE department = $department OR $department IS NULL3. Use COALESCE for Defaults
Section titled “3. Use COALESCE for Defaults”Handle missing data gracefully:
COALESCE(SUM(amount), 0) AS total4. Type Cast When Needed
Section titled “4. Type Cast When Needed”Ensure types match:
WHERE created_at >= $start_date::DATE5. Limit Results
Section titled “5. Limit Results”Always limit unbounded queries:
ORDER BY created_at DESCLIMIT $limitNext Steps
Section titled “Next Steps”- Python Endpoints Tutorial - Add Python logic
- dbt Integration - Use dbt for data modeling
- SQL Reference - Complete SQL features