Databricks MCP Server

A production-ready Model Context Protocol (MCP) server that exposes Databricks REST capabilities to MCP-compatible agents and tooling. Version 0.4.4 introduces structured responses, resource caching, retry-aware networking, and end-to-end resilience improvements.

Key Capabilities
Architecture Highlights
Installation
Configuration
Running the Server
Integrating with MCP Clients
Working with Tool Responses
Available Tools
Development Workflow
Testing
Publishing Builds
Support & Contact
License

Key Capabilities

Structured MCP Responses - Each tool returns a CallToolResult with a human-readable summary in content and machine-readable payloads in structuredContent that conform to the tool’s outputSchema.
Resource Caching - Large notebook/workspace exports are cached once and returned as resource_link content blocks with URIs such as resource://databricks/exports/{id} (also reflected in metadata for convenience).
Progress & Metrics - Long-running actions stream MCP progress notifications and track per-tool success/error/timeout/cancel metrics.
Resilient Networking - Shared HTTP client injects request IDs, enforces timeouts, and retries retryable Databricks responses (408/429/5xx) with exponential backoff.
Async Runtime - Built on mcp.server.FastMCP with centralized JSON logging and concurrency guards for predictable stdio behaviour.

Architecture Highlights

databricks_mcp/server/databricks_mcp_server.py - FastMCP server with tool registration, progress handling, metrics, and resource caching.
databricks_mcp/core/utils.py - HTTP utilities with correlation IDs, retries, and error mapping to DatabricksAPIError.
databricks_mcp/core/logging_utils.py - JSON logging configuration for stderr/file outputs.
databricks_mcp/core/models.py - Pydantic models (e.g., ClusterConfig) used by tool schemas.
Tests under tests/ mock Databricks APIs to validate orchestration, structured responses, and schema metadata without shell scripts.

For an in-depth tour of data flow and design decisions, see ARCHITECTURE.md.

Installation

Prerequisites

Python 3.10+
uv for dependency management and publishing

Quick Install (recommended)

Register the server with Cursor using the deeplink below - it resolves to uvx databricks-mcp-server@latest and picks up future updates automatically.

cursor://anysphere.cursor-deeplink/mcp/install?name=databricks-mcp&config=eyJjb21tYW5kIjoidXZ4IiwiYXJncyI6WyJkYXRhYnJpY2tzLW1jcC1zZXJ2ZXIiXSwiZW52Ijp7IkRBVEFCUklDS1NfSE9TVCI6IiR7REFUQUJSSUNLU19IT1NUfSIsIkRBVEFCUklDS1NfVE9LRU4iOiIke0RBVEFCUklDS1NfVE9LRU59IiwiREFUQUJSSUNLU19XQVJFSE9VU0VfSUQiOiIke0RBVEFCUklDS1NfV0FSRUhPVVNFX0lEfSJ9fQ==

Manual Installation

# Clone and enter the repository
git clone https://github.com/markov-kernel/databricks-mcp.git
cd databricks-mcp

# Create an isolated environment (optional but recommended)
uv venv
source .venv/bin/activate  # Linux/Mac
# .\.venv\Scripts\activate  # Windows PowerShell

# Install package and development dependencies
uv pip install -e .
uv pip install -e ".[dev]"

Configuration

Set the following environment variables (or populate .env from .env.example).

export DATABRICKS_HOST="https://your-workspace.databricks.com"
export DATABRICKS_TOKEN="dapiXXXXXXXXXXXXXXXX"
export DATABRICKS_WAREHOUSE_ID="sql_warehouse_12345"  # optional default
export TOOL_TIMEOUT_SECONDS=300
export MAX_CONCURRENT_REQUESTS=8
export HTTP_TIMEOUT_SECONDS=60
export API_MAX_RETRIES=3
export API_RETRY_BACKOFF_SECONDS=0.5

Running the Server

uvx databricks-mcp-server@latest

Tip: append --refresh (e.g., uvx databricks-mcp-server@latest --refresh) to force uv to resolve the latest PyPI release after publishing. Logs are emitted as JSON lines to stderr and persisted to databricks_mcp.log in the working directory.

To adjust logging:

uvx databricks-mcp-server@latest -- --log-level DEBUG

Integrating with MCP Clients

Codex CLI (STDIO)

codex mcp add databricks   --env DATABRICKS_HOST="https://your-workspace.databricks.com"   --env DATABRICKS_TOKEN="dapi_XXXXXXXXXXXXXXXX"   --env DATABRICKS_WAREHOUSE_ID="sql_warehouse_12345"   -- uvx databricks-mcp-server@latest
# Add --refresh immediately after a publish to invalidate the uv cache

Or edit ~/.codex/config.toml:

[mcp_servers.databricks]
command = "uvx"
args    = ["databricks-mcp-server@latest"]
env = {
  DATABRICKS_HOST = "https://your-workspace.databricks.com",
  DATABRICKS_TOKEN = "dapi_XXXXXXXXXXXXXXXX",
  DATABRICKS_WAREHOUSE_ID = "sql_warehouse_12345"
}
startup_timeout_sec = 15
tool_timeout_sec    = 300

Planning an HTTP deployment? Codex also supports url = "https://…" plus bearer_token_env_var = "DATABRICKS_TOKEN" or codex mcp login (with experimental_use_rmcp_client = true).

Cursor

{
  "mcpServers": {
    "databricks-mcp-local": {
      "command": "uvx",
      "args": ["databricks-mcp-server@latest"],
      "env": {
        "DATABRICKS_HOST": "https://your-workspace.databricks.com",
        "DATABRICKS_TOKEN": "dapiXXXXXXXXXXXXXXXX",
        "DATABRICKS_WAREHOUSE_ID": "sql_warehouse_12345",
        "RUNNING_VIA_CURSOR_MCP": "true"
      }
    }
  }
}

Restart Cursor after saving and invoke tools as databricks-mcp-local:<tool>.

Claude CLI

claude mcp add databricks-mcp-local   -s user   -e DATABRICKS_HOST="https://your-workspace.databricks.com"   -e DATABRICKS_TOKEN="dapiXXXXXXXXXXXXXXXX"   -e DATABRICKS_WAREHOUSE_ID="sql_warehouse_12345"   -- uvx databricks-mcp-server@latest

Working with Tool Responses

structuredContent carries machine-readable payloads. Large artifacts are returned as resource_link content blocks using URIs like resource://databricks/exports/{id} and can be fetched via the MCP resources API.

result = await session.call_tool("list_clusters", {})
summary = next((block.text for block in result.content if getattr(block, "type", "") == "text"), "")
clusters = (result.structuredContent or {}).get("clusters", [])
resource_links = [block for block in result.content if isinstance(block, dict) and block.get("type") == "resource_link"]

Progress notifications follow MCP’s progress token mechanism; Codex surfaces these messages in the UI while a tool runs.

Example - SQL Query

result = await session.call_tool("execute_sql", {"statement": "SELECT * FROM samples LIMIT 10"})
print(result.content[0].text)
rows = (result.structuredContent or {}).get("result", [])

Example - Workspace File Export

result = await session.call_tool("get_workspace_file_content", {
    "path": "/Users/[email protected]/report.ipynb",
    "format": "SOURCE"
})
resource_link = next((block for block in result.content if isinstance(block, dict) and block.get("type") == "resource_link"), None)
if resource_link:
    contents = await session.read_resource(resource_link["uri"])

Available Tools

Category	Tool	Description
Clusters	`list_clusters`, `create_cluster`, `terminate_cluster`, `get_cluster`, `start_cluster`, `resize_cluster`, `restart_cluster`	Manage interactive clusters
Jobs	`list_jobs`, `create_job`, `delete_job`, `run_job`, `run_notebook`, `sync_repo_and_run_notebook`, `get_run_status`, `list_job_runs`, `cancel_run`	Manage scheduled and ad-hoc jobs
Workspace	`list_notebooks`, `export_notebook`, `import_notebook`, `delete_workspace_object`, `get_workspace_file_content`, `get_workspace_file_info`	Inspect and manage workspace assets
DBFS	`list_files`, `dbfs_put`, `dbfs_delete`	Explore DBFS and manage files
SQL	`execute_sql`	Submit SQL statements with optional `warehouse_id`, `catalog`, `schema_name`
Libraries	`install_library`, `uninstall_library`, `list_cluster_libraries`	Manage cluster libraries
Repos	`create_repo`, `update_repo`, `list_repos`, `pull_repo`	Manage Databricks repos
Unity Catalog	`list_catalogs`, `create_catalog`, `list_schemas`, `create_schema`, `list_tables`, `create_table`, `get_table_lineage`	Unity Catalog operations

Development Workflow

uv run black databricks_mcp tests
uv run pylint databricks_mcp tests
uv run pytest
uv build
uv publish --token "$PYPI_TOKEN"

Testing

uv run pytest

Pytest suites mock Databricks APIs, providing deterministic structured outputs and transcript tests.

Publishing Builds

Ensure PYPI_TOKEN is available (via .env or environment) before publishing:

uv build
uv publish --token "$PYPI_TOKEN"

Support & Contact

Maintainer: Olivier Debeuf De Rijcker ([email protected])
Issues: GitHub Issues
Architecture deep dive: ARCHITECTURE.md

License

Released under the MIT License. See LICENSE.

Databricks