ClickZetta

Rating

Information

# Clickzetta MCP Server [![smithery badge](https://smithery.ai/badge/mcp_clickzetta_server)](https://smithery.ai/server/mcp_clickzetta_server) [![PyPI - Version](https://img.shields.io/pypi/dm/mcp-clickzetta-server?color&logo=pypi&logoColor=white&label=PyPI%20downloads)](https://pypi.org/project/mcp-clickzetta-server/) ## Overview A Model Context Protocol (MCP) server implementation that provides database interaction with [Clickzetta Lakehouse](https://www.yunqi.tech/documents). This server enables running SQL queries with tools and intereacting with a memo of data insights presented as a resource. ![image.gif](image/introduction.gif) ## Quick Start with MCP-ClickZetta-Server/Trae as your AI Data Engineer ### Download and install Trae [Download from trae.ai](https://www.trae.ai) and sign in to enable use AI. ### Get your ClickZetta Account [Get your ClickZetta Account](https://accounts.clickzetta.com/register) ### Pull MCP-ClickZetta-Server Docker Image \`\`\`SHELL docker pull czqiliang/mcp-clickzetta-server \`\`\` ### Add MCP server in Trae - In the AI chat window, click the Settings icon > MCP. - The MCP window will appear. - Click the + Add button. - You will enter the MCP Server Marketplace. - Click Manual Configuration. The Manual Configuration window will appear. Add a brand-new MCP Server by pasting the following JSON configuration into the input box, then click the Confirm button. The MCP Server will be added to the MCP list. \`\`\`JSON \{ "mcpServers": \{ "clickzetta-mcp-server": \{ "command": "docker", "args": [ "run", "-i", "--rm", "-e", "LOG_LEVEL=INFO", "-e", "CLICKZETTA_SERVICE", "-e", "CLICKZETTA_INSTANCE", "-e", "CLICKZETTA_WORKSPACE", "-e", "CLICKZETTA_SCHEMA", "-e", "CLICKZETTA_USERNAME", "-e", "CLICKZETTA_PASSWORD", "-e", "CLICKZETTA_VCLUSTER", "-e", "XINFERENCE_BASE_URL", "-e", "XINFERENCE_EMBEDDING_MODEL_512", "-e", "Similar_table_name", "-e", "Similar_embedding_column_name", "-e", "Similar_content_column_name", "-e", "Similar_partition_scope", "czqiliang/mcp-clickzetta-server:latest" ], "env": \{ "CLICKZETTA_SERVICE": "api.clickzetta.com", "CLICKZETTA_INSTANCE": "your clickzetta instance", "CLICKZETTA_WORKSPACE": "your clickzetta workspace" , "CLICKZETTA_SCHEMA": "your clickzetta schema", "CLICKZETTA_USERNAME": "your clickzetta usename", "CLICKZETTA_PASSWORD": "your clickzetta password", "CLICKZETTA_VCLUSTER": "your clickzetta vcluster", "Similar_table_name": "clickzegithub_event_issuesevent_embedding.github_event_issuesevent_embedding_512tta_table", "Similar_embedding_column_name": "issue_body_embedding", "Similar_content_column_name": "issue_body", "Similar_partition_scope": "partition_date >= '2024-01-01' and partition_date <= '2024-01-15'" \} \} \} \} \`\`\` - CLICKZETTA开头的env参数为必填 ![image.gif](image/trae_show.gif) ## Quick Start with MCP-ClickZetta-Server/Zettapark-MCP-Server/Claude Desktop as your AI Data Engineer ### Download and install Claude Desktop [Download from claude.ai](https://claude.ai/download) and sign in. ### Get your ClickZetta Account [Get your ClickZetta Account](https://accounts.clickzetta.com/register) ### Start your Jupyter lab #### Start by Docker Compose - create folder \`\`\`bash mkdir notebooks cd notebooks \`\`\` - Create config.json file and set your login information as below: \`\`\`json \{ "username": "your clickzetta lakehouse user name", "password": "your clickzetta lakehouse password", "service": "api.clickzetta.com", "instance": "your clickzetta lakehouse instance name", "workspace": "your clickzetta lakehouse workspac name", "schema": "your clickzetta lakehouse schema", "vcluster": "your clickzetta lakehouse vcluster name", "sdk_job_timeout": 60, "hints": \{ "sdk.job.timeout": 60, "query_tag": "test_zettapark_vector_ns227206", "cz.storage.parquet.vector.index.read.memory.cache": "true", "cz.storage.parquet.vector.index.read.local.cache": "false", "cz.sql.table.scan.push.down.filter": "true", "cz.sql.table.scan.enable.ensure.filter": "true", "cz.storage.always.prefetch.internal": "true", "cz.optimizer.generate.columns.always.valid": "true", "cz.sql.index.prewhere.enabled": "true", "cz.storage.parquet.enable.io.prefetch": "false" \} \} \`\`\` Notice: In this way, please set config.json file in notebooks folder. You should chage query_tag for your purpose and it is using to find query by tag. - Download docker-compose.yml and start docker compose Download docker compose file from [here](https://github.com/yunqiqiliang/jupyter-mcp-server/blob/main/jupyter_lab_server/docker-compose.yml) \`\`\`bash docker compose up -d \`\`\` Go to Docker desktop and check the docker compose is started. - Access Jupyter Lab Server http://localhost:8888/ Token: YOUR_SECURE_TOKEN Then create a new notebook, change name from Untitled.ipynb to notebook.ipynb . #### Or Start Local Server \`\`\`SHELL # Create a clean environment (Python 3.10 worked during debugging) conda create -n jupyter_mcp_env python=3.10 -y # Activate the environment conda activate jupyter_mcp_env # Use 'python -m pip' to ensure correct pip in the activated env python -m pip install jupyterlab ipykernel # Install the required v2.0.1 python -m pip install "jupyter_collaboration==2.0.1" # Uninstall potentially conflicting versions python -m pip uninstall -y pycrdt datalayer_pycrdt # Install the required version python -m pip install datalayer_pycrdt jupyter server extension enable jupyter_collaboration --py --sys-prefix # Start JupyterLab, please keep token as YOUR_SECURE_TOKEN jupyter lab --port 8888 --IdentityProvider.token YOUR_SECURE_TOKEN --ip 0.0.0.0 \`\`\` ### Add MCP server in your Claude Desktop - In Claude Desktop, go to Settings → Developer → Edit Config - Open claude_desktop_config.json and config MCP servers \`\`\`JSON \{ "mcpServers": \{ "jupyter": \{ "command": "docker", "args": [ "run", "-i", "--rm", "-e", "SERVER_URL", "-e", "TOKEN", "-e", "NOTEBOOK_PATH", "-e", "LOG_LEVEL=INFO", "czqiliang/jupyter-mcp-server:latest" ], "env": \{ "SERVER_URL": "http://host.docker.internal:8888", "TOKEN": "YOUR_SECURE_TOKEN", "NOTEBOOK_PATH": "notebook.ipynb" \} \}, "clickzetta-mcp-server": \{ "command": "docker", "args": [ "run", "-i", "--rm", "-e", "LOG_LEVEL=INFO", "-e", "CLICKZETTA_SERVICE", "-e", "CLICKZETTA_INSTANCE", "-e", "CLICKZETTA_WORKSPACE", "-e", "CLICKZETTA_SCHEMA", "-e", "CLICKZETTA_USERNAME", "-e", "CLICKZETTA_PASSWORD", "-e", "CLICKZETTA_VCLUSTER", "-e", "Similar_table_name", "-e", "Similar_embedding_column_name", "-e", "Similar_content_column_name", "-e", "Similar_partition_scope", "czqiliang/mcp-clickzetta-server:latest" ], "env": \{ "CLICKZETTA_SERVICE": "api.clickzetta.com", "CLICKZETTA_INSTANCE": "your clickzetta instance", "CLICKZETTA_WORKSPACE": "your clickzetta workspace" , "CLICKZETTA_SCHEMA": "your clickzetta schema", "CLICKZETTA_USERNAME": "your clickzetta usename", "CLICKZETTA_PASSWORD": "your clickzetta password", "CLICKZETTA_VCLUSTER": "your clickzetta vcluster", "Similar_table_name": "clickzegithub_event_issuesevent_embedding.github_event_issuesevent_embedding_512tta_table", "Similar_embedding_column_name": "issue_body_embedding", "Similar_content_column_name": "issue_body", "Similar_partition_scope": "partition_date >= '2024-01-01' and partition_date <= '2024-01-15'" \} \} \} \} \`\`\` You could get more detail information about Zettapark MCP Server from [here](https://github.com/yunqiqiliang/jupyter-mcp-server). ## Components ### Resources The server exposes a single dynamic resource: - \`memo://insights\`: A continuously updated data insights memo that aggregates discovered insights during analysis - Auto-updates as new insights are discovered via the append-insight tool ### Tools The server offers the following core tools: #### Query Tools ##### \`read_query\` - **Description**: Execute \`SELECT\` queries to read data from the database. - **Input**: - \`query\` (string): The \`SELECT\` SQL query to execute. - **Returns**: Query results as an array of objects. ##### \`write_query\` (requires \`--allow-write\` flag) - **Description**: Execute \`INSERT\`, \`UPDATE\`, or \`DELETE\` queries to modify data. - **Input**: - \`query\` (string): The SQL modification query. - **Returns**: \`\{ affected_rows: number \}\`, indicating the number of affected rows. ##### \`create_table\` (requires \`--allow-write\` flag) - **Description**: Create new tables in the database. - **Input**: - \`query\` (string): \`CREATE TABLE\` SQL statement. - **Returns**: Confirmation of table creation. ##### \`create_table_with_prompt\` (requires \`--allow-write\` flag) - **Description**: Create a new table by prompting the user for table name, columns, and their types. - **Input**: - \`table_name\` (string): The name of the table to create. - \`columns\` (string): The columns and their types in the format \`column1:type1,column2:type2\`. - **Returns**: Confirmation of table creation. #### Schema Tools ##### \`list_tables\` - **Description**: Get a list of all tables in the database. - **Input**: No input required. - **Returns**: An array of table names. ##### \`describe_table\` - **Description**: View column information for a specific table. - **Input**: - \`table_name\` (string): Name of the table to describe (can be fully qualified). - **Returns**: An array of column definitions with names and types. ##### \`show_object_list\` - **Description**: Get the list of specific object types in the current workspace, such as catalogs, schemas, tables, etc. - **Input**: - \`object_type\` (string): The type of the object to show. - **Returns**: A list of objects. ##### \`desc_object\` - **Description**: Get detailed information about a specific object, such as a catalog, schema, or table. - **Input**: - \`object_type\` (string): The type of the object. - \`object_name\` (string): The name of the object. - **Returns**: Detailed information about the object. #### Analysis Tools ##### \`append_insight\` - **Description**: Add new data insights to the memo resource. - **Input**: - \`insight\` (string): Data insight discovered from analysis. - **Returns**: Confirmation of insight addition. - **Triggers**: Updates the \`memo://insights\` resource. #### Data Import Tools ##### \`import_data_into_table_from_url\` - **Description**: Import data into a table from a URL (including file paths or HTTP/HTTPS URLs). If the destination table does not exist, it will be created automatically. - **Input**: - \`from_url\` (string): The data source URL. - \`dest_table\` (string): The table to import data into. - **Returns**: Confirmation of successful data import. ##### \`import_data_into_table_from_database\` - **Description**: Connect to a database, execute a query, and import the results into a Clickzetta table. Supports MySQL, PostgreSQL, SQLite, and other common database types. - **Input**: - \`db_type\` (string): The type of the database (e.g., \`mysql\`, \`postgresql\`, \`sqlite\`). - \`host\` (string): The hostname or IP address of the database server (not required for SQLite). - \`port\` (integer): The port number of the database server (not required for SQLite). - \`database\` (string): The name of the database to connect to (for SQLite, this is the file path to the database file). - \`username\` (string): The username for authentication (not required for SQLite). - \`password\` (string): The password for authentication (not required for SQLite). - \`source_table\` (string): The source table name. - \`dest_table\` (string): The destination table name. - **Returns**: Confirmation of successful data import. #### Similar Search Tools ##### \`vector_search\` - **Description**: Perform vector search on a table using a question and return the top 5 closest answers. - **Input**: - \`table_name\` (string): The table name. - \`content_column_name\` (string): The column storing content. - \`embedding_column_name\` (string): The column storing embeddings. - \`partition_scope\` (string): SQL code to define the partition scope as part of the \`WHERE\` condition. - \`question\` (string): The question to search. - **Returns**: Search results. ##### \`match_all\` - **Description**: Perform a search using the "match all" function on a table with a question and return the top 5 answers. - **Input**: - \`table_name\` (string): The table name. - \`content_column_name\` (string): The column storing content. - \`partition_scope\` (string): SQL code to define the partition scope as part of the \`WHERE\` condition. - \`question\` (string): The question to search. - **Returns**: Search results. #### Knowledge Search Tools ##### \`get_knowledge_about_how_to_do_something\` - **Description**: Provide guidance on how to perform specific tasks, such as analyzing slow queries, creating tables, or managing storage connections. - **Input**: - \`to_do_something\` (string): The task to perform. Supported tasks include: - \`analyze_slow_query\` - \`analyze_table_with_small_file\` - \`create_table_syntax\` - \`how_to_create_vcluster\` - \`how_to_create_index\` - \`how_to_alter_table_and_column\` - \`how_to_create_storage_connection\` - \`how_to_create_external_volume\` - **Returns**: Detailed guidance on the specified task. #### Usage Notes - Ensure the \`--allow-write\` flag is enabled when using tools that modify data (e.g., \`write_query\`, \`create_table\`). - Provide the correct input parameters for each tool as described above. ## Usage with Claude Desktop ### Installing as local MCP Server(This way has been tested and verified on MacOS) #### Clone this repository: \`\`\`bash git clone https://github.com/yunqiqiliang/mcp-clickzetta-server.git cd mcp-clickzetta-server \`\`\` #### Install the package: \`\`\`bash uv pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple/ \`\`\` #### Config credentials Create a .env file based on .env.example with your Clickzetta Lakehouse credentials: \`\`\` CLICKZETTA_USERNAME = "" CLICKZETTA_PASSWORD = "" CLICKZETTA_SERVICE = "api.clickzetta.com" CLICKZETTA_INSTANCE = "" CLICKZETTA_WORKSPACE = "" CLICKZETTA_SCHEMA = "" CLICKZETTA_VCLUSTER = "" Similar_table_name = "github_event_issuesevent_embedding.github_event_issuesevent_embedding_512" Similar_embedding_column_name = "issue_body_embedding" Similar_content_column_name = "issue_body" Similar_partition_scope = "partition_date >= '2024-01-01' and partition_date <= '2024-01-15'" \`\`\` ##### Usage ##### Running with uv After installing the package, you can run the server directly with: \`\`\`bash uv run mcp_clickzetta_server \`\`\` If this is the first time you are running the server, you could run the following command to acclerate the package installation: \`\`\`bash UV_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple/ uv run mcp_clickzetta_server \`\`\` This will start the stdio-based MCP server, which can be connected to Claude Desktop or any MCP client that supports stdio communication. You should see output similar to: \`\`\`bash uv run mcp_clickzetta_server --no-prefetch 2025-03-25 10:11:20,799 - mcp_clickzetta_server - INFO - Starting Clickzetta MCP Server 2025-03-25 10:11:20,799 - mcp_clickzetta_server - INFO - Allow write operations: False 2025-03-25 10:11:20,799 - mcp_clickzetta_server - INFO - Prefetch table descriptions: True 2025-03-25 10:11:20,799 - mcp_clickzetta_server - INFO - Excluded tools: [] 2025-03-25 10:11:20,799 - mcp_clickzetta_server - INFO - Prefetching table descriptions 2025-03-25 10:11:21,726 - clickzetta.zettapark.session - INFO - Zettapark Session information: "version" : 0.1.3, "python.version" : 3.12.2, "python.connector.version" : 0.8.89.0, "python.connector.session.id" : dd46bd27-920d-4760-94a6-6f994d31e63e, "os.name" : Darwin 2025-03-25 10:11:21,728 - clickzetta.connector.v0.client - INFO - clickzetta connector submitting job, id:2025032510112172821098301 2025-03-25 10:11:23,059 - clickzetta.connector.v0.client - INFO - clickzetta connector submitting job, id:2025032510112305897947697 2025-03-25 10:11:23,728 - mcp_clickzetta_server - INFO - Allowed tools: ['read_query', 'append_insight'] 2025-03-25 10:11:23,732 - mcp_clickzetta_server - INFO - Server running with stdio transport \`\`\` ##### Claude Desktop Integration ###### command:docker The MCP server (running in Docker) reads its configuration from environment variables passed via the MCP client configuration (e.g., \`claude_desktop_config.json\`). Key variables: \`\`\`json \{ "clickzetta-mcp-server": \{ "command": "docker", "args": [ "run", "-i", "--rm", "-e", "LOG_LEVEL=INFO", "-e", "CLICKZETTA_SERVICE", "-e", "CLICKZETTA_INSTANCE", "-e", "CLICKZETTA_WORKSPACE", "-e", "CLICKZETTA_SCHEMA", "-e", "CLICKZETTA_USERNAME", "-e", "CLICKZETTA_PASSWORD", "-e", "CLICKZETTA_VCLUSTER", "-e", "Similar_table_name", "-e", "Similar_embedding_column_name", "-e", "Similar_content_column_name", "-e", "Similar_partition_scope", "czqiliang/mcp-clickzetta-server:latest" ], "env": \{ "CLICKZETTA_SERVICE": "api.clickzetta.com", "CLICKZETTA_INSTANCE": "your clickzetta instance", "CLICKZETTA_WORKSPACE": "your clickzetta workspace" , "CLICKZETTA_SCHEMA": "your clickzetta schema", "CLICKZETTA_USERNAME": "your clickzetta usename", "CLICKZETTA_PASSWORD": "your clickzetta password", "CLICKZETTA_VCLUSTER": "your clickzetta vcluster", "Similar_table_name": "clickzegithub_event_issuesevent_embedding.github_event_issuesevent_embedding_512tta_table", "Similar_embedding_column_name": "issue_body_embedding", "Similar_content_column_name": "issue_body", "Similar_partition_scope": "partition_date >= '2024-01-01' and partition_date <= '2024-01-15'" \} \} \} \`\`\` ###### command:uv - In Claude Desktop, go to Settings → MCP Servers - Add a new server with the full path to your uv executable: \`\`\`json \{ "mcpServers": \{ "clickzetta-mcp-server" : \{ "command": "/Users/******/anaconda3/bin/uv", "args": [ "--directory", "/Users/******/Documents/GitHub/mcp-clickzetta-server", "run", "mcp_clickzetta_server" ] \} \} \} \`\`\` - You can find your uv path by running which uv in your terminal - Save the server configuration ![image.png](/image/claude_add_clickzetta_mcp_server.png) ##### Example Queries When using with Claude, you can ask questions like: - "Can you list all the schemas in my Clickzetta account?" - "List all views in the PUBLIC schema" - "Describe the structure of the CUSTOMER_ANALYTICS view in the SALES schema" - "Show me sample data from the REVENUE_BY_REGION view in the FINANCE schema" - "Run this SQL query: SELECT customer_id, SUM(order_total) as total_spend FROM SALES.ORDERS GROUP BY customer_id ORDER BY total_spend DESC LIMIT 10" - "Query the MARKETING database to find the top 5 performing campaigns by conversion rate" - "帮我从Clickzetta中读取数据，分析下在public这个schema下github_users表里每个公司的用户数。请用中文返回结果，并对结果进行数据可视化展现" - "帮我从Clickzetta中读取数据，分析下在public这个schema下github_event_issuesevent表里有多少条记录？" ##### Example Result - '帮我从Clickzetta数据源中读取数据，先分析基于public这个schema下github_users表里的数据可以做哪些分析？包括指标、统计、趋势、以及各种经典的用户分析模型比如用户价值分析、用户生命周期分析、用户segment等，然后根据这些分析内容生成一个分析报告的dashboard' ![image.gif](./image/result3_image.gif) - The result of "帮我从Clickzetta中读取数据，分析下在public这个schema下github_users表里每个公司的用户数。请用中文返回结果，并对结果进行数据可视化展现": ![image.png](./image/result1_image.png) - The result of "帮我从Clickzetta中读取数据，分析下在public这个schema下github_users表里每个位置的用户数。请用中文返回结果，并对结果进行数据可视化展现": ![image.png](./image/result2_image.png) #### Security Considerations This server: - Enforces read-only operations (only SELECT statements are allowed) - Automatically adds LIMIT clauses to prevent large result sets - Uses service account authentication for secure connections - Validates inputs to prevent SQL injection - ⚠️ Important: Keep your .env file secure and never commit it to version control. The .gitignore file is configured to exclude it. ### Installing via Smithery(This way is tobe tested and verified) To install Clickzetta Server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@yunqiqiliang/mcp-clickzetta-server): \`\`\`bash npx -y @smithery/cli@latest install @yunqiqiliang/mcp-clickzetta-server --client claude --key ****** \`\`\` ### Installing via UVX(This way is tobe tested and verified) \`\`\`python # Add the server to your claude_desktop_config.json "mcpServers": \{ "clickzetta_pip": \{ "command": "uvx", "args": [ "mcp_clickzetta_server", "--service", "the_service", "--instance", "the_instance", "--vcluster", "the_vcluster", "--workspace", "the_workspace", "--schema", "the_schema", "--user", "the_user", "--password", "their_password", # Optionally: "--allow_write" (but not recommended) # Optionally: "--log_dir", "/absolute/path/to/logs" # Optionally: "--log_level", "DEBUG"/"INFO"/"WARNING"/"ERROR"/"CRITICAL" # Optionally: "--exclude_tools", "\{tool name\}", ["\{other tool name\}"] ] \} \} \`\`\`

Prompts

Reviews

Write Your Review

Detailed Ratings

ALL

Correctness

Helpfulness

Interesting

Upload Pictures and Videos

Name

Size

Type

Download

Last Modified

Upload Files

Community

Add Discussion

Upload Pictures and Videos

Chatbot close

Bot
Hi there
How can I help you today?

Send