
How to build your own MCP server with FastMCP

Evin Callahan
24 Sep 2025 - 04 Mins read
FastMCP is a Python library that makes it incredibly easy to connect your existing code or backend services to an LLM (Large Language Model - aka ChatGPT).
In this guide we'll walk through the initial setup and demonstrate some of the powerful use cases we have found to be helpful.
Setup
Install FastMCP using pip or your preferred package manager:
pip install fastmcp
# or
uv add fastmcp
# or
poetry add fastmcp
Creating a simple MCP
An MCP server exposes tools (functions) that LLMs can call. The @mcp.tool decorator registers a function as an available tool, and the LLM receives the function signature and description to determine when to use it.
# my_mcp.py
from fastmcp import FastMCP, Context
from enum import Enum
# Initialize the MCP server with a descriptive name
mcp = FastMCP("My Awesome MCP Server")
# Create HTTP app for serving the MCP server
http_app = mcp.http_app()
# Optional: organize tools with tags for better discoverability
class Tags(str, Enum):
    WEATHER = "weather"
    DATABASE = "database"
    SYSTEM = "system"
# Register a tool that the LLM can call
@mcp.tool(
    description="Get the current weather for a given city",
    tags={Tags.WEATHER.value},
)
async def get_weather(city: str) -> str:
    """
    Args:
        city: Name of the city to get weather for
    """
    # Your implementation here
    return f"Weather in {city}: Sunny, 72°F"
# Another example tool
@mcp.tool(description="Calculate the sum of two numbers")
async def add_numbers(a: int, b: int) -> int:
    return a + b
Above what we are doing is performing the setup of the MCP service AND exposing a http app for use in WSGI / ASGI servers (http_app)
It also adds two tools for use by the LLM whenever it wants:
- get_weather
- add_numbers
Testing the new MCP functions
FastMCP includes a built-in development server with a web inspector:
# Start the dev server
uv run fastmcp dev my_mcp.py
# Or with standard Python
python -m fastmcp dev my_mcp.py
This launches a browser with the "MCP Inspector" with an interactive inspector where you can:
- Connect via Streamable HTTP to your new MCP server
- View all registered tools and their schemas
- Test tools individually with custom inputs
- See real-time request/response logs
To connect from an LLM client, use the SSE endpoint: http://localhost:8000/
Running in production with Uvicorn
For production deployments, use Uvicorn to serve your MCP server:
# my_mcp.py
from fastmcp import FastMCP
mcp = FastMCP("My Awesome MCP Server")
# ... your tools here ...
http_app = mcp.http_app()
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(http_app, port=8000)
Run it directly:
python my_mcp.py
Or use Uvicorn's CLI for more control:
# Basic
uvicorn my_mcp:http_app --port 8000
# With auto-reload for development
uvicorn my_mcp:http_app --port 8000 --reload
# Production with multiple workers
uvicorn my_mcp:http_app --host 0.0.0.0 --port 8000 --workers 4
Adding middleware
Middleware runs on every tool call, enabling cross-cutting concerns like logging, authentication, rate limiting, or analytics. It intercepts requests before they reach your tools and can modify context or responses.
Learn more: FastMCP Middleware Documentation
from fastmcp import Middleware, MiddlewareContext
from typing import Any
import httpx
import time
class EventLoggingMiddleware(Middleware):
    """Log every tool call to an external analytics service"""
    async def on_call_tool(self, context: MiddlewareContext, call_next: Any) -> Any:
        fast_ctx = context.fastmcp_context
        if fast_ctx is None:
            return await call_next(context)
        # Extract request metadata
        request = fast_ctx.request_context.request
        tool_name = context.tool_name
        start_time = time.time()
        # Get user info from headers (if available)
        user_id = request.headers.get("x-user-id", "anonymous") if request else "unknown"
        try:
            # Execute the tool
            result = await call_next(context)
            duration = time.time() - start_time
            # Log successful execution
            async with httpx.AsyncClient() as client:
                await client.post(
                    "https://analytics.example.com/events",
                    json={
                        "event": "tool_call",
                        "tool": tool_name,
                        "user_id": user_id,
                        "duration_ms": duration * 1000,
                        "status": "success"
                    },
                    timeout=5.0
                )
            return result
        except Exception as e:
            # Log failures for any downstream failure with exception context,
            # and continue to raise the exception
            duration = time.time() - start_time
            async with httpx.AsyncClient() as client:
                await client.post(
                    "https://analytics.example.com/events",
                    json={
                        "event": "tool_call",
                        "tool": tool_name,
                        "user_id": user_id,
                        "duration_ms": duration * 1000,
                        "status": "error",
                        "error": str(e)
                    },
                    timeout=5.0
                )
            raise
# Add middleware to your MCP server
mcp.add_middleware(EventLoggingMiddleware())
Using this in practice
Most major LLM providers now support MCP servers:
- Claude Desktop: Native MCP support via configuration file (if you have pro)
- LibreChat: Full MCP integration with UI for managing servers
- Cline (VSCode): Built-in MCP client for development workflows
- OpenAI ChatGPT: Via MCP proxy adapters in the OpenAI playground
For detailed setup instructions with specific providers, see our guide: Using LibreChat to host MCP chatbots
The key is exposing your FastMCP server via HTTP/SSE and configuring your LLM client to connect to it.
Helpful advice
Use the inspector during development
- The built-in inspector (fastmcp dev) is invaluable for testing tools before connecting to an LLM
- Verify tool schemas are correct and descriptions are clear enough for the LLM to understand when to use them
Keep tool names and descriptions precise
- LLMs rely on names and descriptions to choose which tool to call
- Be specific about parameters, return types, and use cases
- FastMCP will supply the LLM with the typed arguments to the LLM
- It will also handle incoming types as well if you define them in function signatures via type hints
 
- Bad: "Gets data" | Good: "Retrieves user profile data including name, email, and account status"
Use type hints
- FastMCP automatically generates schemas from Python type hints
- Proper typing improves LLM tool selection accuracy
Handle errors gracefully
- Return meaningful error messages that the LLM can relay to users
- If you need to change the flow based on a given error message, describe the error, why it happened, and next steps to the LLM and it will act on them.
- Use middleware for consistent error handling across all tools
Bonus: VSCode Debug Configuration
For debugging your MCP server in VSCode, add this configuration to your .vscode/launch.json:
{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Run FastMCP via uvicorn",
      "module": "uvicorn",
      "args": ["my_mcp:http_app", "--port", "8000", "--log-level", "debug"],
      "cwd": "${workspaceFolder}",
      "request": "launch",
      "type": "debugpy"
    }
  ]
}
This allows you to:
- Set breakpoints in your tool functions
- Step through middleware execution
- Inspect context and request data in real-time
- View detailed logs

