Back to Projects

MCP Server Platform

Extensible Model Context Protocol server with dynamic tool registry and Azure integration

Oct 2024 – Dec 2024 Grid Dynamics
MCP Server Architecture showing CSV-driven tool registry and Azure Blob Storage integration

Overview

The MCP (Model Context Protocol) Server Platform is a flexible, production-grade service that exposes AI tools through a unified REST API interface. Built to integrate with Azure AI services, the platform enables seamless tool discovery, execution, and management for large language models and AI agents.

The Problem

Organizations using Azure AI services needed a standardized way to:

  • Expose custom AI tools to language models without redeploying services
  • Handle large file uploads and downloads efficiently without memory bottlenecks
  • Provide async execution for long-running AI operations
  • Maintain a centralized registry of available tools with version control
  • Deploy and scale on Kubernetes with minimal operational overhead

Architecture & Design

Core Components

  • FastAPI REST API: High-performance async API server with automatic OpenAPI documentation and request validation using Pydantic models
  • CSV-Driven Tool Registry: Dynamic tool discovery system allowing administrators to add new tools by updating a CSV configuration file without code changes
  • Runtime Module Loading: Python importlib-based system that dynamically loads tool implementations from specified modules at startup
  • Claim Check Pattern: Efficient handling of large files by storing them in Azure Blob Storage and passing only references through the API
  • Azure Blob Storage Integration: Secure, scalable storage for tool inputs and outputs with automatic cleanup policies
  • Azure Key Vault: Centralized secret management for API keys, connection strings, and service credentials

Tool Registry System

The CSV-driven tool registry is a key innovation that enables non-developers to add new capabilities:

tool_name,module_path,description,version,enabled
image_classifier,tools.vision.classifier,Classify images using CLIP,1.0,true
text_summarizer,tools.nlp.summarizer,Summarize long documents,1.2,true
sentiment_analyzer,tools.nlp.sentiment,Analyze text sentiment,1.0,true

The server reads this CSV at startup, validates each entry, and dynamically imports the specified Python modules. Each tool must implement a standard interface ensuring consistent behavior.

Implementation Highlights

Async Request Handling

All tool executions are async-first to prevent blocking operations:

  • Used asyncio for I/O-bound operations (API calls, file uploads)
  • Implemented thread pools for CPU-bound tasks (ML inference)
  • Added request timeout middleware preventing resource exhaustion
  • Built retry logic with exponential backoff for transient failures

Secure File Handling

The Claim Check pattern dramatically improved performance and security:

  • Large files uploaded directly to Azure Blob Storage via SAS tokens
  • API receives only a blob reference (URI) keeping payloads small
  • Tools fetch files from Blob Storage only when needed
  • Automatic cleanup of temporary files after configurable TTL (default 24 hours)
  • Virus scanning integration before file processing

Kubernetes Deployment

Deployed to Azure Kubernetes Service (AKS) with production-grade configuration:

  • Horizontal Pod Autoscaler (HPA) based on CPU and memory metrics
  • Health check endpoints (liveness and readiness probes)
  • Secrets injected via Azure Key Vault CSI driver
  • Logging to Azure Monitor with structured JSON output
  • Blue-green deployments for zero-downtime updates

Exposed AI Tools

The platform currently exposes 9 AI tools across different categories:

Vision Tools

  • Image classification (CLIP-based)
  • Object detection (YOLO)
  • OCR and document extraction

NLP Tools

  • Text summarization
  • Sentiment analysis
  • Named entity recognition

Utility Tools

  • Document conversion (PDF to text)
  • Language translation
  • Data validation and cleaning

Technical Stack

Python 3.11 FastAPI Azure Blob Storage Azure Key Vault AKS (Kubernetes) Docker Pydantic asyncio

Performance & Scale

  • Handles 500+ concurrent tool executions per pod
  • Average API response time under 200ms for tool discovery
  • Successfully processed files up to 500MB using claim check pattern
  • Achieved 99.9% uptime across 3-month production period
  • Scales from 2 to 20 pods based on load (HPA configured)
  • Zero-downtime deployments with rolling updates

Key Learnings

  • Dynamic Configuration is Powerful: The CSV-driven registry allowed non-developers to add tools, significantly reducing deployment cycles.
  • Claim Check Pattern for Files: Moving large files out of the API request path improved response times by 10x and eliminated memory issues.
  • Observability from Day One: Structured logging and metrics collection made debugging production issues much easier than anticipated.
  • Kubernetes Complexity: While powerful, K8s required significant operational knowledge. Proper health checks and resource limits were critical.

Want to Learn More?

I'm happy to dive deeper into the technical architecture or discuss how similar patterns could be applied to your infrastructure.