Skip to content
WhySoGeek.
Software

Docker MCP Toolkit and Model Runner: 2026 Guide

Docker Desktop now runs local LLMs and sandboxed MCP servers from one app, with profile templates that bootstrap a whole agent toolset in a couple of clicks.

Sam Carter 8 min read
Cover image for Docker MCP Toolkit and Model Runner: 2026 Guide
Photo: nickgraywfu / flickr (BY-SA 2.0)

Running local AI used to mean juggling Ollama, a pile of Python, and a folder of half-configured MCP servers you were afraid to touch. Docker Desktop has quietly turned all of that into two features that live in the same app: Model Runner for local LLMs and the MCP Toolkit for the tools those models call.

Quick answer

Docker Model Runner runs large language models locally straight from Docker Desktop, while the MCP Toolkit runs Model Context Protocol servers in sandboxed containers that any MCP client can consume. In 2026, Docker added MCP profile template cards that bootstrap a whole collection of servers at once, Qwen3.5 model support, and a filterable Logs view. You enable both from the AI section of Docker Desktop.

Key takeaways

  • Model Runner pulls and runs local LLMs from Docker Desktop with no separate inference stack.
  • The MCP Toolkit runs MCP servers in sandboxed containers, described by Docker as "the npm of AI tools."
  • Profile template cards (added in 4.67, March 2026) bootstrap a predefined server collection in a couple of clicks.
  • Model Runner added Qwen3.5 support, registry mirrors, and vLLM Metal for Apple silicon.
  • A 2026 security fix patched CVE-2026-33990, an SSRF flaw in the Model Runner OCI client, so update.

Model Runner: local LLMs without the mess

Docker Model Runner lets you pull a model the same way you pull a container image and run inference against it locally. Models come from the Docker model catalog as OCI artifacts, so the tooling you already know handles versioning and distribution.

The 2026 releases expanded what it can run. Support for the Qwen3.5 family landed for local inference directly from Docker Desktop, registry mirrors let teams cache models internally, and vLLM Metal support improved throughput on Apple silicon. If you are choosing which model to run locally, our roundup of the best open-weight LLMs covers the tradeoffs in size and quality.

Because models run as artifacts your existing registry can host, Model Runner fits into corporate environments where pulling from public endpoints is restricted. That is a meaningful difference from tools that assume an open internet connection.

Rows of server hardware with blue status lights representing local model inference
Photo: jurvetson / flickr (BY 2.0)

MCP Toolkit: tools that just work

The Model Context Protocol lets an AI agent call external tools, read files, or hit APIs. The problem has always been running those MCP servers safely and consistently. The Docker MCP Toolkit solves it by running each server in a sandboxed container, isolated from your host, and exposing it to any MCP-compatible client.

Docker calls the accompanying MCP Catalog "the npm of AI tools": a centralized, sandboxed registry of server capabilities that clients like Claude Desktop, Cursor, and Continue.dev can consume. Instead of cloning a repo and hoping its dependencies do not conflict with your system, you pick a server from the catalog and it runs in its own container.

The sandboxing is the point. An MCP server that can read files or make network calls is exactly the kind of thing you want isolated, and containers give you that boundary for free. If you are thinking about the security model here, our piece on sandboxing AI agent code execution explains why that isolation matters.

Profile template cards

The standout 2026 addition arrived in Docker Desktop 4.67.0 on March 30: MCP profile template cards. A profile is a named bundle of MCP servers. Instead of adding servers one at a time and wiring up each configuration, a template card bootstraps a whole predefined collection at once, with an onboarding tour to walk you through it.

For anyone setting up an agent workflow, this collapses what used to be an afternoon of config into a couple of clicks. Pick a template that matches your use case, adjust what you need, and every server in the bundle is running and connected.

FeatureWhat it doesAdded
Model RunnerRuns local LLMs as OCI artifactsEarlier, expanded 2026
Qwen3.5 supportAdditional local model family2026
MCP ToolkitSandboxed MCP servers for any client2025, refined 2026
Profile template cardsBootstrap a server collection at once4.67, March 2026
Logs (Beta) filterScope logs to one Compose stack4.67, March 2026

Debugging with the new Logs view

Docker also shipped a filterable Logs view in beta. The 4.67 update added a filter that scopes log output to a specific Compose stack, which cuts a lot of noise when you are running several services locally. For MCP work, where a misbehaving server can be hard to spot in a flood of container output, being able to isolate one stack's logs is a practical quality-of-life win.

The security fix you should not skip

Docker Desktop 4.67.0 also patched CVE-2026-33990, a server-side request forgery vulnerability in Model Runner's OCI registry client. SSRF flaws let an attacker coax a service into making requests it should not, which is especially dangerous for teams running Docker on machines with access to sensitive internal networks. If you use Model Runner, updating to 4.67 or later is not optional.

What to do right now

  • Update Docker Desktop to 4.67.0 or later to get the CVE-2026-33990 fix.
  • Open the AI section in Docker Desktop and enable Model Runner.
  • Pull a model like Qwen3.5 and run a quick local inference test.
  • Add MCP servers through a profile template card rather than one at a time.
  • Point your MCP client (Claude Desktop, Cursor, or Continue.dev) at the toolkit.
  • Use the Logs filter to scope output to one Compose stack when debugging.

Frequently asked questions

Do I still need Ollama if I use Docker Model Runner?

Not necessarily. Model Runner covers the local inference role that Ollama fills, and it integrates with the Docker tooling and registries you already use. If your team is standardized on Docker, consolidating on Model Runner removes a separate dependency.

Are MCP servers in the toolkit safe to run?

Each server runs in a sandboxed container isolated from your host, which is a stronger boundary than running the same code directly on your machine. That said, an MCP server still has whatever permissions you grant it, so review what a server can access before wiring it into an agent.

What is a profile in the MCP Toolkit?

A profile is a named bundle of MCP servers you can enable together. Profile template cards, added in 4.67, let you bootstrap a predefined collection in one step instead of configuring each server manually.

Can I run this offline or behind a corporate firewall?

Yes, to a degree. Model Runner supports registry mirrors so teams can cache models internally, and because models are OCI artifacts, your existing private registry can host them. MCP servers that call external APIs will still need network access to those APIs.

Which clients can use the Docker MCP Toolkit?

Any MCP-compatible client, including Claude Desktop, Cursor, and Continue.dev. Docker exposes the toolkit through the standard protocol, so support depends on the client implementing MCP rather than on anything Docker-specific.

#docker#mcp#local-llm#ai-tools

Sources & further reading

Keep reading