Switching from mcp-ragdocs to Pathfinder

mcp-ragdocs is a solid open-source RAG MCP server. Pathfinder adds filesystem exploration, multi-source indexing, conversational sources, config-driven setup, and webhook auto-reindex on top of what ragdocs provides.

Why Switch

More than search. Pathfinder gives agents bash filesystem exploration (find, grep, cat, ls) alongside semantic search. Agents pick the right tool for each sub-task instead of relying solely on RAG.
Multi-source indexing. Index docs, code, Slack threads, Discord forums, and Notion pages in one server. mcp-ragdocs indexes URLs one at a time.
Config-driven. One pathfinder.yaml file defines all sources, tools, and behavior. No scattered environment variables to manage.
No Qdrant dependency. Pathfinder uses PostgreSQL + pgvector for vector storage, or PGlite for zero-infra local development. No separate vector database to run.
Webhook reindexing. Push to GitHub and your docs reindex automatically. No manual re-ingestion step.
Knowledge and FAQ tools. Conversational sources (Slack, Discord) are distilled into Q&A pairs that agents can query directly.

What You Gain

🔎

Search + Explore

Agents get both semantic search and bash filesystem tools. They pick the right approach for each sub-task instead of being limited to RAG only.

📄

Multi-Source

Docs, code, Slack, Discord, Notion — all indexed in one server. mcp-ragdocs indexes one URL at a time; Pathfinder indexes everything at once.

⚡

Zero-Infra Start

PGlite means no database to install for local dev. Bash-only mode needs no API keys at all. Add RAG when you're ready — it's just config.

🔄

Webhook Reindexing

Push to GitHub and docs reindex automatically. No manual add_documentation calls. Nightly full reindex on schedule.

💬

Knowledge Tools

Conversational sources are distilled into Q&A pairs. Agents query community knowledge directly, not just documentation.

📋

Config-Driven

One YAML file defines sources, tools, behavior, and webhooks. No scattered environment variables or manual setup steps.

Config Comparison

Here's how a typical mcp-ragdocs setup compares to the equivalent Pathfinder config:

mcp-ragdocs (.env + MCP config)

# Environment variables QDRANT_URL=http://localhost:6333 OPENAI_API_KEY=sk-... EMBEDDING_MODEL=text-embedding-ada-002 COLLECTION_NAME=my-docs # MCP client config { "mcpServers": { "ragdocs": { "command": "npx", "args": ["mcp-ragdocs"], "env": { "QDRANT_URL": "...", "OPENAI_API_KEY": "..." } } } }

Pathfinder (pathfinder.yaml)

server: name: my-project sources: - name: docs type: markdown repo: https://github.com/your-org/your-repo.git path: docs/ tools: - name: search-docs type: search source: docs - name: explore-docs type: bash sources: [docs]

Migration Walkthrough

Install Pathfinder

CLI:

$ npx @copilotkit/pathfinder init
$ npx @copilotkit/pathfinder serve

Docker:

$ docker pull ghcr.io/copilotkit/pathfinder

See full setup guide for detailed instructions.

Create pathfinder.yaml

Translate what mcp-ragdocs was indexing into a Pathfinder source. If you were adding URLs pointing to your docs repo, map them to a git source:

server:
  name: my-project

sources:
  - name: docs
    type: markdown
    repo: https://github.com/your-org/your-repo.git
    path: docs/
    file_patterns: ["**/*.md", "**/*.mdx"]

tools:
  # Semantic search — replaces ragdocs search_documentation
  - name: search-docs
    type: search
    source: docs
    default_limit: 5

  # Filesystem exploration — ragdocs doesn't have this
  - name: explore-docs
    type: bash
    sources: [docs]
    bash:
      session_state: true
      grep_strategy: hybrid

Start serving

No Qdrant needed. Pathfinder uses PGlite by default for zero-infra local development:

$ npx @copilotkit/pathfinder serve

The first boot automatically indexes your sources. For production, set DATABASE_URL to a PostgreSQL instance with pgvector.

Update your MCP client config

Replace the mcp-ragdocs server entry with Pathfinder:

// Before (mcp-ragdocs)
{
  "mcpServers": {
    "ragdocs": {
      "command": "npx",
      "args": ["mcp-ragdocs"]
    }
  }
}

// After (Pathfinder)
{
  "mcpServers": {
    "docs": { "url": "http://localhost:3001/mcp" }
  }
}

Add sources ragdocs couldn't do (optional)

Now that you're on Pathfinder, you can index conversational sources too:

sources:
  - name: docs
    type: markdown
    repo: https://github.com/your-org/your-repo.git
    path: docs/

  # Slack support threads
  - name: support
    type: slack
    channels: ["C0123456789"]

  # Discord community forums
  - name: community
    type: discord
    guild_id: "123456789"
    channels:
      - id: "111111111"
        type: forum

  # Source code
  - name: code
    type: code
    repo: https://github.com/your-org/your-repo.git
    path: src/

What's Different

Pathfinder is not a drop-in replacement. Here's what works differently:

No URL-based ingestion

mcp-ragdocs lets you add arbitrary URLs for indexing. Pathfinder indexes git repos and local files instead. For web content, use the html source type to index specific sites.

Different vector database

mcp-ragdocs uses Qdrant. Pathfinder uses PostgreSQL + pgvector. If you have an existing Qdrant deployment, you'll be moving to a different storage backend. Your embeddings will be regenerated on first index.

Managed embeddings pipeline

Pathfinder manages its own embeddings — you don't need a separate embedding service or configure embedding models per-collection. Set the model once in config and Pathfinder handles the rest.

← Back to Pathfinder