PaperCortex/docs/setup.md
Rene Fichtmueller 2052d87ba1 feat: initial release — AI document intelligence for Paperless-ngx
PaperCortex adds semantic search, auto-classification, receipt extraction,
bank statement matching, and DATEV export to Paperless-ngx — powered
entirely by local AI through Ollama. Exposes everything as an MCP Server
for Claude Code and AI agent integration.

- MCP Server with 5 tools (search, classify, receipt, query, export)
- Local Ollama embeddings for semantic document search
- Receipt data extraction (vendor, amount, date, tax, line items)
- DATEV Buchungsstapel CSV export for German accounting
- Bank CSV transaction matching
- Paperless-ngx REST API client
- Docker deployment
- Zero cloud dependencies — 100% self-hosted
2026-03-26 06:28:48 +13:00

2.2 KiB

Setup Guide

Prerequisites

  • Node.js 20+ (or Docker)
  • Paperless-ngx instance with API access
  • Ollama with required models

Step 1: Install Ollama Models

# Required: LLM for classification and extraction
ollama pull qwen2.5:14b

# Required: Embedding model for semantic search
ollama pull nomic-embed-text

Verify Ollama is running:

curl http://localhost:11434/api/tags

Step 2: Get Paperless-ngx API Token

  1. Open your Paperless-ngx web UI
  2. Go to Settings > API
  3. Generate a new API token
  4. Copy the token for the next step

Step 3: Configure PaperCortex

git clone https://github.com/YOUR_USERNAME/PaperCortex.git
cd PaperCortex
cp .env.example .env

Edit .env with your values:

PAPERLESS_URL=http://localhost:8000
PAPERLESS_TOKEN=<your-api-token>
OLLAMA_URL=http://localhost:11434

Step 4: Run

docker compose up -d

Option B: Manual

npm install
npm run build
npm start

Option C: Development

npm install
npm run dev

Step 5: Register MCP Server

Add to your Claude Code configuration (~/.claude.json):

{
  "mcpServers": {
    "papercortex": {
      "command": "node",
      "args": ["/absolute/path/to/PaperCortex/dist/mcp-server/index.js"],
      "env": {
        "PAPERLESS_URL": "http://localhost:8000",
        "PAPERLESS_TOKEN": "your-token",
        "OLLAMA_URL": "http://localhost:11434"
      }
    }
  }
}

Step 6: Populate Vector Store

On first run, you need to embed your existing documents. This will be automated in a future release. For now, the vector store is populated as documents are queried or classified.

Troubleshooting

"Connection refused" to Paperless-ngx

  • Verify the URL in .env is reachable
  • Check that the API token is valid
  • Ensure Paperless-ngx is running

"Connection refused" to Ollama

  • Run ollama serve if not already running
  • Check the port (default: 11434)
  • Verify models are pulled: ollama list

Slow first query

  • The first embedding generation may take longer as Ollama loads the model into memory
  • Subsequent queries will be faster once the model is loaded