Rene Fichtmueller 2052d87ba1 feat: initial release — AI document intelligence for Paperless-ngx

PaperCortex adds semantic search, auto-classification, receipt extraction,
bank statement matching, and DATEV export to Paperless-ngx — powered
entirely by local AI through Ollama. Exposes everything as an MCP Server
for Claude Code and AI agent integration.

- MCP Server with 5 tools (search, classify, receipt, query, export)
- Local Ollama embeddings for semantic document search
- Receipt data extraction (vendor, amount, date, tax, line items)
- DATEV Buchungsstapel CSV export for German accounting
- Bank CSV transaction matching
- Paperless-ngx REST API client
- Docker deployment
- Zero cloud dependencies — 100% self-hosted

2026-03-26 06:28:48 +13:00

2.2 KiB

Raw Blame History

Setup Guide

Prerequisites

Node.js 20+ (or Docker)
Paperless-ngx instance with API access
Ollama with required models

Step 1: Install Ollama Models

# Required: LLM for classification and extraction
ollama pull qwen2.5:14b

# Required: Embedding model for semantic search
ollama pull nomic-embed-text

Verify Ollama is running:

curl http://localhost:11434/api/tags

Step 2: Get Paperless-ngx API Token

Open your Paperless-ngx web UI
Go to Settings > API
Generate a new API token
Copy the token for the next step

Step 3: Configure PaperCortex

git clone https://github.com/YOUR_USERNAME/PaperCortex.git
cd PaperCortex
cp .env.example .env

Edit .env with your values:

PAPERLESS_URL=http://localhost:8000
PAPERLESS_TOKEN=<your-api-token>
OLLAMA_URL=http://localhost:11434

Step 4: Run

Option A: Docker (Recommended)

docker compose up -d

Option B: Manual

npm install
npm run build
npm start

Option C: Development

npm install
npm run dev

Step 5: Register MCP Server

Add to your Claude Code configuration (~/.claude.json):

{
  "mcpServers": {
    "papercortex": {
      "command": "node",
      "args": ["/absolute/path/to/PaperCortex/dist/mcp-server/index.js"],
      "env": {
        "PAPERLESS_URL": "http://localhost:8000",
        "PAPERLESS_TOKEN": "your-token",
        "OLLAMA_URL": "http://localhost:11434"
      }
    }
  }
}

Step 6: Populate Vector Store

On first run, you need to embed your existing documents. This will be automated in a future release. For now, the vector store is populated as documents are queried or classified.

Troubleshooting

"Connection refused" to Paperless-ngx

Verify the URL in .env is reachable
Check that the API token is valid
Ensure Paperless-ngx is running

"Connection refused" to Ollama

Run ollama serve if not already running
Check the port (default: 11434)
Verify models are pulled: ollama list

Slow first query

The first embedding generation may take longer as Ollama loads the model into memory
Subsequent queries will be faster once the model is loaded

2.2 KiB Raw Blame History