PaperCortex/docs/receipts.md
Rene Fichtmueller 2052d87ba1 feat: initial release — AI document intelligence for Paperless-ngx
PaperCortex adds semantic search, auto-classification, receipt extraction,
bank statement matching, and DATEV export to Paperless-ngx — powered
entirely by local AI through Ollama. Exposes everything as an MCP Server
for Claude Code and AI agent integration.

- MCP Server with 5 tools (search, classify, receipt, query, export)
- Local Ollama embeddings for semantic document search
- Receipt data extraction (vendor, amount, date, tax, line items)
- DATEV Buchungsstapel CSV export for German accounting
- Bank CSV transaction matching
- Paperless-ngx REST API client
- Docker deployment
- Zero cloud dependencies — 100% self-hosted
2026-03-26 06:28:48 +13:00

2.6 KiB

Receipt Workflow

Overview

PaperCortex provides a complete receipt-to-accounting pipeline:

  1. Scan -- Upload receipts to Paperless-ngx (scan, email, photo)
  2. Extract -- AI extracts structured data (vendor, date, amounts, line items)
  3. Match -- Reconcile against bank CSV exports
  4. Export -- Generate DATEV-compatible CSV for accounting software

Receipt Extraction

Via MCP Server (Claude Code)

Extract receipt data from document #1234

Via CLI

npm run receipt:extract -- --document-id 1234

Extracted Fields

Field Description Example
vendor Company name "IKEA Deutschland GmbH"
vendorAddress Full address "Am Wanderweg 1, 65719 Hofheim"
vendorTaxId Tax ID / VAT number "DE 129 341 800"
date Receipt date "2024-03-15"
currency ISO 4217 code "EUR"
subtotal Before tax 84.03
taxRate Tax percentage 19
taxAmount Tax amount 15.97
totalAmount Total with tax 100.00
paymentMethod How it was paid "card"
lineItems Individual items Array of items
category Expense category "office_supplies"

Bank Statement Matching

Match receipts against bank CSV exports to verify which receipts correspond to which bank transactions.

Supported Bank Formats

  • Sparkasse (semicolon-separated, German format)
  • ING (semicolon-separated)
  • DKB (semicolon-separated)
  • Volksbank (semicolon-separated)
  • Generic CSV

Matching Algorithm

  1. Amount match -- Exact or close amount (within 1.00 tolerance)
  2. Date proximity -- Same day, within 3 days, or within 7 days
  3. Vendor name -- Partial match in transaction description

Results include a confidence score (0.0 - 1.0) and match reasons.

DATEV Export

Format

PaperCortex generates DATEV Buchungsstapel (posting batch) format CSV, compatible with:

  • DATEV Unternehmen Online
  • lexoffice
  • sevDesk
  • FastBill
  • Any DATEV-import-capable software

Account Mapping (SKR03)

Category Account Description
office_supplies 4930 Buerokosten
travel 4660 Reisekosten
food 4650 Bewirtungskosten
telephone 4920 Telefon
postage 4910 Porto
rent 4210 Miete
advertising 4600 Werbekosten
software 4964 Software
consulting 4950 Rechts- und Beratungskosten
default 4900 Sonstige Aufwendungen

Export via CLI

# Export all receipts from March 2024 as DATEV CSV
npm run receipt:export -- --format datev --year 2024 --month 03

Export via MCP Server

Export documents #100, #101, #102 as DATEV CSV