PaperCortex adds semantic search, auto-classification, receipt extraction, bank statement matching, and DATEV export to Paperless-ngx — powered entirely by local AI through Ollama. Exposes everything as an MCP Server for Claude Code and AI agent integration. - MCP Server with 5 tools (search, classify, receipt, query, export) - Local Ollama embeddings for semantic document search - Receipt data extraction (vendor, amount, date, tax, line items) - DATEV Buchungsstapel CSV export for German accounting - Bank CSV transaction matching - Paperless-ngx REST API client - Docker deployment - Zero cloud dependencies — 100% self-hosted
102 lines
2.6 KiB
Markdown
102 lines
2.6 KiB
Markdown
# Receipt Workflow
|
|
|
|
## Overview
|
|
|
|
PaperCortex provides a complete receipt-to-accounting pipeline:
|
|
|
|
1. **Scan** -- Upload receipts to Paperless-ngx (scan, email, photo)
|
|
2. **Extract** -- AI extracts structured data (vendor, date, amounts, line items)
|
|
3. **Match** -- Reconcile against bank CSV exports
|
|
4. **Export** -- Generate DATEV-compatible CSV for accounting software
|
|
|
|
## Receipt Extraction
|
|
|
|
### Via MCP Server (Claude Code)
|
|
|
|
```
|
|
Extract receipt data from document #1234
|
|
```
|
|
|
|
### Via CLI
|
|
|
|
```bash
|
|
npm run receipt:extract -- --document-id 1234
|
|
```
|
|
|
|
### Extracted Fields
|
|
|
|
| Field | Description | Example |
|
|
|---|---|---|
|
|
| vendor | Company name | "IKEA Deutschland GmbH" |
|
|
| vendorAddress | Full address | "Am Wanderweg 1, 65719 Hofheim" |
|
|
| vendorTaxId | Tax ID / VAT number | "DE 129 341 800" |
|
|
| date | Receipt date | "2024-03-15" |
|
|
| currency | ISO 4217 code | "EUR" |
|
|
| subtotal | Before tax | 84.03 |
|
|
| taxRate | Tax percentage | 19 |
|
|
| taxAmount | Tax amount | 15.97 |
|
|
| totalAmount | Total with tax | 100.00 |
|
|
| paymentMethod | How it was paid | "card" |
|
|
| lineItems | Individual items | Array of items |
|
|
| category | Expense category | "office_supplies" |
|
|
|
|
## Bank Statement Matching
|
|
|
|
Match receipts against bank CSV exports to verify which receipts correspond to which bank transactions.
|
|
|
|
### Supported Bank Formats
|
|
|
|
- Sparkasse (semicolon-separated, German format)
|
|
- ING (semicolon-separated)
|
|
- DKB (semicolon-separated)
|
|
- Volksbank (semicolon-separated)
|
|
- Generic CSV
|
|
|
|
### Matching Algorithm
|
|
|
|
1. **Amount match** -- Exact or close amount (within 1.00 tolerance)
|
|
2. **Date proximity** -- Same day, within 3 days, or within 7 days
|
|
3. **Vendor name** -- Partial match in transaction description
|
|
|
|
Results include a confidence score (0.0 - 1.0) and match reasons.
|
|
|
|
## DATEV Export
|
|
|
|
### Format
|
|
|
|
PaperCortex generates DATEV Buchungsstapel (posting batch) format CSV, compatible with:
|
|
|
|
- DATEV Unternehmen Online
|
|
- lexoffice
|
|
- sevDesk
|
|
- FastBill
|
|
- Any DATEV-import-capable software
|
|
|
|
### Account Mapping (SKR03)
|
|
|
|
| Category | Account | Description |
|
|
|---|---|---|
|
|
| office_supplies | 4930 | Buerokosten |
|
|
| travel | 4660 | Reisekosten |
|
|
| food | 4650 | Bewirtungskosten |
|
|
| telephone | 4920 | Telefon |
|
|
| postage | 4910 | Porto |
|
|
| rent | 4210 | Miete |
|
|
| advertising | 4600 | Werbekosten |
|
|
| software | 4964 | Software |
|
|
| consulting | 4950 | Rechts- und Beratungskosten |
|
|
| default | 4900 | Sonstige Aufwendungen |
|
|
|
|
### Export via CLI
|
|
|
|
```bash
|
|
# Export all receipts from March 2024 as DATEV CSV
|
|
npm run receipt:export -- --format datev --year 2024 --month 03
|
|
```
|
|
|
|
### Export via MCP Server
|
|
|
|
```
|
|
Export documents #100, #101, #102 as DATEV CSV
|
|
```
|