# Bookkeeper Module — Design & Change Log

All notable changes to the Booker module (`booker.lukaslabs.ai` / `/var/www/html/bookkeeper/`).

---

## Phase 1 — Core Infrastructure

### Schema (`bk_001_schema.sql`)
- Created 11 tables: `bk_accounts`, `bk_statement_batches`, `bk_uploaded_documents`, `bk_statements`, `bk_transactions`, `bk_categories`, `bk_transaction_overrides`, `bk_review_queue`, `bk_pl_reports`, `bk_jobs`, `bk_audit_events`
- `ALTER TABLE users ADD COLUMN is_bookkeeper TINYINT NOT NULL DEFAULT 0`
- `duplicate_key` on `bk_transactions` is a non-unique INDEX — duplicates are inserted but flagged, never silently dropped

### Helpers (`inc/helpers.php`)
- `bk_require_auth()` — checks session `uid` + `is_bookkeeper=1`; redirects unauthenticated users to `https://frontdeskai.lukaslabs.ai/portal/`
- `bk_audit()` — writes to `bk_audit_events`
- `bk_queue_job()` / `bk_claim_job()` — atomic job queue; `bk_claim_job` uses SELECT then UPDATE WHERE status IN ('queued','failed') — race-safe without transactions
- `bk_job_done()` / `bk_job_fail()` — job lifecycle
- `bk_normalize_merchant()` — strips long numbers, punctuation, lowercases, trims to 80 chars; used as duplicate key component
- `bk_duplicate_key()` — `sha256(account_id|date|abs(amount)|normalized_merchant)`
- `bk_is_transfer_candidate()` — keyword list: transfer, xfer, zelle, venmo, autopay, bill pay, credit card payment, etc.
- `bk_parse_date()` — **BUG-001 FIX**: 2-digit year formats (`m/d/y`, `n/j/y`) placed BEFORE 4-digit equivalents. `DateTime::createFromFormat('m/d/Y', '01/15/24')` returns year 0024 — guarded with 1990–2099 range check
- `bk_parse_amount()` — handles `$`, commas, parentheses for negatives

### CSV Parser (`cron/parse-csv.php`)
- Runs every minute via cron
- Header synonym map covers: date/post_date, description, amount, debit, credit, balance, type — with many column name variants
- **BUG-002 FIX**: `bk_resolve_amount()` takes `$credit_card_mode` param. Credit cards: positive = debit (charge), negative = credit. Banks: positive = credit (deposit), negative = debit
- **BUG-004 FIX**: After marking a document `done`, checks if all sibling docs in batch are done → sets `bk_statement_batches.status = 'ready'`
- Queues `classify_transactions` job after successful parse

### Upload API (`api/upload.php`)
- Loose MIME check (extension OR MIME type), 5 MB max
- **BUG-005 FIX**: Reads `start_balance` / `end_balance` from POST; stores in `bk_statements` for reconciliation
- Overlap detection: warns but does not block
- Creates batch → document → statement → queues `parse_csv` job

### Supersede API (`api/supersede.php`)
- POST: mark an old statement superseded by a new one
- Sets old statement `is_active=0, status='superseded', superseded_by_id=new`
- Bulk-sets old transactions: `from_superseded=1, is_excluded_from_pl=1`
- **BUG-006 FIX**: Supersession is a deliberate user action, not automatic — prevents double-counting in P&L

### Accounts API (`api/accounts.php`)
- POST: create account (name, account_type, institution_name, last_four, base_currency)
- DELETE: soft-deactivate account (verify ownership first)
- `account_type` allowlist: `['checking','savings','credit_card','other']`

### Job Status API (`api/job-status.php`)
- GET: poll job status, parse_status, failure_reason, statement_status, reconciliation_status

### Bug Fixes (Phase 1)
| # | Bug | Fix |
|---|-----|-----|
| 001 | `bk_parse_date` returned year 0024 for 2-digit year inputs | Reordered format array; added 1990–2099 guard |
| 002 | Credit card sign convention wrong | Added `$credit_card_mode` param queried from `bk_accounts.account_type` |
| 003 | Storage dir owned by azureuser, Apache (www-data) couldn't write | `chmod -R 777 /var/www/html/bookkeeper/storage/` |
| 004 | Batch never marked `ready` after parsing | Check remaining unprocessed docs after each doc completes |
| 005 | Reconciliation always `not_started` | Added opening/closing balance fields to upload form and API |
| 006 | Supersession detection-only — old statement stayed active | Built `api/supersede.php` as a deliberate user action |

---

## FrontDeskAI Portal Integration

### Admin panel (`/frontdeskai/admin/client.php`)
- Added `toggle_user_flag` POST handler — validated against allowlist `['is_bookkeeper']`
- Added **Platform Access** card between Feature Flags and Usage History
- Toggle uses existing `fd-toggle` CSS

### Client portal (`/frontdeskai/portal/index.php`)
- Added `$has_bookkeeper` — queried live from DB on every page load (not from session), so admin toggle takes effect immediately without re-login
- Added `bookkeeper` to `$valid_tabs`
- Bookkeeper tab: active button if `$has_bookkeeper`, locked/greyed-out with `🔒` if not
- `#tab-bookkeeper` pane: stats (batch count, review count), recent batches table, **Open Bookkeeper →** button linking to `https://booker.lukaslabs.ai/portal.php`
- Pre-existing bug fix: `$has_email` was only set inside `if ($client)` but used outside it — added to variable initialization block

---

## URL Architecture
| Infrastructure URL | Public-facing URL | Doc Root |
|---|---|---|
| `frontdeskai.lukaslabs.ai` | `lukas.lukaslabs.ai` | `/var/www/html/frontdeskai/` |
| `bookkeeper.lukaslabs.ai` | `booker.lukaslabs.ai` | `/var/www/html/bookkeeper/` |

- All public-facing links in portal use `https://booker.lukaslabs.ai/`
- `← Portal` back-link goes to `https://frontdeskai.lukaslabs.ai/portal/`
- Unauthenticated redirect in `bk_require_auth()` goes to `https://frontdeskai.lukaslabs.ai/portal/`
- Session sharing works because FD portal sets `session.cookie_domain = .lukaslabs.ai`

---

## Phase 2 — Classification & Review

### Categories seed (`bk_002_categories.sql`)
- 20 system categories seeded into `bk_categories` (client_id=NULL = available to all):
  - Income (3): Sales Revenue, Service Revenue, Other Income
  - COGS (3): Cost of Goods Sold, Materials & Supplies, Fulfillment & Shipping
  - Expense (14): Advertising & Marketing, Bank Fees, Education & Training, Equipment & Hardware, Insurance, Meals & Entertainment, Office Supplies, Payroll & Wages, Professional Services, Rent & Utilities, Software & Subscriptions, Taxes & Licenses, Travel, Miscellaneous

### Merchant Rules table
- `bk_merchant_rules (id, client_id, merchant_normalized, category_id, source_description, created_at, updated_at)`
- UNIQUE KEY on `(client_id, merchant_normalized)` — one rule per normalized merchant name per client
- Used by classify worker to skip GPT for known merchants
- Populated when user saves an override with "Remember for future imports" checked

### Classifier (`cron/classify-transactions.php`)
- Runs every minute via cron
- Priority 1: apply saved `bk_merchant_rules` — known merchants skip GPT, get `auto_classified` immediately, any pending review queue entry resolved
- Priority 2: batch remaining unclassified transactions (up to 30 per GPT call)
- GPT-4o called with structured JSON prompt; response_format=json_object; returns array of `{category, type, confidence, rationale, tax}`
- Confidence ≥ 0.75 + not duplicate/transfer → `auto_classified`
- Confidence < 0.75 → `needs_review`, added to `bk_review_queue` with reason `low_confidence`
- Fallback: unrecognized GPT category name → `Miscellaneous`
- Tax treatment derived from GPT response: `likely_business`, `partially_business`, `likely_personal`, `unknown`

### Review API (`api/review.php`)
**Single actions** (require `rq_id`):
- `approve` — accept AI suggestion, mark `reviewed`
- `override` — set new category; supports `apply_to_all` (resolves all matching merchant+reason in one call) and `save_rule` (saves to `bk_merchant_rules`)
- `confirm_duplicate` — set `is_excluded_from_pl=1`, mark `reviewed`
- `dismiss_duplicate` — clear `is_duplicate`, mark `reviewed`
- `confirm_transfer` — set `is_transfer=1, is_excluded_from_pl=1, transaction_type='transfer'`, mark `reviewed`
- `dismiss_transfer` — clear `is_transfer, is_excluded_from_pl`, mark `reviewed`

**Bulk actions** (keyed by `merchant_normalized` + `reason`):
- `bulk_confirm_duplicate`, `bulk_dismiss_duplicate`, `bulk_confirm_transfer`, `bulk_dismiss_transfer`
- Finds all pending queue items matching merchant+reason, applies action to all, returns `resolved_rq_ids` array

**Tax treatment on override** (`derive_tax_treatment()`):
- Meals & Entertainment / Travel → `partially_business`
- All other expense/cogs/income → `likely_business`
- Transfer → `unknown`

All actions log to `bk_transaction_overrides` (audit trail): `field_name`, `old_value`, `new_value`, `note`

### Transactions API (`api/transactions.php`)
- `action=update_note` — saves up to 500-char note on a transaction; verifies ownership

### Portal — Transactions Tab
- Default date range: last 30 days (From/To date pickers in filter bar)
- Filters: From, To, Account, Category, Status (needs_review / classified / reviewed / overridden), Type (debit/credit)
- "Clear filters ×" resets to 30-day default
- **Charts** (Chart.js 4.4 from CDN) above the filter bar:
  - Left: Horizontal bar — expenses by category, last 30 days
  - Right: Doughnut — tax deductibility breakdown (green=Business, amber=Partial, red=Personal, grey=Unclassified)
- **Notes column** — click any cell to edit inline; textarea with Save/Cancel; saves to DB without page reload
- Excluded transactions shown at 45% opacity with `excluded` pill
- Pagination: 50 per page
- Tax column uses `suggested_tax_treatment` (not `transaction_type`) — fixed after Phase 2

### Portal — Review Tab
- Items ordered by `description_normalized` so same merchant groups together
- Each card shows: reason badge, date, merchant, amount, account, AI suggestion + confidence + rationale
- `similar_pending` count shown per item (subquery on description_normalized + reason)
- **Apply to all** button: resolves all matching merchant transactions in one POST call
- Override panel: grouped category `<select>` (Income / COGS / Expense optgroups), "Apply to all N transactions" checkbox (pre-checked if similar exist), "Remember for future imports" checkbox (pre-checked)
- All actions trigger `location.reload()` on success — no stale card state

---

## Phase 2 — Reports & Storage

### Azure Blob Storage (`inc/azure-blob.php`)
- Account: `barkartcomedia` / Container: `book-reports` (private)
- `bk_blob_put(blobPath, content, contentType)` — uploads string content, returns URL
- `bk_blob_get(blobPath)` — fetches blob content as string, returns null if 404
- Signing: SharedKey HMAC-SHA256, API version 2020-04-08
- Blob paths: `{client_id}/{YYYY-MM}_{label-slug}.html`

### `bk_pl_reports` schema additions
- `blob_url VARCHAR(500)` — full Azure Blob URL (not publicly accessible)
- `blob_path VARCHAR(300)` — relative path within container for re-fetching

### Report Generator (`api/generate-report.php`)
- POST: `{ period_start, period_end, label }`
- Queries income, COGS, expenses, all transactions for period
- Calculates: gross profit, net income, estimated tax deductible (business expenses + 50% of partial)
- Generates self-contained HTML report (no external dependencies)
- Uploads to Azure Blob → stores record in `bk_pl_reports`
- Re-generating same period overwrites existing record
- Warns in report if unresolved review items exist in the period
- Logs to `bk_audit_events`

### Report Viewer (`api/view-report.php`)
- GET `?id=N` — auth-gated; fetches blob from Azure and streams as `text/html`
- Reports are never publicly accessible — always proxied through PHP auth check

### Report HTML contents
1. Header: label, period, generated datetime, transaction count
2. Warning banner (if unresolved items)
3. Summary cards: Total Income / Total Expenses / Net Income / Est. Tax Deductible
4. P&L table: Income by category → COGS → Gross Profit → Expenses by category (with tax treatment pills) → Net Income → Est. Deductible line
5. Page break → Full transaction list (all transactions, including excluded, with notes column)
6. Footer disclaimer

### Portal — Reports Tab
- Generate form: label, date range; label auto-fills from end date
- Saved reports table: label, period, income, expenses, net, generated date, View link
- View → opens report in new tab via `/api/view-report.php?id=N`

---

---

## Phase 3 — Export, Trend, Merchant Rules, Purge

### CSV Export (`api/export.php`)
- GET with same filter params as Transactions tab: `from`, `to`, `acct`, `cat`, `st`, `dir`
- Streams as `text/csv` with UTF-8 BOM (Excel-compatible)
- Columns: Date, Description, Merchant, Account, Direction, Amount (signed), Category, Tax Treatment, Status, Excluded from P&L, Notes
- **Export CSV** button in top-right of transactions table, preserves active filters

### Monthly P&L Trend Chart (Reports tab)
- Grouped bar chart (Chart.js) above the generate form
- Last 12 months of data: green bars = income, red bars = expenses
- Queries `bk_transactions` grouped by `DATE_FORMAT(transaction_date, '%Y-%m')`
- Excluded transactions (`is_excluded_from_pl=1`) omitted from both income and expense bars
- Only renders if there is data (no empty chart)

### Merchant Rules Management (Accounts tab)
- New **Merchant Rules** section below the accounts list
- Table: normalized merchant key, original description, assigned category (pill), date saved
- **Delete** button per rule — calls `api/merchant-rules.php` DELETE
- After deletion, future imports of that merchant go back through AI classification
- `api/merchant-rules.php` — DELETE endpoint, verifies client ownership before deleting

### DB Purge Cron (`cron/purge-transactions.php`)
- Runs 1st of each month at 3am: `0 3 1 * * php ...`
- Safety rules (all must pass before any row is deleted):
  1. Transaction must be older than 60 days
  2. A `bk_pl_reports` row with `status='final'` AND `blob_path IS NOT NULL` must fully cover the calendar month (`period_start <= month_start` AND `period_end >= month_end`)
  3. No partial-month purge — whole month or nothing
- Orphaned `bk_review_queue` rows (no matching transaction) are also cleaned up
- All purges logged to `/var/www/html/bookkeeper/storage/purge.log` and `bk_audit_events`

### Cron schedule summary
| Schedule | Script | Purpose |
|---|---|---|
| `* * * * *` | `cron/parse-csv.php` | Parse uploaded CSV files |
| `* * * * *` | `cron/classify-transactions.php` | AI classify new transactions |
| `0 3 1 * *` | `cron/purge-transactions.php` | Purge old transactions covered by reports |

---

## Planned / Future

- **Phase 3**: P&L by month chart, year-over-year comparison, CSV export
- **Merchant rule learning UI**: review and manage saved rules
- **DB purge**: delete `bk_transactions` older than 60 days for periods with a saved report in blob storage (report is the source of truth)
- **Lukas Bot support widget**: embed on bookkeeper portal as inline support expert (placeholder comment in portal.php)
- **Custom categories**: per-client category additions on top of system defaults
