← All tools & services

Paydirt

The Machine Data Insights Pipeline

Assess - CAT Sanitize - Paydirt Build - Data Refinery

A free, open-source tool that scrubs CUI, PII, PHI, and credentials from Splunk or other log data exports so they can be safely shared and mined for value. Drop a file, get a sanitized version - no install, no network calls, nothing leaves your machine.

Want the details? See the README on GitHub to learn more about Paydirt and how to set up your own custom scrubbing config.

Paydirt comparison view: original and scrubbed log output side by side with per-field highlighting

There's Gold in That Data!®

What Paydirt redacts out of the box.

Network & identity

IPv4/IPv6 addresses, AWS ip- hostnames, emails, FQDNs, UNC paths, domain usernames, MAC addresses.

Credentials & tokens

PEM private keys, AWS keys, GitHub PATs, Slack & Stripe tokens, JWTs, Google API keys, Authorization headers, URL query-string secrets.

PII / PHI

SSNs (valid ranges), Luhn-validated credit cards, NPIs (45 CFR 162.406), US phone numbers, Windows user SIDs.

CUI markings NIST SP 800-171

Banner, portion, and legacy markings (CUI, FOUO, SBU, NOFORN…) plus ITAR / EAR / DD 254 flags - full-value redaction with metadata-only placeholders.

Whatever else you tell it to

Text substitution, JSON field targeting at any depth, cloud tag structures, and random replacement pools via a simple CSV config.

Validated, not naive

Validators run inside the matchers, so ordinary order IDs, tracking numbers, and timestamps aren't mistaken for SSNs, cards, or NPIs.

Built for regulated data.

CMMCCUI marking detection per NIST SP 800-171.
HIPAAPHI identifiers: SSN, payment cards, NPI, phone.
GDPRPersonal data including IPs and device identifiers.

Run it however your environment allows.

Browser tool

Paydirt.html

Download the file, double-click, drop logs on it. Pure HTML/CSS/JS, runs entirely offline. Ideal for locked-down or air-gapped environments where you can't install Python.

Python CLI

log_scrubber.py

A CLI and importable library for automation, batch processing, and pipeline integration. Python 3.9+, standard library only.

Both use the same configuration format and produce identical output for identical input - verified by an automated parity test.

Free & open source. Yours to keep.

Paydirt is the sanitize stage of the Machine Data Insights pipeline - assess with CAT, sanitize with Paydirt, build with Data Refinery.

Free & Open Source Runs Offline No Install CMMC / HIPAA / GDPR aware