Commit Graph

67 Commits

Author SHA1 Message Date
Justin Bollinger f0bba73225 fix: auto-detect training device instead of defaulting to CUDA
The PassGPT training device menu now uses _detect_device() to default
to the best available device (CUDA > MPS > CPU) rather than always
defaulting to CUDA, which fails on systems without NVIDIA GPUs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 18:47:41 -05:00
Justin Bollinger c0d2cad2c1 fix: skip ML-dependent tests in CI and mock version in version check test
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:52:16 -05:00
Justin Bollinger b6524cbdc4 feat: add training time estimates and device selection to PassGPT menu
Show estimated training times for CUDA/MPS/CPU before starting a
training run. Add device selection prompt with cuda as the default.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:27:09 -05:00
Justin Bollinger fcfe6890f6 feat: add memory pre-checks and optimize PassGPT training for large wordlists
Training previously loaded entire wordlists into RAM and tokenized all at
once, causing OOM on large files like rockyou.txt. This adds memory
estimation, lazy dataset loading, and training optimizations.

- Add _get_available_memory_mb() for cross-platform RAM detection
- Add _estimate_training_memory_mb() to predict peak usage before loading
- Replace bulk tokenization with LazyPasswordDataset (file offset index + on-the-fly tokenization)
- Add --max-lines flag to limit training to first N lines
- Add --memory-limit flag to auto-tune --max-lines based on available RAM
- Enable gradient checkpointing and gradient accumulation (steps=4)
- Enable fp16 on CUDA devices

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:47:44 -05:00
Justin Bollinger 56aaa9b47d feat: add PassGPT model fine-tuning and training menu integration
Add ability to fine-tune PassGPT models on custom password wordlists.
Models save locally to ~/.hate_crack/passgpt/ with no data uploaded to
HuggingFace (push_to_hub=False, HF_HUB_DISABLE_TELEMETRY=1). The
PassGPT menu now shows available models (default + local fine-tuned)
and a training option. Adds datasets to [ml] deps and passgptTrainingList
config key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:51:06 -05:00
Justin Bollinger 4a7f0724d9 feat: add startup version check, fix PassGPT MPS/output issues, hide menu without ML deps
- Add optional startup version check against GitHub releases (check_for_updates config option)
- Add packaging dependency for version comparison
- Fix PassGPT OOM on MPS by capping batch size to 64 and setting memory watermark limits
- Fix PassGPT output having spaces between every character
- Hide PassGPT menu item (17) unless torch/transformers are installed
- Fix mypy errors in passgpt_generate.py with type: ignore comments
- Update README with version check docs, optional ML deps section, and PassGPT CLI options
- Add test_version_check.py with 8 tests covering update check behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:32:40 -05:00
Justin Bollinger 87535b9828 feat: add PassGPT attack (#17) - GPT-2 based ML password generator
Add PassGPT as attack mode 17, using a GPT-2 model trained on leaked
password datasets to generate candidate passwords. The generator pipes
candidates to hashcat via stdin, matching the existing OMEN pipe pattern.

- Add standalone generator module (python -m hate_crack.passgpt_generate)
- Add [ml] optional dependency group (torch, transformers)
- Add config keys: passgptModel, passgptMaxCandidates, passgptBatchSize
- Wire up menu entries in main.py, attacks.py, and hate_crack.py
- Auto-detect GPU (CUDA/MPS) with CPU fallback
- Add unit tests for pipe construction, handler, and ML deps check

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 08:41:22 -05:00
Justin Bollinger 0991701024 feat: add OMEN attack as menu option 16
Add OMEN (Ordered Markov ENumerator) as a probability-ordered password
candidate generator. Trains n-gram models on leaked passwords via
createNG, then pipes candidates from enumNG into hashcat.

Also fix a pre-existing bug where ensure_binary() used quit(1) instead
of sys.exit(1) - quit() closes stdin before raising SystemExit, which
caused "ValueError: I/O operation on closed file" when any optional
binary check failed and the program continued to use input().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 14:01:58 -05:00
Justin Bollinger 97997daf15 feat: add computer account filtering for NetNTLM hash types (5500/5600)
Reuses existing _count_computer_accounts() and _filter_computer_accounts()
to optionally strip computer accounts before NetNTLM deduplication.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 13:23:36 -05:00
Justin Bollinger 4ae7a2b94e test: add E2E preprocessing flow tests for computer account filtering
Add TestE2EPreprocessingFlow class that simulates the exact main()
preprocessing logic (format detection, filtering, NT/LM extraction)
with realistic secretsdump.py output. Covers: filter accept/decline,
no computers, all computers, LM hash detection, domain\computer$
format.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 13:10:38 -05:00
Justin Bollinger 73bb6cf596 fix: QA hardening for NTLM preprocessing
- Catch PermissionError/OSError in file operations (not just FileNotFoundError)
- Refactor _dedup_netntlm_by_username to two-pass streaming (memory safe)
- Handle CRLF line endings in filter and dedup functions
- Add KeyboardInterrupt handling with temp file cleanup during preprocessing
- Track .filtered/.dedup temp files for cleanup on interruption
- Add CRLF line ending tests for both filter and dedup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:59:09 -05:00
Justin Bollinger 26cd21af16 test: add comprehensive pwdump filter pipeline and edge case tests
Add TestWriteFieldSortedUnique (7 tests) and TestPwdumpFilterPipeline
(8 tests) covering the full filter -> extract NT/LM pipeline. Add
edge case tests for unicode, BOM, long lines, permissions, and
delimiter handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:53:20 -05:00
Justin Bollinger 53d42fbe96 fix: address QA review issues for NTLM preprocessing (#27, #28)
- Add type hints to _filter_computer_accounts and _dedup_netntlm_by_username
- Fix unclosed file handle when reading hash file for format detection
- Extract _count_computer_accounts helper to eliminate duplicate file reads
- Stream _filter_computer_accounts output instead of collecting in memory
- Only write .dedup file when duplicates actually exist
- Add tests for _count_computer_accounts, malformed lines, and no-file-on-zero-dupes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:46:48 -05:00
Justin Bollinger e417e3d928 feat: add computer account filtering and NetNTLM dedup
- Detect and optionally filter Windows computer accounts (username
  ending with $) from pwdump-format NTLM hash files (type 1000)
- Detect and optionally deduplicate NetNTLMv1/v2 hashes (types
  5500/5600) by username, keeping first occurrence
- Add 10 tests covering both features

Fixes #27
Fixes #28

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:43:22 -05:00
Justin Bollinger 7a56c7f506 fix: resolve Hashview wordlist downloads to configured directory
download_wordlist() now resolves relative filenames against
get_hcat_wordlists_dir() instead of saving to cwd. Also ensures
the parent directory exists before writing.

Fixes #70

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 11:05:23 -05:00
Justin Bollinger 80857d03c6 fix: add missing rulesDirectory mock in ollama 404 retry test
The test built its own mock context instead of using the shared
ollama_globals helper, missing the rulesDirectory and hcatPotfilePath
patches. This caused FileNotFoundError on CI where /path/to/hashcat/rules
doesn't exist.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:25:48 -05:00
Justin Bollinger abe8f2ae73 fix: resolve CI test failures in ollama and hashview tests
- Mock rulesDirectory in ollama test fixture so hcatOllama doesn't
  fail with FileNotFoundError on CI where /path/to/hashcat/rules
  doesn't exist
- Mock potfile path in hashview auto-merge test so found file cleanup
  isn't blocked by missing ~/.hashcat directory
- Update pre-push hook to match CI env vars (HATE_CRACK_SKIP_INIT=1)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:15:48 -05:00
Justin Bollinger f6a6e508ee fix: update ollama tests to match refactored target-only handler
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:01:33 -05:00
Justin Bollinger 164a17003c refactor: use cracked .out file as sole wordlist source for Ollama attack
Remove ollamaWordlist config key and all references. Wordlist mode now
requires the cracked hashes .out file to exist and extracts passwords
by splitting on the first colon. Detect Ollama refusal responses and
abort gracefully. Update tests accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 20:04:11 -05:00
Justin Bollinger 1035287d4e feat: send full wordlist to Ollama with configurable num_ctx
Remove 500-line wordlist cap and send the entire file to Ollama.
Add ollamaNumCtx config key (default 32768) to control the context
window size. Invert wordlist prompt to default-yes, remove unused
ollamaCandidateCount config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 19:33:23 -05:00
Justin Bollinger 88d786d9aa refactor: rename Markov LLM attack to Ollama attack and simplify interface
Rename markov_attack → ollama_attack and hcatMarkov → hcatOllama across
menu, attacks, and tests. Remove candidate count prompts and cracked-output
default wordlist logic. Rename config keys (markov* → ollama*) and drop
ollamaUrl. Fix Dockerfile.test to use granular build steps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 19:17:50 -05:00
Justin Bollinger 2cb54beecb fix: overhaul Hashview download flow and fix hashcat --show stderr pollution
- Merge download_left and download_found into single "Download Hashes" menu option
- Append found hash:clear pairs to potfile instead of running broken hashcat re-crack
- Append found hashes to left file so hashcat --show returns full results
- Clean up found_ temp files after merge
- Split found file on first colon (not last) to handle passwords containing colons
- Filter hashcat parse errors from --show stdout in _run_hashcat_show
- Add get_hcat_potfile_path() helper to api.py for potfile resolution
- Remove obsolete download_found_hashes API method and CLI subcommand
- Fix ollama tests to match current 4-arg hcatOllama signature and rule loop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 19:11:51 -05:00
Justin Bollinger fe384641df feat: default Markov LLM wordlist to cracked hashes output
When the cracked hashes output file (.out) exists, use it as the default
wordlist for the LLM Markov attack instead of the generic markovWordlist
config. This makes the attack learn from already-cracked passwords for
the current engagement, falling back to config when no cracked output
exists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 16:01:44 -05:00
Justin Bollinger 371fca1228 feat: add LLM Markov Attack (menu option 15)
Add a new attack mode that uses a local LLM via Ollama to generate
password candidates, converts them into hashcat .hcstat2 Markov
statistics via hcstat2gen, and runs a Markov-enhanced mask attack.

Two generation sub-modes:
- Wordlist-based: feeds sample from an existing wordlist to the LLM
  as pattern context (config-selectable default with Y/N override)
- Target-based: prompts for company name, industry, and location
  for contextual password generation

Pipeline: Ollama API -> candidate file -> hcstat2gen -> LZMA compress
-> hashcat -a 3 --markov-hcstat2

Config additions: ollamaUrl, ollamaModel, markovCandidateCount,
markovWordlist. No new pip dependencies (uses stdlib urllib/lzma).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 13:13:39 -05:00
Justin Bollinger 55b7f0fc62 fix: separate hcatPath (hashcat dir) from hate_path (asset dir)
hcatPath now exclusively points to the hashcat install directory and is
auto-discovered from PATH when not configured. hate_path is resolved
from the package directory (installed) or repo root (development) with
no auto-discovery. Extracted vendor-assets/clean-vendor Makefile targets
to deduplicate the install logic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 20:23:12 -05:00
Justin Bollinger efc25c335a Apply ruff formatting fixes
- Fix line length and formatting in hate_crack/api.py
- Fix line wrapping and f-string formatting in hate_crack/attacks.py
- Apply code style improvements in hate_crack/main.py
- Format test files for consistency
- All changes applied via 'ruff check --fix'
2026-02-09 20:08:51 -05:00
Justin Bollinger 210a4006f5 removed extended expander 2026-02-06 19:37:39 -05:00
Justin Bollinger 59cbad7890 working loopback mode with tests 2026-02-06 15:06:25 -05:00
Justin Bollinger 40172c8df9 updated loopback logic and added testing 2026-02-06 14:37:37 -05:00
Justin Bollinger 4f51ce7379 updated loopback logic and added testing 2026-02-06 14:36:15 -05:00
Justin Bollinger 742529962a test: improve live test to skip gracefully when server unavailable
- Add server reachability check before attempting connection
- Test skips if Hashview server is not available on localhost:5000
- Prevents test failures when HASHVIEW_TEST_REAL env var is set but server isn't running
- Allows test to succeed when Docker/server is available and configured
2026-02-06 09:43:23 -05:00
Justin Bollinger 56dd272e75 test: update CLI menu test strings to match recent changes
The Hashview menu has been updated:
- Old: 'Available Customers'
- New: 'What would you like to do?'

Updated test to check for the current menu text that displays
when using the --hashview flag.
2026-02-06 09:39:07 -05:00
Justin Bollinger b52d856d7b download wordlists testing 2026-02-05 17:36:46 -05:00
Justin Bollinger b16c3a1d26 fixed wordlist download and unit testing for live api 2026-02-05 16:32:17 -05:00
Justin Bollinger c1d5658dd6 automatic hash_id detection via api for hashview hashes and updated tests. 2026-02-05 15:39:08 -05:00
Justin Bollinger 092b63f82c automatic hash_id detection via api for hashview hashes 2026-02-05 14:56:27 -05:00
Justin Bollinger 0c0690e2ef lots of refactoring around the menues and building out test cases 2026-02-05 13:52:06 -05:00
Justin Bollinger c049065924 Update hashview API flow and tests 2026-02-04 15:48:19 -05:00
Justin Bollinger 70334f0024 rule download fixes from hashmob 2026-02-03 19:44:58 -05:00
Justin Bollinger f33ca19107 expanded tests and hashview menu changes 2026-02-03 19:28:45 -05:00
Justin Bollinger ee375cfbd7 fixed menu pytest options that failed after updating them 2026-02-02 13:18:12 -05:00
Justin Bollinger 0431d10e26 fixed wordlist display 2026-02-02 12:39:24 -05:00
Justin Bollinger 23134be3a0 Revert ancestor asset lookup 2026-02-01 22:41:42 -05:00
Justin Bollinger 4402d3175b Fix uv tool asset lookup 2026-02-01 22:32:24 -05:00
Justin Bollinger e2775b1e53 Add regression tests for asset path separation
Tests now verify that hashcat-utils are loaded from hate_crack repo
even when hcatPath points to a different directory (like /opt/hashcat).

Why previous tests didn't catch this bug:
- config.json.example has hcatPath = "" (empty)
- Code has fallback: hcatPath = config.get('hcatPath', '') or hate_path
- So hcatPath accidentally defaulted to hate_path during testing
- This masked the bug where utilities incorrectly used hcatPath

Added tests that would have caught this:
- test_hashcat_utils_uses_hate_path_not_hcat_path
- test_config_with_explicit_hashcat_path
- test_readme_documents_correct_usage

Also added code comment documenting the fallback behavior.
2026-02-01 22:06:52 -05:00
Justin Bollinger 52f8d5ee8b Fix asset path resolution: separate hashcat and hate_crack locations
BREAKING CHANGE: Corrected understanding of hcatPath configuration

- hcatPath should point to hashcat binary location (or omit if in PATH)
- hashcat-utils and princeprocessor are located in hate_crack repo
- Changed code to use hate_path for utilities instead of hcatPath
- Updated error messages to guide users correctly
- Updated README with correct configuration examples
- Asset discovery now properly uses HATE_CRACK_HOME environment variable

This fixes the issue where users had hcatPath pointing to hashcat
installation but the code was looking there for hashcat-utils.
2026-02-01 22:05:27 -05:00
Justin Bollinger 33a20d2540 Improve error handling for misconfigured hcatPath
- Add directory existence check in ensure_binary() before attempting make
- Provide clear error message when build directory doesn't exist
- Add troubleshooting section to README.md explaining common hcatPath mistakes
- Add tests for invalid hcatPath scenario and installed tool execution
- Helps users distinguish between hashcat path and hate_crack path

Fixes issue where users set hcatPath to hashcat installation directory
instead of hate_crack repository directory, causing confusing errors.
2026-02-01 22:02:35 -05:00
Justin Bollinger 8e6909d602 updated makefile 2026-02-01 20:46:01 -05:00
Justin Bollinger d429cede89 Merge draft PR #63 2026-01-31 23:40:09 -05:00
copilot-swe-agent[bot] ac7f809e33 Check return code and log stderr on cleanup failure
Co-authored-by: bandrel <3598052+bandrel@users.noreply.github.com>
2026-02-01 04:29:41 +00:00