Files
hate_crack/tests
Justin Bollinger fcfe6890f6 feat: add memory pre-checks and optimize PassGPT training for large wordlists
Training previously loaded entire wordlists into RAM and tokenized all at
once, causing OOM on large files like rockyou.txt. This adds memory
estimation, lazy dataset loading, and training optimizations.

- Add _get_available_memory_mb() for cross-platform RAM detection
- Add _estimate_training_memory_mb() to predict peak usage before loading
- Replace bulk tokenization with LazyPasswordDataset (file offset index + on-the-fly tokenization)
- Add --max-lines flag to limit training to first N lines
- Add --memory-limit flag to auto-tune --max-lines based on available RAM
- Enable gradient checkpointing and gradient accumulation (steps=4)
- Enable fp16 on CUDA devices

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:47:44 -05:00
..
2026-02-09 20:08:51 -05:00