mirror of
https://github.com/trustedsec/hate_crack.git
synced 2026-06-29 17:54:28 -07:00
fcfe6890f6
Training previously loaded entire wordlists into RAM and tokenized all at once, causing OOM on large files like rockyou.txt. This adds memory estimation, lazy dataset loading, and training optimizations. - Add _get_available_memory_mb() for cross-platform RAM detection - Add _estimate_training_memory_mb() to predict peak usage before loading - Replace bulk tokenization with LazyPasswordDataset (file offset index + on-the-fly tokenization) - Add --max-lines flag to limit training to first N lines - Add --memory-limit flag to auto-tune --max-lines based on available RAM - Enable gradient checkpointing and gradient accumulation (steps=4) - Enable fp16 on CUDA devices Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>