mirror of
https://github.com/trustedsec/hate_crack.git
synced 2026-03-12 21:23:05 -07:00
feat: add PassGPT model fine-tuning and training menu integration
Add ability to fine-tune PassGPT models on custom password wordlists. Models save locally to ~/.hate_crack/passgpt/ with no data uploaded to HuggingFace (push_to_hub=False, HF_HUB_DISABLE_TELEMETRY=1). The PassGPT menu now shows available models (default + local fine-tuned) and a training option. Adds datasets to [ml] deps and passgptTrainingList config key. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
56
README.md
56
README.md
@@ -325,15 +325,18 @@ chmod +x .git/hooks/pre-push
|
||||
|
||||
### Optional Dependencies
|
||||
|
||||
The optional `[ml]` group includes ML/AI features:
|
||||
- **torch** - PyTorch deep learning framework (for PassGPT attack)
|
||||
The optional `[ml]` group includes ML/AI features required for the PassGPT attack:
|
||||
- **torch** - PyTorch deep learning framework (for PassGPT attack and training)
|
||||
- **transformers** - HuggingFace transformers library (for GPT-2 models)
|
||||
- **datasets** - HuggingFace datasets library (for fine-tuning support)
|
||||
|
||||
Install with:
|
||||
```bash
|
||||
uv pip install -e ".[ml]"
|
||||
```
|
||||
|
||||
PassGPT (option 17) will be hidden from the menu if ML dependencies are not installed.
|
||||
|
||||
### Dev Dependencies
|
||||
|
||||
The optional `[dev]` group includes:
|
||||
@@ -721,7 +724,9 @@ Uses the Ordered Markov ENumerator (OMEN) to train a statistical password model
|
||||
* Model files are stored in `~/.hate_crack/omen/` for persistence across sessions
|
||||
|
||||
#### PassGPT Attack
|
||||
Uses PassGPT, a GPT-2 based password generator trained on leaked password datasets, to generate candidate passwords. PassGPT produces higher-quality candidates than traditional Markov models by leveraging transformer-based language modeling.
|
||||
Uses PassGPT, a GPT-2 based password generator trained on leaked password datasets, to generate candidate passwords. PassGPT produces higher-quality candidates than traditional Markov models by leveraging transformer-based language modeling. You can use the default HuggingFace model or fine-tune a custom model on your own password wordlist.
|
||||
|
||||
**Note:** This menu item is hidden unless ML dependencies are installed.
|
||||
|
||||
**Requirements:** ML dependencies must be installed separately:
|
||||
```bash
|
||||
@@ -734,24 +739,60 @@ This installs PyTorch and HuggingFace Transformers. GPU acceleration (CUDA/MPS)
|
||||
- `passgptModel` - HuggingFace model name (default: `javirandor/passgpt-10characters`)
|
||||
- `passgptMaxCandidates` - Maximum candidates to generate (default: 1000000)
|
||||
- `passgptBatchSize` - Generation batch size (default: 1024)
|
||||
- `passgptTrainingList` - Default wordlist for fine-tuning (default: `rockyou.txt`)
|
||||
|
||||
**Supported models:**
|
||||
- `javirandor/passgpt-10characters` - Trained on passwords up to 10 characters (default)
|
||||
- `javirandor/passgpt-16characters` - Trained on passwords up to 16 characters
|
||||
- Any compatible GPT-2 model on HuggingFace
|
||||
- Locally fine-tuned models (stored in `~/.hate_crack/passgpt/`)
|
||||
|
||||
**Training a Custom Model:**
|
||||
When you select the PassGPT Attack (option 17), the menu presents:
|
||||
- List of available models (default HF model + any locally fine-tuned models)
|
||||
- Option (T) to train a new model on a custom wordlist
|
||||
- Fine-tuned models are automatically saved to `~/.hate_crack/passgpt/<name>/` for reuse
|
||||
|
||||
To train a new model:
|
||||
1. Select option (T) from the model selection menu
|
||||
2. Choose a training wordlist (supports tab-complete file selection)
|
||||
3. Optionally specify a base model (defaults to configured `passgptModel`)
|
||||
4. Training will fine-tune the model on your wordlist and save it locally
|
||||
|
||||
Fine-tuned models can be reused in future cracking sessions and appear in the model selection menu alongside the default models.
|
||||
|
||||
**Apple Silicon (MPS) Performance Notes:**
|
||||
- Batch size is automatically capped at 64 to prevent memory errors on MPS devices
|
||||
- GPU memory watermark ratios are configured for stability (50% high, 30% low)
|
||||
- Specify `--device cpu` to force CPU generation if MPS has issues
|
||||
|
||||
**Standalone usage:**
|
||||
|
||||
Generate candidates:
|
||||
```bash
|
||||
python -m hate_crack.passgpt_generate --num 1000 --model javirandor/passgpt-10characters
|
||||
```
|
||||
|
||||
Available command-line options:
|
||||
Fine-tune a custom model:
|
||||
```bash
|
||||
python -m hate_crack.passgpt_train --training-file wordlist.txt --output-dir ~/.hate_crack/passgpt/my_model
|
||||
```
|
||||
|
||||
**Generator command-line options:**
|
||||
- `--num` - Number of candidates to generate (default: 1000000)
|
||||
- `--model` - HuggingFace model name (default: javirandor/passgpt-10characters)
|
||||
- `--model` - HuggingFace model name or local path (default: javirandor/passgpt-10characters)
|
||||
- `--batch-size` - Generation batch size (default: 1024)
|
||||
- `--max-length` - Max token length including special tokens (default: 12)
|
||||
- `--device` - Device: cuda, mps, or cpu (default: auto-detect)
|
||||
|
||||
**Training command-line options:**
|
||||
- `--training-file` - Path to password wordlist for fine-tuning (required)
|
||||
- `--output-dir` - Directory to save the fine-tuned model (required)
|
||||
- `--base-model` - Base HuggingFace model to fine-tune (default: javirandor/passgpt-10characters)
|
||||
- `--epochs` - Number of training epochs (default: 3)
|
||||
- `--batch-size` - Training batch size (default: 8)
|
||||
- `--device` - Device: cuda, mps, or cpu (default: auto-detect)
|
||||
|
||||
#### Download Rules from Hashmob.net
|
||||
Downloads the latest rule files from Hashmob.net's rule repository. These rules are curated and optimized for password cracking and can be used with the Quick Crack and Loopback Attack modes.
|
||||
|
||||
@@ -789,8 +830,9 @@ Version 2.0+
|
||||
- Added automatic update checks on startup (check_for_updates config option)
|
||||
- Added `packaging` dependency for version comparison
|
||||
- Added PassGPT Attack (option 17) using GPT-2 based ML password generation
|
||||
- Added PassGPT configuration keys (passgptModel, passgptMaxCandidates, passgptBatchSize)
|
||||
- Added `[ml]` optional dependency group for PyTorch and Transformers
|
||||
- Added PassGPT fine-tuning capability for custom password models
|
||||
- Added PassGPT configuration keys (passgptModel, passgptMaxCandidates, passgptBatchSize, passgptTrainingList)
|
||||
- Added `[ml]` optional dependency group for PyTorch, Transformers, and Datasets
|
||||
- Added OMEN Attack (option 16) using statistical model-based password generation
|
||||
- Added OMEN configuration keys (omenTrainingList, omenMaxCandidates)
|
||||
- Added LLM Attack (option 15) using Ollama for AI-generated password candidates
|
||||
|
||||
@@ -30,5 +30,6 @@
|
||||
"passgptModel": "javirandor/passgpt-10characters",
|
||||
"passgptMaxCandidates": 1000000,
|
||||
"passgptBatchSize": 1024,
|
||||
"passgptTrainingList": "rockyou.txt",
|
||||
"check_for_updates": true
|
||||
}
|
||||
|
||||
@@ -534,14 +534,62 @@ def passgpt_attack(ctx: Any) -> None:
|
||||
print("\n\tPassGPT requires ML dependencies. Install them with:")
|
||||
print('\t uv pip install -e ".[ml]"')
|
||||
return
|
||||
|
||||
# Build model choices: default HF model + any local fine-tuned models
|
||||
default_model = ctx.passgptModel
|
||||
models = [(default_model, f"{default_model} (default)")]
|
||||
|
||||
model_dir = ctx._passgpt_model_dir()
|
||||
if os.path.isdir(model_dir):
|
||||
for entry in sorted(os.listdir(model_dir)):
|
||||
entry_path = os.path.join(model_dir, entry)
|
||||
if os.path.isdir(entry_path) and os.path.isfile(
|
||||
os.path.join(entry_path, "config.json")
|
||||
):
|
||||
models.append((entry_path, f"{entry} (local)"))
|
||||
|
||||
print("\n\tSelect a model:")
|
||||
for i, (_, label) in enumerate(models, 1):
|
||||
print(f"\t ({i}) {label}")
|
||||
print("\t (T) Train a new model")
|
||||
|
||||
choice = input("\n\tChoice: ").strip()
|
||||
|
||||
if choice.upper() == "T":
|
||||
print("\n\tTrain a new PassGPT model")
|
||||
training_file = ctx.select_file_with_autocomplete(
|
||||
"Select training wordlist", base_dir=ctx.hcatWordlists
|
||||
)
|
||||
if not training_file:
|
||||
print("\n\tNo training file selected. Aborting.")
|
||||
return
|
||||
if isinstance(training_file, list):
|
||||
training_file = training_file[0]
|
||||
base = input(f"\n\tBase model ({default_model}): ").strip()
|
||||
if not base:
|
||||
base = default_model
|
||||
result = ctx.hcatPassGPTTrain(training_file, base)
|
||||
if result is None:
|
||||
print("\n\tTraining failed. Returning to menu.")
|
||||
return
|
||||
model_name = result
|
||||
else:
|
||||
try:
|
||||
idx = int(choice) - 1
|
||||
if 0 <= idx < len(models):
|
||||
model_name = models[idx][0]
|
||||
else:
|
||||
print("\n\tInvalid selection.")
|
||||
return
|
||||
except ValueError:
|
||||
print("\n\tInvalid selection.")
|
||||
return
|
||||
|
||||
max_candidates = input(
|
||||
f"\n\tMax candidates to generate ({ctx.passgptMaxCandidates}): "
|
||||
).strip()
|
||||
if not max_candidates:
|
||||
max_candidates = str(ctx.passgptMaxCandidates)
|
||||
model_name = input(f"\n\tModel name ({ctx.passgptModel}): ").strip()
|
||||
if not model_name:
|
||||
model_name = ctx.passgptModel
|
||||
ctx.hcatPassGPT(
|
||||
ctx.hcatHashType,
|
||||
ctx.hcatHashFile,
|
||||
|
||||
@@ -522,6 +522,15 @@ except KeyError as e:
|
||||
)
|
||||
)
|
||||
passgptBatchSize = int(default_config.get("passgptBatchSize", 1024))
|
||||
try:
|
||||
passgptTrainingList = config_parser["passgptTrainingList"]
|
||||
except KeyError as e:
|
||||
print(
|
||||
"{0} is not defined in config.json using defaults from config.json.example".format(
|
||||
e
|
||||
)
|
||||
)
|
||||
passgptTrainingList = default_config.get("passgptTrainingList", "rockyou.txt")
|
||||
try:
|
||||
check_for_updates_enabled = config_parser["check_for_updates"]
|
||||
except KeyError as e:
|
||||
@@ -673,6 +682,7 @@ hcatGoodMeasureBaseList = _normalize_wordlist_setting(
|
||||
)
|
||||
hcatPrinceBaseList = _normalize_wordlist_setting(hcatPrinceBaseList, wordlists_dir)
|
||||
omenTrainingList = _normalize_wordlist_setting(omenTrainingList, wordlists_dir)
|
||||
passgptTrainingList = _normalize_wordlist_setting(passgptTrainingList, wordlists_dir)
|
||||
if not SKIP_INIT:
|
||||
# Verify hashcat binary is available
|
||||
# hcatBin should be in PATH or be an absolute path (resolved from hcatPath + hcatBin if configured)
|
||||
@@ -2278,6 +2288,55 @@ def hcatOmen(hcatHashType, hcatHashFile, max_candidates):
|
||||
enum_proc.kill()
|
||||
|
||||
|
||||
# PassGPT model directory - writable location for fine-tuned models.
|
||||
# Models are saved to ~/.hate_crack/passgpt/<model_name>/.
|
||||
def _passgpt_model_dir():
|
||||
model_dir = os.path.join(os.path.expanduser("~"), ".hate_crack", "passgpt")
|
||||
os.makedirs(model_dir, exist_ok=True)
|
||||
return model_dir
|
||||
|
||||
|
||||
# PassGPT Attack - Fine-tune a model on a custom wordlist
|
||||
def hcatPassGPTTrain(training_file, base_model=None):
|
||||
training_file = os.path.abspath(training_file)
|
||||
if not os.path.isfile(training_file):
|
||||
print(f"Error: Training file not found: {training_file}")
|
||||
return None
|
||||
if base_model is None:
|
||||
base_model = passgptModel
|
||||
# Derive output dir name from training file
|
||||
basename = os.path.splitext(os.path.basename(training_file))[0]
|
||||
# Sanitize: replace non-alphanumeric chars with underscores
|
||||
sanitized = "".join(c if c.isalnum() or c in "-_" else "_" for c in basename)
|
||||
output_dir = os.path.join(_passgpt_model_dir(), sanitized)
|
||||
os.makedirs(output_dir, exist_ok=True)
|
||||
cmd = [
|
||||
sys.executable,
|
||||
"-m",
|
||||
"hate_crack.passgpt_train",
|
||||
"--training-file",
|
||||
training_file,
|
||||
"--base-model",
|
||||
base_model,
|
||||
"--output-dir",
|
||||
output_dir,
|
||||
]
|
||||
print(f"[*] Running: {_format_cmd(cmd)}")
|
||||
proc = subprocess.Popen(cmd)
|
||||
try:
|
||||
proc.wait()
|
||||
except KeyboardInterrupt:
|
||||
print("Killing PID {0}...".format(str(proc.pid)))
|
||||
proc.kill()
|
||||
return None
|
||||
if proc.returncode == 0:
|
||||
print(f"PassGPT model training complete. Model saved to: {output_dir}")
|
||||
return output_dir
|
||||
else:
|
||||
print(f"PassGPT training failed with exit code {proc.returncode}")
|
||||
return None
|
||||
|
||||
|
||||
# PassGPT Attack - Generate candidates with ML model and pipe to hashcat
|
||||
def hcatPassGPT(
|
||||
hcatHashType,
|
||||
|
||||
@@ -8,8 +8,12 @@ hashcat. Progress and diagnostic messages go to stderr.
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
|
||||
# Disable HuggingFace telemetry before any HF imports
|
||||
os.environ["HF_HUB_DISABLE_TELEMETRY"] = "1"
|
||||
|
||||
|
||||
_MPS_BATCH_SIZE_CAP = 64
|
||||
|
||||
|
||||
174
hate_crack/passgpt_train.py
Normal file
174
hate_crack/passgpt_train.py
Normal file
@@ -0,0 +1,174 @@
|
||||
"""Fine-tune a PassGPT model on a custom password wordlist.
|
||||
|
||||
Invokable as ``python -m hate_crack.passgpt_train``. Progress and
|
||||
diagnostic messages go to stderr.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
|
||||
# Disable HuggingFace telemetry before any HF imports
|
||||
os.environ["HF_HUB_DISABLE_TELEMETRY"] = "1"
|
||||
|
||||
|
||||
def _detect_device() -> str:
|
||||
import torch
|
||||
|
||||
if torch.cuda.is_available():
|
||||
return "cuda"
|
||||
if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
|
||||
return "mps"
|
||||
return "cpu"
|
||||
|
||||
|
||||
def _configure_mps() -> None:
|
||||
"""Set MPS memory limits before torch is imported."""
|
||||
os.environ.setdefault("PYTORCH_MPS_HIGH_WATERMARK_RATIO", "0.5")
|
||||
os.environ.setdefault("PYTORCH_MPS_LOW_WATERMARK_RATIO", "0.3")
|
||||
|
||||
|
||||
def train(
|
||||
training_file: str,
|
||||
output_dir: str,
|
||||
base_model: str,
|
||||
epochs: int,
|
||||
batch_size: int,
|
||||
device: str | None,
|
||||
) -> None:
|
||||
if device == "mps" or device is None:
|
||||
_configure_mps()
|
||||
|
||||
import torch
|
||||
from transformers import ( # type: ignore[attr-defined]
|
||||
GPT2LMHeadModel,
|
||||
RobertaTokenizerFast,
|
||||
Trainer,
|
||||
TrainingArguments,
|
||||
)
|
||||
|
||||
if device is None:
|
||||
device = _detect_device()
|
||||
|
||||
print(f"[*] Loading base model {base_model} on {device}", file=sys.stderr)
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained(base_model)
|
||||
model = GPT2LMHeadModel.from_pretrained(base_model).to(device) # type: ignore[arg-type]
|
||||
|
||||
print(f"[*] Reading training file: {training_file}", file=sys.stderr)
|
||||
with open(training_file, encoding="utf-8", errors="replace") as f:
|
||||
passwords = [line.strip() for line in f if line.strip()]
|
||||
print(f"[*] Loaded {len(passwords)} passwords", file=sys.stderr)
|
||||
|
||||
print("[*] Tokenizing passwords...", file=sys.stderr)
|
||||
max_length = model.config.n_positions if hasattr(model.config, "n_positions") else 16
|
||||
encodings = tokenizer(
|
||||
passwords,
|
||||
truncation=True,
|
||||
padding="max_length",
|
||||
max_length=max_length,
|
||||
return_tensors="pt",
|
||||
)
|
||||
|
||||
class PasswordDataset(torch.utils.data.Dataset): # type: ignore[type-arg]
|
||||
def __init__(self, encodings):
|
||||
self.input_ids = encodings["input_ids"]
|
||||
self.attention_mask = encodings["attention_mask"]
|
||||
|
||||
def __len__(self):
|
||||
return len(self.input_ids)
|
||||
|
||||
def __getitem__(self, idx):
|
||||
return {
|
||||
"input_ids": self.input_ids[idx],
|
||||
"attention_mask": self.attention_mask[idx],
|
||||
"labels": self.input_ids[idx],
|
||||
}
|
||||
|
||||
dataset = PasswordDataset(encodings)
|
||||
|
||||
# Use CPU for training args if device is MPS (Trainer handles device placement)
|
||||
use_cpu = device not in ("cuda",)
|
||||
training_args = TrainingArguments(
|
||||
output_dir=output_dir,
|
||||
num_train_epochs=epochs,
|
||||
per_device_train_batch_size=batch_size,
|
||||
save_strategy="epoch",
|
||||
logging_steps=100,
|
||||
use_cpu=use_cpu,
|
||||
report_to="none",
|
||||
push_to_hub=False,
|
||||
)
|
||||
|
||||
trainer = Trainer(
|
||||
model=model,
|
||||
args=training_args,
|
||||
train_dataset=dataset,
|
||||
)
|
||||
|
||||
print(
|
||||
f"[*] Starting training: {epochs} epochs, batch_size={batch_size}, device={device}",
|
||||
file=sys.stderr,
|
||||
)
|
||||
trainer.train()
|
||||
|
||||
print(f"[*] Saving model to {output_dir}", file=sys.stderr)
|
||||
model.save_pretrained(output_dir)
|
||||
tokenizer.save_pretrained(output_dir)
|
||||
print("[*] Training complete.", file=sys.stderr)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Fine-tune a PassGPT model on a password wordlist"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--training-file",
|
||||
type=str,
|
||||
required=True,
|
||||
help="Path to the password wordlist for training",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--base-model",
|
||||
type=str,
|
||||
default="javirandor/passgpt-10characters",
|
||||
help="Base HuggingFace model to fine-tune (default: javirandor/passgpt-10characters)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--output-dir",
|
||||
type=str,
|
||||
required=True,
|
||||
help="Directory to save the fine-tuned model",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--epochs",
|
||||
type=int,
|
||||
default=3,
|
||||
help="Number of training epochs (default: 3)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--batch-size",
|
||||
type=int,
|
||||
default=8,
|
||||
help="Training batch size (default: 8)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--device",
|
||||
type=str,
|
||||
default=None,
|
||||
help="Device: cuda, mps, or cpu (default: auto-detect)",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
train(
|
||||
training_file=args.training_file,
|
||||
output_dir=args.output_dir,
|
||||
base_model=args.base_model,
|
||||
epochs=args.epochs,
|
||||
batch_size=args.batch_size,
|
||||
device=args.device,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -22,6 +22,7 @@ hate_crack = "hate_crack.__main__:main"
|
||||
ml = [
|
||||
"torch>=2.0.0",
|
||||
"transformers>=4.30.0",
|
||||
"datasets>=2.14.0",
|
||||
]
|
||||
dev = [
|
||||
"mypy>=1.8.0",
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import os
|
||||
import sys
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
@@ -78,8 +79,99 @@ class TestHcatPassGPT:
|
||||
assert "512" in gen_cmd
|
||||
|
||||
|
||||
class TestHcatPassGPTTrain:
|
||||
def test_builds_correct_subprocess_command(self, main_module, tmp_path):
|
||||
training_file = tmp_path / "wordlist.txt"
|
||||
training_file.write_text("password123\nabc456\n")
|
||||
|
||||
with patch.object(
|
||||
main_module, "passgptModel", "javirandor/passgpt-10characters"
|
||||
), patch("hate_crack.main.subprocess.Popen") as mock_popen:
|
||||
mock_proc = MagicMock()
|
||||
mock_proc.returncode = 0
|
||||
mock_proc.wait.return_value = None
|
||||
mock_popen.return_value = mock_proc
|
||||
|
||||
with patch.object(
|
||||
main_module,
|
||||
"_passgpt_model_dir",
|
||||
return_value=str(tmp_path / "models"),
|
||||
):
|
||||
result = main_module.hcatPassGPTTrain(str(training_file))
|
||||
|
||||
assert result is not None
|
||||
assert mock_popen.call_count == 1
|
||||
cmd = mock_popen.call_args[0][0]
|
||||
assert cmd[0] == sys.executable
|
||||
assert "-m" in cmd
|
||||
assert "hate_crack.passgpt_train" in cmd
|
||||
assert "--training-file" in cmd
|
||||
assert str(training_file) in cmd
|
||||
assert "--base-model" in cmd
|
||||
assert "javirandor/passgpt-10characters" in cmd
|
||||
assert "--output-dir" in cmd
|
||||
|
||||
def test_missing_training_file(self, main_module, capsys):
|
||||
result = main_module.hcatPassGPTTrain("/nonexistent/wordlist.txt")
|
||||
assert result is None
|
||||
captured = capsys.readouterr()
|
||||
assert "Training file not found" in captured.out
|
||||
|
||||
def test_custom_base_model(self, main_module, tmp_path):
|
||||
training_file = tmp_path / "wordlist.txt"
|
||||
training_file.write_text("test\n")
|
||||
|
||||
with patch("hate_crack.main.subprocess.Popen") as mock_popen:
|
||||
mock_proc = MagicMock()
|
||||
mock_proc.returncode = 0
|
||||
mock_proc.wait.return_value = None
|
||||
mock_popen.return_value = mock_proc
|
||||
|
||||
with patch.object(
|
||||
main_module,
|
||||
"_passgpt_model_dir",
|
||||
return_value=str(tmp_path / "models"),
|
||||
):
|
||||
main_module.hcatPassGPTTrain(
|
||||
str(training_file), base_model="custom/base-model"
|
||||
)
|
||||
|
||||
cmd = mock_popen.call_args[0][0]
|
||||
assert "custom/base-model" in cmd
|
||||
|
||||
def test_training_failure_returns_none(self, main_module, tmp_path):
|
||||
training_file = tmp_path / "wordlist.txt"
|
||||
training_file.write_text("test\n")
|
||||
|
||||
with patch.object(
|
||||
main_module, "passgptModel", "javirandor/passgpt-10characters"
|
||||
), patch("hate_crack.main.subprocess.Popen") as mock_popen:
|
||||
mock_proc = MagicMock()
|
||||
mock_proc.returncode = 1
|
||||
mock_proc.wait.return_value = None
|
||||
mock_popen.return_value = mock_proc
|
||||
|
||||
with patch.object(
|
||||
main_module,
|
||||
"_passgpt_model_dir",
|
||||
return_value=str(tmp_path / "models"),
|
||||
):
|
||||
result = main_module.hcatPassGPTTrain(str(training_file))
|
||||
|
||||
assert result is None
|
||||
|
||||
|
||||
class TestPassGPTModelDir:
|
||||
def test_creates_directory(self, main_module, tmp_path):
|
||||
target = str(tmp_path / "passgpt_models")
|
||||
with patch("hate_crack.main.os.path.expanduser", return_value=str(tmp_path)):
|
||||
result = main_module._passgpt_model_dir()
|
||||
assert os.path.isdir(result)
|
||||
assert result.endswith("passgpt")
|
||||
|
||||
|
||||
class TestPassGPTAttackHandler:
|
||||
def test_prompts_and_calls_hcatPassGPT(self):
|
||||
def _make_ctx(self, model_dir=None):
|
||||
ctx = MagicMock()
|
||||
ctx.HAS_ML_DEPS = True
|
||||
ctx.passgptMaxCandidates = 1000000
|
||||
@@ -87,8 +179,21 @@ class TestPassGPTAttackHandler:
|
||||
ctx.passgptBatchSize = 1024
|
||||
ctx.hcatHashType = "1000"
|
||||
ctx.hcatHashFile = "/tmp/hashes.txt"
|
||||
ctx.hcatWordlists = "/tmp/wordlists"
|
||||
if model_dir is None:
|
||||
ctx._passgpt_model_dir.return_value = "/nonexistent/empty"
|
||||
else:
|
||||
ctx._passgpt_model_dir.return_value = model_dir
|
||||
return ctx
|
||||
|
||||
with patch("builtins.input", return_value=""):
|
||||
def test_select_default_model_and_generate(self):
|
||||
ctx = self._make_ctx()
|
||||
|
||||
# "1" selects default model, "" accepts default max candidates
|
||||
inputs = iter(["1", ""])
|
||||
with patch("builtins.input", side_effect=inputs), patch(
|
||||
"hate_crack.attacks.os.path.isdir", return_value=False
|
||||
):
|
||||
from hate_crack.attacks import passgpt_attack
|
||||
|
||||
passgpt_attack(ctx)
|
||||
@@ -101,28 +206,70 @@ class TestPassGPTAttackHandler:
|
||||
batch_size=1024,
|
||||
)
|
||||
|
||||
def test_custom_values(self):
|
||||
ctx = MagicMock()
|
||||
ctx.HAS_ML_DEPS = True
|
||||
ctx.passgptMaxCandidates = 1000000
|
||||
ctx.passgptModel = "javirandor/passgpt-10characters"
|
||||
ctx.passgptBatchSize = 1024
|
||||
ctx.hcatHashType = "1000"
|
||||
ctx.hcatHashFile = "/tmp/hashes.txt"
|
||||
def test_select_local_model(self, tmp_path):
|
||||
# Create a fake local model directory
|
||||
model_dir = tmp_path / "passgpt"
|
||||
local_model = model_dir / "my_model"
|
||||
local_model.mkdir(parents=True)
|
||||
(local_model / "config.json").write_text("{}")
|
||||
|
||||
inputs = iter(["500000", "custom/model"])
|
||||
with patch("builtins.input", side_effect=inputs):
|
||||
ctx = self._make_ctx(model_dir=str(model_dir))
|
||||
|
||||
# "2" selects the local model, "" accepts default max candidates
|
||||
inputs = iter(["2", ""])
|
||||
with patch("builtins.input", side_effect=inputs), patch(
|
||||
"hate_crack.attacks.os.path.isdir", return_value=True
|
||||
), patch("hate_crack.attacks.os.listdir", return_value=["my_model"]), patch(
|
||||
"hate_crack.attacks.os.path.isfile", return_value=True
|
||||
), patch(
|
||||
"hate_crack.attacks.os.path.isdir",
|
||||
side_effect=lambda p: True,
|
||||
):
|
||||
from hate_crack.attacks import passgpt_attack
|
||||
|
||||
passgpt_attack(ctx)
|
||||
|
||||
ctx.hcatPassGPT.assert_called_once_with(
|
||||
"1000",
|
||||
"/tmp/hashes.txt",
|
||||
500000,
|
||||
model_name="custom/model",
|
||||
batch_size=1024,
|
||||
ctx.hcatPassGPT.assert_called_once()
|
||||
call_kwargs = ctx.hcatPassGPT.call_args
|
||||
# The model_name should be the local path
|
||||
assert call_kwargs[1]["model_name"] == str(local_model)
|
||||
|
||||
def test_train_new_model(self):
|
||||
ctx = self._make_ctx()
|
||||
ctx.select_file_with_autocomplete.return_value = "/tmp/wordlist.txt"
|
||||
ctx.hcatPassGPTTrain.return_value = "/home/user/.hate_crack/passgpt/wordlist"
|
||||
|
||||
# "T" for train, "" for default base model, "" for default max candidates
|
||||
inputs = iter(["T", "", ""])
|
||||
with patch("builtins.input", side_effect=inputs), patch(
|
||||
"hate_crack.attacks.os.path.isdir", return_value=False
|
||||
):
|
||||
from hate_crack.attacks import passgpt_attack
|
||||
|
||||
passgpt_attack(ctx)
|
||||
|
||||
ctx.hcatPassGPTTrain.assert_called_once_with(
|
||||
"/tmp/wordlist.txt", "javirandor/passgpt-10characters"
|
||||
)
|
||||
ctx.hcatPassGPT.assert_called_once()
|
||||
call_kwargs = ctx.hcatPassGPT.call_args
|
||||
assert call_kwargs[1]["model_name"] == "/home/user/.hate_crack/passgpt/wordlist"
|
||||
|
||||
def test_train_failure_aborts(self):
|
||||
ctx = self._make_ctx()
|
||||
ctx.select_file_with_autocomplete.return_value = "/tmp/wordlist.txt"
|
||||
ctx.hcatPassGPTTrain.return_value = None
|
||||
|
||||
inputs = iter(["T", ""])
|
||||
with patch("builtins.input", side_effect=inputs), patch(
|
||||
"hate_crack.attacks.os.path.isdir", return_value=False
|
||||
):
|
||||
from hate_crack.attacks import passgpt_attack
|
||||
|
||||
passgpt_attack(ctx)
|
||||
|
||||
ctx.hcatPassGPTTrain.assert_called_once()
|
||||
ctx.hcatPassGPT.assert_not_called()
|
||||
|
||||
def test_ml_deps_missing(self, capsys):
|
||||
ctx = MagicMock()
|
||||
@@ -136,3 +283,23 @@ class TestPassGPTAttackHandler:
|
||||
assert "ML dependencies" in captured.out
|
||||
assert "uv pip install" in captured.out
|
||||
ctx.hcatPassGPT.assert_not_called()
|
||||
|
||||
def test_custom_max_candidates(self):
|
||||
ctx = self._make_ctx()
|
||||
|
||||
# "1" selects default model, "500000" for custom max candidates
|
||||
inputs = iter(["1", "500000"])
|
||||
with patch("builtins.input", side_effect=inputs), patch(
|
||||
"hate_crack.attacks.os.path.isdir", return_value=False
|
||||
):
|
||||
from hate_crack.attacks import passgpt_attack
|
||||
|
||||
passgpt_attack(ctx)
|
||||
|
||||
ctx.hcatPassGPT.assert_called_once_with(
|
||||
"1000",
|
||||
"/tmp/hashes.txt",
|
||||
500000,
|
||||
model_name="javirandor/passgpt-10characters",
|
||||
batch_size=1024,
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user