mirror of
https://github.com/trustedsec/hate_crack.git
synced 2026-06-27 17:03:16 -07:00
4eacf9d9ee
macOS `sort` is locale-strict: under LC_COLLATE=en_US.UTF-8 (the default
on most macOS shells) it errors out with "sort: Illegal byte sequence"
when stdin contains bytes that are not valid UTF-8. Cracked-password
streams routinely contain such bytes - hex-encoded fields, mixed
encodings, binary garbage from poorly-encoded source hashes - so this
fires in real fingerprint runs whenever the pot already has any non-
ASCII output.
Symptom in the fingerprint attack: the expander -> sort pipeline emits
"sort: Illegal byte sequence" and produces an empty .expanded file. The
empty-.expanded guard added in the previous patch then triggers the
"no candidates to expand" skip message - which is misleading, because
the user does have cracks; they just got dropped on the sort step.
Pass env={**os.environ, "LC_ALL": "C"} to all three subprocess.Popen
calls that invoke `sort -u`:
- _write_field_sorted_unique (main.py:1163)
- hcatFingerprint expander (main.py:1544)
- hcatLMtoNT combinator dedupe (main.py:2995)
LC_ALL=C makes sort byte-collation only. Dedup correctness is
unaffected (byte equality is locale-independent), and hashcat doesn't
care about wordlist order.
Also adds an AST-level test that fails if any future `sort` Popen lacks
an env kwarg, so the locale fix can't silently regress.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>