Commit Graph

81 Commits

Author SHA1 Message Date
vmfunc 8078978a44 feat: notify integrations (slack, discord, telegram, webhook)
ship findings to chat/webhook sinks after a scan so continuous recon can
alert on what it turns up. each provider is one POST through httpx.Client,
so the global proxy/rate-limit/header config applies and there's no extra
http stack. config resolves env-first (SLACK_WEBHOOK_URL, DISCORD_WEBHOOK_URL,
TELEGRAM_BOT_TOKEN/TELEGRAM_CHAT_ID, NOTIFY_WEBHOOK_URL), overridable by a
notify-compatible yaml file so existing projectdiscovery/notify configs port
over. -notify enables it, -notify-severity gates on the finding severity
ladder (default medium), -notify-config points at the yaml. wired after the
scan loop on the severity-filtered finding set; no provider configured is a
silent no-op.
2026-06-10 16:40:14 -07:00
vmfunc 6ec0b60e5a feat: diff mode with json snapshot store
re-scans become a monitor: -diff snapshots each target's normalized
findings to a per-target json file and, on the next run, surfaces only
the delta (+ new / - gone) against the last snapshot, then overwrites it
so each run diffs against the previous one. behavior is unchanged when
-diff is off.

new internal/store keys the set-difference off finding.Key (already
stable across runs) and uses only encoding/json + os - no new deps.
snapshot files are sanitized per target (no traversal), written 0600
under 0750 dirs. -store picks the location: explicit dir, else the log
dir, else <user-config>/sif/state. a missing snapshot is a clean
baseline, a corrupt one self-heals on the next save.
2026-06-10 16:39:04 -07:00
vmfunc 22168611e4 perf(scan): work-stealing worker pool replacing static stride partition
static i%threads stride partitions assign each item to a fixed worker, so a
goroutine stuck on a slow or timing-out host stalls while the rest idle behind
it (head-of-line blocking). add internal/pool.Each: workers all pull from one
shared channel, so a slow item only blocks its own worker and the others keep
draining. migrate git, ports, robots (scan), dnslist, dork, dirlist and
subdomaintakeover off the stride loops; per-item work, mutex-guarded appends and
progress increments are unchanged, results were already unordered.
2026-06-10 16:39:04 -07:00
vmfunc 57b1bd7113 fix(favicon): use twmb/murmur3; spaolacci does checkptr-unsafe pointer math that crashes under -race (which CI runs) 2026-06-10 16:08:47 -07:00
vmfunc ab731d0562 feat(scan): add jwt, openapi and favicon-hash scanners
jwt fetches the target once then analyzes every harvested token offline:
flags alg:none, the rs256->hs256 confusion surface, missing/expired exp
and plaintext sensitive claims, and cracks a small bundled weak-hmac list.
openapi probes the conventional spec paths, parses json/yaml and enumerates
paths plus unauthenticated operations. favicon computes the shodan-style
mmh3 hash (python base64.encodebytes chunking, signed int32) for tech
fingerprinting and the http.favicon.hash pivot, pinned by a golden test.
2026-06-10 15:50:59 -07:00
vmfunc ef0408ee8d feat: pipe mode (stdin targets, naked-host, -silent plain output)
sif can now slot into unix pipelines. stdin is drained for targets when
it's a pipe (keyed off stdin's mode, not stdout), alongside -u/-f. naked
hosts are accepted and default to https://; explicit http(s) is kept,
other schemes rejected. -silent routes all banner/spinner/log chrome to
stderr and prints one normalized finding per line to stdout via
finding.Flatten, so `subfinder | sif -silent | notify` works.
2026-06-10 15:50:58 -07:00
vmfunc 0383a7bcd2 perf(scan): drain response bodies so pooled connections get reused
go only returns a conn to the idle pool when the body is read to EOF
before Close. the header-only and early-return scan paths closed an
unread body, leaking the conn and forcing a fresh dial each request.
route those close sites through httpx.DrainClose so the tuned pool from
phase 1 actually gets reused. body-read paths (scanner/io.ReadAll) are
left untouched.
2026-06-10 15:50:58 -07:00
vmfunc a5f42ddfa6 feat(dnslist): async dns resolution with wildcard filtering
dnslist previously http-probed every wordlist candidate through the
blocking os resolver, so a big list meant a request per dead name and a
wildcard zone flooded results. resolve each candidate first via a new
internal/dnsx (retryabledns over a bundled 1.1.1.1/8.8.8.8/9.9.9.9 pool,
promoted to a direct dep), fingerprint the apex with random nonexistent
labels to detect catch-all zones, and http-probe only the names that
actually resolve and aren't wildcard. add -resolvers to override the
pool. resolverFn is a package-level seam so the dnsx tests stay
hermetic; the dnslist newDNSResolver seam keeps the integration test
network-free.
2026-06-10 15:30:03 -07:00
vmfunc 1237f3f09e feat(finding): normalized finding layer for notify and diff
scan results live in ~two dozen structs with no shared shape, so every
consumer that wants "what did this run turn up" reimplements the
type-switch. add internal/finding: an ordered Severity (info<low<medium<
high<critical, with parse/compare) and Flatten, the single type-switch
that collapses every scan result struct into flat, severity-ranked
Findings keyed module:identifier for stable dedup/diff.

wire collectFindings off Flatten in the run loop so notify and diff
(later bundles) build on one normalization path instead of re-deriving
it; the report path keeps emitting raw json blobs unchanged. expose
JavascriptScanResult.SupabaseFindings so the js internals stay private.

the guard test iterates a representative instance of every ResultType
and fails if Flatten lacks a case (falls through to :unhandled) - so a
new scanner can't ship without a Flatten case landing too.
2026-06-10 15:29:20 -07:00
vmfunc 546ab091da perf(httpx): tune transport for connection reuse and add DrainClose
the shared transport was a bare DefaultTransport.Clone() with the stock
MaxIdleConnsPerHost=2, and call-sites only close response bodies without
draining them - so go could never return a conn to the idle pool and every
request re-dialed. high thread counts just thrashed the dialer.

- plumb Threads through Options into buildTransport; size MaxIdleConnsPerHost
  to the worker count (floored) so concurrent workers on one host pool instead
  of re-dialing, MaxIdleConns=512, MaxConnsPerHost=0, IdleConnTimeout=90s,
  ForceAttemptHTTP2. the socks5 branch gets its own keepalive net.Dialer so it
  doesn't lose os-level pooling under proxy.Direct.
- add DrainClose to read (capped) and close a body so the conn is reusable.
- benchmark proves it: 50 sequential requests reuse 1 conn tuned vs 50 bare.
2026-06-10 15:29:20 -07:00
vmfunc c3a755f934 feat: live-host probe and sarif/markdown report export
adds an httpx-style -probe scanner reporting liveness, final status, page
title, server header and the redirect chain, plus -sarif/-markdown export
flags that serialize the collected run after the scan loop. the report
serializers live in a decoupled internal/report package consuming a raw-json
result model so they never import scan types.
2026-06-10 14:47:17 -07:00
vmfunc 5050900f29 feat(dirlist): response filters, wildcard calibration and custom wordlists
the old scanner surfaced every response that wasn't 404/403, so modern SPA
catch-all 200s flooded the output and made -dirlist near-useless. add ffuf-style
matching:

- -mc/-fc/-fr and -fs/-fw filter by status, regex, body size and word count;
  bodies are read through a capped io.LimitReader so size/word counts are
  deterministic and memory stays flat. filters win over matches.
- -ac auto-calibrates the soft-404 baseline from a few deterministic
  non-existent paths and drops responses matching that wildcard shape.
- -w overrides the size switch with a local file or remote list (fetched through
  the shared client so proxy/rate-limit apply); -e appends extensions per word.

size and words are added to DirectoryResult for the json output.
2026-06-10 14:47:17 -07:00
vmfunc 320fc3d4e7 test(modules): cover matchers, extractors, loader and executor
the yaml module engine (the user-facing extensibility surface) had 0%
test coverage. add table-driven tests for the matcher types
(status/word/regex + and/or + negative), checkWords/checkRegex (incl
invalid-pattern fail-closed under AND, skip under OR), runExtractors
(regex capture groups, group-index bounds, part selection),
substituteVariables and generateHTTPRequests (path x payload expansion),
and ParseYAMLModule on valid + malformed yaml. drive ExecuteHTTPModule
end-to-end against an httptest server through the shared httpx client so
matcher hits and extractor captures are exercised for real. coverage
0% -> 93.7%.

also: ExecuteDNSModule/ExecuteTCPModule were stubs returning an empty
result with nil error, so a type:dns/type:tcp module silently reported
"0 findings" - indistinguishable from a real clean scan. make them
return ErrUnsupportedModuleType (sentinel, wrapped with the module id) so
the existing caller logs a clear failure instead. a test pins the new
behavior.

bodyclose is excluded for test files in .golangci.yml: the synthetic
*http.Response fixtures carry no socket, mirroring the existing _test.go
slack for errcheck/noctx/gosec.
2026-06-10 14:47:17 -07:00
vmfunc 839c0a779c fix(scan): dnslist dedup, robots recursion bound, framework version lookup, takeover cname
four recon-flagged bugs, each with a focused test:

- dnslist fired both http and https per candidate and counted a "found"
  on any non-error response (incl 404 and wildcard catch-all redirects),
  so every host double-counted and a wildcard-dns host flooded results.
  probe http then https with per-subdomain dedupe, gate on a meaningful
  (2xx, non-redirect) status, and stop chasing redirects so a catch-all
  301 reads as a redirect instead of a 200.

- fetchRobotsTXT recursed on every 301 Location with no depth limit and
  no visited set, so an A->B->A loop blew the stack. bound it to a named
  hop cap and a visited set, iteratively.

- framework cve lookup used best.version ("unknown" when the detector
  only fingerprints the framework) and threw away the version
  ExtractVersionOptimized dug out of the body, missing CVEs. reconcile
  via resolveVersion, preferring the extracted concrete version.

- subdomain takeover flagged a dangling cname whenever a no-such-host
  coincided with ANY cname (LookupCNAME echoes the host back for plain A
  records). only flag when the cname points off-host at a known
  takeoverable provider.
2026-06-10 14:47:17 -07:00
vmfunc dbe79c495e feat(scan): add web crawler and passive subdomain/url discovery
-crawl spiders same-host links/scripts/forms through the shared httpx
client so proxy/headers/rate-limit and robots.txt are honored, bounded
by -crawl-depth. -passive pulls subdomains from keyless ct feeds (crt.sh,
certspotter) and historical urls from wayback, each source isolated so
one feed being down doesn't sink the rest and the target sees no traffic.
2026-06-09 18:11:38 -07:00
vmfunc 9401aa669e feat(scan): add cors, open-redirect and reflected-xss probes
three active web-vuln probes wired into the per-target loop:

- cors: crafts attacker origins (evil sentinel, null, prefix/suffix
  bypass, http downgrade) and flags responses that reflect them in
  access-control-allow-origin, ranking reflection+credentials high.
- redirect: injects a controlled sentinel host plus bypass variants
  (//, https:/, backslash, null-byte, userinfo @) into redirect-prone
  params and catches 30x location, meta-refresh and js redirects that
  resolve off-site.
- xss: injects a unique canary wrapped in breaking chars, classifies
  the reflection context (html/attribute/script) and reports only the
  chars that survive unescaped where they matter, so escaped
  reflections don't false-positive.

all route through httpx.Client so proxy/-H/-cookie/-rate-limit apply.
hermetic httptest coverage plus integration testbed entries.
2026-06-09 18:11:38 -07:00
vmfunc b4e78114d7 feat(js): extract secrets and endpoints from scanned javascript
the -js pipeline already pulls every <script> into a buffer but only
mined supabase jwts from it. reuse that buffer to run a credential
regex bank (aws/github/slack/stripe/google keys, pem blocks, plus
entropy-gated generic apikey/secret/token assignments) and a
linkfinder-style endpoint extractor that resolves relatives to
absolute urls. both dedupe across scripts and surface through the
existing js logger and result struct, no new flag.
2026-06-09 18:11:38 -07:00
vmfunc d0bdcf1690 feat: shared http client with proxy, custom headers and rate limiting
every scanner spun up its own &http.Client, so there was no single place
to apply a proxy, custom headers, a cookie or a rate limit. add an
internal/httpx package that builds one configured transport at startup and
hand it to every scanner via httpx.Client(timeout), keeping behavior
identical when nothing is set (plain client when Configure was never
called).

- httpx.Configure wires -proxy (http/https/socks5), -H/--header, -cookie
  and -rate-limit into a package-level RoundTripper that paces via a
  rate.Limiter and only sets headers the caller hasn't already, so a
  scanner's explicit api key still wins.
- route the scan/wordlist downloads that used http.DefaultClient through
  the shared client too; ports tcp dialing is left untouched.
- clamp -threads to a floor of 1: it feeds wg.Add across the scanners, so
  0 was a silent no-op and a negative value panicked the waitgroup.

document the new flags in the readme, usage docs and man page.
2026-06-09 17:28:14 -07:00
celeste cb194406a7 Merge pull request #116 from vmfunc/test/scanner-seams
test(scan): seam shodan/securitytrails/cloudstorage/dnslist for hermetic tests
2026-06-09 16:26:40 -07:00
vmfunc 912f6e8e0e test(scan): seam shodan/securitytrails/cloudstorage/dnslist for hermetic integration tests
the remaining hardcoded base urls had no test seam, so their drivers could
only be exercised against the live apis. promote them to package vars (matching
the dirlist/git/ports pattern from #112) and route dnslist's per-host probes
through an injectable transport, then add integration tests that pin each at a
local httptest fixture. defaults equal the old const values so behavior is
unchanged.
2026-06-09 16:07:20 -07:00
vmfunc 094f1e7806 fix(output): dedupe non-tty progress milestones
concurrent workers (-threads 40) all hit the same milestone bucket on
increment, spamming ~10 duplicate [25%] lines. track the last printed
bucket under p.mu and only print when it advances.
2026-06-09 16:03:52 -07:00
celeste 83ac92a4b8 Merge pull request #113 from vmfunc/fix/frameworks-confidence
fix(frameworks): require a real signature match + fix cve version matching
2026-06-09 15:02:40 -07:00
celeste 7f0e4cd128 Merge pull request #112 from vmfunc/test/e2e-integration
test: hermetic e2e integration suite
2026-06-09 14:50:54 -07:00
vmfunc 29d94e5352 fix(frameworks): require a real signature match, fix cve version matching
- recenter the detection confidence (sigmoid centered at 0.3) so a single weak
  signature match no longer clears the 0.5 threshold. before, sigmoid(0) was 0.5
  so *any* match counted as a detection - that's the magento-on-a-plain-page
  false positive from the live run. real detections match ~50%+ of signature
  weight, so the existing detector tests are unaffected
- getVulnerabilities matched affected versions with a raw string prefix, so "4.2"
  also matched "4.20"; match only on dotted boundaries now
- break confidence ties on name so the picked framework is deterministic
- add regression tests for the confidence floor and the version boundary
2026-06-09 14:46:10 -07:00
vmfunc ce3075ad91 test: hermetic e2e integration suite
- make the four wordlist base urls (dirlist/dnslist/git/ports) package vars
  instead of consts so tests can repoint them at a local fixture; the default
  values are byte-for-byte unchanged
- add internal/scan/integration_test.go behind a //go:build integration tag: it
  stands up a local "vulnerable app" httptest server with planted artifacts and
  runs git/dirlist/cms/headers/sql/lfi/ports against it, asserting real findings
- go.yml runs them via `go test -tags=integration`; the default test run is
  untouched (the tag keeps them out)
- document the integration run in docs/development.md
2026-06-09 14:32:26 -07:00
vmfunc 661480a56d feat: stamp and surface the build version
- add internal/version: resolve from the release ldflag, else the go build
  info (module tag / vcs revision), else "dev"
- show the version on the boot banner and for `sif version`
- Makefile now stamps `make` builds via git describe (matching the release ci),
  so local/go-install builds report a real version instead of "dev"
- patchnotes.ShowOnce skips pseudo/dev versions so non-release builds dont make
  a doomed github call
- document sif version / sif patchnote / SIF_NO_PATCHNOTES in the readme + usage
2026-06-09 14:18:28 -07:00
celeste 76e8893ee2 Merge pull request #110 from vmfunc/fix/supabase-timeout
fix(js): give supabase requests a real timeout
2026-06-09 14:11:53 -07:00
vmfunc eb33321102 fix(js): give supabase requests a real timeout
doSupabaseRequest and the signup call used a bare http.Client{} with no
timeout, so a slow supabase project could hang the whole js scan. thread the
scan's --timeout down through ScanSupabase into every supabase request.
2026-06-09 13:56:01 -07:00
vmfunc 133224c348 refactor: dedupe url scheme stripping
`strings.Split(url, "://")[1]` was copy-pasted in 18 spots and panics on a
schemeless target (index out of range). add a small stripScheme helper in the
scan package - and a guarded equivalent in logger, which cant import scan - so
a bare host degrades gracefully instead of crashing the scan.
2026-06-09 13:51:04 -07:00
vmfunc 5e10c1857b feat: show release notes via patch notes
- `sif patchnote` (also `-pn`) fetches the latest github release and renders
  its notes with glamour
- on the first run of a new version those notes are shown once, then recorded
  so they dont show again - best-effort, so dev builds, the SIF_NO_PATCHNOTES
  opt-out, and any network failure stay quiet
- wire up `var version` so the release `-X main.version` ldflag actually lands,
  and add `sif version`
2026-06-08 19:13:03 -07:00
vmfunc 9326465a46 chore: drop decorative emoji and redundant comments
strips the emoji from the subdomaintakeover/cloudstorage/supabase log
prefixes and trims comments that just restate the code (the
parameters/returns block on SubdomainTakeover, a couple of "this provides
type safety" notes) so the scanners read like whois.go/headers.go.
2026-06-08 18:53:06 -07:00
vmfunc 50c9933812 chore: drop the unused worker pool
internal/worker.Pool[T,R] (run/runwithfilter/foreach) was added with tests
but never wired into anything - the scanners all hand-roll their own
waitgroup loops. dead weight, so remove it.
2026-06-08 18:53:06 -07:00
vmfunc af0167859a fix: data races and a progress divide-by-zero
- git.go and dork.go appended to a shared results slice from every worker
  goroutine with no mutex - a real race that -race never caught since
  neither has a test. guard the appends like dirlist/dnslist already do.
- progress.go's non-tty milestone path divided by total with no guard, so
  a zero-total bar panicked when output was piped/redirected. bail early
  on total <= 0 to match the tty branch, and add an output test for it.
2026-06-08 18:53:06 -07:00
vmfunc 7efd62c804 feat: add security header analysis scan
adds a -sh/--security-headers scan that flags missing or weak response
headers (hsts, csp, x-frame-options, x-content-type-options,
referrer-policy, permissions-policy, coop) and headers that leak server
internals (server, x-powered-by, ...). hsts is only graded over https
where it actually applies. wired into App.Run and the module results.
2026-06-08 18:53:06 -07:00
vmfunc 648fa8d2c8 chore: bump copyright headers to 2026
rolls the (c) 2022-2025 banner to 2022-2026 across all go files, the
startup banner in sif.go, and the header-check workflow's expected
format. comment-only, nothing else changes.
2026-06-08 18:30:48 -07:00
vmfunc 4fc0df5a01 fix(templates): guard tar extraction against path traversal
The nuclei-templates tarball is fetched over the network and its entry
names flowed directly into os.Mkdir/os.Create, so a malicious or
compromised archive could write outside the extraction directory
("Zip Slip", CWE-22). Resolve each entry against the working directory
and reject any path that escapes it before touching the filesystem.

CodeQL flagged this as a high-severity alert on the lines this branch
already touched. gosec's G305 fires on filepath.Join with archive data
regardless of the traversal guard, so it's excluded with a note.
2026-06-08 17:35:05 -07:00
Claude ece5b2b0b0 chore: clean up lint exclusions deferred in #98
Address pre-existing code issues that were suppressed in #98 to keep that
PR scoped to the Go 1.25 / golangci-lint v2 toolchain bump.

https://claude.ai/code/session_01S433Zq3Xzm3ZethsqkyaZF
2026-06-08 16:56:49 -07:00
vmfunc 495d2c5496 feat: add securitytrails integration for domain discovery + target expansion 2026-02-17 13:38:07 +01:00
vmfunc 75da3e3131 fix: resolve all golangci-lint issues across codebase
- noctx: use http.NewRequestWithContext instead of http.Get/client.Get
- bodyclose: close response bodies on all code paths
- httpNoBody: use http.NoBody instead of nil for GET request bodies
- ifElseChain: convert if/else chains to switch in sif.go
- sloppyReassign: use := in logger.go where possible
- nilnil: annotate intentional nil,nil returns in lfi.go, sql.go
- errcheck: handle template install error in nuclei.go
- govet copylock: pass mutex by pointer in executor.go
- log.Fatalf: replace with log.Errorf+continue in api mode
2026-02-13 02:11:17 +01:00
vmfunc bad5b598c9 test: add fuzz tests for LFI detection, SQL patterns, version parsing
fuzz targets: DetectLFIFromResponse, isAdminPanel, databaseErrorPatterns,
isValidVersionString, ExtractVersionOptimized - should bump the scorecard
fuzzing check.

Signed-off-by: vmfunc <celeste@linux.com>
2026-02-13 01:57:46 +01:00
vmfunc 03a9488b65 internal/scan: migrate nuclei integration to v3 SDK
replace ~100 lines of manual nuclei v2 plumbing (catalog, loader, core,
protocolstate, protocolinit, hosterrorscache, interactsh, reporting,
ratelimit, testutils) with the v3 lib SDK - NewNucleiEngineCtx +
functional options.

drops direct ratelimit dep, mholt/archiver and nwaples/rardecode
(resolves dependabot CVE alerts for path traversal + DoS).

Signed-off-by: vmfunc <celeste@linux.com>
2026-02-13 01:22:25 +01:00
vmfunc f50f1b933a Merge branch 'main' into feat/builtin-shodan 2026-02-08 19:22:32 +01:00
vmfunc 6f460425be Merge pull request #63 from 0x4bs3nt/feat/builtin-whois
feat(modules): builtin whois scan as module
2026-02-08 14:12:27 +01:00
vmfunc 16ea9047f0 Merge branch 'main' into feat/builtin-frameworks 2026-01-12 11:22:56 +01:00
vmfunc 39bd115d3c Merge branch 'main' into feat/builtin-shodan 2026-01-12 11:22:36 +01:00
vmfunc ccf093b7e9 fix: rename to snakecase 2026-01-12 11:19:54 +01:00
vmfunc b5398ec687 fix: renamed whois module file
Renamed whois scan module file to differentiate from legacy whois scan
file.
2026-01-12 11:19:54 +01:00
vmfunc b298e2ec2c fix(conflicts): fix PR conflicts on 2026-01-12 11:19:48 +01:00
vmfunc 95cebab47f fix: rename to snakecase 2026-01-07 22:39:56 +01:00
vmfunc 579f5aff4b fix: rename to snakecase 2026-01-07 22:39:35 +01:00