feat(scan): add web crawler and passive subdomain/url discovery

-crawl spiders same-host links/scripts/forms through the shared httpx
client so proxy/headers/rate-limit and robots.txt are honored, bounded
by -crawl-depth. -passive pulls subdomains from keyless ct feeds (crt.sh,
certspotter) and historical urls from wayback, each source isolated so
one feed being down doesn't sink the rest and the target sees no traffic.
This commit is contained in:
vmfunc
2026-06-09 17:57:42 -07:00
committed by vmfunc
parent 9401aa669e
commit dbe79c495e
10 changed files with 787 additions and 1 deletions
+20
View File
@@ -186,6 +186,26 @@ export SHODAN_API_KEY=your-api-key
./sif -u https://example.com -framework
```
### web crawler
`-crawl` - spider the target, following same-host links, scripts and forms
`-crawl-depth` - max recursion depth (default 2). respects robots.txt and stays on the target host.
```bash
./sif -u https://example.com -crawl -crawl-depth 3
```
### passive discovery
`-passive` - gather subdomains from certificate transparency (crt.sh, certspotter) and historical urls from the wayback machine
keyless and zero traffic to the target itself - all lookups hit third-party feeds.
```bash
./sif -u https://example.com -passive
```
### whois lookup
`-whois` - perform whois lookups