3 Commits

Author SHA1 Message Date
Willi Ballenthin 877d8da73c mapa: add --output=html-map
Polish MAPA html split view
2026-03-16 19:54:45 +01:00
Willi Ballenthin a364659cc3 mapa: port from Lancelot/BinExport2 to IDALib/ida-domain
Replace the Lancelot/BinExport2 backend with an IDALib-only implementation
using ida-domain as the primary query surface.

New mapa/ package with four layers:
- model.py: backend-neutral dataclasses (MapaReport, MapaFunction, etc.)
- ida_db.py: database lifecycle with SHA-256 caching and flock guards
- collector.py: populates MapaReport from an open ida_domain.Database
- renderer.py: Rich-based text output from MapaReport
- cli.py: argument parsing, capa/assemblage loading, orchestration

Key behaviors preserved from the original:
- Report sections: meta, sections, libraries, functions (modules removed)
- Thunk chain resolution (depth 5, matching capa's THUNK_CHAIN_DEPTH_DELTA)
- Caller forwarding through thunks
- CFG stats with NOEXT|PREDS flags
- String extraction via data-reference chains (depth 10)
- Assemblage overlay and capa match attachment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: suppress Lumina via IdaCommandOptions.plugin_options

Match capa's loader.py behavior: disable primary and secondary Lumina
servers by passing plugin_options through IdaCommandOptions, which maps
to IDA's -O switch. load_resources=True already provides -R.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: add __main__.py for python -m mapa invocation

scripts/mapa.py shadows the mapa package when run directly because
Python adds scripts/ to sys.path. The canonical invocation is now:

    python -m mapa <input_file> [options]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: import idapro before ida_auto

idapro must be imported first because it mutates sys.path to make
ida_auto and other IDA modules available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: guard against InvalidEAError in string/xref lookups

ida-domain raises InvalidEAError for unmapped addresses instead of
returning None. Guard data_refs_from_ea and strings.get_at calls
so the collector handles broken reference chains gracefully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: change default/key theme color from black to blue

Black text is invisible on dark terminals. Use blue for function names,
keys, and values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: use module.dll!func format for APIs and libraries

IDA strips .dll from PE import module names. Add it back so libraries
render as 'KERNEL32.dll' and API entries as 'KERNEL32.dll!CreateFileW'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: lowercase module names in libraries and API entries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: use FLOSS/capa regex-based string extraction instead of IDA string list

IDA's built-in string list has a minimum length threshold (~5 chars)
that silently drops short strings like "exec". Replace db.strings and
ida_bytes.get_strlit_contents with regex-based extraction from FLOSS/capa
that scans raw segment bytes for ASCII and UTF-16 LE strings (min 4 chars).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: simplify string extraction to on-demand via get_cstring_at

Replace upfront segment-scanning index with on-demand reads using
db.bytes.get_cstring_at, validated against FLOSS/capa printable ASCII
charset. The index approach missed mid-string references and did
unnecessary work scanning entire segments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mapa: add UTF-16 LE wide string extraction

Read raw bytes at data reference targets and check for both ASCII and
UTF-16 LE strings using FLOSS/capa printability heuristics. Neither
ida_domain's get_cstring_at nor get_string_at handle wide strings, so
we parse the byte patterns directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 15:04:26 +01:00
Willi Ballenthin 5dd1f49023 import codemap as mapa 2026-03-16 11:23:35 +01:00