Compare commits

..

379 Commits

Author SHA1 Message Date
Ana María Martínez Gómez
407ecab162 Merge pull request #515 from Ana06/v1-6-1 2021-04-07 18:03:56 +02:00
Ana Maria Martinez Gomez
cbc1f57b21 changelog: add master (unreleased) to CHANGELOG
Add placeholder for master (unreleased changes) in CHANGELOG. Document
this in the release checklist.
2021-04-07 17:50:19 +02:00
Ana Maria Martinez Gomez
374a9e4337 changelog: v1.6.1
This release includes several bug fixes, such as a vivisect fix for a bug, which caused that capa didn't work on Windows with Python 3. It also adds 17 new rules and a bunch of improvements in the rules and IDA rule generator. We appreciate everyone who opened issues, provided feedback, and contributed code and rules.

This is the very last capa release that supports Python 2.
2021-04-07 17:50:16 +02:00
Capa Bot
83e2f80d10 Sync capa-testfiles submodule 2021-04-07 13:53:32 +00:00
Ana Maria Martinez Gomez
576211c4ef version: bump to v1.6.1 2021-04-07 11:11:43 +02:00
Ana María Martínez Gómez
31fc5a31d6 Merge pull request #513 from Ana06/ping-dependencies
setup: pin dependencies
2021-04-07 10:19:04 +02:00
Ana Maria Martinez Gomez
eb08943d4f setup: pin dependencies
Pin all dependencies in setup to the currently used version to avoid
that a new release breaks capa without being noticed.

Closes https://github.com/fireeye/capa/issues/498
2021-04-07 09:40:13 +02:00
Ana María Martínez Gómez
c36ed71353 Merge pull request #470 from fireeye/ci/test-windows 2021-04-07 09:38:34 +02:00
Ana Maria Martinez Gomez
fa52dbcf84 ci: skip smda tests in win32
Due to a bug, two `test_smda_features` tests are failing:
https://github.com/danielplohmann/smda/issues/20

Disable them until the bug is fixed.
2021-04-06 21:53:22 +02:00
Ana Maria Martinez Gomez
d412e66cea ci: do not test Python 2.7 with Windows
The Python 2.7 tests fail in Windows with vivisect because the Windows
filesystem encoding is not UTF-8. This shouldn't be a problem when using
capa as the given filename most likely uses the same encoding, but we
force UTF-8 in our tests. As we are planing to remove Python 2 support
is not wortwhile to invest time in making this test working. Instead,
test Python 2.7 only in Ubuntu.
2021-04-06 21:39:01 +02:00
Moritz Raabe
efe50d3313 ci: test on Windows and macOS
Run the tests on Windows and macOS to avoid failures OS related.

closes #460
2021-04-06 21:38:07 +02:00
Ana María Martínez Gómez
1062ba995e doc: add milestones link to release checklist
This makes it a bit easier to check if all milestoned issues/PRs are addressed, or reassign to a new milestone.

I am committing directly to master as this is a minor change which doesn't need review.
2021-04-06 10:21:43 +02:00
Ana María Martínez Gómez
7f93bd5b59 Merge pull request #512 from fireeye/williballenthin-patch-2
setup: bump viv to v1.0.1
2021-04-06 10:17:44 +02:00
Willi Ballenthin
275d170680 setup: bump viv to v1.0.1 2021-04-05 21:22:17 -06:00
Moritz
6d7e10b804 Merge pull request #511 from fireeye/ci/fix-typos
fix submodule typos
2021-04-05 13:13:41 +02:00
Moritz Raabe
25944864f7 fix submodule typos 2021-04-05 12:52:08 +02:00
Capa Bot
5e84a16eba Sync capa rules submodule 2021-04-01 16:44:59 +00:00
Capa Bot
244ec163a3 Sync capa-testfiles submodule 2021-04-01 16:44:11 +00:00
Capa Bot
dabd2174d4 Sync capa rules submodule 2021-03-29 16:25:18 +00:00
Moritz
f8d2b41a86 Merge pull request #495 from fireeye/gh/add-pr-template
add PR template
2021-03-29 17:31:05 +02:00
Capa Bot
902972a1ee Sync capa-testfiles submodule 2021-03-29 12:49:24 +00:00
Capa Bot
bddb5fbd2f Sync capa rules submodule 2021-03-26 11:17:46 +00:00
Capa Bot
adfd769963 Sync capa-testfiles submodule 2021-03-26 11:00:35 +00:00
Capa Bot
c75e70ec74 Sync capa-testfiles submodule 2021-03-26 11:00:15 +00:00
Moritz
6118183105 Merge pull request #504 from fireeye/mr-tz-patch-1
Update setup.py
2021-03-26 11:58:52 +01:00
Moritz
da755d8411 Update setup.py 2021-03-26 11:44:04 +01:00
mike-hunhoff
742e03d90f Merge pull request #503 from fireeye/explorer/update-readme
updating capa explorer README
2021-03-25 14:51:21 -06:00
Capa Bot
744228a03e Sync capa rules submodule 2021-03-25 20:48:41 +00:00
Michael Hunhoff
5d1c6f54cd updating capa explorer README 2021-03-25 14:30:28 -06:00
mike-hunhoff
0a3dd4600b Merge pull request #468 from fireeye/features/support-string-values-special-chars
add support for string features with special characters e.g. '\n'
2021-03-25 12:58:00 -06:00
Michael Hunhoff
0289891d07 merging upstream 2021-03-25 12:43:59 -06:00
Michael Hunhoff
87cdf837e6 merging upstream 2021-03-25 12:42:36 -06:00
Capa Bot
ea4c7d6403 Sync capa rules submodule 2021-03-25 18:37:22 +00:00
Capa Bot
2807549564 Sync capa rules submodule 2021-03-25 07:21:21 +00:00
Capa Bot
c0fe96cec6 Sync capa-testfiles submodule 2021-03-25 07:17:41 +00:00
mike-hunhoff
8c967ac237 Merge pull request #500 from fireeye/explorer/improve-rulegen-search
explorer: add checks to validate matched data when searching
2021-03-24 15:55:34 -06:00
Michael Hunhoff
c48b46e932 explorer: adding checks to validate matched data when searching 2021-03-24 15:33:20 -06:00
mike-hunhoff
49d1af7798 improve unit tests for strings containing special characters
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2021-03-24 13:22:18 -06:00
mike-hunhoff
d44fd008ae improve unit tests for strings containing special characters
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2021-03-24 13:22:04 -06:00
Moritz Raabe
c0c9ea3403 incorprate Ana's feedback 2021-03-24 09:22:40 +01:00
Michael Hunhoff
21359da766 updating test for strings with special characaters 2021-03-23 16:02:47 -06:00
Michael Hunhoff
e51c79c241 adding lint for incorrect rule string format, refined rendering for strings 2021-03-23 15:55:48 -06:00
Capa Bot
195bae903f Sync capa rules submodule 2021-03-23 12:25:20 +00:00
Moritz Raabe
5aff21a9a1 add PR template 2021-03-23 10:52:01 +01:00
Ana María Martínez Gómez
6f289d1b8e Merge pull request #476 from Ana06/tag-workflow 2021-03-23 09:54:59 +01:00
Moritz
71b21aec59 Merge pull request #492 from fireeye/ignore-gitfiles
rule loading: ignore files starting with .git
2021-03-23 08:16:29 +01:00
Capa Bot
42a87d4eaa Sync capa-testfiles submodule 2021-03-23 07:14:58 +00:00
Capa Bot
51d125642f Sync capa rules submodule 2021-03-23 07:14:21 +00:00
mike-hunhoff
ddebf2e1cb Merge pull request #493 from fireeye/enhance/472
rule generator: support subscope rules
2021-03-22 17:28:43 -06:00
Michael Hunhoff
7f3e8f1fb1 adding support to match subscope rules and auto insert child statements when creating a new basic block subscope 2021-03-22 17:12:13 -06:00
Ana María Martínez Gómez
ab7dbcd2e4 Merge pull request #491 from fireeye/williballenthin-patch-3 2021-03-22 19:16:49 +01:00
Ana Maria Martinez Gomez
7e5cbddf5d doc: document release process
Add a release checklist.

Closes https://github.com/fireeye/capa/issues/184
2021-03-22 19:14:02 +01:00
Moritz Raabe
44f517c20d rule loading: ignore files starting with .git 2021-03-22 18:11:29 +01:00
Michael Hunhoff
7bf8c6e3a1 merging upstream 2021-03-22 10:33:36 -06:00
Michael Hunhoff
31ea683335 merge upstream 2021-03-22 09:53:07 -06:00
Willi Ballenthin
29d8f1fd27 ci: tests: pin OS version 2021-03-22 09:51:20 -06:00
Willi Ballenthin
a6c472bb2a ci: publish: pin OS version 2021-03-22 09:50:47 -06:00
Willi Ballenthin
b880d419a3 ci: build: pin OS versions 2021-03-22 09:50:04 -06:00
Capa Bot
a2ff87af8a Sync capa rules submodule 2021-03-22 15:45:10 +00:00
Willi Ballenthin
5b9c577380 Merge pull request #489 from fireeye/dependabot/pip/viv-utils-0.6.0
Bump viv-utils from 0.5.0 to 0.6.0
2021-03-22 09:39:52 -06:00
Capa Bot
4775e124db Sync capa rules submodule 2021-03-22 09:02:35 +00:00
Moritz
c243158d7c Merge pull request #486 from fireeye/fix/eol-improvements
EOL improvements
2021-03-22 09:58:29 +01:00
Capa Bot
8afc3f46f6 Sync capa rules submodule 2021-03-22 08:41:21 +00:00
dependabot[bot]
8b5dc54397 Bump viv-utils from 0.5.0 to 0.6.0
Bumps [viv-utils](https://github.com/williballenthin/viv-utils) from 0.5.0 to 0.6.0.
- [Release notes](https://github.com/williballenthin/viv-utils/releases)
- [Commits](https://github.com/williballenthin/viv-utils/compare/v0.5.0...v0.6.0)

Signed-off-by: dependabot[bot] <support@github.com>
2021-03-22 06:20:47 +00:00
Capa Bot
1dbb34df9f Sync capa-testfiles submodule 2021-03-21 19:28:58 +00:00
mike-hunhoff
9383f0bc77 Merge pull request #474 from fireeye/explorer/fix-471
explorer: adding support for multi-line tab and SHIFT + Tab
2021-03-19 19:11:14 -06:00
Moritz Raabe
13306b71e0 add file 2021-03-19 09:40:44 +01:00
Moritz Raabe
8719a23de4 dos2unix 2021-03-19 09:40:44 +01:00
Moritz Raabe
7e0b5236af better deal with CRLF/LF issues 2021-03-19 09:40:43 +01:00
Moritz Raabe
c7798b3254 ensure LF end of line 2021-03-19 09:40:43 +01:00
Willi Ballenthin
7d668550f5 Merge pull request #485 from fireeye/ci/ensure-lf-eol
ensure LF end of line
2021-03-18 14:41:13 -06:00
Capa Bot
c945eaf804 Sync capa rules submodule 2021-03-18 20:41:05 +00:00
Moritz Raabe
1bfe0e0874 ensure LF end of line 2021-03-18 20:15:23 +01:00
Capa Bot
153c6a7b01 Sync capa-testfiles submodule 2021-03-18 18:04:33 +00:00
Ana Maria Martinez Gomez
30a83fa382 doc: Fix broken link in README
Introduced in https://github.com/fireeye/capa/pull/478
2021-03-16 16:37:33 +01:00
Willi Ballenthin
c0bcefe0bf Merge pull request #479 from Ana06/viv-utils5
setup: bump viv-utils to 0.5.0
2021-03-16 07:02:43 -06:00
Ana Maria Martinez Gomez
5d16a77891 ci: tag capa-rules on release
Add GitHub Action to tag capa-rules when releasing capa. The used tag
name is the same as the one in capa.
2021-03-16 12:45:02 +01:00
Ana Maria Martinez Gomez
cd01a01894 setup: bump viv-utils to 0.5.0
In viv-utils `getWorkspace` raises `IncompatibleVivVersion` on Python 3
when `vw.loadWorkspace(viv_file)` raises `UnicodeDecodeError`.

Fixes https://github.com/fireeye/capa/issues/469

As we use the same version in py2 and py3, define the viv-utils
requirement once.
2021-03-16 10:51:50 +01:00
Willi Ballenthin
df36bb9f35 Merge pull request #478 from Ana06/badges
doc: Improve README badges
2021-03-15 14:42:57 -06:00
Ana María Martínez Gómez
030893e125 Merge pull request #475 from Ana06/incompatible-viv
changelog: document incompatibility of viv files
2021-03-15 17:30:17 +01:00
Ana Maria Martinez Gomez
b2ab8ab54c doc: Improve README badges
- Add a link to the `PyPI - Python Version` badge. Otherwise it opens
the image when clicking on it, which is inconsistent with the other
labels. I arrived too late to point this out in:
https://github.com/fireeye/capa/pull/477
- Add release badge with last release version. This may help users to
realize that a new version has been released.
- Add downloads badge.
- Order labels by color.

Closes https://github.com/fireeye/capa/issues/196
2021-03-15 16:47:15 +01:00
Willi Ballenthin
12eb1b96de Merge pull request #477 from fireeye/mr-tz-patch-1
Update README.md with Python version badge
2021-03-15 08:35:27 -06:00
Moritz
cff7d4bad4 Update README.md 2021-03-15 11:54:11 +01:00
Ana Maria Martinez Gomez
a31c616a21 changelog: document incompatibility of viv files
`.viv` files (generated by vivisect) are not compatible between Python 2
and Python 3. This causes capa to raise an `UnicodeDecodeError`
exception and should be documented better. I'll add this change to the
release notes after the review.

Related to https://github.com/fireeye/capa/issues/469
2021-03-15 10:26:32 +01:00
Michael Hunhoff
3d2b4dcc26 adding support for multi-line tab and SHIFT + Tab 2021-03-11 17:13:43 -07:00
Michael Hunhoff
c7d24ee290 adding support for string features with special characters e.g. '\n' 2021-03-10 13:56:54 -07:00
mike-hunhoff
06c958f081 Merge pull request #465 from fireeye/explorer/fix-463
explorer: improve settings modification
2021-03-10 11:30:23 -07:00
Michael Hunhoff
b8efe585d5 fix 463, improve settings UI 2021-03-09 14:56:44 -07:00
Willi Ballenthin
e7eb2152cc Merge pull request #464 from fireeye/explorer/fix-462
fix 462
2021-03-09 12:13:54 -07:00
Michael Hunhoff
e1a8641399 fixes 462, default to empty string when accessing rule path stored in ida_settings 2021-03-09 12:09:35 -07:00
Capa Bot
cffac62e68 Sync capa rules submodule 2021-03-09 10:00:48 +00:00
Ana María Martínez Gómez
7a8c0572e9 Merge pull request #455 from Ana06/v1-6-0 2021-03-09 10:48:01 +01:00
Ana Maria Martinez Gomez
5596d5f8b2 version: bump to v1.6.0 2021-03-09 10:36:26 +01:00
Ana Maria Martinez Gomez
06fd02cd61 changelog: v1.6.0
This release adds the capa explorer rule generator plugin for IDA Pro,
vivisect support for Python 3 and 12 new rules. We appreciate everyone
who opened issues, provided feedback, and contributed code and rules.
Thank you also to the vivisect development team (rakuy0, atlas0
fd00m) for the Python 3 support (v1.0.0) and the fixes for Python 2
(v0.2.1). This is the last capa release which supports Python 2. Next
release will be Python 3 only.
2021-03-09 10:36:26 +01:00
Capa Bot
6b9d1047cf Sync capa rules submodule 2021-03-08 19:39:47 +00:00
Ana Maria Martinez Gomez
a7b3fd72ca changelog: v1.5.1 2021-03-08 20:09:31 +01:00
Ana María Martínez Gómez
dd3deb2358 Merge pull request #454 from fireeye/mr-tz-patch-1
setup: bump viv to 0.2.1
2021-03-08 11:36:18 +01:00
Moritz
c99fce3183 setup: bump viv to 0.2.1 2021-03-08 09:07:04 +01:00
Willi Ballenthin
3e55581bf7 Merge pull request #450 from fireeye/feature-refactor-args
refactor common cli argument handling
2021-03-05 15:07:50 -07:00
Willi Ballenthin
dfbe1418d4 Merge pull request #452 from fireeye/feature-py3-pyinstaller
pyinstaller: update for py3/pyinstaller 4.2
2021-03-05 15:06:47 -07:00
William Ballenthin
7671fca373 pep8 2021-03-05 13:27:16 -07:00
William Ballenthin
c01dde3fb2 ci: disable test building of pyinstaller upon push 2021-03-05 13:26:15 -07:00
William Ballenthin
bb17adeda2 pyinstaller: smda: collect capstone shared library 2021-03-05 13:23:15 -07:00
Willi Ballenthin
9f743f1c59 main: fix reference error 2021-03-05 13:19:54 -07:00
William Ballenthin
ee85c929da pyinstaller: install capstone for smda 2021-03-05 12:59:21 -07:00
William Ballenthin
6f9c660082 ci: test pyinstaller CI 2021-03-05 12:55:19 -07:00
William Ballenthin
e02bb7f5a1 pep8 2021-03-05 12:53:50 -07:00
William Ballenthin
9aaaa044da ci: use py3.9 and pyinstaller 4.2 to build standalone binaries 2021-03-05 12:52:38 -07:00
William Ballenthin
54da8444df pyinstaller: update for py3/pyinstaller 4.2
closes #451
2021-03-05 12:40:21 -07:00
William Ballenthin
063e1229bc pep8 2021-03-05 11:10:12 -07:00
William Ballenthin
eacd70329a merge from master, sorry 2021-03-05 11:06:40 -07:00
William Ballenthin
3a1d5d068c scripts: use common argument handler
closes #449
2021-03-05 10:58:40 -07:00
William Ballenthin
f2749d884f main: factor out common cli argument handling
ref #449
2021-03-05 10:57:39 -07:00
William Ballenthin
bdea61f93b scripts: remove old migration script 2021-03-05 10:57:14 -07:00
Ana María Martínez Gómez
829274cd5e Merge pull request #421 from Ana06/viv-py3 2021-03-03 21:40:08 +01:00
Ana Maria Martinez Gomez
c522f5094a Use -j option in test_backend_option
Use `-j` option in `test_backend_option` to check the extractor and that
rules have been extracted. This way we don't need to check if a concrete
rule matches, but only that at least a rule matches.
2021-03-03 18:33:20 +01:00
Ana Maria Martinez Gomez
29b6772721 Test backend option
As `get_extractor` returns only vivisect now, `test_main` is not run for
smda. Test that capa works with all backends. It doesn't test that the
backend is actually called.
2021-03-03 17:36:51 +01:00
Ana Maria Martinez Gomez
695b5b50ab Remove va not None check
Instead of checking if `va` is `None in `get_section()` we should avoid
calling this function with `None`. This have been fixed in the following
PR, so this is not longer needed:
https://github.com/fireeye/capa/pull/442
2021-03-03 17:36:51 +01:00
Ana Maria Martinez Gomez
42af7b2d8b Use default backend instead of None
Set the `backend` variable to the default backend by default instead to
`None`. The `backend` variable is needed in Python 2 as `args.backend`
is only set in Python 3. Although the value of the backend variable is
ignored in Python 2, so that the default value is not used.

Co-authored-by: William Ballenthin <william.ballenthin@fireeye.com>
2021-03-03 17:36:51 +01:00
Ana Maria Martinez Gomez
079a9b5204 Remove backend option from Python 2
Do only provide the backend option in Python 3, as there is only one
backend in Python 2. This way we keep the help text simpler.
2021-03-03 17:36:51 +01:00
Ana Maria Martinez Gomez
e5048fd3ac Add missing va parameter to SegmentationViolation
The `envi.SegmentationViolation()` was missing the `va` required
parameter. This has started failing now, because calling
`vw.getSegment(0x4BA190)` for the `tests/data/mimikatz.exe_` produces
different results in Python 2 and Python 3. It returns `None` in Python
3 while the output in Python 2 is:
`(4939776, 16840, '.data', 'mimikatz')`

I have reported the issue to vivisect:
https://github.com/vivisect/vivisect/issues/370
2021-03-03 17:36:51 +01:00
Ana Maria Martinez Gomez
18eaea95fa Fix TypeError exception in Python3
`va` can be None and this causes Python 3 to raise a TypeError
exception. This is caused by the following breaking change in Python3:
> The ordering comparison operators (<, <=, >=, >) raise a TypeError
> exception when the operands don’t have a meaningful natural ordering.

This didn't failed in the previously tried vivisect version (master from
one week ago and not the release). This may have been caused by a bug in
vivisect that has been fixed.
2021-03-03 17:36:51 +01:00
Ana Maria Martinez Gomez
a4a0a56448 Vivisect 1.0.0 released
Vivisect 1.0.0 (Python 3) has been released, so we do not need to link
to my GitHub branch anymore.

https://pypi.org/project/vivisect
2021-03-03 17:36:50 +01:00
Ana Maria Martinez Gomez
40ed2f39a4 Make backend a required parameter in get_extractor
Make the `backend` argument required in the `get_extractor` internal
routine. Specify a backend in the scripts which call this function. Add
a CLI backend option in capa/features/freeze.py as well.
2021-03-03 17:36:50 +01:00
Ana Maria Martinez Gomez
2859b037aa Use constants for backend option
Use constants instead of string literals for the backend option.
2021-03-03 17:36:50 +01:00
Ana Maria Martinez Gomez
bbb7878e0a Enable tests for vivisect in Python3
Now we support vivisect as backend in Python3. We should test it.
2021-03-03 17:36:50 +01:00
Ana Maria Martinez Gomez
fc438866ec Add option to select the backend in Py3
Now we have two working backends in Python3! Add an option to select
which one to use. With this code, vivisect is the default backend, but
this is really easy to change. We could do some analysis to see if smda
performances better than vivisect once the vivisect implementation.
2021-03-03 17:36:50 +01:00
Ana Maria Martinez Gomez
2da2f498a2 Add script to compare vivisect Python 2 vs 3
Compare the performance of vivisect Python 2 vs 3 by counting the number
of feature of each type extracted for every binary in `tests/data`.
Render the ones that perform bad (under a threshold - 98) and the total
performance. Render also the running time per binary for both Python 2 and 3.

From this result, it seems that vivisect behaves properly with Python3.
2021-03-03 17:36:50 +01:00
Ana Maria Martinez Gomez
29dffffe1b Python3 support for vivisect
Vivisect has moved to Python3. Allow to run vivisect with Python3 in
capa.

I am using the following version of vivisect (which includes fixes for
some bugs I have found and some open PRs in vivisect):
https://github.com/Ana06/vivisect/tree/py-3
2021-03-03 17:36:49 +01:00
Capa Bot
1ecaad5413 Sync capa rules submodule 2021-03-02 15:06:24 +00:00
Willi Ballenthin
cd56d672c0 Merge pull request #442 from fireeye/williballenthin-patch-2
viv: ignore empty branch targets
2021-03-01 08:43:26 -07:00
Willi Ballenthin
68aed3c190 insn: better document when branch va may be none 2021-02-28 23:03:08 -07:00
Willi Ballenthin
68fcc03d5c viv: ignore empty branch targets
but what does this really mean? why would `getBranches` return `None`?

closes #441
2021-02-25 13:34:59 -07:00
Capa Bot
939b29bf60 Sync capa rules submodule 2021-02-24 23:00:34 +00:00
Capa Bot
2f6a6e4628 Sync capa rules submodule 2021-02-24 08:07:52 +00:00
Capa Bot
7938ea34d0 Sync capa rules submodule 2021-02-24 08:06:30 +00:00
Capa Bot
ed94e36f7a Sync capa rules submodule 2021-02-24 00:12:19 +00:00
mike-hunhoff
1c3a8df136 Merge pull request #439 from fireeye/explorer/rulegen-support-file-scope
adding file scope support to rule generator IDA plugin
2021-02-23 11:50:54 -07:00
Michael Hunhoff
9f254b22ee adding file scope support to rule generator IDA plugin 2021-02-23 11:10:34 -07:00
Capa Bot
753f8ce84e Sync capa rules submodule 2021-02-23 17:33:38 +00:00
Capa Bot
acf3b549de Sync capa rules submodule 2021-02-23 15:29:20 +00:00
Capa Bot
669f6dcf98 Sync capa rules submodule 2021-02-23 15:23:19 +00:00
Capa Bot
e4f7c4aab1 Sync capa rules submodule 2021-02-23 15:22:43 +00:00
Moritz
5836d55e21 Merge pull request #438 from fireeye/explorer/show-results-by-function
explorer: adding option to show results by function
2021-02-22 18:23:44 +01:00
Michael Hunhoff
e17bf1a1f4 explorer: adding option to show results by function 2021-02-22 08:16:18 -07:00
Willi Ballenthin
acb253ae9c Merge pull request #437 from fireeye/scripts/show-capabilities
update to support running in IDA w/ Python 3
2021-02-19 17:02:53 -07:00
Michael Hunhoff
cc0aaa301f update to support running in IDA w/ Python 3 2021-02-19 14:28:20 -07:00
mike-hunhoff
4256316045 Merge pull request #436 from fireeye/fix/ida/unmapped-data-ref
check for unmapped addresses when resolving data references
2021-02-19 12:58:16 -07:00
Capa Bot
78ab0c9400 Sync capa-testfiles submodule 2021-02-19 19:39:18 +00:00
Capa Bot
944a670af0 Sync capa rules submodule 2021-02-19 17:17:33 +00:00
Michael Hunhoff
e4e517b334 checked for unmapped address when resolving data references 2021-02-19 10:07:23 -07:00
Capa Bot
ccd7f1ee4b Sync capa-testfiles submodule 2021-02-19 09:54:02 +00:00
Capa Bot
9db7ed88aa Sync capa rules submodule 2021-02-18 21:36:08 +00:00
Capa Bot
a5e7497f56 Sync capa-testfiles submodule 2021-02-18 21:35:02 +00:00
Capa Bot
754f302493 Sync capa rules submodule 2021-02-18 17:56:06 +00:00
Moritz
7783543153 Merge pull request #429 from fireeye/scripts/multiple-backends-show-features
mirror show-capabilities-by-function to enable multiple backends
2021-02-18 09:33:36 +01:00
Moritz
b02f92b3ea Merge pull request #428 from fireeye/linter/ntoskrnl-ntdll-overlap
linter: adding ntoskrnl, ntdll overlap lint
2021-02-18 09:23:02 +01:00
Michael Hunhoff
47b3ef29be removing viv dep from show-capabilities-by-function.py 2021-02-17 14:49:52 -07:00
Michael Hunhoff
1eb615f97c mirror show-capabilities-by-function to enable multiple backends 2021-02-17 14:40:33 -07:00
mike-hunhoff
cfa904a0a0 Merge pull request #426 from fireeye/explorer/rule-generator
initial commit of capa explorer rule generator plugin for IDA Pro
2021-02-17 13:44:54 -07:00
Michael Hunhoff
2d34458d10 linter: adding ntoskrnl, ntdll overlap lint 2021-02-17 13:29:36 -07:00
Capa Bot
e39713c4fd Sync capa rules submodule 2021-02-17 17:10:12 +00:00
Capa Bot
320b734da8 Sync capa rules submodule 2021-02-17 17:00:43 +00:00
Capa Bot
887848625c Sync capa-testfiles submodule 2021-02-17 16:52:43 +00:00
Capa Bot
685f06582d Sync capa rules submodule 2021-02-17 15:18:16 +00:00
Capa Bot
a3c21dba32 Sync capa rules submodule 2021-02-17 14:59:46 +00:00
Capa Bot
9744cde8aa Sync capa rules submodule 2021-02-17 07:27:24 +00:00
Capa Bot
0ba8c9ec00 Sync capa-testfiles submodule 2021-02-16 23:44:50 +00:00
Capa Bot
0764c603b4 Sync capa-testfiles submodule 2021-02-16 23:32:23 +00:00
mike-hunhoff
2d4f7a6946 Update README.md 2021-02-12 14:38:11 -07:00
mike-hunhoff
5346eec84d Update README.md 2021-02-12 14:35:34 -07:00
Michael Hunhoff
b704dd967b updating README related to capa explorer 2021-02-12 14:32:08 -07:00
Michael Hunhoff
84ace24b35 merging upstream 2021-02-12 14:19:23 -07:00
Michael Hunhoff
ea42f76cff updating README related to capa explorer 2021-02-12 14:18:30 -07:00
Michael Hunhoff
dd147dd040 format fixes, strip strings before display 2021-02-12 12:03:48 -07:00
Capa Bot
9a79136d15 Sync capa-testfiles submodule 2021-02-11 15:19:46 +00:00
Capa Bot
b722dd016a Sync capa rules submodule 2021-02-11 07:39:06 +00:00
Capa Bot
054853dc06 Sync capa-testfiles submodule 2021-02-11 07:36:27 +00:00
Capa Bot
e5ceef52c6 Sync capa rules submodule 2021-02-10 16:11:34 +00:00
Capa Bot
92747e8efc Sync capa-testfiles submodule 2021-02-10 14:11:34 +00:00
Capa Bot
6171de54f9 Sync capa-testfiles submodule 2021-02-10 14:05:17 +00:00
Capa Bot
287ef31081 Sync capa rules submodule 2021-02-10 13:44:47 +00:00
Willi Ballenthin
8121f291c3 version: bump to v1.5.1 2021-02-09 09:20:03 -07:00
Moritz
b721b5fcff Merge pull request #420 from fireeye/williballenthin-patch-2
setup: pin viv-utils version
2021-02-09 16:49:11 +01:00
Willi Ballenthin
521dfe0337 setup: bump viv-utils to 0.3.19 2021-02-09 08:18:17 -07:00
Capa Bot
7dc78b7837 Sync capa rules submodule 2021-02-09 15:17:09 +00:00
Michael Hunhoff
1a804ed97b merge upstream 2021-02-09 07:55:53 -07:00
Capa Bot
6636b9d56c Sync capa-testfiles submodule 2021-02-09 12:56:48 +00:00
Capa Bot
325c6cc805 Sync capa rules submodule 2021-02-09 09:58:41 +00:00
Capa Bot
6a6e205973 Sync capa-testfiles submodule 2021-02-08 19:07:40 +00:00
Capa Bot
46ec25d286 Sync capa rules submodule 2021-02-08 17:49:32 +00:00
Capa Bot
6e33a22676 Sync capa rules submodule 2021-02-08 17:48:52 +00:00
Capa Bot
6e81de9e44 Sync capa rules submodule 2021-02-08 17:45:01 +00:00
Willi Ballenthin
03f7bbc3a5 setup: pin viv-utils version 2021-02-08 10:30:31 -07:00
Willi Ballenthin
4354bc9108 Merge pull request #415 from fireeye/williballenthin-patch-2
v1.5.0
2021-02-08 09:55:43 -07:00
Willi Ballenthin
b8fcc2ff0c Merge pull request #417 from fireeye/smda/calls-from-no-api
remove apirefs from calls from
2021-02-08 09:54:04 -07:00
Moritz Raabe
55b7ae10a7 remove apirefs from calls from
closes #416
2021-02-08 11:56:01 +01:00
Willi Ballenthin
6d2a6c98d1 changelog: v1.5.0 2021-02-05 10:59:30 -07:00
Capa Bot
05998b5d05 Sync capa-testfiles submodule 2021-02-04 08:19:32 +00:00
Capa Bot
1063f3fcda Sync capa rules submodule 2021-02-03 18:13:29 +00:00
Capa Bot
93c5e4637b Sync capa rules submodule 2021-02-03 15:15:51 +00:00
Moritz
073c2b5754 Merge pull request #412 from fireeye/ida/meta-add-baseaddr
add imagebase to IDA meta data
2021-02-02 16:48:22 +01:00
mike-hunhoff
ef41d74b82 Merge pull request #411 from fireeye/fix/410
fixes #410
2021-02-02 08:38:23 -07:00
Moritz Raabe
84b3f38810 add imagebase to IDA meta data 2021-02-02 13:54:46 +01:00
mike-hunhoff
2288f38a11 Update capa/main.py
Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>
2021-02-01 12:45:36 -07:00
mike-hunhoff
dbc4e06657 Update capa/main.py
Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>
2021-02-01 12:45:29 -07:00
Michael Hunhoff
2433777a76 fixes #410 2021-02-01 11:43:24 -07:00
Moritz
bb7001f5f2 Merge pull request #409 from fireeye/fix/extract-bytes
improve bytes feature extraction
2021-02-01 17:38:40 +01:00
Moritz Raabe
9b5aaa40de improve bytes feature extraction 2021-02-01 17:17:22 +01:00
Capa Bot
96d74f48f4 Sync capa rules submodule 2021-02-01 11:55:33 +00:00
Michael Hunhoff
c8a99c247c rulegen python2.x support 2021-01-29 12:45:04 -07:00
Michael Hunhoff
9f50a37e40 rulegen filtering basic blocks, adding support for double-click to add feature 2021-01-29 11:47:58 -07:00
Michael Hunhoff
54c9e39654 rulegen reorder context menu actions 2021-01-29 11:11:41 -07:00
Michael Hunhoff
3386a1e9f9 rulegen adding vert and hort splitters, moving save button to right 2021-01-29 10:51:26 -07:00
Michael Hunhoff
b413f2eafe rulegen adding support for sync between editor and preview windows 2021-01-28 17:15:18 -07:00
Capa Bot
f07af25a6a Sync capa rules submodule 2021-01-28 16:52:21 +00:00
Willi Ballenthin
14e65c4601 Merge pull request #401 from fireeye/linter-format
Lint rule formatting and improved rule dump
2021-01-28 09:18:20 -07:00
Capa Bot
b5c2fb0259 Sync capa rules submodule 2021-01-28 16:06:09 +00:00
Capa Bot
92d98db7bb Sync capa-testfiles submodule 2021-01-28 15:25:17 +00:00
Michael Hunhoff
9caafedb8d merging upstream 2021-01-28 08:14:16 -07:00
Moritz
e6f7ef604a Merge pull request #404 from fireeye/bugfix/403
fixing #403
2021-01-28 11:17:39 +01:00
Moritz Raabe
0eb8d3e47c fix time debug output 2021-01-28 11:09:25 +01:00
Moritz Raabe
072e30498b adjust negative hex numbers in to_yaml 2021-01-28 10:54:17 +01:00
Moritz Raabe
d6e73577af dont change quotes when dumping 2021-01-28 10:54:17 +01:00
Moritz Raabe
a81f98be8e manual adjust negative numbers 2021-01-28 10:54:17 +01:00
Moritz Raabe
0980e35c29 simplify string comparison 2021-01-28 10:54:17 +01:00
Moritz Raabe
336c2a3aff add option to only check reformat status 2021-01-28 10:54:17 +01:00
Moritz Raabe
e3055bc740 check rule format consistency 2021-01-28 10:54:17 +01:00
Capa Bot
9406e3dbfb Sync capa rules submodule 2021-01-28 09:52:43 +00:00
Moritz
5307b7e1b1 Merge pull request #408 from fireeye/fix/lint-lib-path
adjust expected lib path and log time
2021-01-28 10:28:30 +01:00
Moritz Raabe
f18a8f5b31 adjust expected lib path and log time 2021-01-28 10:18:03 +01:00
Moritz
cfe99c4b72 Merge pull request #407 from fireeye/fix/lint-logging
disable extractor progress
2021-01-28 09:25:07 +01:00
Moritz Raabe
0d439c0f55 disable extractor progress 2021-01-28 09:22:15 +01:00
Moritz
6288a96a8b Merge pull request #406 from fireeye/ci/disable-python36
Disable Python 3.6 tests
2021-01-28 08:35:42 +01:00
Moritz
819b6f6ccf Merge pull request #402 from fireeye/lib-rules-subscoped
potential fix for #398
2021-01-28 08:35:28 +01:00
Moritz Raabe
4bc06aa8cd closes #405 2021-01-28 08:23:15 +01:00
Moritz Raabe
7b64425c24 update doc and test case 2021-01-28 08:18:23 +01:00
Michael Hunhoff
44c9d6a22b fixing #403 2021-01-27 18:29:53 -07:00
Moritz Raabe
c750447d62 potential fix for #398 2021-01-27 17:59:56 +01:00
Michael Hunhoff
b1c99d82fd rulegen adding special handling for count description 2021-01-22 09:41:17 -07:00
Michael Hunhoff
10db79f636 rulegen changes for backwards compat w/ Python 2.x 2021-01-22 08:22:37 -07:00
Willi Ballenthin
059ec8f3f2 Merge pull request #400 from fireeye/ci/enable-py39-2
bump smda, enable Python 3.9
2021-01-22 07:18:54 -07:00
Moritz Raabe
2c5508febd bump smda, enable Python 3.9 2021-01-22 10:00:25 +01:00
Capa Bot
905fff041b Sync capa rules submodule 2021-01-21 21:32:42 +00:00
Michael Hunhoff
cd27a64f4e rulegen clear ruleset cache when user configures new directory 2021-01-21 14:15:52 -07:00
Michael Hunhoff
d1b7a5c2e4 rulegen fixing bug in handling of subscope-rules 2021-01-21 14:05:24 -07:00
Michael Hunhoff
4b81b086db rulegen removing uneeded file 2021-01-21 10:19:37 -07:00
Michael Hunhoff
0db42c28a7 rulegen adding support to use cached ruleset, user click reset to reload rules from disk 2021-01-21 10:09:43 -07:00
Michael Hunhoff
0eca6ce2e3 rulegen adding save button, reducing menu complexity 2021-01-21 09:29:10 -07:00
Michael Hunhoff
34685bf80e rulegen adding header comment to generated rules 2021-01-20 15:22:56 -07:00
Michael Hunhoff
271dc2a6a9 rulegen add ability to configure default values for rule author and scope 2021-01-20 15:12:44 -07:00
Michael Hunhoff
bf0376f73f rulegen adding auto check if new rule matches current function 2021-01-20 14:31:48 -07:00
Michael Hunhoff
cf8656eb2d adding search bar for feature tree in rule generator 2021-01-19 12:03:15 -07:00
Willi Ballenthin
20ce29b033 Merge pull request #396 from fireeye/dependabot/pip/smda-1.5.11
Bump smda from 1.5.10 to 1.5.11
2021-01-19 08:21:00 -07:00
Capa Bot
4bd93a680e Sync capa-testfiles submodule 2021-01-18 08:02:29 +00:00
dependabot[bot]
c9bf7f424d Bump smda from 1.5.10 to 1.5.11
Bumps [smda](https://github.com/danielplohmann/smda) from 1.5.10 to 1.5.11.
- [Release notes](https://github.com/danielplohmann/smda/releases)
- [Commits](https://github.com/danielplohmann/smda/commits)

Signed-off-by: dependabot[bot] <support@github.com>
2021-01-18 06:44:33 +00:00
Capa Bot
4cde2e1a78 Sync capa rules submodule 2021-01-16 15:39:09 +00:00
Michael Hunhoff
15625b5f8c capa explorer rulegen -> adding styling; adding support for descriptions 2021-01-15 12:52:52 -07:00
Michael Hunhoff
e5f9da1f2b adding submenus to rulegen editor; empty expressions auto pruned from rulegen editor 2021-01-14 16:22:56 -07:00
Michael Hunhoff
ab33c46c87 init commit capa explorer rulegen 2021-01-14 15:46:24 -07:00
Capa Bot
48c045d381 Sync capa rules submodule 2021-01-12 18:30:44 +00:00
Capa Bot
2b385ead7f Sync capa rules submodule 2021-01-12 18:30:11 +00:00
Capa Bot
0fcc9f3df6 Sync capa-testfiles submodule 2021-01-12 18:27:32 +00:00
Capa Bot
b251202804 Sync capa-testfiles submodule 2021-01-12 18:27:11 +00:00
Capa Bot
6967010281 Sync capa-testfiles submodule 2021-01-12 18:26:12 +00:00
Capa Bot
7e0846e66a Sync capa rules submodule 2021-01-12 17:55:13 +00:00
Moritz
4e3daad96d Merge pull request #391 from fireeye/fix/freeze-base-addr
add base address to freeze
2021-01-11 11:30:29 +01:00
Capa Bot
37fb3da5db Sync capa rules submodule 2021-01-08 16:36:36 +00:00
Capa Bot
762f48957c Sync capa rules submodule 2021-01-08 15:16:32 +00:00
Capa Bot
c1af7b8783 Sync capa-testfiles submodule 2021-01-08 15:14:26 +00:00
Moritz Raabe
f89084677d add base address to freeze 2021-01-08 14:48:26 +01:00
Capa Bot
0716084bbb Sync capa-testfiles submodule 2021-01-08 08:46:53 +00:00
Capa Bot
a6c946e6c9 Sync capa rules submodule 2021-01-07 13:59:20 +00:00
Capa Bot
3f6e088faa Sync capa-testfiles submodule 2021-01-07 11:53:24 +00:00
Capa Bot
9abdd5813b Sync capa rules submodule 2021-01-07 07:47:28 +00:00
Capa Bot
f33ea36e6f Sync capa rules submodule 2021-01-05 15:49:04 +00:00
Moritz
8788e0a9c9 Merge pull request #388 from fireeye/ci/linter-update
lint with tags
2021-01-05 16:37:21 +01:00
Moritz Raabe
b1c1cb4b9b lint with --tag 2021-01-05 16:16:35 +01:00
Capa Bot
982d4ac472 Sync capa-testfiles submodule 2021-01-04 14:42:43 +00:00
Capa Bot
b7a8d667b9 Sync capa rules submodule 2021-01-04 12:51:43 +00:00
Capa Bot
8f8729df05 Sync capa-testfiles submodule 2020-12-30 19:06:28 +00:00
Capa Bot
e928d281dd Sync capa-testfiles submodule 2020-12-30 15:21:36 +00:00
Capa Bot
625583f5ab Sync capa rules submodule 2020-12-23 12:44:25 +00:00
Capa Bot
ab54553dd2 Sync capa rules submodule 2020-12-22 17:16:54 +00:00
Moritz
47bf7b1325 Merge pull request #375 from doomedraven/return_dict
add render to dict, is the same as default but just in dictionary so …
2020-12-22 15:52:50 +01:00
Moritz
145d75f579 Merge pull request #381 from fireeye/fix/viv-set-logger-levels
set level of more viv loggers explicitly
2020-12-22 15:52:05 +01:00
Capa Bot
01d976d7f7 Sync capa rules submodule 2020-12-22 13:17:37 +00:00
Capa Bot
095e3720ab Sync capa-testfiles submodule 2020-12-22 12:00:35 +00:00
Capa Bot
d62a37fe1f Sync capa-testfiles submodule 2020-12-21 16:17:33 +00:00
Capa Bot
5323f2fc31 Sync capa rules submodule 2020-12-17 17:14:43 +00:00
Capa Bot
5539cb0d08 Sync capa rules submodule 2020-12-17 17:12:21 +00:00
Capa Bot
76e80106d6 Sync capa-testfiles submodule 2020-12-17 09:29:56 +00:00
Capa Bot
9ab7b9a033 Sync capa rules submodule 2020-12-16 20:47:34 +00:00
Capa Bot
fe97d6a349 Sync capa-testfiles submodule 2020-12-15 19:23:15 +00:00
Capa Bot
2242c2afe8 Sync capa-testfiles submodule 2020-12-15 19:19:09 +00:00
Willi Ballenthin
ec25fb5c36 Merge pull request #384 from fireeye/dependabot/pip/smda-1.5.10
Bump smda from 1.5.9 to 1.5.10
2020-12-14 10:32:31 -07:00
dependabot[bot]
ce25f5cadd Bump smda from 1.5.9 to 1.5.10
Bumps [smda](https://github.com/danielplohmann/smda) from 1.5.9 to 1.5.10.
- [Release notes](https://github.com/danielplohmann/smda/releases)
- [Commits](https://github.com/danielplohmann/smda/commits)

Signed-off-by: dependabot[bot] <support@github.com>
2020-12-14 07:15:58 +00:00
Capa Bot
1099f40f19 Sync capa rules submodule 2020-12-12 05:43:31 +00:00
Capa Bot
70368b3f1e Sync capa rules submodule 2020-12-11 10:42:16 +00:00
Capa Bot
0181ebad45 Sync capa-testfiles submodule 2020-12-10 17:38:00 +00:00
DoomedRaven
e158e3f13c remove type hint to make CI happy 2020-12-08 21:46:39 +01:00
DoomedRaven
b1bbded23c black -l 120 . 2020-12-08 21:39:50 +01:00
DoomedRaven
b77d9d3738 isort --profile black --length-sort --line-width 120 capa_as_library.py 2020-12-08 21:34:42 +01:00
DoomedRaven
d0b2421752 isort capa_as_library.py 2020-12-08 20:53:26 +01:00
DoomedRaven
96b65a7c60 add example how to render it as library
```
>>> from capa_as_library import capa_details
>>> details = capa_details("/opt/CAPEv2/storage/analyses/83/binary", "dictionary")
>>> from pprint import pprint as pp
>>> pp(details)
{'ATTCK': {'DEFENSE EVASION': ['Obfuscated Files or Information [T1027]',
                               'Virtualization/Sandbox Evasion::System Checks '
                               '[T1497.001]'],
           'EXECUTION': ['Shared Modules [T1129]']},
 'CAPABILITY': {'anti-analysis/anti-vm/vm-detection': ['execute anti-VM '
                                                       'instructions (3 '
                                                       'matches)'],
                'anti-analysis/obfuscation/string/stackstring': ['contain '
                                                                 'obfuscated '
                                                                 'stackstrings'],
                'data-manipulation/encryption/rc4': ['encrypt data using RC4 '
                                                     'PRGA'],
                'executable/pe/section/rsrc': ['contain a resource (.rsrc) '
                                               'section'],
                'host-interaction/cli': ['accept command line arguments'],
                'host-interaction/environment-variable': ['query environment '
                                                          'variable'],
                'host-interaction/file-system/read': ['read .ini file',
                                                      'read file'],
                'host-interaction/file-system/write': ['write file (3 '
                                                       'matches)'],
                'host-interaction/process': ['get thread local storage value '
                                             '(3 matches)',
                                             'set thread local storage value '
                                             '(2 matches)'],
                'host-interaction/process/terminate': ['terminate process (3 '
                                                       'matches)'],
                'host-interaction/thread/terminate': ['terminate thread'],
                'linking/runtime-linking': ['link function at runtime (7 '
                                            'matches)',
                                            'link many functions at runtime'],
                'load-code/pe': ['parse PE header (3 matches)']},
 'MBC': {'ANTI-BEHAVIORAL ANALYSIS': ['Virtual Machine Detection::Instruction '
                                      'Testing [B0009.029]'],
         'ANTI-STATIC ANALYSIS': ['Disassembler Evasion::Argument Obfuscation '
                                  '[B0012.001]'],
         'CRYPTOGRAPHY': ['Encrypt Data::RC4 [C0027.009]',
                          'Generate Pseudo-random Sequence::RC4 PRGA '
                          '[C0021.004]']},
 'md5': 'ad56c384476a81faef9aebd60b2f4623',
 'path': '/opt/CAPEv2/storage/analyses/83/binary',
 'sha1': 'aa027d89f5d3f991ad3e14ffb681616a77621836',
 'sha256': '16995e059eb47de0b58a95ce2c3d863d964a7a16064d4298cee9db1de266e68d'}
>>>
```
2020-12-08 20:00:24 +01:00
Willi Ballenthin
177c90093e Merge pull request #380 from doomedraven/patch-1
fix is_ordinal IndexError
2020-12-08 09:21:53 -07:00
Moritz Raabe
28ee091107 set level of more viv loggers explicitly 2020-12-08 16:30:23 +01:00
doomedraven
64c71d8e6d fix is_ordinal IndexError
```
 Traceback (most recent call last):
   File "/opt/CAPE/utils/../lib/cuckoo/common/cape_utils.py", line 223, in flare_capa_details
     capabilities, counts = capa.main.find_capabilities(rules, extractor, disable_progress=True)
   File "/usr/local/lib/python2.7/dist-packages/capa/main.py", line 116, in find_capabilities
     function_matches, bb_matches, feature_count = find_function_capabilities(ruleset, extractor, f)
   File "/usr/local/lib/python2.7/dist-packages/capa/main.py", line 68, in find_function_capabilities
     for feature, va in extractor.extract_insn_features(f, bb, insn):
   File "/usr/local/lib/python2.7/dist-packages/capa/features/extractors/viv/__init__.py", line 84, in extract_insn_features
     for feature, va in capa.features.extractors.viv.insn.extract_features(f, bb, insn):
   File "/usr/local/lib/python2.7/dist-packages/capa/features/extractors/viv/insn.py", line 599, in extract_features
     for feature, va in insn_handler(f, bb, insn):
   File "/usr/local/lib/python2.7/dist-packages/capa/features/extractors/viv/insn.py", line 93, in extract_insn_api_features
     for name in capa.features.extractors.helpers.generate_symbols(dll, symbol):
   File "/usr/local/lib/python2.7/dist-packages/capa/features/extractors/helpers.py", line 61, in generate_symbols
     if not is_ordinal(symbol):
   File "/usr/local/lib/python2.7/dist-packages/capa/features/extractors/helpers.py", line 45, in is_ordinal
     return symbol[0] == "#"
 IndexError: string index out of range
```
2020-12-08 09:50:00 +01:00
Moritz
9ce0c94e17 Merge pull request #379 from fireeye/fix/nzxor-xor-instructions
add more xor instructions
2020-12-08 09:37:35 +01:00
Moritz Raabe
08c3372635 add more xor instructions 2020-12-08 09:21:50 +01:00
Capa Bot
2fafc70b69 Sync capa-testfiles submodule 2020-12-07 18:06:53 +00:00
Capa Bot
0e62ebe3a2 Sync capa-testfiles submodule 2020-12-07 17:16:01 +00:00
Moritz
1cc4d20b89 Merge pull request #373 from fireeye/ci/setup-dependabot
add dependabot config
2020-12-07 18:03:57 +01:00
Capa Bot
af4889894a Sync capa rules submodule 2020-12-04 08:31:42 +00:00
Moritz
429a5e1ea3 Merge pull request #378 from fireeye/fix/viv-string-extractor
fix: add viv extract strings for i386ImmMemOper operands
2020-12-04 08:55:23 +01:00
Moritz Raabe
4ef860eb07 fix: add viv extract strings for i386ImmMemOper operands 2020-12-03 20:24:29 +01:00
Capa Bot
b59ebf30c6 Sync capa-testfiles submodule 2020-12-03 18:57:45 +00:00
Capa Bot
a1ae8d54a6 Sync capa rules submodule 2020-12-02 15:24:15 +00:00
Capa Bot
8155207bea Sync capa rules submodule 2020-12-02 15:13:30 +00:00
Capa Bot
337d2cfa6d Sync capa rules submodule 2020-12-02 15:12:27 +00:00
Capa Bot
df2229782b Sync capa rules submodule 2020-12-02 15:08:55 +00:00
doomedraven
5920552649 small improvements 2020-12-01 20:31:56 +01:00
doomedraven
b4827fcb00 add render to dict, is the same as default but just in dictionary so simplifies the integrations 2020-12-01 19:43:54 +01:00
Willi Ballenthin
63983ccb65 Merge pull request #372 from doomedraven/patch-1
Simple example how to use capa as library
2020-12-01 06:56:44 -07:00
Willi Ballenthin
eac7e2b749 capa_as_library: style and comments 2020-12-01 06:54:55 -07:00
Moritz Raabe
65a365bca1 update halo requirements py2/3 2020-12-01 11:46:53 +01:00
Moritz Raabe
fecd0e11eb add dependabot config 2020-12-01 11:46:14 +01:00
doomedraven
51ad526cfc Simple example how to use capa as library
Just quick example how to use capa as library, to save time to someone, reading code and scripts
2020-12-01 11:20:49 +01:00
Moritz
10a062017d Merge pull request #370 from fireeye/pin-smda
pin smda
2020-12-01 11:10:23 +01:00
Moritz Raabe
0d351794db pin smda
addresses #369
2020-12-01 11:02:36 +01:00
Capa Bot
067e3ffced Sync capa-testfiles submodule 2020-11-30 19:36:59 +00:00
Capa Bot
50d55fae56 Sync capa-testfiles submodule 2020-11-23 17:55:56 +00:00
Capa Bot
ce63628d3d Sync capa rules submodule 2020-11-19 15:43:59 +00:00
Capa Bot
13df7f90f6 Sync capa rules submodule 2020-11-19 15:09:24 +00:00
Capa Bot
f5099b873d Sync capa rules submodule 2020-11-19 11:40:38 +00:00
Capa Bot
70eb38895d Sync capa-testfiles submodule 2020-11-18 16:28:34 +00:00
Capa Bot
7aea9fa1d2 Sync capa rules submodule 2020-11-16 19:38:02 +00:00
Capa Bot
5d30be31e0 Sync capa rules submodule 2020-11-16 09:44:08 +00:00
Capa Bot
7abe66e3de Sync capa rules submodule 2020-11-16 06:40:23 +00:00
mike-hunhoff
49ef5e5e64 Merge pull request #364 from fireeye/viv/fix-353
improve viv extractor unicode string detection
2020-11-10 17:56:47 -07:00
Michael Hunhoff
c2266bc105 improve viv extractor unicode string detection with supporting unit test 2020-11-10 12:23:07 -07:00
Moritz
a813e219e6 Merge pull request #363 from fireeye/williballenthin-patch-1
ci: disable py3.9 testing
2020-11-09 21:14:36 +01:00
Moritz
1c1fb20546 Merge pull request #355 from danielplohmann/backend-smda
initial commit for backend-smda
2020-11-09 21:13:51 +01:00
Willi Ballenthin
65feb60bb8 ci: disable py3.9 testing 2020-11-09 13:06:37 -07:00
Daniel Plohmann (jupiter)
f7492c7dc7 throw UnsupportedRuntimeError if SmdaFeatureExtractor is used with a Python version < 3.0 2020-11-09 16:20:08 +01:00
Moritz Raabe
dfc805b89b improvements for PR #355 2020-11-09 13:39:19 +01:00
Moritz Raabe
75defc13a0 disable fail-fast for tests job 2020-11-09 13:22:23 +01:00
Daniel Plohmann (jupiter)
7d4888bb77 addressing the comments in the PR discussion 2020-11-06 10:09:06 +01:00
Daniel Plohmann (jupiter)
1a34029171 Merge branch 'master' of github.com:fireeye/capa into backend-smda 2020-11-06 09:50:09 +01:00
Willi Ballenthin
f6ad4652e4 Merge pull request #358 from fireeye/doc/pyinstaller
document PyInstaller build process
2020-11-05 09:19:51 -07:00
pnx@pyrite
1e25604b0b replacement test for nested x64 thunks - still needs to be verified for vivisect 2020-11-05 16:31:47 +01:00
pnx@pyrite
3a43ffa641 adjusted identification of thunks via SMDA. 2020-11-05 12:58:07 +01:00
Capa Bot
8f6bcf3d98 Sync capa rules submodule 2020-11-03 14:23:36 +00:00
Moritz Raabe
0fd9753681 document PyInstaller build process
closes #357
2020-11-03 15:03:32 +01:00
Capa Bot
76a04dfe25 Sync capa rules submodule 2020-11-03 13:20:30 +00:00
Capa Bot
16317182e3 Sync capa-testfiles submodule 2020-11-03 13:14:45 +00:00
Daniel Plohmann (jupiter)
6bcdf64f67 formatting 2020-10-30 15:34:02 +01:00
Daniel Plohmann (jupiter)
d276a07a71 comments on a test where disassembly differs among backends 2020-10-30 15:29:38 +01:00
Daniel Plohmann (jupiter)
f3b59b342a Merge branch 'backend-smda' of github.com:danielplohmann/capa into backend-smda 2020-10-30 15:25:45 +01:00
Daniel Plohmann (jupiter)
4a0f1f22ba test fixes 2020-10-30 15:25:42 +01:00
Jon Crussell
0c85e7604c use magical derefs
Found derefs in viv/insn.py, does exactly what we need!
2020-10-30 07:23:24 -07:00
Jon Crussell
8f6a46e2d8 add check for pointer to string
Check if memory referenced is a pointer to a string. Fixes mimikatz
string test.
2020-10-30 07:01:07 -07:00
Daniel Plohmann (jupiter)
74b2c18296 down to 14 failed 2020-10-29 20:05:50 +01:00
Jon Crussell
b12d0b6424 tests: add smda backend test
40 failed, 73 passed.
2020-10-29 09:56:28 -07:00
Daniel Plohmann (jupiter)
60ddf0400e addressing review 2020-10-29 17:47:10 +01:00
Daniel Plohmann (jupiter)
669d3484c0 Merge remote-tracking branch 'origin/master' into backend-smda 2020-10-29 17:38:21 +01:00
William Ballenthin
5420ad97a3 sync submodules 2020-10-29 09:42:56 -06:00
Daniel Plohmann (jupiter)
36822926af initial commit for backend-smda 2020-10-29 11:28:22 +01:00
Capa Bot
eef8f2e781 Sync capa rules submodule 2020-10-29 03:50:40 +00:00
Capa Bot
31ac667623 Sync capa rules submodule 2020-10-27 15:16:07 +00:00
Capa Bot
868ceb25bf Sync capa rules submodule 2020-10-27 15:15:30 +00:00
Capa Bot
ee3ab94774 Sync capa rules submodule 2020-10-27 15:15:04 +00:00
Capa Bot
1c47877a8c Sync capa rules submodule 2020-10-27 15:14:22 +00:00
Capa Bot
84698462f3 Sync capa rules submodule 2020-10-27 15:13:25 +00:00
Capa Bot
da7dc793e7 Sync capa rules submodule 2020-10-27 15:12:51 +00:00
Capa Bot
044ee83fbc Sync capa-testfiles submodule 2020-10-26 16:48:15 +00:00
Capa Bot
aea324c4a8 Sync capa rules submodule 2020-10-26 16:47:44 +00:00
Capa Bot
4d05b20830 Sync capa rules submodule 2020-10-26 16:46:53 +00:00
Willi Ballenthin
276928951c build: event published/edited, not created 2020-10-23 15:17:32 -06:00
67 changed files with 5329 additions and 2226 deletions

9
.gitattributes vendored Normal file
View File

@@ -0,0 +1,9 @@
# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto
# Explicitly declare text files you want to always be normalized and converted
# to native line endings on checkout.
*.py text
*.yml text
*.md text
*.txt text

View File

@@ -1,46 +1,46 @@
# Contributor Covenant Code of Conduct
## Our Pledge
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to creating a positive environment include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a professional setting
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
## Scope
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [https://contributor-covenant.org/version/1/4][version]
[homepage]: https://contributor-covenant.org
[version]: https://contributor-covenant.org/version/1/4/
# Contributor Covenant Code of Conduct
## Our Pledge
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to creating a positive environment include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a professional setting
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
## Scope
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [https://contributor-covenant.org/version/1/4][version]
[homepage]: https://contributor-covenant.org
[version]: https://contributor-covenant.org/version/1/4/

View File

@@ -1,197 +1,197 @@
# Contributing to Capa
First off, thanks for taking the time to contribute!
The following is a set of guidelines for contributing to capa and its packages, which are hosted in the [FireEye Organization](https://github.com/fireeye) on GitHub. These are mostly guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.
#### Table Of Contents
[Code of Conduct](#code-of-conduct)
[What should I know before I get started?](#what-should-i-know-before-i-get-started)
* [Capa and its Repositories](#capa-and-its-repositories)
* [Capa Design Decisions](#design-decisions)
[How Can I Contribute?](#how-can-i-contribute)
* [Reporting Bugs](#reporting-bugs)
* [Suggesting Enhancements](#suggesting-enhancements)
* [Your First Code Contribution](#your-first-code-contribution)
* [Pull Requests](#pull-requests)
[Styleguides](#styleguides)
* [Git Commit Messages](#git-commit-messages)
* [Python Styleguide](#python-styleguide)
* [Rules Styleguide](#rules-styleguide)
## Code of Conduct
This project and everyone participating in it is governed by the [Capa Code of Conduct](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior to the maintainers.
## What should I know before I get started?
### Capa and its repositories
We host the capa project as three Github repositories:
- [capa](https://github.com/fireeye/capa)
- [capa-rules](https://github.com/fireeye/capa-rules)
- [capa-testfiles](https://github.com/fireeye/capa-testfiles)
The command line tools, logic engine, and other Python source code are found in the `capa` repository.
This is the repository to fork when you want to enhance the features, performance, or user interface of capa.
Do *not* push rules directly to this repository, instead...
The standard rules contributed by the community are found in the `capa-rules` repository.
When you have an idea for a new rule, you should open a PR against `capa-rules`.
We keep `capa` and `capa-rules` separate to distinguish where ideas, bugs, and discussions should happen.
If you're writing yaml it probably goes in `capa-rules` and if you're writing Python it probably goes in `capa`.
Also, we encourage users to develop their own rule repositories, so we treat our default set of rules in the same way.
Test fixtures, such as malware samples and analysis workspaces, are found in the `capa-testfiles` repository.
These are files you'll need in order to run the linter (in `--thorough` mode) and full test suites;
however, they take up a lot of space (1GB+), so by keeping `capa-testfiles` separate,
a shallow checkout of `capa` and `capa-rules` doesn't take much bandwidth.
### Design Decisions
When we make a significant decision in how we maintain the project and what we can or cannot support,
we will document it in the [capa issues tracker](https://github.com/fireeye/capa/issues).
This is the best place review our discussions about what/how/why we do things in the project.
If you have a question, check to see if it is documented there.
If it is *not* documented there, or you can't find an answer, please open a issue.
We'll link to existing issues when appropriate to keep discussions in one place.
## How Can I Contribute?
### Reporting Bugs
This section guides you through submitting a bug report for capa.
Following these guidelines helps maintainers and the community understand your report, reproduce the behavior, and find related reports.
Before creating bug reports, please check [this list](#before-submitting-a-bug-report)
as you might find out that you don't need to create one.
When you are creating a bug report, please [include as many details as possible](#how-do-i-submit-a-good-bug-report).
Fill out [the required template](./ISSUE_TEMPLATE/bug_report.md),
the information it asks for helps us resolve issues faster.
> **Note:** If you find a **Closed** issue that seems like it is the same thing that you're experiencing, open a new issue and include a link to the original issue in the body of your new one.
#### Before Submitting A Bug Report
* **Determine [which repository the problem should be reported in](#capa-and-its-repositories)**.
* **Perform a [cursory search](https://github.com/fireeye/capa/issues?q=is%3Aissue)** to see if the problem has already been reported. If it has **and the issue is still open**, add a comment to the existing issue instead of opening a new one.
#### How Do I Submit A (Good) Bug Report?
Bugs are tracked as [GitHub issues](https://guides.github.com/features/issues/).
After you've determined [which repository](#capa-and-its-repositories) your bug is related to,
create an issue on that repository and provide the following information by filling in
[the template](./ISSUE_TEMPLATE/bug_report.md).
Explain the problem and include additional details to help maintainers reproduce the problem:
* **Use a clear and descriptive title** for the issue to identify the problem.
* **Describe the exact steps which reproduce the problem** in as many details as possible. For example, start by explaining how you started capa, e.g. which command exactly you used in the terminal, or how you started capa otherwise.
* **Provide specific examples to demonstrate the steps**. Include links to files or GitHub projects, or copy/pasteable snippets, which you use in those examples. If you're providing snippets in the issue, use [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
* **Describe the behavior you observed after following the steps** and point out what exactly is the problem with that behavior.
* **Explain which behavior you expected to see instead and why.**
* **Include screenshots and animated GIFs** which show you following the described steps and clearly demonstrate the problem. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux.
* **If you're reporting that capa crashed**, include the stack trace from the terminal. Include the stack trace in the issue in a [code block](https://help.github.com/articles/markdown-basics/#multiple-lines), a [file attachment](https://help.github.com/articles/file-attachments-on-issues-and-pull-requests/), or put it in a [gist](https://gist.github.com/) and provide link to that gist.
* **If the problem wasn't triggered by a specific action**, describe what you were doing before the problem happened and share more information using the guidelines below.
Provide more context by answering these questions:
* **Did the problem start happening recently** (e.g. after updating to a new version of capa) or was this always a problem?
* If the problem started happening recently, **can you reproduce the problem in an older version of capa?** What's the most recent version in which the problem doesn't happen? You can download older versions of capa from [the releases page](https://github.com/fireeye/capa/releases).
* **Can you reliably reproduce the issue?** If not, provide details about how often the problem happens and under which conditions it normally happens.
* If the problem is related to working with files (e.g. opening and editing files), **does the problem happen for all files and projects or only some?** Does the problem happen only when working with local or remote files (e.g. on network drives), with files of a specific type (e.g. only JavaScript or Python files), with large files or files with very long lines, or with files in a specific encoding? Is there anything else special about the files you are using?
Include details about your configuration and environment:
* **Which version of capa are you using?** You can get the exact version by running `capa --version` in your terminal.
* **What's the name and version of the OS you're using**?
### Suggesting Enhancements
This section guides you through submitting an enhancement suggestion for capa, including completely new features and minor improvements to existing functionality. Following these guidelines helps maintainers and the community understand your suggestion and find related suggestions.
Before creating enhancement suggestions, please check [this list](#before-submitting-an-enhancement-suggestion) as you might find out that you don't need to create one. When you are creating an enhancement suggestion, please [include as many details as possible](#how-do-i-submit-a-good-enhancement-suggestion). Fill in [the template](./ISSUE_TEMPLATE/feature_request.md), including the steps that you imagine you would take if the feature you're requesting existed.
#### Before Submitting An Enhancement Suggestion
* **Determine [which repository the enhancement should be suggested in](#capa-and-its-repositories).**
* **Perform a [cursory search](https://github.com/fireeye/capa/issues?q=is%3Aissue)** to see if the enhancement has already been suggested. If it has, add a comment to the existing issue instead of opening a new one.
#### How Do I Submit A (Good) Enhancement Suggestion?
Enhancement suggestions are tracked as [GitHub issues](https://guides.github.com/features/issues/). After you've determined [which repository](#capa-and-its-repositories) your enhancement suggestion is related to, create an issue on that repository and provide the following information:
* **Use a clear and descriptive title** for the issue to identify the suggestion.
* **Provide a step-by-step description of the suggested enhancement** in as many details as possible.
* **Provide specific examples to demonstrate the steps**. Include copy/pasteable snippets which you use in those examples, as [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
* **Describe the current behavior** and **explain which behavior you expected to see instead** and why.
* **Include screenshots and animated GIFs** which help you demonstrate the steps or point out the part of capa which the suggestion is related to. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux.
* **Explain why this enhancement would be useful** to most capa users and isn't something that can or should be implemented as an external tool that uses capa as a library.
* **Specify which version of capa you're using.** You can get the exact version by running `capa --version` in your terminal.
* **Specify the name and version of the OS you're using.**
### Your First Code Contribution
Unsure where to begin contributing to capa? You can start by looking through these `good-first-issue` and `rule-idea` issues:
* [good-first-issue](https://github.com/fireeye/capa/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) - issues which should only require a few lines of code, and a test or two.
* [rule-idea](https://github.com/fireeye/capa-rules/issues?q=is%3Aissue+is%3Aopen+label%3A%22rule+idea%22) - issues that describe potential new rule ideas.
Both issue lists are sorted by total number of comments. While not perfect, number of comments is a reasonable proxy for impact a given change will have.
#### Local development
capa and all its resources can be developed locally.
For instructions on how to do this, see the "Method 3" section of the [installation guide](https://github.com/fireeye/capa/blob/master/doc/installation.md).
### Pull Requests
The process described here has several goals:
- Maintain capa's quality
- Fix problems that are important to users
- Engage the community in working toward the best possible capa
- Enable a sustainable system for capa's maintainers to review contributions
Please follow these steps to have your contribution considered by the maintainers:
1. Follow all instructions in [the template](PULL_REQUEST_TEMPLATE.md)
2. Follow the [styleguides](#styleguides)
3. After you submit your pull request, verify that all [status checks](https://help.github.com/articles/about-status-checks/) are passing <details><summary>What if the status checks are failing? </summary>If a status check is failing, and you believe that the failure is unrelated to your change, please leave a comment on the pull request explaining why you believe the failure is unrelated. A maintainer will re-run the status check for you. If we conclude that the failure was a false positive, then we will open an issue to track that problem with our status check suite.</details>
While the prerequisites above must be satisfied prior to having your pull request reviewed, the reviewer(s) may ask you to complete additional design work, tests, or other changes before your pull request can be ultimately accepted.
## Styleguides
### Git Commit Messages
* Use the present tense ("Add feature" not "Added feature")
* Use the imperative mood ("Move cursor to..." not "Moves cursor to...")
* Prefix the first line with the component in question ("rules: ..." or "render: ...")
* Reference issues and pull requests liberally after the first line
### Python Styleguide
All Python code must adhere to the style guide used by capa:
1. [PEP8](https://www.python.org/dev/peps/pep-0008/), with clarifications from
2. [Willi's style guide](https://docs.google.com/document/d/1iRpeg-w4DtibwytUyC_dDT7IGhNGBP25-nQfuBa-Fyk/edit?usp=sharing), formatted with
3. [isort](https://pypi.org/project/isort/) (with line width 120 and ordered by line length), and formatted with
4. [black](https://github.com/psf/black) (with line width 120), and formatted with
5. [dos2unix](https://linux.die.net/man/1/dos2unix)
Our CI pipeline will reformat and enforce the Python styleguide.
### Rules Styleguide
All (non-nursery) capa rules must:
1. pass the [linter](https://github.com/fireeye/capa/blob/master/scripts/lint.py), and
2. be formatted with [capafmt](https://github.com/fireeye/capa/blob/master/scripts/capafmt.py)
This ensures that all rules meet the same minimum level of quality and are structured in a consistent way.
Our CI pipeline will reformat and enforce the capa rules styleguide.
# Contributing to Capa
First off, thanks for taking the time to contribute!
The following is a set of guidelines for contributing to capa and its packages, which are hosted in the [FireEye Organization](https://github.com/fireeye) on GitHub. These are mostly guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.
#### Table Of Contents
[Code of Conduct](#code-of-conduct)
[What should I know before I get started?](#what-should-i-know-before-i-get-started)
* [Capa and its Repositories](#capa-and-its-repositories)
* [Capa Design Decisions](#design-decisions)
[How Can I Contribute?](#how-can-i-contribute)
* [Reporting Bugs](#reporting-bugs)
* [Suggesting Enhancements](#suggesting-enhancements)
* [Your First Code Contribution](#your-first-code-contribution)
* [Pull Requests](#pull-requests)
[Styleguides](#styleguides)
* [Git Commit Messages](#git-commit-messages)
* [Python Styleguide](#python-styleguide)
* [Rules Styleguide](#rules-styleguide)
## Code of Conduct
This project and everyone participating in it is governed by the [Capa Code of Conduct](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior to the maintainers.
## What should I know before I get started?
### Capa and its repositories
We host the capa project as three Github repositories:
- [capa](https://github.com/fireeye/capa)
- [capa-rules](https://github.com/fireeye/capa-rules)
- [capa-testfiles](https://github.com/fireeye/capa-testfiles)
The command line tools, logic engine, and other Python source code are found in the `capa` repository.
This is the repository to fork when you want to enhance the features, performance, or user interface of capa.
Do *not* push rules directly to this repository, instead...
The standard rules contributed by the community are found in the `capa-rules` repository.
When you have an idea for a new rule, you should open a PR against `capa-rules`.
We keep `capa` and `capa-rules` separate to distinguish where ideas, bugs, and discussions should happen.
If you're writing yaml it probably goes in `capa-rules` and if you're writing Python it probably goes in `capa`.
Also, we encourage users to develop their own rule repositories, so we treat our default set of rules in the same way.
Test fixtures, such as malware samples and analysis workspaces, are found in the `capa-testfiles` repository.
These are files you'll need in order to run the linter (in `--thorough` mode) and full test suites;
however, they take up a lot of space (1GB+), so by keeping `capa-testfiles` separate,
a shallow checkout of `capa` and `capa-rules` doesn't take much bandwidth.
### Design Decisions
When we make a significant decision in how we maintain the project and what we can or cannot support,
we will document it in the [capa issues tracker](https://github.com/fireeye/capa/issues).
This is the best place review our discussions about what/how/why we do things in the project.
If you have a question, check to see if it is documented there.
If it is *not* documented there, or you can't find an answer, please open a issue.
We'll link to existing issues when appropriate to keep discussions in one place.
## How Can I Contribute?
### Reporting Bugs
This section guides you through submitting a bug report for capa.
Following these guidelines helps maintainers and the community understand your report, reproduce the behavior, and find related reports.
Before creating bug reports, please check [this list](#before-submitting-a-bug-report)
as you might find out that you don't need to create one.
When you are creating a bug report, please [include as many details as possible](#how-do-i-submit-a-good-bug-report).
Fill out [the required template](./ISSUE_TEMPLATE/bug_report.md),
the information it asks for helps us resolve issues faster.
> **Note:** If you find a **Closed** issue that seems like it is the same thing that you're experiencing, open a new issue and include a link to the original issue in the body of your new one.
#### Before Submitting A Bug Report
* **Determine [which repository the problem should be reported in](#capa-and-its-repositories)**.
* **Perform a [cursory search](https://github.com/fireeye/capa/issues?q=is%3Aissue)** to see if the problem has already been reported. If it has **and the issue is still open**, add a comment to the existing issue instead of opening a new one.
#### How Do I Submit A (Good) Bug Report?
Bugs are tracked as [GitHub issues](https://guides.github.com/features/issues/).
After you've determined [which repository](#capa-and-its-repositories) your bug is related to,
create an issue on that repository and provide the following information by filling in
[the template](./ISSUE_TEMPLATE/bug_report.md).
Explain the problem and include additional details to help maintainers reproduce the problem:
* **Use a clear and descriptive title** for the issue to identify the problem.
* **Describe the exact steps which reproduce the problem** in as many details as possible. For example, start by explaining how you started capa, e.g. which command exactly you used in the terminal, or how you started capa otherwise.
* **Provide specific examples to demonstrate the steps**. Include links to files or GitHub projects, or copy/pasteable snippets, which you use in those examples. If you're providing snippets in the issue, use [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
* **Describe the behavior you observed after following the steps** and point out what exactly is the problem with that behavior.
* **Explain which behavior you expected to see instead and why.**
* **Include screenshots and animated GIFs** which show you following the described steps and clearly demonstrate the problem. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux.
* **If you're reporting that capa crashed**, include the stack trace from the terminal. Include the stack trace in the issue in a [code block](https://help.github.com/articles/markdown-basics/#multiple-lines), a [file attachment](https://help.github.com/articles/file-attachments-on-issues-and-pull-requests/), or put it in a [gist](https://gist.github.com/) and provide link to that gist.
* **If the problem wasn't triggered by a specific action**, describe what you were doing before the problem happened and share more information using the guidelines below.
Provide more context by answering these questions:
* **Did the problem start happening recently** (e.g. after updating to a new version of capa) or was this always a problem?
* If the problem started happening recently, **can you reproduce the problem in an older version of capa?** What's the most recent version in which the problem doesn't happen? You can download older versions of capa from [the releases page](https://github.com/fireeye/capa/releases).
* **Can you reliably reproduce the issue?** If not, provide details about how often the problem happens and under which conditions it normally happens.
* If the problem is related to working with files (e.g. opening and editing files), **does the problem happen for all files and projects or only some?** Does the problem happen only when working with local or remote files (e.g. on network drives), with files of a specific type (e.g. only JavaScript or Python files), with large files or files with very long lines, or with files in a specific encoding? Is there anything else special about the files you are using?
Include details about your configuration and environment:
* **Which version of capa are you using?** You can get the exact version by running `capa --version` in your terminal.
* **What's the name and version of the OS you're using**?
### Suggesting Enhancements
This section guides you through submitting an enhancement suggestion for capa, including completely new features and minor improvements to existing functionality. Following these guidelines helps maintainers and the community understand your suggestion and find related suggestions.
Before creating enhancement suggestions, please check [this list](#before-submitting-an-enhancement-suggestion) as you might find out that you don't need to create one. When you are creating an enhancement suggestion, please [include as many details as possible](#how-do-i-submit-a-good-enhancement-suggestion). Fill in [the template](./ISSUE_TEMPLATE/feature_request.md), including the steps that you imagine you would take if the feature you're requesting existed.
#### Before Submitting An Enhancement Suggestion
* **Determine [which repository the enhancement should be suggested in](#capa-and-its-repositories).**
* **Perform a [cursory search](https://github.com/fireeye/capa/issues?q=is%3Aissue)** to see if the enhancement has already been suggested. If it has, add a comment to the existing issue instead of opening a new one.
#### How Do I Submit A (Good) Enhancement Suggestion?
Enhancement suggestions are tracked as [GitHub issues](https://guides.github.com/features/issues/). After you've determined [which repository](#capa-and-its-repositories) your enhancement suggestion is related to, create an issue on that repository and provide the following information:
* **Use a clear and descriptive title** for the issue to identify the suggestion.
* **Provide a step-by-step description of the suggested enhancement** in as many details as possible.
* **Provide specific examples to demonstrate the steps**. Include copy/pasteable snippets which you use in those examples, as [Markdown code blocks](https://help.github.com/articles/markdown-basics/#multiple-lines).
* **Describe the current behavior** and **explain which behavior you expected to see instead** and why.
* **Include screenshots and animated GIFs** which help you demonstrate the steps or point out the part of capa which the suggestion is related to. You can use [this tool](https://www.cockos.com/licecap/) to record GIFs on macOS and Windows, and [this tool](https://github.com/colinkeenan/silentcast) or [this tool](https://github.com/GNOME/byzanz) on Linux.
* **Explain why this enhancement would be useful** to most capa users and isn't something that can or should be implemented as an external tool that uses capa as a library.
* **Specify which version of capa you're using.** You can get the exact version by running `capa --version` in your terminal.
* **Specify the name and version of the OS you're using.**
### Your First Code Contribution
Unsure where to begin contributing to capa? You can start by looking through these `good-first-issue` and `rule-idea` issues:
* [good-first-issue](https://github.com/fireeye/capa/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) - issues which should only require a few lines of code, and a test or two.
* [rule-idea](https://github.com/fireeye/capa-rules/issues?q=is%3Aissue+is%3Aopen+label%3A%22rule+idea%22) - issues that describe potential new rule ideas.
Both issue lists are sorted by total number of comments. While not perfect, number of comments is a reasonable proxy for impact a given change will have.
#### Local development
capa and all its resources can be developed locally.
For instructions on how to do this, see the "Method 3" section of the [installation guide](https://github.com/fireeye/capa/blob/master/doc/installation.md).
### Pull Requests
The process described here has several goals:
- Maintain capa's quality
- Fix problems that are important to users
- Engage the community in working toward the best possible capa
- Enable a sustainable system for capa's maintainers to review contributions
Please follow these steps to have your contribution considered by the maintainers:
1. Follow all instructions in [the template](PULL_REQUEST_TEMPLATE.md)
2. Follow the [styleguides](#styleguides)
3. After you submit your pull request, verify that all [status checks](https://help.github.com/articles/about-status-checks/) are passing <details><summary>What if the status checks are failing? </summary>If a status check is failing, and you believe that the failure is unrelated to your change, please leave a comment on the pull request explaining why you believe the failure is unrelated. A maintainer will re-run the status check for you. If we conclude that the failure was a false positive, then we will open an issue to track that problem with our status check suite.</details>
While the prerequisites above must be satisfied prior to having your pull request reviewed, the reviewer(s) may ask you to complete additional design work, tests, or other changes before your pull request can be ultimately accepted.
## Styleguides
### Git Commit Messages
* Use the present tense ("Add feature" not "Added feature")
* Use the imperative mood ("Move cursor to..." not "Moves cursor to...")
* Prefix the first line with the component in question ("rules: ..." or "render: ...")
* Reference issues and pull requests liberally after the first line
### Python Styleguide
All Python code must adhere to the style guide used by capa:
1. [PEP8](https://www.python.org/dev/peps/pep-0008/), with clarifications from
2. [Willi's style guide](https://docs.google.com/document/d/1iRpeg-w4DtibwytUyC_dDT7IGhNGBP25-nQfuBa-Fyk/edit?usp=sharing), formatted with
3. [isort](https://pypi.org/project/isort/) (with line width 120 and ordered by line length), and formatted with
4. [black](https://github.com/psf/black) (with line width 120), and formatted with
5. [dos2unix](https://linux.die.net/man/1/dos2unix)
Our CI pipeline will reformat and enforce the Python styleguide.
### Rules Styleguide
All (non-nursery) capa rules must:
1. pass the [linter](https://github.com/fireeye/capa/blob/master/scripts/lint.py), and
2. be formatted with [capafmt](https://github.com/fireeye/capa/blob/master/scripts/capafmt.py)
This ensures that all rules meet the same minimum level of quality and are structured in a consistent way.
Our CI pipeline will reformat and enforce the capa rules styleguide.

View File

@@ -1,47 +1,47 @@
---
name: Bug report
about: Create a report to help us improve
---
<!--
# Is your bug report related to capa rules (for example a false positive)?
We use sybmodules to separate code, rules and test data. If your issue is related to capa rules, please report it at https://github.com/fireeye/capa-rules/issues.
# Have you checked that your issue isn't already filed?
Please search if there is a similar issue at https://github.com/fireeye/capa/issues. If there is already a similar issue, please add more details there instead of opening a new one.
# Have you read capa's Code of Conduct?
By filing an Issue, you are expected to comply with it, including treating everyone with respect: https://github.com/fireeye/capa/blob/master/.github/CODE_OF_CONDUCT.md
# Have you read capa's CONTRIBUTING guide?
It contains helpful information about how to contribute to capa. Check https://github.com/fireeye/capa/blob/master/.github/CONTRIBUTING.md#reporting-bugs
-->
### Description
<!-- Description of the issue -->
### Steps to Reproduce
<!-- 1. First Step -->
<!-- 2. Second Step -->
<!-- 3. and so on… -->
**Expected behavior:**
<!-- What you expect to happen -->
**Actual behavior:**
<!-- What actually happens -->
### Versions
<!-- You can get this information from copy and pasting the output of `capa --version` from the command line.
Please specify the component you're using (e.g. standalone tool or IDA Pro integration) and your Python version.
Also, please include the OS and what version of the OS you're running. -->
### Additional Information
<!-- Any additional information, configuration or data that might be necessary to reproduce the issue. -->
---
name: Bug report
about: Create a report to help us improve
---
<!--
# Is your bug report related to capa rules (for example a false positive)?
We use submodules to separate code, rules and test data. If your issue is related to capa rules, please report it at https://github.com/fireeye/capa-rules/issues.
# Have you checked that your issue isn't already filed?
Please search if there is a similar issue at https://github.com/fireeye/capa/issues. If there is already a similar issue, please add more details there instead of opening a new one.
# Have you read capa's Code of Conduct?
By filing an Issue, you are expected to comply with it, including treating everyone with respect: https://github.com/fireeye/capa/blob/master/.github/CODE_OF_CONDUCT.md
# Have you read capa's CONTRIBUTING guide?
It contains helpful information about how to contribute to capa. Check https://github.com/fireeye/capa/blob/master/.github/CONTRIBUTING.md#reporting-bugs
-->
### Description
<!-- Description of the issue -->
### Steps to Reproduce
<!-- 1. First Step -->
<!-- 2. Second Step -->
<!-- 3. and so on… -->
**Expected behavior:**
<!-- What you expect to happen -->
**Actual behavior:**
<!-- What actually happens -->
### Versions
<!-- You can get this information from copy and pasting the output of `capa --version` from the command line.
Please specify the component you're using (e.g. standalone tool or IDA Pro integration) and your Python version.
Also, please include the OS and what version of the OS you're running. -->
### Additional Information
<!-- Any additional information, configuration or data that might be necessary to reproduce the issue. -->

View File

@@ -1,35 +1,35 @@
---
name: Feature request
about: Suggest an idea for capa
---
<!--
# Is your issue related to capa rules (for example an idea for a new rule)?
We use sybmodules to separate code, rules and test data. If your issue is related to capa rules, please report it at https://github.com/fireeye/capa-rules/issues.
# Have you checked that your issue isn't already filed?
Please search if there is a similar issue at https://github.com/fireeye/capa/issues. If there is already a similar issue, please add more details there instead of opening a new one.
# Have you read capa's Code of Conduct?
By filing an Issue, you are expected to comply with it, including treating everyone with respect: https://github.com/fireeye/capa/blob/master/.github/CODE_OF_CONDUCT.md
# Have you read capa's CONTRIBUTING guide?
It contains helpful information about how to contribute to capa. Check https://github.com/fireeye/capa/blob/master/.github/CONTRIBUTING.md#suggesting-enhancements
-->
### Summary
<!-- One paragraph explanation of the feature. -->
### Motivation
<!-- Why are we doing this? What use cases does it support? What is the expected outcome? -->
### Describe alternatives you've considered
<!-- A clear and concise description of the alternative solutions you've considered. -->
## Additional context
<!-- Add any other context or screenshots about the feature request here. -->
---
name: Feature request
about: Suggest an idea for capa
---
<!--
# Is your issue related to capa rules (for example an idea for a new rule)?
We use submodules to separate code, rules and test data. If your issue is related to capa rules, please report it at https://github.com/fireeye/capa-rules/issues.
# Have you checked that your issue isn't already filed?
Please search if there is a similar issue at https://github.com/fireeye/capa/issues. If there is already a similar issue, please add more details there instead of opening a new one.
# Have you read capa's Code of Conduct?
By filing an Issue, you are expected to comply with it, including treating everyone with respect: https://github.com/fireeye/capa/blob/master/.github/CODE_OF_CONDUCT.md
# Have you read capa's CONTRIBUTING guide?
It contains helpful information about how to contribute to capa. Check https://github.com/fireeye/capa/blob/master/.github/CONTRIBUTING.md#suggesting-enhancements
-->
### Summary
<!-- One paragraph explanation of the feature. -->
### Motivation
<!-- Why are we doing this? What use cases does it support? What is the expected outcome? -->
### Describe alternatives you've considered
<!-- A clear and concise description of the alternative solutions you've considered. -->
## Additional context
<!-- Add any other context or screenshots about the feature request here. -->

6
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,6 @@
version: 2
updates:
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"

32
.github/pull_request_template.md vendored Normal file
View File

@@ -0,0 +1,32 @@
<!--
Thank you for contributing to capa! :heart:
IMPORTANT NOTE
It's most important that you submit your improvements. So even if you don't use this complete template we look forward to collaborating!
Please read capa's CONTRIBUTING guide if you haven't done so already.
It contains helpful information about how to contribute to capa. Check https://github.com/fireeye/capa/blob/master/.github/CONTRIBUTING.md
PR template based on https://embeddedartistry.com/blog/2017/08/04/a-github-pull-request-template-for-your-projects/
-->
### Description
<!-- Please describe the changes in this PR. Including your motivation and context helps us to review. -->
closes # (issue)
### Type of change
Please update the [CHANGELOG.md](/CHANGELOG.md)
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] This change requires a documentation update
- [ ] I have made the corresponding changes to the documentation
### Tests
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] No new tests needed

View File

@@ -0,0 +1,5 @@
# Copyright (C) 2020 FireEye, Inc. All Rights Reserved.
import PyInstaller.utils.hooks
# ref: https://groups.google.com/g/pyinstaller/c/amWi0-66uZI/m/miPoKfWjBAAJ
binaries = PyInstaller.utils.hooks.collect_dynamic_libs("capstone")

View File

@@ -13,3 +13,144 @@ from PyInstaller.utils.hooks import copy_metadata
#
# ref: https://github.com/pyinstaller/pyinstaller/issues/1713#issuecomment-162682084
datas = copy_metadata("vivisect")
excludedimports = [
# viv gui requires these heavy libraries,
# but viv as a library doesn't.
# they shouldn't be installed in our configuration,
# but we'll ensure they don't slip in here (such as on developers' systems).
"PyQt5",
"qt5",
"pyqtwebengine",
# the above are imported by these viv modules.
# so really, we'd want to exclude these submodules of viv.
# but i dont think this works.
"vqt",
"vdb.qt",
"envi.qt",
# unused by capa
"pyasn1",
]
hiddenimports = [
# vivisect does manual/runtime importing of its modules,
# so declare the things that could be imported here.
"vivisect",
"vivisect.analysis",
"vivisect.analysis.amd64",
"vivisect.analysis.amd64",
"vivisect.analysis.amd64.emulation",
"vivisect.analysis.amd64.golang",
"vivisect.analysis.crypto",
"vivisect.analysis.crypto",
"vivisect.analysis.crypto.constants",
"vivisect.analysis.elf",
"vivisect.analysis.elf",
"vivisect.analysis.elf.elfplt",
"vivisect.analysis.elf.libc_start_main",
"vivisect.analysis.generic",
"vivisect.analysis.generic",
"vivisect.analysis.generic.codeblocks",
"vivisect.analysis.generic.emucode",
"vivisect.analysis.generic.entrypoints",
"vivisect.analysis.generic.funcentries",
"vivisect.analysis.generic.impapi",
"vivisect.analysis.generic.mkpointers",
"vivisect.analysis.generic.pointers",
"vivisect.analysis.generic.pointertables",
"vivisect.analysis.generic.relocations",
"vivisect.analysis.generic.strconst",
"vivisect.analysis.generic.switchcase",
"vivisect.analysis.generic.thunks",
"vivisect.analysis.generic.noret",
"vivisect.analysis.i386",
"vivisect.analysis.i386",
"vivisect.analysis.i386.calling",
"vivisect.analysis.i386.golang",
"vivisect.analysis.i386.importcalls",
"vivisect.analysis.i386.instrhook",
"vivisect.analysis.i386.thunk_bx",
"vivisect.analysis.ms",
"vivisect.analysis.ms",
"vivisect.analysis.ms.hotpatch",
"vivisect.analysis.ms.localhints",
"vivisect.analysis.ms.msvc",
"vivisect.analysis.ms.msvcfunc",
"vivisect.analysis.ms.vftables",
"vivisect.analysis.pe",
"vivisect.impapi.posix.amd64",
"vivisect.impapi.posix.i386",
"vivisect.impapi.windows",
"vivisect.impapi.windows.amd64",
"vivisect.impapi.windows.i386",
"vivisect.impapi.winkern.i386",
"vivisect.impapi.winkern.amd64",
"vivisect.parsers.blob",
"vivisect.parsers.elf",
"vivisect.parsers.ihex",
"vivisect.parsers.macho",
"vivisect.parsers.pe",
"vivisect.storage",
"vivisect.storage.basicfile",
"vstruct.constants",
"vstruct.constants.ntstatus",
"vstruct.defs",
"vstruct.defs.arm7",
"vstruct.defs.bmp",
"vstruct.defs.dns",
"vstruct.defs.elf",
"vstruct.defs.gif",
"vstruct.defs.ihex",
"vstruct.defs.inet",
"vstruct.defs.java",
"vstruct.defs.kdcom",
"vstruct.defs.macho",
"vstruct.defs.macho.const",
"vstruct.defs.macho.fat",
"vstruct.defs.macho.loader",
"vstruct.defs.macho.stabs",
"vstruct.defs.minidump",
"vstruct.defs.pcap",
"vstruct.defs.pe",
"vstruct.defs.pptp",
"vstruct.defs.rar",
"vstruct.defs.swf",
"vstruct.defs.win32",
"vstruct.defs.windows",
"vstruct.defs.windows.win_5_1_i386",
"vstruct.defs.windows.win_5_1_i386.ntdll",
"vstruct.defs.windows.win_5_1_i386.ntoskrnl",
"vstruct.defs.windows.win_5_1_i386.win32k",
"vstruct.defs.windows.win_5_2_i386",
"vstruct.defs.windows.win_5_2_i386.ntdll",
"vstruct.defs.windows.win_5_2_i386.ntoskrnl",
"vstruct.defs.windows.win_5_2_i386.win32k",
"vstruct.defs.windows.win_6_1_amd64",
"vstruct.defs.windows.win_6_1_amd64.ntdll",
"vstruct.defs.windows.win_6_1_amd64.ntoskrnl",
"vstruct.defs.windows.win_6_1_amd64.win32k",
"vstruct.defs.windows.win_6_1_i386",
"vstruct.defs.windows.win_6_1_i386.ntdll",
"vstruct.defs.windows.win_6_1_i386.ntoskrnl",
"vstruct.defs.windows.win_6_1_i386.win32k",
"vstruct.defs.windows.win_6_1_wow64",
"vstruct.defs.windows.win_6_1_wow64.ntdll",
"vstruct.defs.windows.win_6_2_amd64",
"vstruct.defs.windows.win_6_2_amd64.ntdll",
"vstruct.defs.windows.win_6_2_amd64.ntoskrnl",
"vstruct.defs.windows.win_6_2_amd64.win32k",
"vstruct.defs.windows.win_6_2_i386",
"vstruct.defs.windows.win_6_2_i386.ntdll",
"vstruct.defs.windows.win_6_2_i386.ntoskrnl",
"vstruct.defs.windows.win_6_2_i386.win32k",
"vstruct.defs.windows.win_6_2_wow64",
"vstruct.defs.windows.win_6_2_wow64.ntdll",
"vstruct.defs.windows.win_6_3_amd64",
"vstruct.defs.windows.win_6_3_amd64.ntdll",
"vstruct.defs.windows.win_6_3_amd64.ntoskrnl",
"vstruct.defs.windows.win_6_3_i386",
"vstruct.defs.windows.win_6_3_i386.ntdll",
"vstruct.defs.windows.win_6_3_i386.ntoskrnl",
"vstruct.defs.windows.win_6_3_wow64",
"vstruct.defs.windows.win_6_3_wow64.ntdll",
]

View File

@@ -16,9 +16,10 @@ with open('./capa/version.py', 'wb') as f:
# - commits since
# g------- git hash fragment
version = (subprocess.check_output(["git", "describe", "--always", "--tags", "--long"])
.decode("utf-8")
.strip()
.replace("tags/", ""))
f.write("__version__ = '%s'" % version)
f.write(("__version__ = '%s'" % version).encode("utf-8"))
a = Analysis(
# when invoking pyinstaller from the project root,
@@ -41,128 +42,6 @@ a = Analysis(
# ref: https://stackoverflow.com/a/62278462/87207
(os.path.dirname(wcwidth.__file__), 'wcwidth')
],
hiddenimports=[
# vivisect does manual/runtime importing of its modules,
# so declare the things that could be imported here.
"vivisect",
"vivisect.analysis",
"vivisect.analysis.amd64",
"vivisect.analysis.amd64",
"vivisect.analysis.amd64.emulation",
"vivisect.analysis.amd64.golang",
"vivisect.analysis.crypto",
"vivisect.analysis.crypto",
"vivisect.analysis.crypto.constants",
"vivisect.analysis.elf",
"vivisect.analysis.elf",
"vivisect.analysis.elf.elfplt",
"vivisect.analysis.elf.libc_start_main",
"vivisect.analysis.generic",
"vivisect.analysis.generic",
"vivisect.analysis.generic.codeblocks",
"vivisect.analysis.generic.emucode",
"vivisect.analysis.generic.entrypoints",
"vivisect.analysis.generic.funcentries",
"vivisect.analysis.generic.impapi",
"vivisect.analysis.generic.mkpointers",
"vivisect.analysis.generic.pointers",
"vivisect.analysis.generic.pointertables",
"vivisect.analysis.generic.relocations",
"vivisect.analysis.generic.strconst",
"vivisect.analysis.generic.switchcase",
"vivisect.analysis.generic.thunks",
"vivisect.analysis.i386",
"vivisect.analysis.i386",
"vivisect.analysis.i386.calling",
"vivisect.analysis.i386.golang",
"vivisect.analysis.i386.importcalls",
"vivisect.analysis.i386.instrhook",
"vivisect.analysis.i386.thunk_bx",
"vivisect.analysis.ms",
"vivisect.analysis.ms",
"vivisect.analysis.ms.hotpatch",
"vivisect.analysis.ms.localhints",
"vivisect.analysis.ms.msvc",
"vivisect.analysis.ms.msvcfunc",
"vivisect.analysis.ms.vftables",
"vivisect.analysis.pe",
"vivisect.impapi.posix.amd64",
"vivisect.impapi.posix.i386",
"vivisect.impapi.windows",
"vivisect.impapi.windows.amd64",
"vivisect.impapi.windows.i386",
"vivisect.impapi.winkern.i386",
"vivisect.impapi.winkern.amd64",
"vivisect.parsers.blob",
"vivisect.parsers.elf",
"vivisect.parsers.ihex",
"vivisect.parsers.macho",
"vivisect.parsers.pe",
"vivisect.parsers.utils",
"vivisect.storage",
"vivisect.storage.basicfile",
"vstruct.constants",
"vstruct.constants.ntstatus",
"vstruct.defs",
"vstruct.defs.arm7",
"vstruct.defs.bmp",
"vstruct.defs.dns",
"vstruct.defs.elf",
"vstruct.defs.gif",
"vstruct.defs.ihex",
"vstruct.defs.inet",
"vstruct.defs.java",
"vstruct.defs.kdcom",
"vstruct.defs.macho",
"vstruct.defs.macho.const",
"vstruct.defs.macho.fat",
"vstruct.defs.macho.loader",
"vstruct.defs.macho.stabs",
"vstruct.defs.minidump",
"vstruct.defs.pcap",
"vstruct.defs.pe",
"vstruct.defs.pptp",
"vstruct.defs.rar",
"vstruct.defs.swf",
"vstruct.defs.win32",
"vstruct.defs.windows",
"vstruct.defs.windows.win_5_1_i386",
"vstruct.defs.windows.win_5_1_i386.ntdll",
"vstruct.defs.windows.win_5_1_i386.ntoskrnl",
"vstruct.defs.windows.win_5_1_i386.win32k",
"vstruct.defs.windows.win_5_2_i386",
"vstruct.defs.windows.win_5_2_i386.ntdll",
"vstruct.defs.windows.win_5_2_i386.ntoskrnl",
"vstruct.defs.windows.win_5_2_i386.win32k",
"vstruct.defs.windows.win_6_1_amd64",
"vstruct.defs.windows.win_6_1_amd64.ntdll",
"vstruct.defs.windows.win_6_1_amd64.ntoskrnl",
"vstruct.defs.windows.win_6_1_amd64.win32k",
"vstruct.defs.windows.win_6_1_i386",
"vstruct.defs.windows.win_6_1_i386.ntdll",
"vstruct.defs.windows.win_6_1_i386.ntoskrnl",
"vstruct.defs.windows.win_6_1_i386.win32k",
"vstruct.defs.windows.win_6_1_wow64",
"vstruct.defs.windows.win_6_1_wow64.ntdll",
"vstruct.defs.windows.win_6_2_amd64",
"vstruct.defs.windows.win_6_2_amd64.ntdll",
"vstruct.defs.windows.win_6_2_amd64.ntoskrnl",
"vstruct.defs.windows.win_6_2_amd64.win32k",
"vstruct.defs.windows.win_6_2_i386",
"vstruct.defs.windows.win_6_2_i386.ntdll",
"vstruct.defs.windows.win_6_2_i386.ntoskrnl",
"vstruct.defs.windows.win_6_2_i386.win32k",
"vstruct.defs.windows.win_6_2_wow64",
"vstruct.defs.windows.win_6_2_wow64.ntdll",
"vstruct.defs.windows.win_6_3_amd64",
"vstruct.defs.windows.win_6_3_amd64.ntdll",
"vstruct.defs.windows.win_6_3_amd64.ntoskrnl",
"vstruct.defs.windows.win_6_3_i386",
"vstruct.defs.windows.win_6_3_i386.ntdll",
"vstruct.defs.windows.win_6_3_i386.ntoskrnl",
"vstruct.defs.windows.win_6_3_wow64",
"vstruct.defs.windows.win_6_3_wow64.ntdll",
],
# when invoking pyinstaller from the project root,
# this gets run from the project root.
hookspath=['.github/pyinstaller/hooks'],
@@ -180,6 +59,25 @@ a = Analysis(
# since we don't spawn a notebook, we can safely remove these.
"IPython",
"ipywidgets",
# these are pulled in by networkx
# but we don't need to compute the strongly connected components.
"numpy",
"scipy",
"matplotlib",
"pandas",
"pytest",
# deps from viv that we don't use.
# this duplicates the entries in `hook-vivisect`,
# but works better this way.
"vqt",
"vdb.qt",
"envi.qt",
"PyQt5",
"qt5",
"pyqtwebengine",
"pyasn1"
])
a.binaries = a.binaries - TOC([

View File

@@ -1,82 +1,77 @@
name: build
on:
release:
types: [created, edited, published]
jobs:
build:
name: PyInstaller for ${{ matrix.os }}
runs-on: ${{ matrix.os }}
strategy:
matrix:
include:
- os: ubuntu-16.04
# use old linux so that the shared library versioning is more portable
artifact_name: capa
asset_name: linux
- os: windows-latest
artifact_name: capa.exe
asset_name: windows
- os: macos-latest
artifact_name: capa
asset_name: macos
steps:
- name: Checkout capa
uses: actions/checkout@v2
with:
submodules: true
- name: Set up Python 2.7
uses: actions/setup-python@v2
with:
python-version: 2.7
- if: matrix.os == 'ubuntu-latest'
run: sudo apt-get install -y libyaml-dev
- if: matrix.os == 'windows-latest'
run: |
choco install vcredist2008
choco install --ignore-dependencies vcpython27
- name: Install PyInstaller
# pyinstaller 4 doesn't support Python 2.7
run: pip install 'pyinstaller==3.*'
- name: Install capa
run: pip install -e .
- name: Build standalone executable
run: pyinstaller .github/pyinstaller/pyinstaller.spec
- name: Does it run?
run: dist/capa "tests/data/Practical Malware Analysis Lab 01-01.dll_"
- uses: actions/upload-artifact@v2
with:
name: ${{ matrix.asset_name }}
path: dist/${{ matrix.artifact_name }}
zip:
name: zip ${{ matrix.asset_name }}
runs-on: ubuntu-latest
needs: build
strategy:
matrix:
include:
- asset_name: linux
artifact_name: capa
- asset_name: windows
artifact_name: capa.exe
- asset_name: macos
artifact_name: capa
steps:
- name: Download ${{ matrix.asset_name }}
uses: actions/download-artifact@v2
with:
name: ${{ matrix.asset_name }}
- name: Set executable flag
run: chmod +x ${{ matrix.artifact_name }}
- name: Set zip name
run: echo "zip_name=capa-${GITHUB_REF#refs/tags/}-${{ matrix.asset_name }}.zip" >> $GITHUB_ENV
- name: Zip ${{ matrix.artifact_name }} into ${{ env.zip_name }}
run: zip ${{ env.zip_name }} ${{ matrix.artifact_name }}
- name: Upload ${{ env.zip_name }} to GH Release
uses: svenstaro/upload-release-action@v2
with:
repo_token: ${{ secrets.GITHUB_TOKEN}}
file: ${{ env.zip_name }}
tag: ${{ github.ref }}
name: build
on:
release:
types: [edited, published]
jobs:
build:
name: PyInstaller for ${{ matrix.os }}
runs-on: ${{ matrix.os }}
strategy:
matrix:
include:
- os: ubuntu-16.04
# use old linux so that the shared library versioning is more portable
artifact_name: capa
asset_name: linux
- os: windows-2019
artifact_name: capa.exe
asset_name: windows
- os: macos-10.15
artifact_name: capa
asset_name: macos
steps:
- name: Checkout capa
uses: actions/checkout@v2
with:
submodules: true
- name: Set up Python 3.9
uses: actions/setup-python@v2
with:
python-version: 3.9
- if: matrix.os == 'ubuntu-16.04'
run: sudo apt-get install -y libyaml-dev
- name: Install PyInstaller
run: pip install 'pyinstaller==4.2'
- name: Install capa
run: pip install -e .
- name: Build standalone executable
run: pyinstaller .github/pyinstaller/pyinstaller.spec
- name: Does it run?
run: dist/capa "tests/data/Practical Malware Analysis Lab 01-01.dll_"
- uses: actions/upload-artifact@v2
with:
name: ${{ matrix.asset_name }}
path: dist/${{ matrix.artifact_name }}
zip:
name: zip ${{ matrix.asset_name }}
runs-on: ubuntu-20.04
needs: build
strategy:
matrix:
include:
- asset_name: linux
artifact_name: capa
- asset_name: windows
artifact_name: capa.exe
- asset_name: macos
artifact_name: capa
steps:
- name: Download ${{ matrix.asset_name }}
uses: actions/download-artifact@v2
with:
name: ${{ matrix.asset_name }}
- name: Set executable flag
run: chmod +x ${{ matrix.artifact_name }}
- name: Set zip name
run: echo "zip_name=capa-${GITHUB_REF#refs/tags/}-${{ matrix.asset_name }}.zip" >> $GITHUB_ENV
- name: Zip ${{ matrix.artifact_name }} into ${{ env.zip_name }}
run: zip ${{ env.zip_name }} ${{ matrix.artifact_name }}
- name: Upload ${{ env.zip_name }} to GH Release
uses: svenstaro/upload-release-action@v2
with:
repo_token: ${{ secrets.GITHUB_TOKEN}}
file: ${{ env.zip_name }}
tag: ${{ github.ref }}

View File

@@ -1,29 +1,29 @@
# This workflows will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries
name: publish to pypi
on:
release:
types: [published]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '2.7'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
python setup.py sdist bdist_wheel
twine upload --skip-existing dist/*
# This workflows will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries
name: publish to pypi
on:
release:
types: [published]
jobs:
deploy:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '2.7'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
python setup.py sdist bdist_wheel
twine upload --skip-existing dist/*

24
.github/workflows/tag.yml vendored Normal file
View File

@@ -0,0 +1,24 @@
name: tag
on:
release:
types: [published]
jobs:
tag:
name: Tag capa rules
runs-on: ubuntu-20.04
steps:
- name: Checkout capa-rules
uses: actions/checkout@v2
with:
repository: fireeye/capa-rules
token: ${{ secrets.CAPA_TOKEN }}
- name: Tag capa-rules
run: git tag ${{ github.event.release.tag_name }}
- name: Push tag to capa-rules
uses: ad-m/github-push-action@master
with:
repository: fireeye/capa-rules
github_token: ${{ secrets.CAPA_TOKEN }}
tags: true

View File

@@ -8,7 +8,7 @@ on:
jobs:
code_style:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- name: Checkout capa
uses: actions/checkout@v2
@@ -24,7 +24,7 @@ jobs:
run: black -l 120 --check .
rule_linter:
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- name: Checkout capa with rules submodule
uses: actions/checkout@v2
@@ -41,30 +41,39 @@ jobs:
run: python scripts/lint.py rules/
tests:
name: Tests in ${{ matrix.python }}
runs-on: ubuntu-latest
name: Tests in ${{ matrix.python-version }} on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
needs: [code_style, rule_linter]
strategy:
fail-fast: false
matrix:
os: [ubuntu-20.04, windows-2019, macos-10.15]
# across all operating systems
python-version: [3.6, 3.9]
include:
- python: 2.7
- python: 3.6
- python: 3.7
- python: 3.8
- python: '3.9.0-rc.1' # Python latest
# on Ubuntu run these as well
- os: ubuntu-20.04
python-version: 2.7
- os: ubuntu-20.04
python-version: 3.7
- os: ubuntu-20.04
python-version: 3.8
steps:
- name: Checkout capa with submodules
uses: actions/checkout@v2
with:
submodules: true
- name: Set up Python ${{ matrix.python }}
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python }}
python-version: ${{ matrix.python-version }}
- name: Install pyyaml
if: matrix.os == 'ubuntu-20.04'
run: sudo apt-get install -y libyaml-dev
- name: Install Microsoft Visual C++ 9.0
if: matrix.os == 'windows-2019' && matrix.python-version == '2.7'
run: choco install vcpython27
- name: Install capa
run: pip install -e .[dev]
- name: Run tests
run: pytest tests/

File diff suppressed because it is too large Load Diff

View File

@@ -1,7 +1,10 @@
![capa](.github/logo.png)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/flare-capa)](https://pypi.org/project/flare-capa)
[![Last release](https://img.shields.io/github/v/release/fireeye/capa)](https://github.com/fireeye/capa/releases)
[![Number of rules](https://img.shields.io/badge/rules-485-blue.svg)](https://github.com/fireeye/capa-rules)
[![CI status](https://github.com/fireeye/capa/workflows/CI/badge.svg)](https://github.com/fireeye/capa/actions?query=workflow%3ACI+event%3Apush+branch%3Amaster)
[![Number of rules](https://img.shields.io/badge/rules-414-blue.svg)](https://github.com/fireeye/capa-rules)
[![Downloads](https://img.shields.io/github/downloads/fireeye/capa/total)](https://github.com/fireeye/capa/releases)
[![License](https://img.shields.io/badge/license-Apache--2.0-green.svg)](LICENSE.txt)
capa detects capabilities in executable files.
@@ -146,11 +149,10 @@ rule:
The [github.com/fireeye/capa-rules](https://github.com/fireeye/capa-rules) repository contains hundreds of standard library rules that are distributed with capa.
Please learn to write rules and contribute new entries as you find interesting techniques in malware.
If you use IDA Pro, then you use can use the [capa explorer IDA plugin](capa/ida/plugin/).
capa explorer lets you quickly identify and navigate to interesting areas of a program and dissect capa rule matches at
the assembly level.
If you use IDA Pro, then you can use the [capa explorer](capa/ida/plugin/) plugin.
capa explorer helps you identify interesting areas of a program and build new capa rules using features extracted directly from your IDA Pro database.
![capa + IDA Pro integration](doc/img/ida_plugin_intro.gif)
![capa + IDA Pro integration](doc/img/explorer_expanded.png)
# further information
## capa

View File

@@ -38,6 +38,20 @@ def hex_string(h):
return " ".join(h[i : i + 2] for i in range(0, len(h), 2)).upper()
def escape_string(s):
"""escape special characters"""
s = repr(s)
if not s.startswith(('"', "'")):
# u'hello\r\nworld' -> hello\\r\\nworld
s = s[2:-1]
else:
# 'hello\r\nworld' -> hello\\r\\nworld
s = s[1:-1]
s = s.replace("\\'", "'") # repr() may escape "'" in some edge cases, remove
s = s.replace('"', '\\"') # repr() does not escape '"', add
return s
class Feature(object):
def __init__(self, value, arch=None, description=None):
"""

View File

@@ -42,7 +42,9 @@ def is_ordinal(symbol):
"""
is the given symbol an ordinal that is prefixed by "#"?
"""
return symbol[0] == "#"
if symbol:
return symbol[0] == "#"
return False
def generate_symbols(dll, symbol):

View File

@@ -166,6 +166,10 @@ def basic_block_size(bb):
def read_bytes_at(ea, count):
""" """
# check if byte has a value, see get_wide_byte doc
if not idc.is_loaded(ea):
return b""
segm_end = idc.get_segm_end(ea)
if ea + count > segm_end:
return idc.get_bytes(ea, segm_end - ea)
@@ -347,6 +351,10 @@ def find_data_reference_from_insn(insn, max_depth=10):
# break if circular reference
break
if not idaapi.is_mapped(data_refs[0]):
# break if address is not mapped
break
depth += 1
if depth > max_depth:
# break if max depth

View File

@@ -148,6 +148,9 @@ def extract_insn_bytes_features(f, bb, insn):
example:
push offset iid_004118d4_IShellLinkA ; riid
"""
if idaapi.is_call_insn(insn):
return
ref = capa.features.extractors.ida.helpers.find_data_reference_from_insn(insn)
if ref != insn.ea:
extracted_bytes = capa.features.extractors.ida.helpers.read_bytes_at(ref, MAX_BYTES_FEATURE_SIZE)
@@ -302,7 +305,7 @@ def extract_insn_nzxor_characteristic_features(f, bb, insn):
bb (IDA BasicBlock)
insn (IDA insn_t)
"""
if insn.itype != idaapi.NN_xor:
if insn.itype not in (idaapi.NN_xor, idaapi.NN_xorpd, idaapi.NN_xorps, idaapi.NN_pxor):
return
if capa.features.extractors.ida.helpers.is_operand_equal(insn.Op1, insn.Op2):
return

View File

@@ -0,0 +1,52 @@
import sys
import types
from smda.common.SmdaReport import SmdaReport
from smda.common.SmdaInstruction import SmdaInstruction
import capa.features.extractors.smda.file
import capa.features.extractors.smda.insn
import capa.features.extractors.smda.function
import capa.features.extractors.smda.basicblock
from capa.main import UnsupportedRuntimeError
from capa.features.extractors import FeatureExtractor
class SmdaFeatureExtractor(FeatureExtractor):
def __init__(self, smda_report: SmdaReport, path):
super(SmdaFeatureExtractor, self).__init__()
if sys.version_info < (3, 0):
raise UnsupportedRuntimeError("SMDA should only be used with Python 3.")
self.smda_report = smda_report
self.path = path
def get_base_address(self):
return self.smda_report.base_addr
def extract_file_features(self):
for feature, va in capa.features.extractors.smda.file.extract_features(self.smda_report, self.path):
yield feature, va
def get_functions(self):
for function in self.smda_report.getFunctions():
yield function
def extract_function_features(self, f):
for feature, va in capa.features.extractors.smda.function.extract_features(f):
yield feature, va
def get_basic_blocks(self, f):
for bb in f.getBlocks():
yield bb
def extract_basic_block_features(self, f, bb):
for feature, va in capa.features.extractors.smda.basicblock.extract_features(f, bb):
yield feature, va
def get_instructions(self, f, bb):
for smda_ins in bb.getInstructions():
yield smda_ins
def extract_insn_features(self, f, bb, insn):
for feature, va in capa.features.extractors.smda.insn.extract_features(f, bb, insn):
yield feature, va

View File

@@ -0,0 +1,131 @@
import sys
import string
import struct
from capa.features import Characteristic
from capa.features.basicblock import BasicBlock
from capa.features.extractors.helpers import MIN_STACKSTRING_LEN
def _bb_has_tight_loop(f, bb):
"""
parse tight loops, true if last instruction in basic block branches to bb start
"""
return bb.offset in f.blockrefs[bb.offset] if bb.offset in f.blockrefs else False
def extract_bb_tight_loop(f, bb):
""" check basic block for tight loop indicators """
if _bb_has_tight_loop(f, bb):
yield Characteristic("tight loop"), bb.offset
def _bb_has_stackstring(f, bb):
"""
extract potential stackstring creation, using the following heuristics:
- basic block contains enough moves of constant bytes to the stack
"""
count = 0
for instr in bb.getInstructions():
if is_mov_imm_to_stack(instr):
count += get_printable_len(instr.getDetailed())
if count > MIN_STACKSTRING_LEN:
return True
return False
def get_operands(smda_ins):
return [o.strip() for o in smda_ins.operands.split(",")]
def extract_stackstring(f, bb):
""" check basic block for stackstring indicators """
if _bb_has_stackstring(f, bb):
yield Characteristic("stack string"), bb.offset
def is_mov_imm_to_stack(smda_ins):
"""
Return if instruction moves immediate onto stack
"""
if not smda_ins.mnemonic.startswith("mov"):
return False
try:
dst, src = get_operands(smda_ins)
except ValueError:
# not two operands
return False
try:
int(src, 16)
except ValueError:
return False
if not any(regname in dst for regname in ["ebp", "rbp", "esp", "rsp"]):
return False
return True
def is_printable_ascii(chars):
return all(c < 127 and chr(c) in string.printable for c in chars)
def is_printable_utf16le(chars):
if all(c == 0x00 for c in chars[1::2]):
return is_printable_ascii(chars[::2])
def get_printable_len(instr):
"""
Return string length if all operand bytes are ascii or utf16-le printable
Works on a capstone instruction
"""
# should have exactly two operands for mov immediate
if len(instr.operands) != 2:
return 0
op_value = instr.operands[1].value.imm
if instr.imm_size == 1:
chars = struct.pack("<B", op_value & 0xFF)
elif instr.imm_size == 2:
chars = struct.pack("<H", op_value & 0xFFFF)
elif instr.imm_size == 4:
chars = struct.pack("<I", op_value & 0xFFFFFFFF)
elif instr.imm_size == 8:
chars = struct.pack("<Q", op_value & 0xFFFFFFFFFFFFFFFF)
else:
raise ValueError("Unhandled operand data type 0x%x." % instr.imm_size)
if is_printable_ascii(chars):
return instr.imm_size
if is_printable_utf16le(chars):
return instr.imm_size // 2
return 0
def extract_features(f, bb):
"""
extract features from the given basic block.
args:
f (smda.common.SmdaFunction): the function from which to extract features
bb (smda.common.SmdaBasicBlock): the basic block to process.
yields:
Feature, set[VA]: the features and their location found in this basic block.
"""
yield BasicBlock(), bb.offset
for bb_handler in BASIC_BLOCK_HANDLERS:
for feature, va in bb_handler(f, bb):
yield feature, va
BASIC_BLOCK_HANDLERS = (
extract_bb_tight_loop,
extract_stackstring,
)

View File

@@ -0,0 +1,139 @@
import struct
# if we have SMDA we definitely have lief
import lief
import capa.features.extractors.helpers
import capa.features.extractors.strings
from capa.features import String, Characteristic
from capa.features.file import Export, Import, Section
def carve(pbytes, offset=0):
"""
Return a list of (offset, size, xor) tuples of embedded PEs
Based on the version from vivisect:
https://github.com/vivisect/vivisect/blob/7be4037b1cecc4551b397f840405a1fc606f9b53/PE/carve.py#L19
And its IDA adaptation:
capa/features/extractors/ida/file.py
"""
mz_xor = [
(
capa.features.extractors.helpers.xor_static(b"MZ", i),
capa.features.extractors.helpers.xor_static(b"PE", i),
i,
)
for i in range(256)
]
pblen = len(pbytes)
todo = [(pbytes.find(mzx, offset), mzx, pex, i) for mzx, pex, i in mz_xor]
todo = [(off, mzx, pex, i) for (off, mzx, pex, i) in todo if off != -1]
while len(todo):
off, mzx, pex, i = todo.pop()
# The MZ header has one field we will check
# e_lfanew is at 0x3c
e_lfanew = off + 0x3C
if pblen < (e_lfanew + 4):
continue
newoff = struct.unpack("<I", capa.features.extractors.helpers.xor_static(pbytes[e_lfanew : e_lfanew + 4], i))[0]
nextres = pbytes.find(mzx, off + 1)
if nextres != -1:
todo.append((nextres, mzx, pex, i))
peoff = off + newoff
if pblen < (peoff + 2):
continue
if pbytes[peoff : peoff + 2] == pex:
yield (off, i)
def extract_file_embedded_pe(smda_report, file_path):
with open(file_path, "rb") as f:
fbytes = f.read()
for offset, i in carve(fbytes, 1):
yield Characteristic("embedded pe"), offset
def extract_file_export_names(smda_report, file_path):
lief_binary = lief.parse(file_path)
if lief_binary is not None:
for function in lief_binary.exported_functions:
yield Export(function.name), function.address
def extract_file_import_names(smda_report, file_path):
# extract import table info via LIEF
lief_binary = lief.parse(file_path)
if not isinstance(lief_binary, lief.PE.Binary):
return
for imported_library in lief_binary.imports:
library_name = imported_library.name.lower()
library_name = library_name[:-4] if library_name.endswith(".dll") else library_name
for func in imported_library.entries:
if func.name:
va = func.iat_address + smda_report.base_addr
for name in capa.features.extractors.helpers.generate_symbols(library_name, func.name):
yield Import(name), va
elif func.is_ordinal:
for name in capa.features.extractors.helpers.generate_symbols(library_name, "#%s" % func.ordinal):
yield Import(name), va
def extract_file_section_names(smda_report, file_path):
lief_binary = lief.parse(file_path)
if not isinstance(lief_binary, lief.PE.Binary):
return
if lief_binary and lief_binary.sections:
base_address = lief_binary.optional_header.imagebase
for section in lief_binary.sections:
yield Section(section.name), base_address + section.virtual_address
def extract_file_strings(smda_report, file_path):
"""
extract ASCII and UTF-16 LE strings from file
"""
with open(file_path, "rb") as f:
b = f.read()
for s in capa.features.extractors.strings.extract_ascii_strings(b):
yield String(s.s), s.offset
for s in capa.features.extractors.strings.extract_unicode_strings(b):
yield String(s.s), s.offset
def extract_features(smda_report, file_path):
"""
extract file features from given workspace
args:
smda_report (smda.common.SmdaReport): a SmdaReport
file_path: path to the input file
yields:
Tuple[Feature, VA]: a feature and its location.
"""
for file_handler in FILE_HANDLERS:
result = file_handler(smda_report, file_path)
for feature, va in file_handler(smda_report, file_path):
yield feature, va
FILE_HANDLERS = (
extract_file_embedded_pe,
extract_file_export_names,
extract_file_import_names,
extract_file_section_names,
extract_file_strings,
)

View File

@@ -0,0 +1,38 @@
from capa.features import Characteristic
from capa.features.extractors import loops
def extract_function_calls_to(f):
for inref in f.inrefs:
yield Characteristic("calls to"), inref
def extract_function_loop(f):
"""
parse if a function has a loop
"""
edges = []
for bb_from, bb_tos in f.blockrefs.items():
for bb_to in bb_tos:
edges.append((bb_from, bb_to))
if edges and loops.has_loop(edges):
yield Characteristic("loop"), f.offset
def extract_features(f):
"""
extract features from the given function.
args:
f (smda.common.SmdaFunction): the function from which to extract features
yields:
Feature, set[VA]: the features and their location found in this function.
"""
for func_handler in FUNCTION_HANDLERS:
for feature, va in func_handler(f):
yield feature, va
FUNCTION_HANDLERS = (extract_function_calls_to, extract_function_loop)

View File

@@ -0,0 +1,393 @@
import re
import string
import struct
from smda.common.SmdaReport import SmdaReport
import capa.features.extractors.helpers
from capa.features import (
ARCH_X32,
ARCH_X64,
MAX_BYTES_FEATURE_SIZE,
THUNK_CHAIN_DEPTH_DELTA,
Bytes,
String,
Characteristic,
)
from capa.features.insn import API, Number, Offset, Mnemonic
# security cookie checks may perform non-zeroing XORs, these are expected within a certain
# byte range within the first and returning basic blocks, this helps to reduce FP features
SECURITY_COOKIE_BYTES_DELTA = 0x40
PATTERN_HEXNUM = re.compile(r"[+\-] (?P<num>0x[a-fA-F0-9]+)")
PATTERN_SINGLENUM = re.compile(r"[+\-] (?P<num>[0-9])")
def get_arch(smda_report):
if smda_report.architecture == "intel":
if smda_report.bitness == 32:
return ARCH_X32
elif smda_report.bitness == 64:
return ARCH_X64
else:
raise NotImplementedError
def extract_insn_api_features(f, bb, insn):
"""parse API features from the given instruction."""
if insn.offset in f.apirefs:
api_entry = f.apirefs[insn.offset]
# reformat
dll_name, api_name = api_entry.split("!")
dll_name = dll_name.split(".")[0]
dll_name = dll_name.lower()
for name in capa.features.extractors.helpers.generate_symbols(dll_name, api_name):
yield API(name), insn.offset
elif insn.offset in f.outrefs:
current_function = f
current_instruction = insn
for index in range(THUNK_CHAIN_DEPTH_DELTA):
if current_function and len(current_function.outrefs[current_instruction.offset]) == 1:
target = current_function.outrefs[current_instruction.offset][0]
referenced_function = current_function.smda_report.getFunction(target)
if referenced_function:
# TODO SMDA: implement this function for both jmp and call, checking if function has 1 instruction which refs an API
if referenced_function.isApiThunk():
api_entry = (
referenced_function.apirefs[target] if target in referenced_function.apirefs else None
)
if api_entry:
# reformat
dll_name, api_name = api_entry.split("!")
dll_name = dll_name.split(".")[0]
dll_name = dll_name.lower()
for name in capa.features.extractors.helpers.generate_symbols(dll_name, api_name):
yield API(name), insn.offset
elif referenced_function.num_instructions == 1 and referenced_function.num_outrefs == 1:
current_function = referenced_function
current_instruction = [i for i in referenced_function.getInstructions()][0]
else:
return
def extract_insn_number_features(f, bb, insn):
"""parse number features from the given instruction."""
# example:
#
# push 3136B0h ; dwControlCode
operands = [o.strip() for o in insn.operands.split(",")]
if insn.mnemonic == "add" and operands[0] in ["esp", "rsp"]:
# skip things like:
#
# .text:00401140 call sub_407E2B
# .text:00401145 add esp, 0Ch
return
for operand in operands:
try:
yield Number(int(operand, 16)), insn.offset
yield Number(int(operand, 16), arch=get_arch(f.smda_report)), insn.offset
except:
continue
def read_bytes(smda_report, va, num_bytes=None):
"""
read up to MAX_BYTES_FEATURE_SIZE from the given address.
"""
rva = va - smda_report.base_addr
if smda_report.buffer is None:
return
buffer_end = len(smda_report.buffer)
max_bytes = num_bytes if num_bytes is not None else MAX_BYTES_FEATURE_SIZE
if rva + max_bytes > buffer_end:
return smda_report.buffer[rva:]
else:
return smda_report.buffer[rva : rva + max_bytes]
def derefs(smda_report, p):
"""
recursively follow the given pointer, yielding the valid memory addresses along the way.
useful when you may have a pointer to string, or pointer to pointer to string, etc.
this is a "do what i mean" type of helper function.
based on the implementation in viv/insn.py
"""
depth = 0
while True:
if not smda_report.isAddrWithinMemoryImage(p):
return
yield p
bytes_ = read_bytes(smda_report, p, num_bytes=4)
val = struct.unpack("I", bytes_)[0]
# sanity: pointer points to self
if val == p:
return
# sanity: avoid chains of pointers that are unreasonably deep
depth += 1
if depth > 10:
return
p = val
def extract_insn_bytes_features(f, bb, insn):
"""
parse byte sequence features from the given instruction.
example:
# push offset iid_004118d4_IShellLinkA ; riid
"""
for data_ref in insn.getDataRefs():
for v in derefs(f.smda_report, data_ref):
bytes_read = read_bytes(f.smda_report, v)
if bytes_read is None:
continue
if capa.features.extractors.helpers.all_zeros(bytes_read):
continue
yield Bytes(bytes_read), insn.offset
def detect_ascii_len(smda_report, offset):
if smda_report.buffer is None:
return 0
ascii_len = 0
rva = offset - smda_report.base_addr
char = smda_report.buffer[rva]
while char < 127 and chr(char) in string.printable:
ascii_len += 1
rva += 1
char = smda_report.buffer[rva]
if char == 0:
return ascii_len
return 0
def detect_unicode_len(smda_report, offset):
if smda_report.buffer is None:
return 0
unicode_len = 0
rva = offset - smda_report.base_addr
char = smda_report.buffer[rva]
second_char = smda_report.buffer[rva + 1]
while char < 127 and chr(char) in string.printable and second_char == 0:
unicode_len += 2
rva += 2
char = smda_report.buffer[rva]
second_char = smda_report.buffer[rva + 1]
if char == 0 and second_char == 0:
return unicode_len
return 0
def read_string(smda_report, offset):
alen = detect_ascii_len(smda_report, offset)
if alen > 1:
return read_bytes(smda_report, offset, alen).decode("utf-8")
ulen = detect_unicode_len(smda_report, offset)
if ulen > 2:
return read_bytes(smda_report, offset, ulen).decode("utf-16")
def extract_insn_string_features(f, bb, insn):
"""parse string features from the given instruction."""
# example:
#
# push offset aAcr ; "ACR > "
for data_ref in insn.getDataRefs():
for v in derefs(f.smda_report, data_ref):
string_read = read_string(f.smda_report, v)
if string_read:
yield String(string_read.rstrip("\x00")), insn.offset
def extract_insn_offset_features(f, bb, insn):
"""parse structure offset features from the given instruction."""
# examples:
#
# mov eax, [esi + 4]
# mov eax, [esi + ecx + 16384]
operands = [o.strip() for o in insn.operands.split(",")]
for operand in operands:
if not "ptr" in operand:
continue
if "esp" in operand or "ebp" in operand or "rbp" in operand:
continue
number = 0
number_hex = re.search(PATTERN_HEXNUM, operand)
number_int = re.search(PATTERN_SINGLENUM, operand)
if number_hex:
number = int(number_hex.group("num"), 16)
number = -1 * number if number_hex.group().startswith("-") else number
elif number_int:
number = int(number_int.group("num"))
number = -1 * number if number_int.group().startswith("-") else number
yield Offset(number), insn.offset
yield Offset(number, arch=get_arch(f.smda_report)), insn.offset
def is_security_cookie(f, bb, insn):
"""
check if an instruction is related to security cookie checks
"""
# security cookie check should use SP or BP
operands = [o.strip() for o in insn.operands.split(",")]
if operands[1] not in ["esp", "ebp", "rsp", "rbp"]:
return False
for index, block in enumerate(f.getBlocks()):
# expect security cookie init in first basic block within first bytes (instructions)
block_instructions = [i for i in block.getInstructions()]
if index == 0 and insn.offset < (block_instructions[0].offset + SECURITY_COOKIE_BYTES_DELTA):
return True
# ... or within last bytes (instructions) before a return
if block_instructions[-1].mnemonic.startswith("ret") and insn.offset > (
block_instructions[-1].offset - SECURITY_COOKIE_BYTES_DELTA
):
return True
return False
def extract_insn_nzxor_characteristic_features(f, bb, insn):
"""
parse non-zeroing XOR instruction from the given instruction.
ignore expected non-zeroing XORs, e.g. security cookies.
"""
if insn.mnemonic not in ("xor", "xorpd", "xorps", "pxor"):
return
operands = [o.strip() for o in insn.operands.split(",")]
if operands[0] == operands[1]:
return
if is_security_cookie(f, bb, insn):
return
yield Characteristic("nzxor"), insn.offset
def extract_insn_mnemonic_features(f, bb, insn):
"""parse mnemonic features from the given instruction."""
yield Mnemonic(insn.mnemonic), insn.offset
def extract_insn_peb_access_characteristic_features(f, bb, insn):
"""
parse peb access from the given function. fs:[0x30] on x86, gs:[0x60] on x64
"""
if insn.mnemonic not in ["push", "mov"]:
return
operands = [o.strip() for o in insn.operands.split(",")]
for operand in operands:
if "fs:" in operand and "0x30" in operand:
yield Characteristic("peb access"), insn.offset
elif "gs:" in operand and "0x60" in operand:
yield Characteristic("peb access"), insn.offset
def extract_insn_segment_access_features(f, bb, insn):
""" parse the instruction for access to fs or gs """
operands = [o.strip() for o in insn.operands.split(",")]
for operand in operands:
if "fs:" in operand:
yield Characteristic("fs access"), insn.offset
elif "gs:" in operand:
yield Characteristic("gs access"), insn.offset
def extract_insn_cross_section_cflow(f, bb, insn):
"""
inspect the instruction for a CALL or JMP that crosses section boundaries.
"""
if insn.mnemonic in ["call", "jmp"]:
if insn.offset in f.apirefs:
return
smda_report = insn.smda_function.smda_report
if insn.offset in f.outrefs:
for target in f.outrefs[insn.offset]:
if smda_report.getSection(insn.offset) != smda_report.getSection(target):
yield Characteristic("cross section flow"), insn.offset
elif insn.operands.startswith("0x"):
target = int(insn.operands, 16)
if smda_report.getSection(insn.offset) != smda_report.getSection(target):
yield Characteristic("cross section flow"), insn.offset
# this is a feature that's most relevant at the function scope,
# however, its most efficient to extract at the instruction scope.
def extract_function_calls_from(f, bb, insn):
if insn.mnemonic != "call":
return
if insn.offset in f.outrefs:
for outref in f.outrefs[insn.offset]:
yield Characteristic("calls from"), outref
if outref == f.offset:
# if we found a jump target and it's the function address
# mark as recursive
yield Characteristic("recursive call"), outref
if insn.offset in f.apirefs:
yield Characteristic("calls from"), insn.offset
# this is a feature that's most relevant at the function or basic block scope,
# however, its most efficient to extract at the instruction scope.
def extract_function_indirect_call_characteristic_features(f, bb, insn):
"""
extract indirect function call characteristic (e.g., call eax or call dword ptr [edx+4])
does not include calls like => call ds:dword_ABD4974
"""
if insn.mnemonic != "call":
return
if insn.operands.startswith("0x"):
return False
if "qword ptr" in insn.operands and "rip" in insn.operands:
return False
if insn.operands.startswith("dword ptr [0x"):
return False
# call edx
# call dword ptr [eax+50h]
# call qword ptr [rsp+78h]
yield Characteristic("indirect call"), insn.offset
def extract_features(f, bb, insn):
"""
extract features from the given insn.
args:
f (smda.common.SmdaFunction): the function to process.
bb (smda.common.SmdaBasicBlock): the basic block to process.
insn (smda.common.SmdaInstruction): the instruction to process.
yields:
Feature, set[VA]: the features and their location found in this insn.
"""
for insn_handler in INSTRUCTION_HANDLERS:
for feature, va in insn_handler(f, bb, insn):
yield feature, va
INSTRUCTION_HANDLERS = (
extract_insn_api_features,
extract_insn_number_features,
extract_insn_string_features,
extract_insn_bytes_features,
extract_insn_offset_features,
extract_insn_nzxor_characteristic_features,
extract_insn_mnemonic_features,
extract_insn_peb_access_characteristic_features,
extract_insn_cross_section_cflow,
extract_insn_segment_access_features,
extract_function_calls_from,
extract_function_indirect_call_characteristic_features,
)

View File

@@ -8,11 +8,7 @@
import types
import file
import insn
import function
import viv_utils
import basicblock
import capa.features.extractors
import capa.features.extractors.viv.file
@@ -42,7 +38,7 @@ def add_va_int_cast(o):
this bit of skullduggery lets use cast viv-utils objects as ints.
the correct way of doing this is to update viv-utils (or subclass the objects here).
"""
setattr(o, "__int__", types.MethodType(get_va, o, type(o)))
setattr(o, "__int__", types.MethodType(get_va, o))
return o

View File

@@ -125,11 +125,16 @@ def get_printable_len(oper):
def is_printable_ascii(chars):
return all(ord(c) < 127 and c in string.printable for c in chars)
try:
chars_str = chars.decode("ascii")
except UnicodeDecodeError:
return False
else:
return all(c in string.printable for c in chars_str)
def is_printable_utf16le(chars):
if all(c == "\x00" for c in chars[1::2]):
if all(c == b"\x00" for c in chars[1::2]):
return is_printable_ascii(chars[::2])

View File

@@ -239,7 +239,7 @@ def read_bytes(vw, va):
"""
segm = vw.getSegment(va)
if not segm:
raise envi.SegmentationViolation()
raise envi.SegmentationViolation(va)
segm_end = segm[0] + segm[1]
try:
@@ -258,10 +258,10 @@ def extract_insn_bytes_features(f, bb, insn):
example:
# push offset iid_004118d4_IShellLinkA ; riid
"""
for oper in insn.opers:
if insn.mnem == "call":
continue
if insn.mnem == "call":
return
for oper in insn.opers:
if isinstance(oper, envi.archs.i386.disasm.i386ImmOper):
v = oper.getOperValue(oper)
elif isinstance(oper, envi.archs.i386.disasm.i386RegMemOper):
@@ -311,6 +311,10 @@ def read_string(vw, offset):
# vivisect seems to mis-detect the end unicode strings
# off by one, too short
ulen += 1
else:
# vivisect seems to mis-detect the end unicode strings
# off by two, too short
ulen += 2
return read_memory(vw, offset, ulen).decode("utf-16")
raise ValueError("not a string", offset)
@@ -325,6 +329,9 @@ def extract_insn_string_features(f, bb, insn):
for oper in insn.opers:
if isinstance(oper, envi.archs.i386.disasm.i386ImmOper):
v = oper.getOperValue(oper)
elif isinstance(oper, envi.archs.i386.disasm.i386ImmMemOper):
# like 0x10056CB4 in `lea eax, dword [0x10056CB4]`
v = oper.imm
elif isinstance(oper, envi.archs.i386.disasm.i386SibOper):
# like 0x401000 in `mov eax, 0x401000[2 * ebx]`
v = oper.imm
@@ -415,7 +422,7 @@ def extract_insn_nzxor_characteristic_features(f, bb, insn):
parse non-zeroing XOR instruction from the given instruction.
ignore expected non-zeroing XORs, e.g. security cookies.
"""
if insn.mnem != "xor":
if insn.mnem not in ("xor", "xorpd", "xorps", "pxor"):
return
if insn.opers[0] == insn.opers[1]:
@@ -492,6 +499,10 @@ def extract_insn_cross_section_cflow(f, bb, insn):
inspect the instruction for a CALL or JMP that crosses section boundaries.
"""
for va, flags in insn.getBranches():
if va is None:
# va may be none for dynamic branches that haven't been resolved, such as `jmp eax`.
continue
if flags & envi.BR_FALL:
continue

View File

@@ -5,6 +5,7 @@ json format:
{
'version': 1,
'base address': int(base address),
'functions': {
int(function va): {
'basic blocks': {
@@ -86,6 +87,7 @@ def dumps(extractor):
"""
ret = {
"version": 1,
"base address": extractor.get_base_address(),
"functions": {},
"scopes": {
"file": [],
@@ -147,6 +149,7 @@ def loads(s):
raise ValueError("unsupported freeze format version: %d" % (doc.get("version")))
features = {
"base address": doc.get("base address"),
"file features": [],
"functions": {},
}
@@ -261,6 +264,15 @@ def main(argv=None):
parser.add_argument(
"-f", "--format", choices=[f[0] for f in formats], default="auto", help="Select sample format, %s" % format_help
)
if sys.version_info >= (3, 0):
parser.add_argument(
"-b",
"--backend",
type=str,
help="select the backend to use",
choices=(capa.main.BACKEND_VIV, capa.main.BACKEND_SMDA),
default=capa.main.BACKEND_VIV,
)
args = parser.parse_args(args=argv)
if args.quiet:
@@ -273,7 +285,8 @@ def main(argv=None):
logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(logging.INFO)
extractor = capa.main.get_extractor(args.sample, args.format)
backend = args.backend if sys.version_info > (3, 0) else capa.main.BACKEND_VIV
extractor = capa.main.get_extractor(args.sample, args.format, backend)
with open(args.output, "wb") as f:
f.write(dump(extractor))

View File

@@ -16,7 +16,7 @@ class API(Feature):
modname, _, impname = name.rpartition(".")
name = modname.lower() + "." + impname
super(API, self).__init__(name, description)
super(API, self).__init__(name, description=description)
class Number(Feature):

View File

@@ -82,14 +82,26 @@ def get_func_start_ea(ea):
return f if f is None else f.start_ea
def collect_metadata():
def get_file_md5():
""" """
md5 = idautils.GetInputFileMD5()
if not isinstance(md5, six.string_types):
md5 = capa.features.bytes_to_str(md5)
return md5
def get_file_sha256():
""" """
sha256 = idaapi.retrieve_input_file_sha256()
if not isinstance(sha256, six.string_types):
sha256 = capa.features.bytes_to_str(sha256)
return sha256
def collect_metadata():
""" """
md5 = get_file_md5()
sha256 = get_file_sha256()
return {
"timestamp": datetime.datetime.now().isoformat(),
@@ -103,6 +115,7 @@ def collect_metadata():
"analysis": {
"format": idaapi.get_file_type_name(),
"extractor": "ida",
"base_address": idaapi.get_imagebase(),
},
"version": capa.version.__version__,
}

View File

@@ -1,49 +1,35 @@
![capa explorer](../../../.github/capa-explorer-logo.png)
capa explorer is an IDA Pro plugin written in Python that integrates the FLARE team's open-source framework, capa, with IDA. capa is a framework that uses a well-defined collection of rules to
capa explorer is an IDAPython plugin that integrates the FLARE team's open-source framework, capa, with IDA Pro. capa is a framework that uses a well-defined collection of rules to
identify capabilities in a program. You can run capa against a PE file or shellcode and it tells you what it thinks the program can do. For example, it might suggest that
the program is a backdoor, can install services, or relies on HTTP to communicate. You can use capa explorer to run capa directly on an IDA database without requiring access
to the source binary. Once a database has been analyzed, capa explorer can be used to quickly identify and navigate to interesting areas of a program
and dissect capa rule matches at the assembly level.
the program is a backdoor, can install services, or relies on HTTP to communicate. capa explorer runs capa directly against your IDA Pro database (IDB) without requiring access
to the original binary file. Once a database has been analyzed, capa explorer helps you identify interesting areas of a program and build new capa rules using features extracted from your IDB.
We love using capa explorer during malware analysis because it teaches us what parts of a program suggest a behavior. As we click on rows, capa explorer jumps directly
to important addresses in the IDA Pro database and highlights key features in the Disassembly view so they stand out visually. To illustrate, we use capa explorer to
to important addresses in the IDB and highlights key features in the Disassembly view so they stand out visually. To illustrate, we use capa explorer to
analyze Lab 14-02 from [Practical Malware Analysis](https://nostarch.com/malware) (PMA) available [here](https://practicalmalwareanalysis.com/labs/). Our goal is to understand
the program's functionality.
After loading Lab 14-02 into IDA and analyzing the database with capa explorer, we see that capa detected a rule match for `self delete via COMSPEC environment variable`:
![](../../../doc/img/ida_plugin_example_1.png)
![](../../../doc/img/explorer_condensed.png)
We can use capa explorer to navigate the IDA Disassembly view directly to the suspect function and get an assembly-level breakdown of why capa matched `self delete via COMSPEC environment variable`
for this particular function.
We can use capa explorer to navigate our Disassembly view directly to the suspect function and get an assembly-level breakdown of why capa matched `self delete via COMSPEC environment variable`.
![](../../../doc/img/ida_plugin_example_2.png)
![](../../../doc/img/explorer_expanded.png)
Using the `Rule Information` and `Details` columns capa explorer shows us that the suspect function matched `self delete via COMSPEC environment variable` because it contains capa rule matches for `create process`, `get COMSPEC environment variable`,
and `query environment variable`, references to the strings `COMSPEC`, ` > nul`, and `/c del`, and calls to the Windows API functions `GetEnvironmentVariableA` and `ShellExecuteEx`.
and `query environment variable`, references to the strings `COMSPEC`, ` > nul`, and `/c del `, and calls to the Windows API functions `GetEnvironmentVariableA` and `ShellExecuteEx`.
capa explorer also helps you build new capa rules. To start select the `Rule Generator` tab, navigate to a function in your Disassembly view,
and click `Analyze`. capa explorer will extract features from the function and display them in the `Features` pane. You can add features listed in this pane to the `Editor` pane
by either double-clicking a feature or using multi-select + right-click to add multiple features at once. The `Preview` and `Editor` panes help edit your rule. Use the `Preview` pane
to modify the rule text directly and the `Editor` pane to construct and rearrange your hierarchy of statements and features. When you finish a rule you can save it directly to a file by clicking `Save`.
![](../../../doc/img/rulegen_expanded.png)
For more information on the FLARE team's open-source framework, capa, check out the overview in our first [blog](https://www.fireeye.com/blog/threat-research/2020/07/capa-automatically-identify-malware-capabilities.html).
## Features
![](../../../doc/img/ida_plugin_intro.gif)
* Display capa results in an interactive tree view of rule matches and their locations in the current database
* Search for keywords or phrases found in the `Rule Information`, `Address`, or `Details` columns
* Display rule source content when a user hovers their cursor over a rule match
* Double-click `Address` column to view associated feature in the IDA Disassembly view
* Limit tree view results to the function currently displayed in the IDA Disassembly view; update results as a user navigates to different functions
* Export results as formatted JSON by navigating to `File > Export results...`
* Remember a user's capa rules directory for future runs; change capa rules directory by navigating to `Rules > Change rules directory...`
* Automatically re-analyze database when user performs a program rebase
* Automatically update results when IDA is used to rename a function
* Select one or more checkboxes to highlight the associated addresses in the IDA Disassembly view
* Right-click a function match to rename it; the new function name is propagated to the current IDA database
* Right-click to copy a result by column or by row
* Sort results by column
* Reset tree view and IDA Disassembly view highlighting by clicking `Reset`
## Getting Started
### Requirements
@@ -56,7 +42,7 @@ If you encounter issues with your specific setup, please open a new [Issue](http
### Supported File Types
capa explorer is limited to the file types supported by capa, which includes:
capa explorer is limited to the file types supported by capa, which include:
* Windows 32-bit and 64-bit PE files
* Windows 32-bit and 64-bit shellcode
@@ -74,38 +60,48 @@ You can install capa explorer using the following steps:
### Usage
1. Run IDA and analyze a supported file type (select the `Manual Load` and `Load Resources` options in IDA for best results)
1. Open IDA and analyze a supported file type (select the `Manual Load` and `Load Resources` options in IDA for best results)
2. Open capa explorer in IDA by navigating to `Edit > Plugins > FLARE capa explorer` or using the keyboard shortcut `Alt+F5`
3. Click the `Analyze` button
3. Select the `Program Analysis` tab
4. Click the `Analyze` button
When running capa explorer for the first time you are prompted to select a file directory containing capa rules. The plugin conveniently
remembers your selection for future runs; you can change this selection by navigating to `Rules > Change rules directory...`. We recommend
remembers your selection for future runs; you can change this selection and other default settings by clicking `Settings`. We recommend
downloading and using the [standard collection of capa rules](https://github.com/fireeye/capa-rules) when getting started with the plugin.
#### Tips
#### Tips for Program Analysis
* Start analysis by clicking the `Analyze` button
* Reset the plugin user interface and remove highlighting from IDA disassembly view by clicking the `Reset` button
* Change your capa rules directory by navigating to `Rules > Change rules directory...` from the plugin menu
* Reset the plugin user interface and remove highlighting from your Disassembly view by clicking the `Reset` button
* Change your capa rules directory and other default settings by clicking `Settings`
* Hover your cursor over a rule match to view the source content of the rule
* Double-click the `Address` column to navigate the IDA Disassembly view to the associated feature
* Double-click the `Address` column to navigate your Disassembly view to the address of the associated feature
* Double-click a result in the `Rule Information` column to expand its children
* Select a checkbox in the `Rule Information` column to highlight the address of the associated feature in the IDA Dissasembly view
* Select a checkbox in the `Rule Information` column to highlight the address of the associated feature in your Dissasembly view
#### Tips for Rule Generator
* Navigate to a function in your Disassembly view and click`Analyze` to get started
* Double-click or use multi-select + right-click to add features from the `Features` pane to the `Editor` pane
* Right-click features in the `Editor` pane to make context-specific modifications
* Drag-and-drop (single click + multi-select support) features in the `Editor` pane to construct your hierarchy of statements and features
* Right-click anywhere in the `Editor` pane not on a feature to remove all features
* Add descriptions or comments to a feature by editing the corresponding column in the `Editor` pane
* Directly edit rule text and metadata fields using the `Preview` pane
* Change the default rule author and default rule scope displayed in the `Preview` pane by clicking `Settings`
## Development
Because capa explorer is packaged with capa you will need to install capa locally for development.
You can install capa locally by following the steps outlined in `Method 3: Inspecting the capa source code` of the [capa
capa explorer is packaged with capa so you will need to install capa locally for development. You can install capa locally by following the steps outlined in `Method 3: Inspecting the capa source code` of the [capa
installation guide](https://github.com/fireeye/capa/blob/master/doc/installation.md#method-3-inspecting-the-capa-source-code). Once installed, copy [capa_explorer.py](https://raw.githubusercontent.com/fireeye/capa/master/capa/ida/plugin/capa_explorer.py)
to your IDA plugins directory to run the plugin in IDA.
to your plugins directory to install capa explorer in IDA.
### Components
capa explorer consists of two main components:
* An IDA [feature extractor](https://github.com/fireeye/capa/tree/master/capa/features/extractors/ida) built on top of IDA's binary analysis engine
* This component uses IDAPython to extract [capa features](https://github.com/fireeye/capa-rules/blob/master/doc/format.md#extracted-features) from the IDA database such as strings,
* An [feature extractor](https://github.com/fireeye/capa/tree/master/capa/features/extractors/ida) built on top of IDA's binary analysis engine
* This component uses IDAPython to extract [capa features](https://github.com/fireeye/capa-rules/blob/master/doc/format.md#extracted-features) from your IDBs such as strings,
disassembly, and control flow; these extracted features are used by capa to find feature combinations that result in a rule match
* An [interactive user interface](https://github.com/fireeye/capa/tree/master/capa/ida/plugin) for displaying and exploring capa rule matches
* This component integrates the IDA feature extractor and capa, providing an interactive user interface to dissect rule matches found by capa using features extracted by the IDA feature extractor
* This component integrates the feature extractor and capa, providing an interactive user interface to dissect rule matches found by capa using features extracted directly from your IDBs

File diff suppressed because it is too large Load Diff

View File

@@ -35,20 +35,19 @@ def location_to_hex(location):
class CapaExplorerDataItem(object):
"""store data for CapaExplorerDataModel"""
def __init__(self, parent, data):
def __init__(self, parent, data, can_check=True):
"""initialize item"""
self.pred = parent
self._data = data
self.children = []
self._checked = False
self._can_check = can_check
# default state for item
self.flags = (
QtCore.Qt.ItemIsEnabled
| QtCore.Qt.ItemIsSelectable
| QtCore.Qt.ItemIsTristate
| QtCore.Qt.ItemIsUserCheckable
)
self.flags = QtCore.Qt.ItemIsEnabled | QtCore.Qt.ItemIsSelectable
if self._can_check:
self.flags = self.flags | QtCore.Qt.ItemIsUserCheckable | QtCore.Qt.ItemIsTristate
if self.pred:
self.pred.appendChild(self)
@@ -70,6 +69,10 @@ class CapaExplorerDataItem(object):
"""
self._checked = checked
def canCheck(self):
""" """
return self._can_check
def isChecked(self):
"""get item is checked"""
return self._checked
@@ -165,7 +168,7 @@ class CapaExplorerRuleItem(CapaExplorerDataItem):
fmt = "%s (%d matches)"
def __init__(self, parent, name, namespace, count, source):
def __init__(self, parent, name, namespace, count, source, can_check=True):
"""initialize item
@param parent: parent node
@@ -175,7 +178,7 @@ class CapaExplorerRuleItem(CapaExplorerDataItem):
@param source: rule source (tooltip)
"""
display = self.fmt % (name, count) if count > 1 else name
super(CapaExplorerRuleItem, self).__init__(parent, [display, "", namespace])
super(CapaExplorerRuleItem, self).__init__(parent, [display, "", namespace], can_check)
self._source = source
@property
@@ -208,14 +211,14 @@ class CapaExplorerFunctionItem(CapaExplorerDataItem):
fmt = "function(%s)"
def __init__(self, parent, location):
def __init__(self, parent, location, can_check=True):
"""initialize item
@param parent: parent node
@param location: virtual address of function as seen by IDA
"""
super(CapaExplorerFunctionItem, self).__init__(
parent, [self.fmt % idaapi.get_name(location), location_to_hex(location), ""]
parent, [self.fmt % idaapi.get_name(location), location_to_hex(location), ""], can_check
)
@property

View File

@@ -6,7 +6,7 @@
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
from collections import deque
from collections import deque, defaultdict
import idc
import idaapi
@@ -110,6 +110,8 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
if role == QtCore.Qt.CheckStateRole and column == CapaExplorerDataModel.COLUMN_INDEX_RULE_INFORMATION:
# inform view how to display content of checkbox - un/checked
if not item.canCheck():
return None
return QtCore.Qt.Checked if item.isChecked() else QtCore.Qt.Unchecked
if role == QtCore.Qt.FontRole and column in (
@@ -424,14 +426,28 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
for child in match.get("children", []):
self.render_capa_doc_match(parent2, child, doc)
def render_capa_doc(self, doc):
"""render capa features specified in doc
@param doc: capa result doc
"""
# inform model that changes are about to occur
self.beginResetModel()
def render_capa_doc_by_function(self, doc):
""" """
matches_by_function = {}
for rule in rutils.capability_rules(doc):
for ea in rule["matches"].keys():
ea = capa.ida.helpers.get_func_start_ea(ea)
if ea is None:
# file scope, skip for rendering in this mode
continue
if None is matches_by_function.get(ea, None):
matches_by_function[ea] = CapaExplorerFunctionItem(self.root_node, ea, can_check=False)
CapaExplorerRuleItem(
matches_by_function[ea],
rule["meta"]["name"],
rule["meta"].get("namespace"),
len(rule["matches"]),
rule["source"],
can_check=False,
)
def render_capa_doc_by_program(self, doc):
""" """
for rule in rutils.capability_rules(doc):
rule_name = rule["meta"]["name"]
rule_namespace = rule["meta"].get("namespace")
@@ -451,6 +467,19 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
self.render_capa_doc_match(parent2, match, doc)
def render_capa_doc(self, doc, by_function):
"""render capa features specified in doc
@param doc: capa result doc
"""
# inform model that changes are about to occur
self.beginResetModel()
if by_function:
self.render_capa_doc_by_function(doc)
else:
self.render_capa_doc_by_program(doc)
# inform model changes have ended
self.endResetModel()
@@ -459,13 +488,17 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
@param feature: capa feature read from doc
"""
if feature[feature["type"]]:
key = feature["type"]
value = feature[feature["type"]]
if value:
if key == "string":
value = '"%s"' % capa.features.escape_string(value)
if feature.get("description", ""):
return "%s(%s = %s)" % (feature["type"], feature[feature["type"]], feature["description"])
return "%s(%s = %s)" % (key, value, feature["description"])
else:
return "%s(%s)" % (feature["type"], feature[feature["type"]])
return "%s(%s)" % (key, value)
else:
return "%s" % feature["type"]
return "%s" % key
def render_capa_doc_feature_node(self, parent, feature, locations, doc):
"""process capa doc feature node
@@ -522,7 +555,9 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
)
if feature["type"] == "regex":
return CapaExplorerStringViewItem(parent, display, location, feature["match"])
return CapaExplorerStringViewItem(
parent, display, location, '"%s"' % capa.features.escape_string(feature["match"])
)
if feature["type"] == "basicblock":
return CapaExplorerBlockItem(parent, location)
@@ -547,7 +582,9 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
if feature["type"] in ("string",):
# display string preview
return CapaExplorerStringViewItem(parent, display, location, feature[feature["type"]])
return CapaExplorerStringViewItem(
parent, display, location, '"%s"' % capa.features.escape_string(feature[feature["type"]])
)
if feature["type"] in ("import", "export"):
# display no preview

View File

@@ -5,15 +5,936 @@
# Unless required by applicable law or agreed to in writing, software distributed under the License
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
import re
from collections import Counter
import idc
from PyQt5 import QtCore, QtWidgets
from PyQt5 import QtGui, QtCore, QtWidgets
import capa.rules
import capa.engine
import capa.ida.helpers
import capa.features.basicblock
from capa.ida.plugin.item import CapaExplorerFunctionItem
from capa.ida.plugin.model import CapaExplorerDataModel
MAX_SECTION_SIZE = 750
# default colors used in views
COLOR_GREEN_RGB = (79, 121, 66)
COLOR_BLUE_RGB = (37, 147, 215)
def calc_level_by_indent(line, prev_level=0):
""" """
if not len(line.strip()):
# blank line, which may occur for comments so we simply use the last level
return prev_level
stripped = line.lstrip()
if stripped.startswith("description"):
# need to adjust two spaces when encountering string description
line = line[2:]
# calc line level based on preceding whitespace
return len(line) - len(stripped)
def parse_feature_for_node(feature):
""" """
description = ""
comment = ""
if feature.startswith("- count"):
# count is weird, we need to handle special
# first, we need to grab the comment, if exists
# next, we need to check for an embedded description
feature, _, comment = feature.partition("#")
m = re.search(r"- count\(([a-zA-Z]+)\((.+)\s+=\s+(.+)\)\):\s*(.+)", feature)
if m:
# reconstruct count without description
feature, value, description, count = m.groups()
feature = "- count(%s(%s)): %s" % (feature, value, count)
elif not feature.startswith("#"):
feature, _, comment = feature.partition("#")
feature, _, description = feature.partition("=")
return map(lambda o: o.strip(), (feature, description, comment))
def parse_node_for_feature(feature, description, comment, depth):
""" """
depth = (depth * 2) + 4
display = ""
if feature.startswith("#"):
display += "%s%s\n" % (" " * depth, feature)
elif description:
if feature.startswith(("- and", "- or", "- optional", "- basic block", "- not")):
display += "%s%s" % (" " * depth, feature)
if comment:
display += " # %s" % comment
display += "\n%s- description: %s\n" % (" " * (depth + 2), description)
elif feature.startswith("- string"):
display += "%s%s" % (" " * depth, feature)
if comment:
display += " # %s" % comment
display += "\n%sdescription: %s\n" % (" " * (depth + 2), description)
elif feature.startswith("- count"):
# count is weird, we need to format description based on feature type, so we parse with regex
# assume format - count(<feature_name>(<feature_value>)): <count>
m = re.search(r"- count\(([a-zA-Z]+)\((.+)\)\): (.+)", feature)
if m:
name, value, count = m.groups()
if name in ("string",):
display += "%s%s" % (" " * depth, feature)
if comment:
display += " # %s" % comment
display += "\n%sdescription: %s\n" % (" " * (depth + 2), description)
else:
display += "%s- count(%s(%s = %s)): %s" % (
" " * depth,
name,
value,
description,
count,
)
if comment:
display += " # %s\n" % comment
else:
display += "%s%s = %s" % (" " * depth, feature, description)
if comment:
display += " # %s\n" % comment
else:
display += "%s%s" % (" " * depth, feature)
if comment:
display += " # %s\n" % comment
return display if display.endswith("\n") else display + "\n"
def yaml_to_nodes(s):
level = 0
for line in s.splitlines():
feature, description, comment = parse_feature_for_node(line.strip())
o = QtWidgets.QTreeWidgetItem(None)
# set node attributes
setattr(o, "capa_level", calc_level_by_indent(line, level))
if feature.startswith(("- and:", "- or:", "- not:", "- basic block:", "- optional:")):
setattr(o, "capa_type", CapaExplorerRulgenEditor.get_node_type_expression())
elif feature.startswith("#"):
setattr(o, "capa_type", CapaExplorerRulgenEditor.get_node_type_comment())
else:
setattr(o, "capa_type", CapaExplorerRulgenEditor.get_node_type_feature())
# set node text
for (i, v) in enumerate((feature, description, comment)):
o.setText(i, v)
yield o
def iterate_tree(o):
""" """
itr = QtWidgets.QTreeWidgetItemIterator(o)
while itr.value():
yield itr.value()
itr += 1
def calc_item_depth(o):
""" """
depth = 0
while True:
if not o.parent():
break
depth += 1
o = o.parent()
return depth
def build_action(o, display, data, slot):
""" """
action = QtWidgets.QAction(display, o)
action.setData(data)
action.triggered.connect(lambda checked: slot(action))
return action
def build_context_menu(o, actions):
""" """
menu = QtWidgets.QMenu()
for action in actions:
if isinstance(action, QtWidgets.QMenu):
menu.addMenu(action)
else:
menu.addAction(build_action(o, *action))
return menu
class CapaExplorerRulgenPreview(QtWidgets.QTextEdit):
INDENT = " " * 2
def __init__(self, parent=None):
""" """
super(CapaExplorerRulgenPreview, self).__init__(parent)
self.setFont(QtGui.QFont("Courier", weight=QtGui.QFont.Bold))
self.setLineWrapMode(QtWidgets.QTextEdit.NoWrap)
self.setHorizontalScrollBarPolicy(QtCore.Qt.ScrollBarAsNeeded)
def reset_view(self):
""" """
self.clear()
def load_preview_meta(self, ea, author, scope):
""" """
metadata_default = [
"# generated using capa explorer for IDA Pro",
"rule:",
" meta:",
" name: <insert_name>",
" namespace: <insert_namespace>",
" author: %s" % author,
" scope: %s" % scope,
" references: <insert_references>",
" examples:",
" - %s:0x%X" % (capa.ida.helpers.get_file_md5().upper(), ea)
if ea
else " - %s" % (capa.ida.helpers.get_file_md5().upper()),
" features:",
]
self.setText("\n".join(metadata_default))
def keyPressEvent(self, e):
"""intercept key press events"""
if e.key() in (QtCore.Qt.Key_Tab, QtCore.Qt.Key_Backtab):
# apparently it's not easy to implement tabs as spaces, or multi-line tab or SHIFT + Tab
# so we need to implement it ourselves so we can retain properly formatted capa rules
# when a user uses the Tab key
if self.textCursor().selection().isEmpty():
# single line, only worry about Tab
if e.key() == QtCore.Qt.Key_Tab:
self.insertPlainText(self.INDENT)
else:
# multi-line tab or SHIFT + Tab
cur = self.textCursor()
select_start_ppos = cur.selectionStart()
select_end_ppos = cur.selectionEnd()
scroll_ppos = self.verticalScrollBar().sliderPosition()
# determine lineno for first selected line, and column
cur.setPosition(select_start_ppos)
start_lineno = self.count_previous_lines_from_block(cur.block())
start_lineco = cur.columnNumber()
# determine lineno for last selected line
cur.setPosition(select_end_ppos)
end_lineno = self.count_previous_lines_from_block(cur.block())
# now we need to indent or dedent the selected lines. for now, we read the text, modify
# the lines between start_lineno and end_lineno accordingly, and then reset the view
# this might not be the best solution, but it avoids messing around with cursor positions
# to determine the beginning of lines
plain = self.toPlainText().splitlines()
if e.key() == QtCore.Qt.Key_Tab:
# user Tab, indent selected lines
lines_modified = end_lineno - start_lineno
first_modified = True
change = [self.INDENT + line for line in plain[start_lineno : end_lineno + 1]]
else:
# user SHIFT + Tab, dedent selected lines
lines_modified = 0
first_modified = False
change = []
for (lineno, line) in enumerate(plain[start_lineno : end_lineno + 1]):
if line.startswith(self.INDENT):
if lineno == 0:
# keep track if first line is modified, so we can properly display
# the text selection later
first_modified = True
lines_modified += 1
line = line[len(self.INDENT) :]
change.append(line)
# apply modifications, and reset view
plain[start_lineno : end_lineno + 1] = change
self.setPlainText("\n".join(plain) + "\n")
# now we need to properly adjust the selection positions, so users don't have to
# re-select when indenting or dedenting the same lines repeatedly
if e.key() == QtCore.Qt.Key_Tab:
# user Tab, increase increment selection positions
select_start_ppos += len(self.INDENT)
select_end_ppos += (lines_modified * len(self.INDENT)) + len(self.INDENT)
elif lines_modified:
# user SHIFT + Tab, decrease selection positions
if start_lineco not in (0, 1) and first_modified:
# only decrease start position if not in first column
select_start_ppos -= len(self.INDENT)
select_end_ppos -= lines_modified * len(self.INDENT)
# apply updated selection and restore previous scroll position
self.set_selection(select_start_ppos, select_end_ppos, len(self.toPlainText()))
self.verticalScrollBar().setSliderPosition(scroll_ppos)
else:
super(CapaExplorerRulgenPreview, self).keyPressEvent(e)
def count_previous_lines_from_block(self, block):
"""calculate number of lines preceding block"""
count = 0
while True:
block = block.previous()
if not block.isValid():
break
count += block.lineCount()
return count
def set_selection(self, start, end, max):
"""set text selection"""
cursor = self.textCursor()
cursor.setPosition(start)
cursor.setPosition(end if end < max else max, QtGui.QTextCursor.KeepAnchor)
self.setTextCursor(cursor)
class CapaExplorerRulgenEditor(QtWidgets.QTreeWidget):
updated = QtCore.pyqtSignal()
def __init__(self, preview, parent=None):
""" """
super(CapaExplorerRulgenEditor, self).__init__(parent)
self.preview = preview
self.setHeaderLabels(["Feature", "Description", "Comment"])
self.header().setSectionResizeMode(QtWidgets.QHeaderView.ResizeToContents)
self.header().setStretchLastSection(False)
self.setExpandsOnDoubleClick(False)
self.setEditTriggers(QtWidgets.QAbstractItemView.NoEditTriggers)
self.setContextMenuPolicy(QtCore.Qt.CustomContextMenu)
self.setSelectionMode(QtWidgets.QAbstractItemView.ExtendedSelection)
self.setStyleSheet("QTreeView::item {padding-right: 15 px;padding-bottom: 2 px;}")
# enable drag and drop
self.setDragEnabled(True)
self.setAcceptDrops(True)
self.setDragDropMode(QtWidgets.QAbstractItemView.InternalMove)
# connect slots
self.itemChanged.connect(self.slot_item_changed)
self.customContextMenuRequested.connect(self.slot_custom_context_menu_requested)
self.itemDoubleClicked.connect(self.slot_item_double_clicked)
self.root = None
self.reset_view()
self.is_editing = False
@staticmethod
def get_column_feature_index():
""" """
return 0
@staticmethod
def get_column_description_index():
""" """
return 1
@staticmethod
def get_column_comment_index():
""" """
return 2
@staticmethod
def get_node_type_expression():
""" """
return 0
@staticmethod
def get_node_type_feature():
""" """
return 1
@staticmethod
def get_node_type_comment():
""" """
return 2
def dragMoveEvent(self, e):
""" """
super(CapaExplorerRulgenEditor, self).dragMoveEvent(e)
def dragEventEnter(self, e):
""" """
super(CapaExplorerRulgenEditor, self).dragEventEnter(e)
def dropEvent(self, e):
""" """
if not self.indexAt(e.pos()).isValid():
return
super(CapaExplorerRulgenEditor, self).dropEvent(e)
# self.prune_expressions()
self.update_preview()
self.expandAll()
def reset_view(self):
""" """
self.root = None
self.clear()
def slot_item_changed(self, item, column):
""" """
if self.is_editing:
self.update_preview()
self.is_editing = False
def slot_remove_selected(self, action):
""" """
for o in self.selectedItems():
if o == self.root:
self.takeTopLevelItem(self.indexOfTopLevelItem(o))
self.root = None
continue
o.parent().removeChild(o)
def slot_nest_features(self, action):
""" """
# create a new parent under root node, by default; new node added last position in tree
new_parent = self.new_expression_node(self.root, (action.data()[0], ""))
if "basic block" in action.data()[0]:
# add default child expression when nesting under basic block
new_parent.setExpanded(True)
new_parent = self.new_expression_node(new_parent, ("- or:", ""))
for o in self.get_features(selected=True):
# take child from its parent by index, add to new parent
new_parent.addChild(o.parent().takeChild(o.parent().indexOfChild(o)))
# ensure new parent expanded
new_parent.setExpanded(True)
def slot_edit_expression(self, action):
""" """
expression, o = action.data()
if "basic block" in expression and "basic block" not in o.text(
CapaExplorerRulgenEditor.get_column_feature_index()
):
# current expression is "basic block", and not changing to "basic block" expression
children = o.takeChildren()
new_parent = self.new_expression_node(o, ("- or:", ""))
for child in children:
new_parent.addChild(child)
new_parent.setExpanded(True)
o.setText(CapaExplorerRulgenEditor.get_column_feature_index(), expression)
def slot_clear_all(self, action):
""" """
self.reset_view()
def slot_custom_context_menu_requested(self, pos):
""" """
if not self.indexAt(pos).isValid():
# user selected invalid index
self.load_custom_context_menu_invalid_index(pos)
elif self.itemAt(pos).capa_type == CapaExplorerRulgenEditor.get_node_type_expression():
# user selected expression node
self.load_custom_context_menu_expression(pos)
else:
# user selected feature node
self.load_custom_context_menu_feature(pos)
self.update_preview()
def slot_item_double_clicked(self, o, column):
""" """
if column in (
CapaExplorerRulgenEditor.get_column_comment_index(),
CapaExplorerRulgenEditor.get_column_description_index(),
):
o.setFlags(o.flags() | QtCore.Qt.ItemIsEditable)
self.editItem(o, column)
o.setFlags(o.flags() & ~QtCore.Qt.ItemIsEditable)
self.is_editing = True
def update_preview(self):
""" """
rule_text = self.preview.toPlainText()
if -1 != rule_text.find("features:"):
rule_text = rule_text[: rule_text.find("features:") + len("features:")]
rule_text += "\n"
else:
rule_text = rule_text.rstrip()
rule_text += "\n features:\n"
for o in iterate_tree(self):
feature, description, comment = map(lambda o: o.strip(), tuple(o.text(i) for i in range(3)))
rule_text += parse_node_for_feature(feature, description, comment, calc_item_depth(o))
# FIXME we avoid circular update by disabling signals when updating
# the preview. Preferably we would refactor the code to avoid this
# in the first place
self.preview.blockSignals(True)
self.preview.setPlainText(rule_text)
self.preview.blockSignals(False)
# emit signal so views can update
self.updated.emit()
def load_custom_context_menu_invalid_index(self, pos):
""" """
actions = (("Remove all", (), self.slot_clear_all),)
menu = build_context_menu(self.parent(), actions)
menu.exec_(self.viewport().mapToGlobal(pos))
def load_custom_context_menu_feature(self, pos):
""" """
actions = (("Remove selection", (), self.slot_remove_selected),)
sub_actions = (
("and", ("- and:",), self.slot_nest_features),
("or", ("- or:",), self.slot_nest_features),
("not", ("- not:",), self.slot_nest_features),
("optional", ("- optional:",), self.slot_nest_features),
("basic block", ("- basic block:",), self.slot_nest_features),
)
# build submenu with modify actions
sub_menu = build_context_menu(self.parent(), sub_actions)
sub_menu.setTitle("Nest feature%s" % ("" if len(tuple(self.get_features(selected=True))) == 1 else "s"))
# build main menu with submenu + main actions
menu = build_context_menu(self.parent(), (sub_menu,) + actions)
menu.exec_(self.viewport().mapToGlobal(pos))
def load_custom_context_menu_expression(self, pos):
""" """
actions = (("Remove expression", (), self.slot_remove_selected),)
sub_actions = (
("and", ("- and:", self.itemAt(pos)), self.slot_edit_expression),
("or", ("- or:", self.itemAt(pos)), self.slot_edit_expression),
("not", ("- not:", self.itemAt(pos)), self.slot_edit_expression),
("optional", ("- optional:", self.itemAt(pos)), self.slot_edit_expression),
("basic block", ("- basic block:", self.itemAt(pos)), self.slot_edit_expression),
)
# build submenu with modify actions
sub_menu = build_context_menu(self.parent(), sub_actions)
sub_menu.setTitle("Modify")
# build main menu with submenu + main actions
menu = build_context_menu(self.parent(), (sub_menu,) + actions)
menu.exec_(self.viewport().mapToGlobal(pos))
def style_expression_node(self, o):
""" """
font = QtGui.QFont()
font.setBold(True)
o.setFont(CapaExplorerRulgenEditor.get_column_feature_index(), font)
def style_feature_node(self, o):
""" """
font = QtGui.QFont()
brush = QtGui.QBrush()
font.setFamily("Courier")
font.setWeight(QtGui.QFont.Medium)
brush.setColor(QtGui.QColor(*COLOR_GREEN_RGB))
o.setFont(CapaExplorerRulgenEditor.get_column_feature_index(), font)
o.setForeground(CapaExplorerRulgenEditor.get_column_feature_index(), brush)
def style_comment_node(self, o):
""" """
font = QtGui.QFont()
font.setBold(True)
font.setFamily("Courier")
o.setFont(CapaExplorerRulgenEditor.get_column_feature_index(), font)
def set_expression_node(self, o):
""" """
setattr(o, "capa_type", CapaExplorerRulgenEditor.get_node_type_expression())
self.style_expression_node(o)
def set_feature_node(self, o):
""" """
setattr(o, "capa_type", CapaExplorerRulgenEditor.get_node_type_feature())
o.setFlags(o.flags() & ~QtCore.Qt.ItemIsDropEnabled)
self.style_feature_node(o)
def set_comment_node(self, o):
""" """
setattr(o, "capa_type", CapaExplorerRulgenEditor.get_node_type_comment())
o.setFlags(o.flags() & ~QtCore.Qt.ItemIsDropEnabled)
self.style_comment_node(o)
def new_expression_node(self, parent, values=()):
""" """
o = QtWidgets.QTreeWidgetItem(parent)
self.set_expression_node(o)
for (i, v) in enumerate(values):
o.setText(i, v)
return o
def new_feature_node(self, parent, values=()):
""" """
o = QtWidgets.QTreeWidgetItem(parent)
self.set_feature_node(o)
for (i, v) in enumerate(values):
o.setText(i, v)
return o
def new_comment_node(self, parent, values=()):
""" """
o = QtWidgets.QTreeWidgetItem(parent)
self.set_comment_node(o)
for (i, v) in enumerate(values):
o.setText(i, v)
return o
def update_features(self, features):
""" """
if not self.root:
# root node does not exist, create default node, set expanded
self.root = self.new_expression_node(self, ("- or:", ""))
# build feature counts
counted = list(zip(Counter(features).keys(), Counter(features).values()))
# single features
for (k, v) in filter(lambda t: t[1] == 1, counted):
if isinstance(k, (capa.features.String,)):
value = '"%s"' % capa.features.escape_string(k.get_value_str())
else:
value = k.get_value_str()
self.new_feature_node(self.root, ("- %s: %s" % (k.name.lower(), value), ""))
# n > 1 features
for (k, v) in filter(lambda t: t[1] > 1, counted):
if k.value:
if isinstance(k, (capa.features.String,)):
value = '"%s"' % capa.features.escape_string(k.get_value_str())
else:
value = k.get_value_str()
display = "- count(%s(%s)): %d" % (k.name.lower(), value, v)
else:
display = "- count(%s): %d" % (k.name.lower(), v)
self.new_feature_node(self.root, (display, ""))
self.expandAll()
self.update_preview()
def load_features_from_yaml(self, rule_text, update_preview=False):
""" """
def add_node(parent, node):
if node.text(0).startswith("description:"):
if parent.childCount():
parent.child(parent.childCount() - 1).setText(1, node.text(0).lstrip("description:").lstrip())
else:
parent.setText(1, node.text(0).lstrip("description:").lstrip())
elif node.text(0).startswith("- description:"):
parent.setText(1, node.text(0).lstrip("- description:").lstrip())
else:
parent.addChild(node)
def build(parent, nodes):
if nodes:
child_lvl = nodes[0].capa_level
while nodes:
node = nodes.pop(0)
if node.capa_level == child_lvl:
add_node(parent, node)
elif node.capa_level > child_lvl:
nodes.insert(0, node)
build(parent.child(parent.childCount() - 1), nodes)
else:
parent = parent.parent() if parent.parent() else parent
add_node(parent, node)
self.reset_view()
# check for lack of features block
if -1 == rule_text.find("features:"):
return
rule_features = rule_text[rule_text.find("features:") + len("features:") :].strip()
rule_nodes = list(yaml_to_nodes(rule_features))
# check for lack of nodes
if not rule_nodes:
return
for o in rule_nodes:
(self.set_expression_node, self.set_feature_node, self.set_comment_node)[o.capa_type](o)
self.root = rule_nodes.pop(0)
self.addTopLevelItem(self.root)
if update_preview:
self.preview.blockSignals(True)
self.preview.setPlainText(rule_text)
self.preview.blockSignals(False)
build(self.root, rule_nodes)
self.expandAll()
def get_features(self, selected=False, ignore=()):
""" """
for feature in filter(
lambda o: o.capa_type
in (CapaExplorerRulgenEditor.get_node_type_feature(), CapaExplorerRulgenEditor.get_node_type_comment()),
tuple(iterate_tree(self)),
):
if feature in ignore:
continue
if selected and not feature.isSelected():
continue
yield feature
def get_expressions(self, selected=False, ignore=()):
""" """
for expression in filter(
lambda o: o.capa_type == CapaExplorerRulgenEditor.get_node_type_expression(), tuple(iterate_tree(self))
):
if expression in ignore:
continue
if selected and not expression.isSelected():
continue
yield expression
class CapaExplorerRulegenFeatures(QtWidgets.QTreeWidget):
def __init__(self, editor, parent=None):
""" """
super(CapaExplorerRulegenFeatures, self).__init__(parent)
self.parent_items = {}
self.editor = editor
self.setHeaderLabels(["Feature", "Virtual Address"])
self.header().setSectionResizeMode(QtWidgets.QHeaderView.ResizeToContents)
self.setStyleSheet("QTreeView::item {padding-right: 15 px;padding-bottom: 2 px;}")
self.setExpandsOnDoubleClick(False)
self.setContextMenuPolicy(QtCore.Qt.CustomContextMenu)
self.setSelectionMode(QtWidgets.QAbstractItemView.ExtendedSelection)
# connect slots
self.itemDoubleClicked.connect(self.slot_item_double_clicked)
self.customContextMenuRequested.connect(self.slot_custom_context_menu_requested)
self.reset_view()
@staticmethod
def get_column_feature_index():
""" """
return 0
@staticmethod
def get_column_address_index():
""" """
return 1
@staticmethod
def get_node_type_parent():
""" """
return 0
@staticmethod
def get_node_type_leaf():
""" """
return 1
def reset_view(self):
""" """
self.clear()
def slot_add_selected_features(self, action):
""" """
selected = [item.data(0, 0x100) for item in self.selectedItems()]
if selected:
self.editor.update_features(selected)
def slot_custom_context_menu_requested(self, pos):
""" """
actions = []
action_add_features_fmt = ""
selected_items_count = len(self.selectedItems())
if selected_items_count == 0:
return
if selected_items_count == 1:
action_add_features_fmt = "Add feature"
else:
action_add_features_fmt = "Add %d features" % selected_items_count
actions.append((action_add_features_fmt, (), self.slot_add_selected_features))
menu = build_context_menu(self.parent(), actions)
menu.exec_(self.viewport().mapToGlobal(pos))
def slot_item_double_clicked(self, o, column):
""" """
if column == CapaExplorerRulegenFeatures.get_column_address_index() and o.text(column):
idc.jumpto(int(o.text(column), 0x10))
elif o.capa_type == CapaExplorerRulegenFeatures.get_node_type_leaf():
self.editor.update_features([o.data(0, 0x100)])
def show_all_items(self):
""" """
for o in iterate_tree(self):
o.setHidden(False)
o.setExpanded(False)
def filter_items_by_text(self, text):
""" """
if text:
for o in iterate_tree(self):
data = o.data(0, 0x100)
if data:
to_match = data.get_value_str()
if not to_match or text.lower() not in to_match.lower():
o.setHidden(True)
continue
o.setHidden(False)
o.setExpanded(True)
else:
self.show_all_items()
def style_parent_node(self, o):
""" """
font = QtGui.QFont()
font.setBold(True)
o.setFont(CapaExplorerRulegenFeatures.get_column_feature_index(), font)
def style_leaf_node(self, o):
""" """
font = QtGui.QFont("Courier", weight=QtGui.QFont.Bold)
brush = QtGui.QBrush()
o.setFont(CapaExplorerRulegenFeatures.get_column_feature_index(), font)
o.setFont(CapaExplorerRulegenFeatures.get_column_address_index(), font)
brush.setColor(QtGui.QColor(*COLOR_GREEN_RGB))
o.setForeground(CapaExplorerRulegenFeatures.get_column_feature_index(), brush)
brush.setColor(QtGui.QColor(*COLOR_BLUE_RGB))
o.setForeground(CapaExplorerRulegenFeatures.get_column_address_index(), brush)
def set_parent_node(self, o):
""" """
o.setFlags(o.flags() & ~QtCore.Qt.ItemIsSelectable)
setattr(o, "capa_type", CapaExplorerRulegenFeatures.get_node_type_parent())
self.style_parent_node(o)
def set_leaf_node(self, o):
""" """
setattr(o, "capa_type", CapaExplorerRulegenFeatures.get_node_type_leaf())
self.style_leaf_node(o)
def new_parent_node(self, parent, data, feature=None):
""" """
o = QtWidgets.QTreeWidgetItem(parent)
self.set_parent_node(o)
for (i, v) in enumerate(data):
o.setText(i, v)
if feature:
o.setData(0, 0x100, feature)
return o
def new_leaf_node(self, parent, data, feature=None):
""" """
o = QtWidgets.QTreeWidgetItem(parent)
self.set_leaf_node(o)
for (i, v) in enumerate(data):
o.setText(i, v)
if feature:
o.setData(0, 0x100, feature)
return o
def load_features(self, file_features, func_features={}):
""" """
self.parse_features_for_tree(self.new_parent_node(self, ("File Scope",)), file_features)
if func_features:
self.parse_features_for_tree(self.new_parent_node(self, ("Function/Basic Block Scope",)), func_features)
def parse_features_for_tree(self, parent, features):
""" """
self.parent_items = {}
def format_address(e):
return "%X" % e if e else ""
def format_feature(feature):
""" """
name = feature.name.lower()
value = feature.get_value_str()
if isinstance(feature, (capa.features.String,)):
value = '"%s"' % capa.features.escape_string(value)
return "%s(%s)" % (name, value)
for (feature, eas) in sorted(features.items(), key=lambda k: sorted(k[1])):
if isinstance(feature, capa.features.basicblock.BasicBlock):
# filter basic blocks for now, we may want to add these back in some time
# in the future
continue
# level 0
if type(feature) not in self.parent_items:
self.parent_items[type(feature)] = self.new_parent_node(parent, (feature.name.lower(),))
# level 1
if feature not in self.parent_items:
if len(eas) > 1:
self.parent_items[feature] = self.new_parent_node(
self.parent_items[type(feature)], (format_feature(feature),), feature=feature
)
else:
self.parent_items[feature] = self.new_leaf_node(
self.parent_items[type(feature)], (format_feature(feature),), feature=feature
)
# level n > 1
if len(eas) > 1:
for ea in sorted(eas):
self.new_leaf_node(
self.parent_items[feature], (format_feature(feature), format_address(ea)), feature=feature
)
else:
ea = eas.pop()
for (i, v) in enumerate((format_feature(feature), format_address(ea))):
self.parent_items[feature].setText(i, v)
self.parent_items[feature].setData(0, 0x100, feature)
class CapaExplorerQtreeView(QtWidgets.QTreeView):
"""tree view used to display hierarchical capa results

View File

@@ -32,7 +32,9 @@ import capa.features.extractors
from capa.helpers import oint, get_file_taste
RULES_PATH_DEFAULT_STRING = "(embedded rules)"
SUPPORTED_FILE_MAGIC = set(["MZ"])
SUPPORTED_FILE_MAGIC = set([b"MZ"])
BACKEND_VIV = "vivisect"
BACKEND_SMDA = "smda"
logger = logging.getLogger("capa")
@@ -40,8 +42,11 @@ logger = logging.getLogger("capa")
def set_vivisect_log_level(level):
logging.getLogger("vivisect").setLevel(level)
logging.getLogger("vivisect.base").setLevel(level)
logging.getLogger("vivisect.impemu").setLevel(level)
logging.getLogger("vtrace").setLevel(level)
logging.getLogger("envi").setLevel(level)
logging.getLogger("envi.codeflow").setLevel(level)
def find_function_capabilities(ruleset, extractor, f):
@@ -112,7 +117,13 @@ def find_capabilities(ruleset, extractor, disable_progress=None):
}
}
for f in tqdm.tqdm(list(extractor.get_functions()), disable=disable_progress, desc="matching", unit=" functions"):
pbar = tqdm.tqdm
if disable_progress:
# do not use tqdm to avoid unnecessary side effects when caller intends
# to disable progress completely
pbar = lambda s, *args, **kwargs: s
for f in pbar(list(extractor.get_functions()), desc="matching", unit=" functions"):
function_matches, bb_matches, feature_count = find_function_capabilities(ruleset, extractor, f)
meta["feature_counts"]["functions"][f.__int__()] = feature_count
logger.debug("analyzed function 0x%x and extracted %d features", f.__int__(), feature_count)
@@ -271,6 +282,8 @@ def get_workspace(path, format, should_save=True):
vw = get_shellcode_vw(path, arch="i386", should_save=should_save)
elif format == "sc64":
vw = get_shellcode_vw(path, arch="amd64", should_save=should_save)
else:
raise ValueError("unexpected format: " + format)
logger.debug("%s", get_meta_str(vw))
return vw
@@ -294,17 +307,43 @@ class UnsupportedRuntimeError(RuntimeError):
pass
def get_extractor_py3(path, format, disable_progress=False):
raise UnsupportedRuntimeError()
def get_extractor_py3(path, format, backend, disable_progress=False):
if backend == "smda":
from smda.SmdaConfig import SmdaConfig
from smda.Disassembler import Disassembler
import capa.features.extractors.smda
smda_report = None
with halo.Halo(text="analyzing program", spinner="simpleDots", stream=sys.stderr, enabled=not disable_progress):
config = SmdaConfig()
config.STORE_BUFFER = True
smda_disasm = Disassembler(config)
smda_report = smda_disasm.disassembleFile(path)
return capa.features.extractors.smda.SmdaFeatureExtractor(smda_report, path)
else:
import capa.features.extractors.viv
with halo.Halo(text="analyzing program", spinner="simpleDots", stream=sys.stderr, enabled=not disable_progress):
vw = get_workspace(path, format, should_save=False)
try:
vw.saveWorkspace()
except IOError:
# see #168 for discussion around how to handle non-writable directories
logger.info("source directory is not writable, won't save intermediate workspace")
return capa.features.extractors.viv.VivisectFeatureExtractor(vw, path)
def get_extractor(path, format, disable_progress=False):
def get_extractor(path, format, backend, disable_progress=False):
"""
raises:
UnsupportedFormatError:
"""
if sys.version_info >= (3, 0):
return get_extractor_py3(path, format, disable_progress=disable_progress)
return get_extractor_py3(path, format, backend, disable_progress=disable_progress)
else:
return get_extractor_py2(path, format, disable_progress=disable_progress)
@@ -340,8 +379,8 @@ def get_rules(rule_path, disable_progress=False):
for file in files:
if not file.endswith(".yml"):
if not (file.endswith(".md") or file.endswith(".git") or file.endswith(".txt")):
# expect to see readme.md, format.md, and maybe a .git directory
if not (file.startswith(".git") or file.endswith((".git", ".md", ".txt"))):
# expect to see .git* files, readme.md, format.md, and maybe a .git directory
# other things maybe are rules, but are mis-named.
logger.warning("skipping non-.yml file: %s", file)
continue
@@ -351,7 +390,13 @@ def get_rules(rule_path, disable_progress=False):
rules = []
for rule_path in tqdm.tqdm(list(rule_paths), disable=disable_progress, desc="loading ", unit=" rules"):
pbar = tqdm.tqdm
if disable_progress:
# do not use tqdm to avoid unnecessary side effects when caller intends
# to disable progress completely
pbar = lambda s, *args, **kwargs: s
for rule_path in pbar(list(rule_paths), desc="loading ", unit=" rules"):
try:
rule = capa.rules.Rule.from_yaml_file(rule_path)
except capa.rules.InvalidRule:
@@ -401,19 +446,162 @@ def collect_metadata(argv, sample_path, rules_path, format, extractor):
}
def install_common_args(parser, wanted=None):
"""
register a common set of command line arguments for re-use by main & scripts.
these are things like logging/coloring/etc.
also enable callers to opt-in to common arguments, like specifying the input sample.
this routine lets many script use the same language for cli arguments.
see `handle_common_args` to do common configuration.
args:
parser (argparse.ArgumentParser): a parser to update in place, adding common arguments.
wanted (Set[str]): collection of arguments to opt-into, including:
- "sample": required positional argument to input file.
- "format": flag to override file format.
- "backend": flag to override analysis backend under py3.
- "rules": flag to override path to capa rules.
- "tag": flag to override/specify which rules to match.
"""
if wanted is None:
wanted = set()
#
# common arguments that all scripts will have
#
parser.add_argument("--version", action="version", version="%(prog)s {:s}".format(capa.version.__version__))
parser.add_argument(
"-v", "--verbose", action="store_true", help="enable verbose result document (no effect with --json)"
)
parser.add_argument(
"-vv", "--vverbose", action="store_true", help="enable very verbose result document (no effect with --json)"
)
parser.add_argument("-d", "--debug", action="store_true", help="enable debugging output on STDERR")
parser.add_argument("-q", "--quiet", action="store_true", help="disable all output but errors")
parser.add_argument(
"--color",
type=str,
choices=("auto", "always", "never"),
default="auto",
help="enable ANSI color codes in results, default: only during interactive session",
)
#
# arguments that may be opted into:
#
# - sample
# - format
# - rules
# - tag
#
if "sample" in wanted:
if sys.version_info >= (3, 0):
parser.add_argument(
# Python 3 str handles non-ASCII arguments correctly
"sample",
type=str,
help="path to sample to analyze",
)
else:
parser.add_argument(
# in #328 we noticed that the sample path is not handled correctly if it contains non-ASCII characters
# https://stackoverflow.com/a/22947334/ offers a solution and decoding using getfilesystemencoding works
# in our testing, however other sources suggest `sys.stdin.encoding` (https://stackoverflow.com/q/4012571/)
"sample",
type=lambda s: s.decode(sys.getfilesystemencoding()),
help="path to sample to analyze",
)
if "format" in wanted:
formats = [
("auto", "(default) detect file type automatically"),
("pe", "Windows PE file"),
("sc32", "32-bit shellcode"),
("sc64", "64-bit shellcode"),
("freeze", "features previously frozen by capa"),
]
format_help = ", ".join(["%s: %s" % (f[0], f[1]) for f in formats])
parser.add_argument(
"-f",
"--format",
choices=[f[0] for f in formats],
default="auto",
help="select sample format, %s" % format_help,
)
if "backend" in wanted and sys.version_info >= (3, 0):
parser.add_argument(
"-b",
"--backend",
type=str,
help="select the backend to use",
choices=(BACKEND_VIV, BACKEND_SMDA),
default=BACKEND_VIV,
)
if "rules" in wanted:
parser.add_argument(
"-r",
"--rules",
type=str,
default=RULES_PATH_DEFAULT_STRING,
help="path to rule file or directory, use embedded rules by default",
)
if "tag" in wanted:
parser.add_argument("-t", "--tag", type=str, help="filter on rule meta field values")
def handle_common_args(args):
"""
handle the global config specified by `install_common_args`,
such as configuring logging/coloring/etc.
args:
args (argparse.Namespace): parsed arguments that included at least `install_common_args` args.
"""
if args.quiet:
logging.basicConfig(level=logging.WARNING)
logging.getLogger().setLevel(logging.WARNING)
elif args.debug:
logging.basicConfig(level=logging.DEBUG)
logging.getLogger().setLevel(logging.DEBUG)
else:
logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(logging.INFO)
# disable vivisect-related logging, it's verbose and not relevant for capa users
set_vivisect_log_level(logging.CRITICAL)
# py2 doesn't know about cp65001, which is a variant of utf-8 on windows
# tqdm bails when trying to render the progress bar in this setup.
# because cp65001 is utf-8, we just map that codepage to the utf-8 codec.
# see #380 and: https://stackoverflow.com/a/3259271/87207
import codecs
codecs.register(lambda name: codecs.lookup("utf-8") if name == "cp65001" else None)
if args.color == "always":
colorama.init(strip=False)
elif args.color == "auto":
# colorama will detect:
# - when on Windows console, and fixup coloring, and
# - when not an interactive session, and disable coloring
# renderers should use coloring and assume it will be stripped out if necessary.
colorama.init()
elif args.color == "never":
colorama.init(strip=True)
else:
raise RuntimeError("unexpected --color value: " + args.color)
def main(argv=None):
if argv is None:
argv = sys.argv[1:]
formats = [
("auto", "(default) detect file type automatically"),
("pe", "Windows PE file"),
("sc32", "32-bit shellcode"),
("sc64", "64-bit shellcode"),
("freeze", "features previously frozen by capa"),
]
format_help = ", ".join(["%s: %s" % (f[0], f[1]) for f in formats])
desc = "The FLARE team's open-source tool to identify capabilities in executable files."
epilog = textwrap.dedent(
"""
@@ -446,56 +634,10 @@ def main(argv=None):
parser = argparse.ArgumentParser(
description=desc, epilog=epilog, formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument(
# in #328 we noticed that the sample path is not handled correctly if it contains non-ASCII characters
# https://stackoverflow.com/a/22947334/ offers a solution and decoding using getfilesystemencoding works
# in our testing, however other sources suggest `sys.stdin.encoding` (https://stackoverflow.com/q/4012571/)
"sample",
type=lambda s: s.decode(sys.getfilesystemencoding()),
help="path to sample to analyze",
)
parser.add_argument("--version", action="version", version="%(prog)s {:s}".format(capa.version.__version__))
parser.add_argument(
"-r",
"--rules",
type=str,
default=RULES_PATH_DEFAULT_STRING,
help="path to rule file or directory, use embedded rules by default",
)
parser.add_argument(
"-f", "--format", choices=[f[0] for f in formats], default="auto", help="select sample format, %s" % format_help
)
parser.add_argument("-t", "--tag", type=str, help="filter on rule meta field values")
install_common_args(parser, {"sample", "format", "backend", "rules", "tag"})
parser.add_argument("-j", "--json", action="store_true", help="emit JSON instead of text")
parser.add_argument(
"-v", "--verbose", action="store_true", help="enable verbose result document (no effect with --json)"
)
parser.add_argument(
"-vv", "--vverbose", action="store_true", help="enable very verbose result document (no effect with --json)"
)
parser.add_argument("-d", "--debug", action="store_true", help="enable debugging output on STDERR")
parser.add_argument("-q", "--quiet", action="store_true", help="disable all output but errors")
parser.add_argument(
"--color",
type=str,
choices=("auto", "always", "never"),
default="auto",
help="enable ANSI color codes in results, default: only during interactive session",
)
args = parser.parse_args(args=argv)
if args.quiet:
logging.basicConfig(level=logging.WARNING)
logging.getLogger().setLevel(logging.WARNING)
elif args.debug:
logging.basicConfig(level=logging.DEBUG)
logging.getLogger().setLevel(logging.DEBUG)
else:
logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(logging.INFO)
# disable vivisect-related logging, it's verbose and not relevant for capa users
set_vivisect_log_level(logging.CRITICAL)
handle_common_args(args)
try:
taste = get_file_taste(args.sample)
@@ -505,14 +647,6 @@ def main(argv=None):
logger.error("%s", e.args[0])
return -1
# py2 doesn't know about cp65001, which is a variant of utf-8 on windows
# tqdm bails when trying to render the progress bar in this setup.
# because cp65001 is utf-8, we just map that codepage to the utf-8 codec.
# see #380 and: https://stackoverflow.com/a/3259271/87207
import codecs
codecs.register(lambda name: codecs.lookup("utf-8") if name == "cp65001" else None)
if args.rules == RULES_PATH_DEFAULT_STRING:
logger.debug("-" * 80)
logger.debug(" Using default embedded rules.")
@@ -550,7 +684,7 @@ def main(argv=None):
# during the load of the RuleSet, we extract subscope statements into their own rules
# that are subsequently `match`ed upon. this inflates the total rule count.
# so, filter out the subscope rules when reporting total number of loaded rules.
len(filter(lambda r: "capa/subscope-rule" not in r.meta, rules.rules.values())),
len([i for i in filter(lambda r: "capa/subscope-rule" not in r.meta, rules.rules.values())]),
)
if args.tag:
rules = rules.filter_rules_by_meta(args.tag)
@@ -569,7 +703,8 @@ def main(argv=None):
else:
format = args.format
try:
extractor = get_extractor(args.sample, args.format, disable_progress=args.quiet)
backend = args.backend if sys.version_info > (3, 0) else BACKEND_VIV
extractor = get_extractor(args.sample, args.format, backend, disable_progress=args.quiet)
except UnsupportedFormatError:
logger.error("-" * 80)
logger.error(" Input file does not appear to be a PE file.")
@@ -602,19 +737,6 @@ def main(argv=None):
if not (args.verbose or args.vverbose or args.json):
return -1
if args.color == "always":
colorama.init(strip=False)
elif args.color == "auto":
# colorama will detect:
# - when on Windows console, and fixup coloring, and
# - when not an interactive session, and disable coloring
# renderers should use coloring and assume it will be stripped out if necessary.
colorama.init()
elif args.color == "never":
colorama.init(strip=True)
else:
raise RuntimeError("unexpected --color value: " + args.color)
if args.json:
print(capa.render.render_json(meta, rules, capabilities))
elif args.vverbose:

View File

@@ -56,7 +56,11 @@ def render_statement(ostream, match, statement, indent=0):
child = statement["child"]
if child[child["type"]]:
value = rutils.bold2(child[child["type"]])
if child["type"] == "string":
value = '"%s"' % capa.features.escape_string(child[child["type"]])
else:
value = child[child["type"]]
value = rutils.bold2(value)
if child.get("description"):
ostream.write("count(%s(%s = %s)): " % (child["type"], value, child["description"]))
else:
@@ -90,6 +94,9 @@ def render_feature(ostream, match, feature, indent=0):
key = "string" # render string for regex to mirror the rule source
value = feature["match"] # the match provides more information than the value for regex
if key == "string":
value = '"%s"' % capa.features.escape_string(value)
ostream.write(key)
ostream.write(": ")

View File

@@ -6,6 +6,7 @@
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
import re
import uuid
import codecs
import logging
@@ -600,6 +601,9 @@ class Rule(object):
# use block mode, not inline json-like mode
y.default_flow_style = False
# leave quotes unchanged
y.preserve_quotes = True
# indent lists by two spaces below their parent
#
# features:
@@ -614,16 +618,20 @@ class Rule(object):
return y
@classmethod
def from_yaml(cls, s):
# use pyyaml because it can be much faster than ruamel (pure python)
doc = yaml.load(s, Loader=cls._get_yaml_loader())
def from_yaml(cls, s, use_ruamel=False):
if use_ruamel:
# ruamel enables nice formatting and doc roundtripping with comments
doc = cls._get_ruamel_yaml_parser().load(s)
else:
# use pyyaml because it can be much faster than ruamel (pure python)
doc = yaml.load(s, Loader=cls._get_yaml_loader())
return cls.from_dict(doc, s)
@classmethod
def from_yaml_file(cls, path):
def from_yaml_file(cls, path, use_ruamel=False):
with open(path, "rb") as f:
try:
return cls.from_yaml(f.read().decode("utf-8"))
return cls.from_yaml(f.read().decode("utf-8"), use_ruamel=use_ruamel)
except InvalidRule as e:
raise InvalidRuleWithPath(path, str(e))
@@ -716,7 +724,20 @@ class Rule(object):
# tweaking `ruamel.indent()` doesn't quite give us the control we want.
# so, add the two extra spaces that we've determined we need through experimentation.
# see #263
doc = doc.replace(" description:", " description:")
# only do this for the features section, so the meta description doesn't get reformatted
# assumes features section always exists
features_offset = doc.find("features")
doc = doc[:features_offset] + doc[features_offset:].replace(" description:", " description:")
# for negative hex numbers, yaml dump outputs:
# - offset: !!int '0x-30'
# we prefer:
# - offset: -0x30
# the below regex makes these adjustments and while ugly, we don't have to explore the ruamel.yaml insides
doc = re.sub(r"!!int '0x-([0-9a-fA-F]+)'", r"-0x\1", doc)
# normalize CRLF to LF
doc = doc.replace("\r\n", "\n")
return doc
@@ -866,7 +887,8 @@ class RuleSet(object):
given a collection of rules, collect the rules that are needed at the given scope.
these rules are ordered topologically.
don't include "lib" rules, unless they are dependencies of other rules.
don't include auto-generated "subscope" rules.
we want to include general "lib" rules here - even if they are not dependencies of other rules, see #398
"""
scope_rules = set([])
@@ -875,7 +897,7 @@ class RuleSet(object):
# at lower scope, e.g. function scope.
# so, we find all dependencies of all rules, and later will filter them down.
for rule in rules:
if rule.meta.get("lib", False):
if rule.meta.get("capa/subscope-rule", False):
continue
scope_rules.update(get_rules_and_dependencies(rules, rule.name))

View File

@@ -1 +1 @@
__version__ = "1.4.0"
__version__ = "1.6.1"

BIN
doc/img/changelog/tab.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 141 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 322 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 173 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 135 KiB

View File

@@ -74,8 +74,20 @@ Note that some development dependencies (including the black code formatter) req
To check the code style, formatting and run the tests you can run the script `scripts/ci.sh`.
You can run it with the argument `no_tests` to skip the tests and only run the code style and formatting: `scripts/ci.sh no_tests`
### 3. Setup hooks [optional]
### 3. Compile binary using PyInstaller
We compile capa standalone binaries using PyInstaller. To reproduce the build process check out the source code as described above and follow these steps.
#### Install PyInstaller:
For Python 2.7: `$ pip install 'pyinstaller==3.*'` (PyInstaller 4 doesn't support Python 2.7)
For Python 3: `$ pip install 'pyinstaller`
#### Run Pyinstaller
`$ pyinstaller .github/pyinstaller/pyinstaller.spec`
You can find the compiled binary in the created directory `dist/`.
### 4. Setup hooks [optional]
If you plan to contribute to capa, you may want to setup the hooks.
Run `scripts/setup-hooks.sh` to set the following hooks up:
- The `pre-commit` hook runs checks before every `git commit`.
@@ -84,4 +96,3 @@ Run `scripts/setup-hooks.sh` to set the following hooks up:
- The `pre-push` hook runs checks before every `git push`.
It runs `scripts/ci.sh` aborting the push if there are code style or rule linter offenses or if the tests fail.
This way you can ensure everything is alright before sending a pull request.

44
doc/release.md Normal file
View File

@@ -0,0 +1,44 @@
# Release checklist
- [ ] Ensure all [milestoned issues/PRs](https://github.com/fireeye/capa/milestones) are addressed, or reassign to a new milestone.
- [ ] Add the `dont merge` label to all PRs that are close to be ready to merge (or merge them if they are ready) in [capa](https://github.com/fireeye/capa/pulls) and [capa-rules](https://github.com/fireeye/capa-rules/pulls).
- [ ] Ensure the [CI workflow succeeds in master](https://github.com/fireeye/capa/actions/workflows/tests.yml?query=branch%3Amaster).
- [ ] Ensure that `python scripts/lint.py rules/ --thorough` succeeds (only `missing examples` offenses are allowed in the nursery).
- [ ] Review changes
- capa https://github.com/fireeye/capa/compare/\<last-release\>...master
- capa-rules https://github.com/fireeye/capa-rules/compare/\<last-release>\...master
- [ ] Update [CHANGELOG.md](https://github.com/fireeye/capa/blob/master/CHANGELOG.md)
- Do not forget to add a nice introduction thanking contributors
- Remember that we need a major release if we introduce breaking changes
- Sections
- New Features
- New Rules
- Bug Fixes
- Changes
- Development
- Raw diffs
- Update `Raw diffs` links
- Create placeholder for `master (unreleased)` section
```
## master (unreleased)
### New Features
### New Rules
### Bug Fixes
### Changes
### Development
### Raw diffs
- [capa <release>...master](https://github.com/fireeye/capa/compare/<release>...master)
- [capa-rules <release>...master](https://github.com/fireeye/capa-rules/compare/<release>...master)
```
- [ ] Update [capa/version.py](https://github.com/fireeye/capa/blob/master/capa/version.py)
- [ ] Create a PR with the updated [CHANGELOG.md](https://github.com/fireeye/capa/blob/master/CHANGELOG.md) and [capa/version.py](https://github.com/fireeye/capa/blob/master/capa/version.py). Copy this checklist in the PR description.
- [ ] After PR review, merge the PR and [create the release in GH](https://github.com/fireeye/capa/releases/new) using text from the [CHANGELOG.md](https://github.com/fireeye/capa/blob/master/CHANGELOG.md).
- [ ] Verify GH actions [upload artifacts](https://github.com/fireeye/capa/releases), [publish to PyPI](https://pypi.org/project/flare-capa) and [create a tag in capa rules](https://github.com/fireeye/capa-rules/tags) upon completion.
- [ ] [Spread the word](https://twitter.com)

2
rules

Submodule rules updated: 6830d707c7...93ea28dd32

View File

@@ -1,247 +1,220 @@
#!/usr/bin/env python
"""
bulk-process
Invoke capa recursively against a directory of samples
and emit a JSON document mapping the file paths to their results.
By default, this will use subprocesses for parallelism.
Use `-n/--parallelism` to change the subprocess count from
the default of current CPU count.
Use `--no-mp` to use threads instead of processes,
which is probably not useful unless you set `--parallelism=1`.
example:
$ python scripts/bulk-process /tmp/suspicious
{
"/tmp/suspicious/suspicious.dll_": {
"rules": {
"encode data using XOR": {
"matches": {
"268440358": {
[...]
"/tmp/suspicious/1.dll_": { ... }
"/tmp/suspicious/2.dll_": { ... }
}
usage:
usage: bulk-process.py [-h] [-r RULES] [-d] [-q] [-n PARALLELISM] [--no-mp]
input
detect capabilities in programs.
positional arguments:
input Path to directory of files to recursively analyze
optional arguments:
-h, --help show this help message and exit
-r RULES, --rules RULES
Path to rule file or directory, use embedded rules by
default
-d, --debug Enable debugging output on STDERR
-q, --quiet Disable all output but errors
-n PARALLELISM, --parallelism PARALLELISM
parallelism factor
--no-mp disable subprocesses
Copyright (C) 2020 FireEye, Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at: [package root]/LICENSE.txt
Unless required by applicable law or agreed to in writing, software distributed under the License
is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and limitations under the License.
"""
import sys
import json
import logging
import os.path
import argparse
import multiprocessing
import multiprocessing.pool
import capa
import capa.main
import capa.render
logger = logging.getLogger("capa")
def get_capa_results(args):
"""
run capa against the file at the given path, using the given rules.
args is a tuple, containing:
rules (capa.rules.RuleSet): the rules to match
format (str): the name of the sample file format
path (str): the file system path to the sample to process
args is a tuple because i'm not quite sure how to unpack multiple arguments using `map`.
returns an dict with two required keys:
path (str): the file system path of the sample to process
status (str): either "error" or "ok"
when status == "error", then a human readable message is found in property "error".
when status == "ok", then the capa results are found in the property "ok".
the capa results are a dictionary with the following keys:
meta (dict): the meta analysis results
capabilities (dict): the matched capabilities and their result objects
"""
rules, format, path = args
logger.info("computing capa results for: %s", path)
try:
extractor = capa.main.get_extractor(path, format, disable_progress=True)
except capa.main.UnsupportedFormatError:
# i'm 100% sure if multiprocessing will reliably raise exceptions across process boundaries.
# so instead, return an object with explicit success/failure status.
#
# if success, then status=ok, and results found in property "ok"
# if error, then status=error, and human readable message in property "error"
return {
"path": path,
"status": "error",
"error": "input file does not appear to be a PE file: %s" % path,
}
except capa.main.UnsupportedRuntimeError:
return {
"path": path,
"status": "error",
"error": "unsupported runtime or Python interpreter",
}
except Exception as e:
return {
"path": path,
"status": "error",
"error": "unexpected error: %s" % (e),
}
meta = capa.main.collect_metadata("", path, "", format, extractor)
capabilities, counts = capa.main.find_capabilities(rules, extractor, disable_progress=True)
meta["analysis"].update(counts)
return {
"path": path,
"status": "ok",
"ok": {
"meta": meta,
"capabilities": capabilities,
},
}
def main(argv=None):
if argv is None:
argv = sys.argv[1:]
parser = argparse.ArgumentParser(description="detect capabilities in programs.")
parser.add_argument("input", type=str, help="Path to directory of files to recursively analyze")
parser.add_argument(
"-r",
"--rules",
type=str,
default="(embedded rules)",
help="Path to rule file or directory, use embedded rules by default",
)
parser.add_argument("-d", "--debug", action="store_true", help="Enable debugging output on STDERR")
parser.add_argument("-q", "--quiet", action="store_true", help="Disable all output but errors")
parser.add_argument(
"-n", "--parallelism", type=int, default=multiprocessing.cpu_count(), help="parallelism factor"
)
parser.add_argument("--no-mp", action="store_true", help="disable subprocesses")
args = parser.parse_args(args=argv)
if args.quiet:
logging.basicConfig(level=logging.ERROR)
logging.getLogger().setLevel(logging.ERROR)
elif args.debug:
logging.basicConfig(level=logging.DEBUG)
logging.getLogger().setLevel(logging.DEBUG)
else:
logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(logging.INFO)
# disable vivisect-related logging, it's verbose and not relevant for capa users
capa.main.set_vivisect_log_level(logging.CRITICAL)
# py2 doesn't know about cp65001, which is a variant of utf-8 on windows
# tqdm bails when trying to render the progress bar in this setup.
# because cp65001 is utf-8, we just map that codepage to the utf-8 codec.
# see #380 and: https://stackoverflow.com/a/3259271/87207
import codecs
codecs.register(lambda name: codecs.lookup("utf-8") if name == "cp65001" else None)
if args.rules == "(embedded rules)":
logger.info("using default embedded rules")
logger.debug("detected running from source")
args.rules = os.path.join(os.path.dirname(__file__), "..", "rules")
logger.debug("default rule path (source method): %s", args.rules)
else:
logger.info("using rules path: %s", args.rules)
try:
rules = capa.main.get_rules(args.rules)
rules = capa.rules.RuleSet(rules)
logger.info("successfully loaded %s rules", len(rules))
except (IOError, capa.rules.InvalidRule, capa.rules.InvalidRuleSet) as e:
logger.error("%s", str(e))
return -1
samples = []
for (base, directories, files) in os.walk(args.input):
for file in files:
samples.append(os.path.join(base, file))
def pmap(f, args, parallelism=multiprocessing.cpu_count()):
"""apply the given function f to the given args using subprocesses"""
return multiprocessing.Pool(parallelism).imap(f, args)
def tmap(f, args, parallelism=multiprocessing.cpu_count()):
"""apply the given function f to the given args using threads"""
return multiprocessing.pool.ThreadPool(parallelism).imap(f, args)
def map(f, args, parallelism=None):
"""apply the given function f to the given args in the current thread"""
for arg in args:
yield f(arg)
if args.no_mp:
if args.parallelism == 1:
logger.debug("using current thread mapper")
mapper = map
else:
logger.debug("using threading mapper")
mapper = tmap
else:
logger.debug("using process mapper")
mapper = pmap
results = {}
for result in mapper(
get_capa_results, [(rules, "pe", sample) for sample in samples], parallelism=args.parallelism
):
if result["status"] == "error":
logger.warning(result["error"])
elif result["status"] == "ok":
meta = result["ok"]["meta"]
capabilities = result["ok"]["capabilities"]
# our renderer expects to emit a json document for a single sample
# so we deserialize the json document, store it in a larger dict, and we'll subsequently re-encode.
results[result["path"]] = json.loads(capa.render.render_json(meta, rules, capabilities))
else:
raise ValueError("unexpected status: %s" % (result["status"]))
print(json.dumps(results))
logger.info("done.")
return 0
if __name__ == "__main__":
sys.exit(main())
#!/usr/bin/env python
"""
bulk-process
Invoke capa recursively against a directory of samples
and emit a JSON document mapping the file paths to their results.
By default, this will use subprocesses for parallelism.
Use `-n/--parallelism` to change the subprocess count from
the default of current CPU count.
Use `--no-mp` to use threads instead of processes,
which is probably not useful unless you set `--parallelism=1`.
example:
$ python scripts/bulk-process /tmp/suspicious
{
"/tmp/suspicious/suspicious.dll_": {
"rules": {
"encode data using XOR": {
"matches": {
"268440358": {
[...]
"/tmp/suspicious/1.dll_": { ... }
"/tmp/suspicious/2.dll_": { ... }
}
usage:
usage: bulk-process.py [-h] [-r RULES] [-d] [-q] [-n PARALLELISM] [--no-mp]
input
detect capabilities in programs.
positional arguments:
input Path to directory of files to recursively analyze
optional arguments:
-h, --help show this help message and exit
-r RULES, --rules RULES
Path to rule file or directory, use embedded rules by
default
-d, --debug Enable debugging output on STDERR
-q, --quiet Disable all output but errors
-n PARALLELISM, --parallelism PARALLELISM
parallelism factor
--no-mp disable subprocesses
Copyright (C) 2020 FireEye, Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at: [package root]/LICENSE.txt
Unless required by applicable law or agreed to in writing, software distributed under the License
is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and limitations under the License.
"""
import sys
import json
import logging
import os.path
import argparse
import multiprocessing
import multiprocessing.pool
import capa
import capa.main
import capa.rules
import capa.render
logger = logging.getLogger("capa")
def get_capa_results(args):
"""
run capa against the file at the given path, using the given rules.
args is a tuple, containing:
rules (capa.rules.RuleSet): the rules to match
format (str): the name of the sample file format
path (str): the file system path to the sample to process
args is a tuple because i'm not quite sure how to unpack multiple arguments using `map`.
returns an dict with two required keys:
path (str): the file system path of the sample to process
status (str): either "error" or "ok"
when status == "error", then a human readable message is found in property "error".
when status == "ok", then the capa results are found in the property "ok".
the capa results are a dictionary with the following keys:
meta (dict): the meta analysis results
capabilities (dict): the matched capabilities and their result objects
"""
rules, format, path = args
logger.info("computing capa results for: %s", path)
try:
extractor = capa.main.get_extractor(path, format, capa.main.BACKEND_VIV, disable_progress=True)
except capa.main.UnsupportedFormatError:
# i'm 100% sure if multiprocessing will reliably raise exceptions across process boundaries.
# so instead, return an object with explicit success/failure status.
#
# if success, then status=ok, and results found in property "ok"
# if error, then status=error, and human readable message in property "error"
return {
"path": path,
"status": "error",
"error": "input file does not appear to be a PE file: %s" % path,
}
except capa.main.UnsupportedRuntimeError:
return {
"path": path,
"status": "error",
"error": "unsupported runtime or Python interpreter",
}
except Exception as e:
return {
"path": path,
"status": "error",
"error": "unexpected error: %s" % (e),
}
meta = capa.main.collect_metadata("", path, "", format, extractor)
capabilities, counts = capa.main.find_capabilities(rules, extractor, disable_progress=True)
meta["analysis"].update(counts)
return {
"path": path,
"status": "ok",
"ok": {
"meta": meta,
"capabilities": capabilities,
},
}
def main(argv=None):
if argv is None:
argv = sys.argv[1:]
parser = argparse.ArgumentParser(description="detect capabilities in programs.")
capa.main.install_common_args(parser, wanted={"rules"})
parser.add_argument("input", type=str, help="Path to directory of files to recursively analyze")
parser.add_argument(
"-n", "--parallelism", type=int, default=multiprocessing.cpu_count(), help="parallelism factor"
)
parser.add_argument("--no-mp", action="store_true", help="disable subprocesses")
args = parser.parse_args(args=argv)
capa.main.handle_common_args(args)
if args.rules == "(embedded rules)":
logger.info("using default embedded rules")
logger.debug("detected running from source")
args.rules = os.path.join(os.path.dirname(__file__), "..", "rules")
logger.debug("default rule path (source method): %s", args.rules)
else:
logger.info("using rules path: %s", args.rules)
try:
rules = capa.main.get_rules(args.rules)
rules = capa.rules.RuleSet(rules)
logger.info("successfully loaded %s rules", len(rules))
except (IOError, capa.rules.InvalidRule, capa.rules.InvalidRuleSet) as e:
logger.error("%s", str(e))
return -1
samples = []
for (base, directories, files) in os.walk(args.input):
for file in files:
samples.append(os.path.join(base, file))
def pmap(f, args, parallelism=multiprocessing.cpu_count()):
"""apply the given function f to the given args using subprocesses"""
return multiprocessing.Pool(parallelism).imap(f, args)
def tmap(f, args, parallelism=multiprocessing.cpu_count()):
"""apply the given function f to the given args using threads"""
return multiprocessing.pool.ThreadPool(parallelism).imap(f, args)
def map(f, args, parallelism=None):
"""apply the given function f to the given args in the current thread"""
for arg in args:
yield f(arg)
if args.no_mp:
if args.parallelism == 1:
logger.debug("using current thread mapper")
mapper = map
else:
logger.debug("using threading mapper")
mapper = tmap
else:
logger.debug("using process mapper")
mapper = pmap
results = {}
for result in mapper(
get_capa_results, [(rules, "pe", sample) for sample in samples], parallelism=args.parallelism
):
if result["status"] == "error":
logger.warning(result["error"])
elif result["status"] == "ok":
meta = result["ok"]["meta"]
capabilities = result["ok"]["capabilities"]
# our renderer expects to emit a json document for a single sample
# so we deserialize the json document, store it in a larger dict, and we'll subsequently re-encode.
results[result["path"]] = json.loads(capa.render.render_json(meta, rules, capabilities))
else:
raise ValueError("unexpected status: %s" % (result["status"]))
print(json.dumps(results))
logger.info("done.")
return 0
if __name__ == "__main__":
sys.exit(main())

215
scripts/capa_as_library.py Normal file
View File

@@ -0,0 +1,215 @@
#!/usr/bin/env python3
import json
import collections
import capa.main
import capa.rules
import capa.engine
import capa.render
import capa.features
import capa.render.utils as rutils
from capa.engine import *
from capa.render import convert_capabilities_to_result_document
# edit this to set the path for file to analyze and rule directory
RULES_PATH = "/tmp/capa/rules/"
# load rules from disk
rules = capa.main.get_rules(RULES_PATH, disable_progress=True)
rules = capa.rules.RuleSet(rules)
# == Render ddictionary helpers
def render_meta(doc, ostream):
ostream["md5"] = doc["meta"]["sample"]["md5"]
ostream["sha1"] = doc["meta"]["sample"]["sha1"]
ostream["sha256"] = doc["meta"]["sample"]["sha256"]
ostream["path"] = doc["meta"]["sample"]["path"]
def find_subrule_matches(doc):
"""
collect the rule names that have been matched as a subrule match.
this way we can avoid displaying entries for things that are too specific.
"""
matches = set([])
def rec(node):
if not node["success"]:
# there's probably a bug here for rules that do `not: match: ...`
# but we don't have any examples of this yet
return
elif node["node"]["type"] == "statement":
for child in node["children"]:
rec(child)
elif node["node"]["type"] == "feature":
if node["node"]["feature"]["type"] == "match":
matches.add(node["node"]["feature"]["match"])
for rule in rutils.capability_rules(doc):
for node in rule["matches"].values():
rec(node)
return matches
def render_capabilities(doc, ostream):
"""
example::
{'CAPABILITY': {'accept command line arguments': 'host-interaction/cli',
'allocate thread local storage (2 matches)': 'host-interaction/process',
'check for time delay via GetTickCount': 'anti-analysis/anti-debugging/debugger-detection',
'check if process is running under wine': 'anti-analysis/anti-emulation/wine',
'contain a resource (.rsrc) section': 'executable/pe/section/rsrc',
'write file (3 matches)': 'host-interaction/file-system/write'}
}
"""
subrule_matches = find_subrule_matches(doc)
ostream["CAPABILITY"] = dict()
for rule in rutils.capability_rules(doc):
if rule["meta"]["name"] in subrule_matches:
# rules that are also matched by other rules should not get rendered by default.
# this cuts down on the amount of output while giving approx the same detail.
# see #224
continue
count = len(rule["matches"])
if count == 1:
capability = rule["meta"]["name"]
else:
capability = "%s (%d matches)" % (rule["meta"]["name"], count)
ostream["CAPABILITY"].setdefault(rule["meta"]["namespace"], list())
ostream["CAPABILITY"][rule["meta"]["namespace"]].append(capability)
def render_attack(doc, ostream):
"""
example::
{'ATT&CK': {'COLLECTION': ['Input Capture::Keylogging [T1056.001]'],
'DEFENSE EVASION': ['Obfuscated Files or Information [T1027]',
'Virtualization/Sandbox Evasion::System Checks '
'[T1497.001]'],
'DISCOVERY': ['File and Directory Discovery [T1083]',
'Query Registry [T1012]',
'System Information Discovery [T1082]'],
'EXECUTION': ['Shared Modules [T1129]']}
}
"""
ostream["ATTCK"] = dict()
tactics = collections.defaultdict(set)
for rule in rutils.capability_rules(doc):
if not rule["meta"].get("att&ck"):
continue
for attack in rule["meta"]["att&ck"]:
tactic, _, rest = attack.partition("::")
if "::" in rest:
technique, _, rest = rest.partition("::")
subtechnique, _, id = rest.rpartition(" ")
tactics[tactic].add((technique, subtechnique, id))
else:
technique, _, id = rest.rpartition(" ")
tactics[tactic].add((technique, id))
for tactic, techniques in sorted(tactics.items()):
inner_rows = []
for spec in sorted(techniques):
if len(spec) == 2:
technique, id = spec
inner_rows.append("%s %s" % (technique, id))
elif len(spec) == 3:
technique, subtechnique, id = spec
inner_rows.append("%s::%s %s" % (technique, subtechnique, id))
else:
raise RuntimeError("unexpected ATT&CK spec format")
ostream["ATTCK"].setdefault(tactic.upper(), inner_rows)
def render_mbc(doc, ostream):
"""
example::
{'MBC': {'ANTI-BEHAVIORAL ANALYSIS': ['Debugger Detection::Timing/Delay Check '
'GetTickCount [B0001.032]',
'Emulator Detection [B0004]',
'Virtual Machine Detection::Instruction '
'Testing [B0009.029]',
'Virtual Machine Detection [B0009]'],
'COLLECTION': ['Keylogging::Polling [F0002.002]'],
'CRYPTOGRAPHY': ['Encrypt Data::RC4 [C0027.009]',
'Generate Pseudo-random Sequence::RC4 PRGA '
'[C0021.004]']}
}
"""
ostream["MBC"] = dict()
objectives = collections.defaultdict(set)
for rule in rutils.capability_rules(doc):
if not rule["meta"].get("mbc"):
continue
mbcs = rule["meta"]["mbc"]
if not isinstance(mbcs, list):
raise ValueError("invalid rule: MBC mapping is not a list")
for mbc in mbcs:
objective, _, rest = mbc.partition("::")
if "::" in rest:
behavior, _, rest = rest.partition("::")
method, _, id = rest.rpartition(" ")
objectives[objective].add((behavior, method, id))
else:
behavior, _, id = rest.rpartition(" ")
objectives[objective].add((behavior, id))
for objective, behaviors in sorted(objectives.items()):
inner_rows = []
for spec in sorted(behaviors):
if len(spec) == 2:
behavior, id = spec
inner_rows.append("%s %s" % (behavior, id))
elif len(spec) == 3:
behavior, method, id = spec
inner_rows.append("%s::%s %s" % (behavior, method, id))
else:
raise RuntimeError("unexpected MBC spec format")
ostream["MBC"].setdefault(objective.upper(), inner_rows)
def render_dictionary(doc):
ostream = dict()
render_meta(doc, ostream)
render_attack(doc, ostream)
render_mbc(doc, ostream)
render_capabilities(doc, ostream)
return ostream
# ==== render dictionary helpers
def capa_details(file_path, output_format="dictionary"):
# extract features and find capabilities
extractor = capa.main.get_extractor(file_path, "auto", capa.main.BACKEND_VIV, disable_progress=True)
capabilities, counts = capa.main.find_capabilities(rules, extractor, disable_progress=True)
# collect metadata (used only to make rendering more complete)
meta = capa.main.collect_metadata("", file_path, RULES_PATH, "auto", extractor)
meta["analysis"].update(counts)
capa_output = False
if output_format == "dictionary":
# ...as python dictionary, simplified as textable but in dictionary
doc = convert_capabilities_to_result_document(meta, rules, capabilities)
capa_output = render_dictionary(doc)
elif output_format == "json":
# render results
# ...as json
capa_output = json.loads(capa.render.render_json(meta, rules, capabilities))
elif output_format == "texttable":
# ...as human readable text table
capa_output = capa.render.render_default(meta, rules, capabilities)
return capa_output

View File

@@ -38,6 +38,12 @@ def main(argv=None):
)
parser.add_argument("-v", "--verbose", action="store_true", help="Enable debug logging")
parser.add_argument("-q", "--quiet", action="store_true", help="Disable all output but errors")
parser.add_argument(
"-c",
"--check",
action="store_true",
help="Don't output (reformatted) rule, only return status. 0 = no changes, 1 = would reformat",
)
args = parser.parse_args(args=argv)
if args.verbose:
@@ -50,12 +56,24 @@ def main(argv=None):
logging.basicConfig(level=level)
logging.getLogger("capafmt").setLevel(level)
rule = capa.rules.Rule.from_yaml_file(args.path)
rule = capa.rules.Rule.from_yaml_file(args.path, use_ruamel=True)
reformatted_rule = rule.to_yaml()
if args.check:
if rule.definition == reformatted_rule:
logger.info("rule is formatted correctly, nice! (%s)", rule.name)
return 0
else:
logger.info("rule requires reformatting (%s)", rule.name)
if "\r\n" in rule.definition:
logger.info("please make sure that the file uses LF (\\n) line endings only")
return 1
if args.in_place:
with open(args.path, "wb") as f:
f.write(rule.to_yaml().encode("utf-8"))
f.write(reformatted_rule.encode("utf-8"))
else:
print(rule.to_yaml().rstrip("\n"))
print(reformatted_rule)
return 0

View File

@@ -31,10 +31,8 @@ See the License for the specific language governing permissions and limitations
import json
import logging
import idc
import idautils
import ida_funcs
import ida_idaapi
import ida_kernwin
logger = logging.getLogger("capa")

View File

@@ -15,7 +15,9 @@ See the License for the specific language governing permissions and limitations
"""
import os
import sys
import time
import string
import difflib
import hashlib
import logging
import os.path
@@ -23,7 +25,10 @@ import argparse
import itertools
import posixpath
import ruamel.yaml
import capa.main
import capa.rules
import capa.engine
import capa.features
import capa.features.insn
@@ -32,7 +37,11 @@ logger = logging.getLogger("capa.lint")
class Lint(object):
WARN = "WARN"
FAIL = "FAIL"
name = "lint"
level = FAIL
recommendation = ""
def check_rule(self, ctx, rule):
@@ -194,7 +203,7 @@ class DoesntMatchExample(Lint):
continue
try:
extractor = capa.main.get_extractor(path, "auto")
extractor = capa.main.get_extractor(path, "auto", capa.main.BACKEND_VIV, disable_progress=True)
capabilities, meta = capa.main.find_capabilities(ctx["rules"], extractor, disable_progress=True)
except Exception as e:
logger.error("failed to extract capabilities: %s %s %s", rule.name, path, e)
@@ -232,7 +241,7 @@ class LibRuleNotInLibDirectory(Lint):
if "lib" not in rule.meta:
return False
return "/lib/" not in get_normpath(rule.meta["capa/path"])
return "lib/" not in get_normpath(rule.meta["capa/path"])
class LibRuleHasNamespace(Lint):
@@ -276,6 +285,91 @@ class FeatureNegativeNumber(Lint):
return False
class FeatureNtdllNtoskrnlApi(Lint):
name = "feature api may overlap with ntdll and ntoskrnl"
level = Lint.WARN
recommendation = (
"check if {:s} is exported by both ntdll and ntoskrnl; if true, consider removing {:s} "
"module requirement to improve detection"
)
def check_features(self, ctx, features):
for feature in features:
if isinstance(feature, capa.features.insn.API):
modname, _, impname = feature.value.rpartition(".")
if modname in ("ntdll", "ntoskrnl"):
self.recommendation = self.recommendation.format(impname, modname)
return True
return False
class FormatLineFeedEOL(Lint):
name = "line(s) end with CRLF (\\r\\n)"
recommendation = "convert line endings to LF (\\n) for example using dos2unix"
def check_rule(self, ctx, rule):
if len(rule.definition.split("\r\n")) > 0:
return False
return True
class FormatSingleEmptyLineEOF(Lint):
name = "EOF format"
recommendation = "end file with a single empty line"
def check_rule(self, ctx, rule):
if rule.definition.endswith("\n") and not rule.definition.endswith("\n\n"):
return False
return True
class FormatIncorrect(Lint):
name = "rule format incorrect"
recommendation_template = "use scripts/capafmt.py or adjust as follows\n{:s}"
def check_rule(self, ctx, rule):
actual = rule.definition
expected = capa.rules.Rule.from_yaml(rule.definition, use_ruamel=True).to_yaml()
if actual != expected:
diff = difflib.ndiff(actual.splitlines(1), expected.splitlines(True))
recommendation_template = self.recommendation_template
if "\r\n" in actual:
recommendation_template = (
self.recommendation_template + "\nplease make sure that the file uses LF (\\n) line endings only"
)
self.recommendation = recommendation_template.format("".join(diff))
return True
return False
class FormatStringQuotesIncorrect(Lint):
name = "rule string quotes incorrect"
def check_rule(self, ctx, rule):
events = capa.rules.Rule._get_ruamel_yaml_parser().parse(rule.definition)
for key in events:
if not (isinstance(key, ruamel.yaml.ScalarEvent) and key.value == "string"):
continue
value = next(events) # assume value is next event
if not isinstance(value, ruamel.yaml.ScalarEvent):
# ignore non-scalar
continue
if value.value.startswith("/") and value.value.endswith(("/", "/i")):
# ignore regex for now
continue
if value.style is None:
# no quotes
self.recommendation = 'add double quotes to "%s"' % value.value
return True
if value.style == "'":
# single quote
self.recommendation = 'change single quotes to double quotes for "%s"' % value.value
return True
return False
def run_lints(lints, ctx, rule):
for lint in lints:
if lint.check_rule(ctx, rule):
@@ -325,14 +419,7 @@ def lint_meta(ctx, rule):
return run_lints(META_LINTS, ctx, rule)
FEATURE_LINTS = (
FeatureStringTooShort(),
FeatureNegativeNumber(),
)
def get_normpath(path):
return posixpath.normpath(path).replace(os.sep, "/")
FEATURE_LINTS = (FeatureStringTooShort(), FeatureNegativeNumber(), FeatureNtdllNtoskrnlApi())
def lint_features(ctx, rule):
@@ -340,6 +427,22 @@ def lint_features(ctx, rule):
return run_feature_lints(FEATURE_LINTS, ctx, features)
FORMAT_LINTS = (
FormatLineFeedEOL(),
FormatSingleEmptyLineEOF(),
FormatStringQuotesIncorrect(),
FormatIncorrect(),
)
def lint_format(ctx, rule):
return run_lints(FORMAT_LINTS, ctx, rule)
def get_normpath(path):
return posixpath.normpath(path).replace(os.sep, "/")
def get_features(ctx, rule):
# get features from rule and all dependencies including subscopes and matched rules
features = []
@@ -390,6 +493,7 @@ def lint_rule(ctx, rule):
lint_meta(ctx, rule),
lint_logic(ctx, rule),
lint_features(ctx, rule),
lint_format(ctx, rule),
)
)
@@ -406,25 +510,28 @@ def lint_rule(ctx, rule):
)
)
level = "WARN" if is_nursery_rule(rule) else "FAIL"
for violation in violations:
print(
"%s %s: %s: %s"
% (
" " if is_nursery_rule(rule) else "",
level,
Lint.WARN if is_nursery_rule(rule) else violation.level,
violation.name,
violation.recommendation,
)
)
elif len(violations) == 0 and is_nursery_rule(rule):
print("")
lints_failed = any(map(lambda v: v.level == Lint.FAIL, violations))
if not lints_failed and is_nursery_rule(rule):
print("")
print("%s%s" % (" (nursery) ", rule.name))
print("%s %s: %s: %s" % (" ", "WARN", "no violations", "Graduate the rule"))
print("%s %s: %s: %s" % (" ", Lint.WARN, "no lint failures", "Graduate the rule"))
print("")
return len(violations) > 0 and not is_nursery_rule(rule)
return lints_failed and not is_nursery_rule(rule)
def lint(ctx, rules):
@@ -492,7 +599,8 @@ def main(argv=None):
samples_path = os.path.join(os.path.dirname(__file__), "..", "tests", "data")
parser = argparse.ArgumentParser(description="A program.")
parser = argparse.ArgumentParser(description="Lint capa rules.")
capa.main.install_common_args(parser, wanted={"tag"})
parser.add_argument("rules", type=str, help="Path to rules")
parser.add_argument("--samples", type=str, default=samples_path, help="Path to samples")
parser.add_argument(
@@ -500,31 +608,28 @@ def main(argv=None):
action="store_true",
help="Enable thorough linting - takes more time, but does a better job",
)
parser.add_argument("-v", "--verbose", action="store_true", help="Enable debug logging")
parser.add_argument("-q", "--quiet", action="store_true", help="Disable all output but errors")
args = parser.parse_args(args=argv)
capa.main.handle_common_args(args)
if args.verbose:
level = logging.DEBUG
elif args.quiet:
level = logging.ERROR
if args.debug:
logging.getLogger("capa").setLevel(logging.DEBUG)
logging.getLogger("viv_utils").setLevel(logging.DEBUG)
else:
level = logging.INFO
logging.getLogger("capa").setLevel(logging.ERROR)
logging.getLogger("viv_utils").setLevel(logging.ERROR)
logging.basicConfig(level=level)
logging.getLogger("capa.lint").setLevel(level)
capa.main.set_vivisect_log_level(logging.CRITICAL)
logging.getLogger("capa").setLevel(logging.CRITICAL)
time0 = time.time()
try:
rules = capa.main.get_rules(args.rules)
rules = capa.main.get_rules(args.rules, disable_progress=True)
rules = capa.rules.RuleSet(rules)
logger.info("successfully loaded %s rules", len(rules))
except IOError as e:
logger.error("%s", str(e))
return -1
except capa.rules.InvalidRule as e:
if args.tag:
rules = rules.filter_rules_by_meta(args.tag)
logger.debug("selected %s rules", len(rules))
for i, r in enumerate(rules.rules, 1):
logger.debug(" %d. %s", i, r)
except (IOError, capa.rules.InvalidRule, capa.rules.InvalidRuleSet) as e:
logger.error("%s", str(e))
return -1
@@ -542,8 +647,12 @@ def main(argv=None):
}
did_violate = lint(ctx, rules)
min, sec = divmod(time.time() - time0, 60)
logger.debug("lints ran for ~ %02d:%02dm", min, sec)
if not did_violate:
logger.info("no suggestions, nice!")
logger.info("no lints failed, nice!")
return 0
else:
return 1

View File

@@ -1,167 +0,0 @@
#!/usr/bin/env python
"""
migrate rules and their namespaces.
example:
$ python scripts/migrate-rules.py migration.csv ./rules ./new-rules
Copyright (C) 2020 FireEye, Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at: [package root]/LICENSE.txt
Unless required by applicable law or agreed to in writing, software distributed under the License
is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and limitations under the License.
"""
import os
import csv
import sys
import logging
import os.path
import argparse
import collections
import capa.rules
logger = logging.getLogger("migrate-rules")
def read_plan(plan_path):
with open(plan_path, "rb") as f:
return list(
csv.DictReader(
f,
restkey="other",
fieldnames=(
"existing path",
"existing name",
"existing rule-category",
"proposed name",
"proposed namespace",
"ATT&CK",
"MBC",
"comment1",
),
)
)
def read_rules(rule_directory):
rules = {}
for root, dirs, files in os.walk(rule_directory):
for file in files:
path = os.path.join(root, file)
if not path.endswith(".yml"):
logger.info("skipping file: %s", path)
continue
rule = capa.rules.Rule.from_yaml_file(path)
rules[rule.name] = rule
if "nursery" in path:
rule.meta["capa/nursery"] = True
return rules
def main(argv=None):
if argv is None:
argv = sys.argv[1:]
parser = argparse.ArgumentParser(description="migrate rules.")
parser.add_argument("plan", type=str, help="Path to CSV describing migration")
parser.add_argument("source", type=str, help="Source directory of rules")
parser.add_argument("destination", type=str, help="Destination directory of rules")
args = parser.parse_args(args=argv)
logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(logging.INFO)
plan = read_plan(args.plan)
logger.info("read %d plan entries", len(plan))
rules = read_rules(args.source)
logger.info("read %d rules", len(rules))
planned_rules = set([row["existing name"] for row in plan])
unplanned_rules = [rule for (name, rule) in rules.items() if name not in planned_rules]
if unplanned_rules:
logger.error("plan does not account for %d rules:" % (len(unplanned_rules)))
for rule in unplanned_rules:
logger.error(" " + rule.name)
return -1
# pairs of strings (needle, replacement)
match_translations = []
for row in plan:
if not row["existing name"]:
continue
rule = rules[row["existing name"]]
if rule.meta["name"] != row["proposed name"]:
logger.info("renaming rule '%s' -> '%s'", rule.meta["name"], row["proposed name"])
# assume the yaml is formatted like `- match: $rule-name`.
# but since its been linted, this should be ok.
match_translations.append(("- match: " + rule.meta["name"], "- match: " + row["proposed name"]))
rule.meta["name"] = row["proposed name"]
rule.name = row["proposed name"]
if "rule-category" in rule.meta:
logger.info("deleting rule category '%s'", rule.meta["rule-category"])
del rule.meta["rule-category"]
rule.meta["namespace"] = row["proposed namespace"]
if row["ATT&CK"] != "n/a" and row["ATT&CK"] != "":
tag = row["ATT&CK"]
name, _, id = tag.rpartition(" ")
tag = "%s [%s]" % (name, id)
rule.meta["att&ck"] = [tag]
if row["MBC"] != "n/a" and row["MBC"] != "":
tag = row["MBC"]
rule.meta["mbc"] = [tag]
for rule in rules.values():
filename = rule.name
filename = filename.lower()
filename = filename.replace(" ", "-")
filename = filename.replace("(", "")
filename = filename.replace(")", "")
filename = filename.replace("+", "")
filename = filename.replace("/", "")
filename = filename + ".yml"
try:
if rule.meta.get("capa/nursery"):
directory = os.path.join(args.destination, "nursery")
elif rule.meta.get("lib"):
directory = os.path.join(args.destination, "lib")
else:
directory = os.path.join(args.destination, rule.meta.get("namespace"))
os.makedirs(directory)
except OSError:
pass
else:
logger.info("created namespace: %s", directory)
path = os.path.join(directory, filename)
logger.info("writing rule %s", path)
doc = rule.to_yaml().decode("utf-8")
for (needle, replacement) in match_translations:
doc = doc.replace(needle, replacement)
with open(path, "wb") as f:
f.write(doc.encode("utf-8"))
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -63,7 +63,6 @@ import capa.render
import capa.features
import capa.render.utils as rutils
import capa.features.freeze
import capa.features.extractors.viv
from capa.helpers import get_file_taste
logger = logging.getLogger("capa.show-capabilities-by-function")
@@ -111,143 +110,93 @@ def main(argv=None):
if argv is None:
argv = sys.argv[1:]
formats = [
("auto", "(default) detect file type automatically"),
("pe", "Windows PE file"),
("sc32", "32-bit shellcode"),
("sc64", "64-bit shellcode"),
("freeze", "features previously frozen by capa"),
]
format_help = ", ".join(["%s: %s" % (f[0], f[1]) for f in formats])
parser = argparse.ArgumentParser(description="detect capabilities in programs.")
capa.main.install_common_args(parser, wanted={"format", "sample", "rules", "tag"})
args = parser.parse_args(args=argv)
capa.main.handle_common_args(args)
parser = argparse.ArgumentParser(description="detect capabilities in programs.")
parser.add_argument("sample", type=str, help="Path to sample to analyze")
parser.add_argument(
"-r",
"--rules",
type=str,
default="(embedded rules)",
help="Path to rule file or directory, use embedded rules by default",
)
parser.add_argument("-t", "--tag", type=str, help="Filter on rule meta field values")
parser.add_argument("-d", "--debug", action="store_true", help="Enable debugging output on STDERR")
parser.add_argument("-q", "--quiet", action="store_true", help="Disable all output but errors")
parser.add_argument(
"-f",
"--format",
choices=[f[0] for f in formats],
default="auto",
help="Select sample format, %s" % format_help,
)
args = parser.parse_args(args=argv)
try:
taste = get_file_taste(args.sample)
except IOError as e:
logger.error("%s", str(e))
return -1
if args.quiet:
logging.basicConfig(level=logging.ERROR)
logging.getLogger().setLevel(logging.ERROR)
elif args.debug:
logging.basicConfig(level=logging.DEBUG)
logging.getLogger().setLevel(logging.DEBUG)
else:
logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(logging.INFO)
if args.rules == "(embedded rules)":
logger.info("-" * 80)
logger.info(" Using default embedded rules.")
logger.info(" To provide your own rules, use the form `capa.exe -r ./path/to/rules/ /path/to/mal.exe`.")
logger.info(" You can see the current default rule set here:")
logger.info(" https://github.com/fireeye/capa-rules")
logger.info("-" * 80)
# disable vivisect-related logging, it's verbose and not relevant for capa users
capa.main.set_vivisect_log_level(logging.CRITICAL)
logger.debug("detected running from source")
args.rules = os.path.join(os.path.dirname(__file__), "..", "rules")
logger.debug("default rule path (source method): %s", args.rules)
else:
logger.info("using rules path: %s", args.rules)
try:
rules = capa.main.get_rules(args.rules)
rules = capa.rules.RuleSet(rules)
logger.info("successfully loaded %s rules", len(rules))
if args.tag:
rules = rules.filter_rules_by_meta(args.tag)
logger.info("selected %s rules", len(rules))
except (IOError, capa.rules.InvalidRule, capa.rules.InvalidRuleSet) as e:
logger.error("%s", str(e))
return -1
if (args.format == "freeze") or (args.format == "auto" and capa.features.freeze.is_freeze(taste)):
format = "freeze"
with open(args.sample, "rb") as f:
extractor = capa.features.freeze.load(f.read())
else:
format = args.format
try:
taste = get_file_taste(args.sample)
except IOError as e:
logger.error("%s", str(e))
extractor = capa.main.get_extractor(args.sample, args.format)
except capa.main.UnsupportedFormatError:
logger.error("-" * 80)
logger.error(" Input file does not appear to be a PE file.")
logger.error(" ")
logger.error(
" capa currently only supports analyzing PE files (or shellcode, when using --format sc32|sc64)."
)
logger.error(" If you don't know the input file type, you can try using the `file` utility to guess it.")
logger.error("-" * 80)
return -1
except capa.main.UnsupportedRuntimeError:
logger.error("-" * 80)
logger.error(" Unsupported runtime or Python interpreter.")
logger.error(" ")
logger.error(" capa supports running under Python 2.7 using Vivisect for binary analysis.")
logger.error(" It can also run within IDA Pro, using either Python 2.7 or 3.5+.")
logger.error(" ")
logger.error(" If you're seeing this message on the command line, please ensure you're running Python 2.7.")
logger.error("-" * 80)
return -1
# py2 doesn't know about cp65001, which is a variant of utf-8 on windows
# tqdm bails when trying to render the progress bar in this setup.
# because cp65001 is utf-8, we just map that codepage to the utf-8 codec.
# see #380 and: https://stackoverflow.com/a/3259271/87207
import codecs
meta = capa.main.collect_metadata(argv, args.sample, args.rules, format, extractor)
capabilities, counts = capa.main.find_capabilities(rules, extractor)
meta["analysis"].update(counts)
codecs.register(lambda name: codecs.lookup("utf-8") if name == "cp65001" else None)
if args.rules == "(embedded rules)":
logger.info("-" * 80)
logger.info(" Using default embedded rules.")
logger.info(" To provide your own rules, use the form `capa.exe -r ./path/to/rules/ /path/to/mal.exe`.")
logger.info(" You can see the current default rule set here:")
logger.info(" https://github.com/fireeye/capa-rules")
logger.info("-" * 80)
logger.debug("detected running from source")
args.rules = os.path.join(os.path.dirname(__file__), "..", "rules")
logger.debug("default rule path (source method): %s", args.rules)
else:
logger.info("using rules path: %s", args.rules)
try:
rules = capa.main.get_rules(args.rules)
rules = capa.rules.RuleSet(rules)
logger.info("successfully loaded %s rules", len(rules))
if args.tag:
rules = rules.filter_rules_by_meta(args.tag)
logger.info("selected %s rules", len(rules))
except (IOError, capa.rules.InvalidRule, capa.rules.InvalidRuleSet) as e:
logger.error("%s", str(e))
if capa.main.has_file_limitation(rules, capabilities):
# bail if capa encountered file limitation e.g. a packed binary
# do show the output in verbose mode, though.
if not (args.verbose or args.vverbose or args.json):
return -1
if (args.format == "freeze") or (args.format == "auto" and capa.features.freeze.is_freeze(taste)):
format = "freeze"
with open(args.sample, "rb") as f:
extractor = capa.features.freeze.load(f.read())
else:
format = args.format
try:
extractor = capa.main.get_extractor(args.sample, args.format)
except capa.main.UnsupportedFormatError:
logger.error("-" * 80)
logger.error(" Input file does not appear to be a PE file.")
logger.error(" ")
logger.error(
" capa currently only supports analyzing PE files (or shellcode, when using --format sc32|sc64)."
)
logger.error(
" If you don't know the input file type, you can try using the `file` utility to guess it."
)
logger.error("-" * 80)
return -1
except capa.main.UnsupportedRuntimeError:
logger.error("-" * 80)
logger.error(" Unsupported runtime or Python interpreter.")
logger.error(" ")
logger.error(" capa supports running under Python 2.7 using Vivisect for binary analysis.")
logger.error(" It can also run within IDA Pro, using either Python 2.7 or 3.5+.")
logger.error(" ")
logger.error(
" If you're seeing this message on the command line, please ensure you're running Python 2.7."
)
logger.error("-" * 80)
return -1
# colorama will detect:
# - when on Windows console, and fixup coloring, and
# - when not an interactive session, and disable coloring
# renderers should use coloring and assume it will be stripped out if necessary.
colorama.init()
doc = capa.render.convert_capabilities_to_result_document(meta, rules, capabilities)
print(render_matches_by_function(doc))
colorama.deinit()
meta = capa.main.collect_metadata(argv, args.sample, args.rules, format, extractor)
capabilities, counts = capa.main.find_capabilities(rules, extractor)
meta["analysis"].update(counts)
logger.info("done.")
if capa.main.has_file_limitation(rules, capabilities):
# bail if capa encountered file limitation e.g. a packed binary
# do show the output in verbose mode, though.
if not (args.verbose or args.vverbose or args.json):
return -1
# colorama will detect:
# - when on Windows console, and fixup coloring, and
# - when not an interactive session, and disable coloring
# renderers should use coloring and assume it will be stripped out if necessary.
colorama.init()
doc = capa.render.convert_capabilities_to_result_document(meta, rules, capabilities)
print(render_matches_by_function(doc))
colorama.deinit()
logger.info("done.")
return 0
return 0
if __name__ == "__main__":

View File

@@ -71,41 +71,56 @@ import argparse
import capa.main
import capa.rules
import capa.engine
import capa.helpers
import capa.features
import capa.features.freeze
import capa.features.extractors.viv
logger = logging.getLogger("capa.show-features")
def main(argv=None):
if argv is None:
argv = sys.argv[1:]
formats = [
("auto", "(default) detect file type automatically"),
("pe", "Windows PE file"),
("sc32", "32-bit shellcode"),
("sc64", "64-bit shellcode"),
("freeze", "features previously frozen by capa"),
]
format_help = ", ".join(["%s: %s" % (f[0], f[1]) for f in formats])
parser = argparse.ArgumentParser(description="Show the features that capa extracts from the given sample")
parser.add_argument("sample", type=str, help="Path to sample to analyze")
parser.add_argument(
"-f", "--format", choices=[f[0] for f in formats], default="auto", help="Select sample format, %s" % format_help
)
capa.main.install_common_args(parser, wanted={"format", "sample"})
parser.add_argument("-F", "--function", type=lambda x: int(x, 0x10), help="Show features for specific function")
args = parser.parse_args(args=argv)
capa.main.handle_common_args(args)
logging.basicConfig(level=logging.INFO)
logging.getLogger().setLevel(logging.INFO)
try:
taste = capa.helpers.get_file_taste(args.sample)
except IOError as e:
logger.error("%s", str(e))
return -1
if args.format == "freeze":
if (args.format == "freeze") or (args.format == "auto" and capa.features.freeze.is_freeze(taste)):
with open(args.sample, "rb") as f:
extractor = capa.features.freeze.load(f.read())
else:
vw = capa.main.get_workspace(args.sample, args.format)
extractor = capa.features.extractors.viv.VivisectFeatureExtractor(vw, args.sample)
try:
extractor = capa.main.get_extractor(args.sample, args.format, capa.main.BACKEND_VIV)
except capa.main.UnsupportedFormatError:
logger.error("-" * 80)
logger.error(" Input file does not appear to be a PE file.")
logger.error(" ")
logger.error(
" capa currently only supports analyzing PE files (or shellcode, when using --format sc32|sc64)."
)
logger.error(" If you don't know the input file type, you can try using the `file` utility to guess it.")
logger.error("-" * 80)
return -1
except capa.main.UnsupportedRuntimeError:
logger.error("-" * 80)
logger.error(" Unsupported runtime or Python interpreter.")
logger.error(" ")
logger.error(" capa supports running under Python 2.7 using Vivisect for binary analysis.")
logger.error(" It can also run within IDA Pro, using either Python 2.7 or 3.5+.")
logger.error(" ")
logger.error(" If you're seeing this message on the command line, please ensure you're running Python 2.7.")
logger.error("-" * 80)
return -1
if not args.function:
for feature, va in extractor.extract_file_features():
@@ -118,15 +133,13 @@ def main(argv=None):
if args.function:
if args.format == "freeze":
functions = filter(lambda f: f == args.function, functions)
functions = tuple(filter(lambda f: f == args.function, functions))
else:
functions = filter(lambda f: f.va == args.function, functions)
functions = tuple(filter(lambda f: capa.helpers.oint(f) == args.function, functions))
if args.function not in [f.va for f in functions]:
print("0x%X not a function, creating it" % args.function)
vw.makeFunction(args.function)
functions = extractor.get_functions()
functions = filter(lambda f: f.va == args.function, functions)
if args.function not in [capa.helpers.oint(f) for f in functions]:
print("0x%X not a function" % args.function)
return -1
if len(functions) == 0:
print("0x%X not a function")
@@ -154,7 +167,7 @@ def ida_main():
functions = extractor.get_functions()
if function:
functions = filter(lambda f: f.start_ea == function, functions)
functions = tuple(filter(lambda f: f.start_ea == function, functions))
if len(functions) == 0:
print("0x%X not a function" % function)

69
scripts/vivisect-py2-vs-py3.sh Executable file
View File

@@ -0,0 +1,69 @@
#!/usr/bin/env bash
int() {
int=$(bc <<< "scale=0; ($1 + 0.5)/1")
}
export TIMEFORMAT='%3R'
threshold_time=90
threshold_py3_time=60 # Do not warn if it doesn't take at least 1 minute to run
rm tests/data/*.viv 2>/dev/null
mkdir results
for file in tests/data/*
do
file=$(printf %q "$file") # Handle names with white spaces
file_name=$(basename $file)
echo $file_name
rm "$file.viv" 2>/dev/null
py3_time=$(sh -c "time python3 scripts/show-features.py $file >> results/p3-$file_name.out 2>/dev/null" 2>&1)
rm "$file.viv" 2>/dev/null
py2_time=$(sh -c "time python2 scripts/show-features.py $file >> results/p2-$file_name.out 2>/dev/null" 2>&1)
int $py3_time
if (($int > $threshold_py3_time))
then
percentage=$(bc <<< "scale=3; $py2_time/$py3_time*100 + 0.5")
int $percentage
if (($int < $threshold_py3_time))
then
echo -n " SLOWER ($percentage): "
fi
fi
echo " PY2($py2_time) PY3($py3_time)"
done
threshold_features=98
counter=0
average=0
results_for() {
py3=$(cat "results/p3-$file_name.out" | grep "$1" | wc -l)
py2=$(cat "results/p2-$file_name.out" | grep "$1" | wc -l)
if (($py2 > 0))
then
percentage=$(bc <<< "scale=2; 100*$py3/$py2")
average=$(bc <<< "scale=2; $percentage + $average")
count=$(($count + 1))
int $percentage
if (($int < $threshold_features))
then
echo -e "$1: py2($py2) py3($py3) $percentage% - $file_name"
fi
fi
}
rm tests/data/*.viv 2>/dev/null
echo -e '\nRESULTS:'
for file in tests/data/*
do
file_name=$(basename $file)
if test -f "results/p2-$file_name.out"; then
results_for 'insn'
results_for 'file'
results_for 'func'
results_for 'bb'
fi
done
average=$(bc <<< "scale=2; $average/$count")
echo "TOTAL: $average"

View File

@@ -11,30 +11,33 @@ import sys
import setuptools
# halo==0.0.30 is the last version to support py2.7
requirements = [
"six",
"tqdm",
"pyyaml",
"tabulate",
"colorama",
"termcolor",
"ruamel.yaml",
"wcwidth",
"halo==0.0.30",
"six==1.15.0",
"tqdm==4.60.0",
"pyyaml==5.4.1",
"tabulate==0.8.9",
"colorama==0.4.4",
"termcolor==1.1.0",
"wcwidth==0.2.5",
"ida-settings==2.1.0",
"viv-utils==0.6.0",
]
if sys.version_info >= (3, 0):
# py3
requirements.append("networkx")
requirements.append("halo==0.0.31")
requirements.append("networkx==2.5.1")
requirements.append("ruamel.yaml==0.17.0")
requirements.append("vivisect==1.0.1")
requirements.append("smda==1.5.13")
else:
# py2
requirements.append("enum34==1.1.6") # v1.1.6 is needed by halo 0.0.30 / spinners 0.0.24
requirements.append("vivisect==0.1.0")
requirements.append("viv-utils")
requirements.append("halo==0.0.30") # halo==0.0.30 is the last version to support py2.7
requirements.append("vivisect==0.2.1")
requirements.append("networkx==2.2") # v2.2 is last version supported by Python 2.7
requirements.append("backports.functools-lru-cache")
requirements.append("ruamel.yaml==0.16.13") # last version tested with Python 2.7
requirements.append("backports.functools-lru-cache==1.6.1")
# this sets __version__
# via: http://stackoverflow.com/a/7071358/87207
@@ -74,13 +77,13 @@ setuptools.setup(
install_requires=requirements,
extras_require={
"dev": [
"pytest",
"pytest-sugar",
"pytest-instafail",
"pytest-cov",
"pycodestyle",
"black ; python_version>'3.0'",
"isort",
"pytest==4.6.11", # TODO: Change to 6.2.3 when removing py2
"pytest-sugar==0.9.4",
"pytest-instafail==0.4.2",
"pytest-cov==2.11.1",
"pycodestyle==2.7.0",
"black==20.8b1 ; python_version>'3.0'",
"isort==4.3.21", # TODO: Change to 5.8.0 when removing py2
]
},
zip_safe=False,

View File

@@ -10,6 +10,7 @@
import os
import sys
import os.path
import binascii
import contextlib
import collections
@@ -78,7 +79,33 @@ def get_viv_extractor(path):
vw = capa.main.get_workspace(path, "sc64", should_save=False)
else:
vw = capa.main.get_workspace(path, "auto", should_save=True)
return capa.features.extractors.viv.VivisectFeatureExtractor(vw, path)
extractor = capa.features.extractors.viv.VivisectFeatureExtractor(vw, path)
fixup_viv(path, extractor)
return extractor
def fixup_viv(path, extractor):
"""
vivisect fixups to overcome differences between backends
"""
if "3b13b" in path:
# vivisect only recognizes calling thunk function at 0x10001573
extractor.vw.makeFunction(0x10006860)
@lru_cache()
def get_smda_extractor(path):
from smda.SmdaConfig import SmdaConfig
from smda.Disassembler import Disassembler
import capa.features.extractors.smda
config = SmdaConfig()
config.STORE_BUFFER = True
disasm = Disassembler(config)
report = disasm.disassembleFile(path)
return capa.features.extractors.smda.SmdaFeatureExtractor(report, path)
@lru_cache()
@@ -129,6 +156,8 @@ def get_data_path_by_name(name):
return os.path.join(CD, "data", "Practical Malware Analysis Lab 21-01.exe_")
elif name == "al-khaser x86":
return os.path.join(CD, "data", "al-khaser_x86.exe_")
elif name == "al-khaser x64":
return os.path.join(CD, "data", "al-khaser_x64.exe_")
elif name.startswith("39c05"):
return os.path.join(CD, "data", "39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41.dll_")
elif name.startswith("499c2"):
@@ -149,8 +178,12 @@ def get_data_path_by_name(name):
return os.path.join(CD, "data", "82BF6347ACF15E5D883715DC289D8A2B.exe_")
elif name.startswith("pingtaest"):
return os.path.join(CD, "data", "ping_täst.exe_")
elif name.startswith("77329"):
return os.path.join(CD, "data", "773290480d5445f11d3dc1b800728966.exe_")
elif name.startswith("3b13b"):
return os.path.join(CD, "data", "3b13b6f1d7cd14dc4a097a12e2e505c0a4cff495262261e2bfc991df238b9b04.dll_")
else:
raise ValueError("unexpected sample fixture")
raise ValueError("unexpected sample fixture: %s" % name)
def get_sample_md5_by_name(name):
@@ -169,6 +202,8 @@ def get_sample_md5_by_name(name):
return "c8403fb05244e23a7931c766409b5e22"
elif name == "al-khaser x86":
return "db648cd247281954344f1d810c6fd590"
elif name == "al-khaser x64":
return "3cb21ae76ff3da4b7e02d77ff76e82be"
elif name.startswith("39c05"):
return "b7841b9d5dc1f511a93cc7576672ec0c"
elif name.startswith("499c2"):
@@ -187,8 +222,13 @@ def get_sample_md5_by_name(name):
return "64d9f7d96b99467f36e22fada623c3bb"
elif name.startswith("82bf6"):
return "82bf6347acf15e5d883715dc289d8a2b"
elif name.startswith("77329"):
return "773290480d5445f11d3dc1b800728966"
elif name.startswith("3b13b"):
# file name is SHA256 hash
return "56a6ffe6a02941028cc8235204eef31d"
else:
raise ValueError("unexpected sample fixture")
raise ValueError("unexpected sample fixture: %s" % name)
def resolve_sample(sample):
@@ -377,7 +417,7 @@ FEATURE_PRESENCE_TESTS = [
),
("kernel32-64", "function=0x1800202B0", capa.features.insn.API("RtlCaptureContext"), True),
# insn/api: x64 nested thunk
("82bf6", "function=0x140059342", capa.features.insn.API("ElfClearEventLogFile"), True),
("al-khaser x64", "function=0x14004B4F0", capa.features.insn.API("__vcrt_GetModuleHandle"), True),
# insn/api: call via jmp
("mimikatz", "function=0x40B3C6", capa.features.insn.API("LocalFree"), True),
("c91887...", "function=0x40156F", capa.features.insn.API("CloseClipboard"), True),
@@ -392,16 +432,21 @@ FEATURE_PRESENCE_TESTS = [
("mimikatz", "function=0x40105D", capa.features.String("SCardTransmit"), True),
("mimikatz", "function=0x40105D", capa.features.String("ACR > "), True),
("mimikatz", "function=0x40105D", capa.features.String("nope"), False),
("773290...", "function=0x140001140", capa.features.String(r"%s:\\OfficePackagesForWDAG"), True),
# insn/regex, issue #262
("pma16-01", "function=0x4021B0", capa.features.Regex("HTTP/1.0"), True),
("pma16-01", "function=0x4021B0", capa.features.Regex("www.practicalmalwareanalysis.com"), False),
# insn/string, pointer to string
("mimikatz", "function=0x44EDEF", capa.features.String("INPUTEVENT"), True),
# insn/string, direct memory reference
("mimikatz", "function=0x46D6CE", capa.features.String("(null)"), True),
# insn/bytes
("mimikatz", "function=0x40105D", capa.features.Bytes("SCardControl".encode("utf-16le")), True),
("mimikatz", "function=0x40105D", capa.features.Bytes("SCardTransmit".encode("utf-16le")), True),
("mimikatz", "function=0x40105D", capa.features.Bytes("ACR > ".encode("utf-16le")), True),
("mimikatz", "function=0x40105D", capa.features.Bytes("nope".encode("ascii")), False),
# IDA features included byte sequences read from invalid memory, fixed in #409
("mimikatz", "function=0x44570F", capa.features.Bytes(binascii.unhexlify("FF" * 256)), False),
# insn/bytes, pointer to bytes
("mimikatz", "function=0x44EDEF", capa.features.Bytes("INPUTEVENT".encode("utf-16le")), True),
# insn/characteristic(nzxor)
@@ -409,6 +454,9 @@ FEATURE_PRESENCE_TESTS = [
("mimikatz", "function=0x40105D", capa.features.Characteristic("nzxor"), False),
# insn/characteristic(nzxor): no security cookies
("mimikatz", "function=0x46D534", capa.features.Characteristic("nzxor"), False),
# insn/characteristic(nzxor): xorps
# viv needs fixup to recognize function, see above
("3b13b...", "function=0x10006860", capa.features.Characteristic("nzxor"), True),
# insn/characteristic(peb access)
("kernel32-64", "function=0x1800017D0", capa.features.Characteristic("peb access"), True),
("mimikatz", "function=0x4556E5", capa.features.Characteristic("peb access"), False),
@@ -472,11 +520,7 @@ def do_test_feature_count(get_extractor, sample, scope, feature, expected):
def get_extractor(path):
if sys.version_info >= (3, 0):
raise RuntimeError("no supported py3 backends yet")
else:
extractor = get_viv_extractor(path)
extractor = get_viv_extractor(path)
# overload the extractor so that the fixture exposes `extractor.path`
setattr(extractor, "path", path)
return extractor

View File

@@ -1,104 +1,104 @@
# run this script from within IDA with ./tests/data/mimikatz.exe open
import sys
import logging
import os.path
import binascii
import traceback
import pytest
try:
sys.path.append(os.path.dirname(__file__))
from fixtures import *
finally:
sys.path.pop()
logger = logging.getLogger("test_ida_features")
def check_input_file(wanted):
import idautils
# some versions (7.4) of IDA return a truncated version of the MD5.
# https://github.com/idapython/bin/issues/11
try:
found = idautils.GetInputFileMD5()[:31].decode("ascii").lower()
except UnicodeDecodeError:
# in IDA 7.5 or so, GetInputFileMD5 started returning raw binary
# rather than the hex digest
found = binascii.hexlify(idautils.GetInputFileMD5()[:15]).decode("ascii").lower()
if not wanted.startswith(found):
raise RuntimeError("please run the tests against sample with MD5: `%s`" % (wanted))
def get_ida_extractor(_path):
check_input_file("5f66b82558ca92e54e77f216ef4c066c")
# have to import import this inline so pytest doesn't bail outside of IDA
import capa.features.extractors.ida
return capa.features.extractors.ida.IdaFeatureExtractor()
@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
def test_ida_features():
for (sample, scope, feature, expected) in FEATURE_PRESENCE_TESTS + FEATURE_PRESENCE_TESTS_IDA:
id = make_test_id((sample, scope, feature, expected))
try:
check_input_file(get_sample_md5_by_name(sample))
except RuntimeError:
print("SKIP %s" % (id))
continue
scope = resolve_scope(scope)
sample = resolve_sample(sample)
try:
do_test_feature_presence(get_ida_extractor, sample, scope, feature, expected)
except Exception as e:
print("FAIL %s" % (id))
traceback.print_exc()
else:
print("OK %s" % (id))
@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
def test_ida_feature_counts():
for (sample, scope, feature, expected) in FEATURE_COUNT_TESTS:
id = make_test_id((sample, scope, feature, expected))
try:
check_input_file(get_sample_md5_by_name(sample))
except RuntimeError:
print("SKIP %s" % (id))
continue
scope = resolve_scope(scope)
sample = resolve_sample(sample)
try:
do_test_feature_count(get_ida_extractor, sample, scope, feature, expected)
except Exception as e:
print("FAIL %s" % (id))
traceback.print_exc()
else:
print("OK %s" % (id))
if __name__ == "__main__":
print("-" * 80)
# invoke all functions in this module that start with `test_`
for name in dir(sys.modules[__name__]):
if not name.startswith("test_"):
continue
test = getattr(sys.modules[__name__], name)
logger.debug("invoking test: %s", name)
sys.stderr.flush()
test()
print("DONE")
# run this script from within IDA with ./tests/data/mimikatz.exe open
import sys
import logging
import os.path
import binascii
import traceback
import pytest
try:
sys.path.append(os.path.dirname(__file__))
from fixtures import *
finally:
sys.path.pop()
logger = logging.getLogger("test_ida_features")
def check_input_file(wanted):
import idautils
# some versions (7.4) of IDA return a truncated version of the MD5.
# https://github.com/idapython/bin/issues/11
try:
found = idautils.GetInputFileMD5()[:31].decode("ascii").lower()
except UnicodeDecodeError:
# in IDA 7.5 or so, GetInputFileMD5 started returning raw binary
# rather than the hex digest
found = binascii.hexlify(idautils.GetInputFileMD5()[:15]).decode("ascii").lower()
if not wanted.startswith(found):
raise RuntimeError("please run the tests against sample with MD5: `%s`" % (wanted))
def get_ida_extractor(_path):
check_input_file("5f66b82558ca92e54e77f216ef4c066c")
# have to import import this inline so pytest doesn't bail outside of IDA
import capa.features.extractors.ida
return capa.features.extractors.ida.IdaFeatureExtractor()
@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
def test_ida_features():
for (sample, scope, feature, expected) in FEATURE_PRESENCE_TESTS + FEATURE_PRESENCE_TESTS_IDA:
id = make_test_id((sample, scope, feature, expected))
try:
check_input_file(get_sample_md5_by_name(sample))
except RuntimeError:
print("SKIP %s" % (id))
continue
scope = resolve_scope(scope)
sample = resolve_sample(sample)
try:
do_test_feature_presence(get_ida_extractor, sample, scope, feature, expected)
except Exception as e:
print("FAIL %s" % (id))
traceback.print_exc()
else:
print("OK %s" % (id))
@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
def test_ida_feature_counts():
for (sample, scope, feature, expected) in FEATURE_COUNT_TESTS:
id = make_test_id((sample, scope, feature, expected))
try:
check_input_file(get_sample_md5_by_name(sample))
except RuntimeError:
print("SKIP %s" % (id))
continue
scope = resolve_scope(scope)
sample = resolve_sample(sample)
try:
do_test_feature_count(get_ida_extractor, sample, scope, feature, expected)
except Exception as e:
print("FAIL %s" % (id))
traceback.print_exc()
else:
print("OK %s" % (id))
if __name__ == "__main__":
print("-" * 80)
# invoke all functions in this module that start with `test_`
for name in dir(sys.modules[__name__]):
if not name.startswith("test_"):
continue
test = getattr(sys.modules[__name__], name)
logger.debug("invoking test: %s", name)
sys.stderr.flush()
test()
print("DONE")

View File

@@ -7,6 +7,7 @@
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
import sys
import json
import textwrap
import pytest
@@ -19,7 +20,6 @@ import capa.features
from capa.engine import *
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_main(z9324d_extractor):
# tests rules can be loaded successfully and all output modes
path = z9324d_extractor.path
@@ -29,7 +29,6 @@ def test_main(z9324d_extractor):
assert capa.main.main([path]) == 0
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_main_single_rule(z9324d_extractor, tmpdir):
# tests a single rule can be loaded successfully
RULE_CONTENT = textwrap.dedent(
@@ -58,7 +57,6 @@ def test_main_single_rule(z9324d_extractor, tmpdir):
)
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_main_non_ascii_filename(pingtaest_extractor, tmpdir, capsys):
# on py2.7, need to be careful about str (which can hold bytes)
# vs unicode (which is only unicode characters).
@@ -71,18 +69,22 @@ def test_main_non_ascii_filename(pingtaest_extractor, tmpdir, capsys):
std = capsys.readouterr()
# but here, we have to use a unicode instance,
# because capsys has decoded the output for us.
assert pingtaest_extractor.path.decode("utf-8") in std.out
if sys.version_info >= (3, 0):
assert pingtaest_extractor.path in std.out
else:
assert pingtaest_extractor.path.decode("utf-8") in std.out
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_main_non_ascii_filename_nonexistent(tmpdir, caplog):
NON_ASCII_FILENAME = "täst_not_there.exe"
assert capa.main.main(["-q", NON_ASCII_FILENAME]) == -1
assert NON_ASCII_FILENAME.decode("utf-8") in caplog.text
if sys.version_info >= (3, 0):
assert NON_ASCII_FILENAME in caplog.text
else:
assert NON_ASCII_FILENAME.decode("utf-8") in caplog.text
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_main_shellcode(z499c2_extractor):
path = z499c2_extractor.path
assert capa.main.main([path, "-vv", "-f", "sc32"]) == 0
@@ -137,7 +139,6 @@ def test_ruleset():
assert len(rules.basic_block_rules) == 1
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_match_across_scopes_file_function(z9324d_extractor):
rules = capa.rules.RuleSet(
[
@@ -201,7 +202,6 @@ def test_match_across_scopes_file_function(z9324d_extractor):
assert ".text section and install service" in capabilities
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_match_across_scopes(z9324d_extractor):
rules = capa.rules.RuleSet(
[
@@ -264,7 +264,6 @@ def test_match_across_scopes(z9324d_extractor):
assert "kill thread program" in capabilities
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_subscope_bb_rules(z9324d_extractor):
rules = capa.rules.RuleSet(
[
@@ -289,7 +288,6 @@ def test_subscope_bb_rules(z9324d_extractor):
assert "test rule" in capabilities
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_byte_matching(z9324d_extractor):
rules = capa.rules.RuleSet(
[
@@ -312,7 +310,6 @@ def test_byte_matching(z9324d_extractor):
assert "byte match test" in capabilities
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_count_bb(z9324d_extractor):
rules = capa.rules.RuleSet(
[
@@ -336,7 +333,6 @@ def test_count_bb(z9324d_extractor):
assert "count bb" in capabilities
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_fix262(pma16_01_extractor, capsys):
# tests rules can be loaded successfully and all output modes
path = pma16_01_extractor.path
@@ -347,7 +343,6 @@ def test_fix262(pma16_01_extractor, capsys):
assert "www.practicalmalwareanalysis.com" not in std.out
@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
def test_not_render_rules_also_matched(z9324d_extractor, capsys):
# rules that are also matched by other rules should not get rendered by default.
# this cuts down on the amount of output while giving approx the same detail.
@@ -371,3 +366,20 @@ def test_not_render_rules_also_matched(z9324d_extractor, capsys):
assert "act as TCP client" in std.out
assert "connect TCP socket" in std.out
assert "create TCP socket" in std.out
# It tests main works with different backends
def test_backend_option(capsys):
if sys.version_info > (3, 0):
path = get_data_path_by_name("pma16-01")
assert capa.main.main([path, "-j", "-b", capa.main.BACKEND_VIV]) == 0
std = capsys.readouterr()
std_json = json.loads(std.out)
assert std_json["meta"]["analysis"]["extractor"] == "VivisectFeatureExtractor"
assert len(std_json["rules"]) > 0
assert capa.main.main([path, "-j", "-b", capa.main.BACKEND_SMDA]) == 0
std = capsys.readouterr()
std_json = json.loads(std.out)
assert std_json["meta"]["analysis"]["extractor"] == "SmdaFeatureExtractor"
assert len(std_json["rules"]) > 0

View File

@@ -282,7 +282,8 @@ def test_lib_rules():
),
]
)
assert len(rules.function_rules) == 1
# lib rules are added to the rule set
assert len(rules.function_rules) == 2
def test_subscope_rules():
@@ -680,6 +681,25 @@ def test_explicit_string_values_int():
assert (String("0x123") in children) == True
def test_string_values_special_characters():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- or:
- string: "hello\\r\\nworld"
- string: "bye\\nbye"
description: "test description"
"""
)
r = capa.rules.Rule.from_yaml(rule)
children = list(r.statement.get_children())
assert (String("hello\r\nworld") in children) == True
assert (String("bye\nbye") in children) == True
def test_regex_values_always_string():
rules = [
capa.rules.Rule.from_yaml(

View File

@@ -0,0 +1,31 @@
# Copyright (C) 2020 FireEye, Inc. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at: [package root]/LICENSE.txt
# Unless required by applicable law or agreed to in writing, software distributed under the License
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
import sys
from fixtures import *
@parametrize(
"sample,scope,feature,expected",
FEATURE_PRESENCE_TESTS,
indirect=["sample", "scope"],
)
@pytest.mark.xfail(sys.version_info < (3, 0), reason="SMDA only works on py3")
@pytest.mark.xfail(sys.platform == "win32", reason="SMDA bug: https://github.com/danielplohmann/smda/issues/20")
def test_smda_features(sample, scope, feature, expected):
do_test_feature_presence(get_smda_extractor, sample, scope, feature, expected)
@parametrize(
"sample,scope,feature,expected",
FEATURE_COUNT_TESTS,
indirect=["sample", "scope"],
)
def test_smda_feature_counts(sample, scope, feature, expected):
with xfail(sys.version_info < (3, 0), reason="SMDA only works on py3"):
do_test_feature_count(get_smda_extractor, sample, scope, feature, expected)

View File

@@ -16,8 +16,7 @@ from fixtures import *
indirect=["sample", "scope"],
)
def test_viv_features(sample, scope, feature, expected):
with xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2"):
do_test_feature_presence(get_viv_extractor, sample, scope, feature, expected)
do_test_feature_presence(get_viv_extractor, sample, scope, feature, expected)
@parametrize(
@@ -26,5 +25,4 @@ def test_viv_features(sample, scope, feature, expected):
indirect=["sample", "scope"],
)
def test_viv_feature_counts(sample, scope, feature, expected):
with xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2"):
do_test_feature_count(get_viv_extractor, sample, scope, feature, expected)
do_test_feature_count(get_viv_extractor, sample, scope, feature, expected)