Compare commits

..

203 Commits

Author SHA1 Message Date
Willi Ballenthin
c0851fc643 Merge pull request #863 from mandiant/v3.1.0
version: v3.1.0
2022-01-12 14:18:22 -07:00
Willi Ballenthin
de7592b351 changelog: add additional contributor 2022-01-11 14:29:15 -07:00
Willi Ballenthin
5530bbad53 Update CHANGELOG.md
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2022-01-11 14:28:17 -07:00
Willi Ballenthin
4f0067e408 Update CHANGELOG.md
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2022-01-11 14:27:59 -07:00
Willi Ballenthin
b444c28a19 changelog: fix format 2022-01-11 10:05:40 -07:00
Willi Ballenthin
a4cc409c95 Update capa/version.py
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2022-01-10 12:39:07 -07:00
Moritz
fcb08501c0 Merge pull request #865 from mandiant/mr-tz-patch-1
Update global_.py
2022-01-10 19:21:24 +01:00
Moritz
cb2d00cefc Update global_.py 2022-01-10 19:04:52 +01:00
Willi Ballenthin
1cb9fc8a40 Merge pull request #864 from doomedraven/patch-1
Fix deprication warning from IDA
2022-01-10 10:52:10 -07:00
doomedraven
85cfc04bdb Fix deprication warning from IDA
```
    if info.procName == "metapc" and info.is_64bit():
```
Please use "procname" instead of "procName" ("procName" is kept for backward-compatibility, and will be removed soon.)
2022-01-10 18:37:59 +01:00
Willi Ballenthin
6555a3604f changelog: intro section 2022-01-10 09:49:00 -07:00
Willi Ballenthin
a97262d022 changelog: v3.1.0 2022-01-10 09:39:46 -07:00
Willi Ballenthin
8ad54271e9 version: v3.1.0 2022-01-10 09:33:39 -07:00
Willi Ballenthin
e5b9a20d09 changelog: add rule changes and contributors 2022-01-10 09:32:49 -07:00
Willi Ballenthin
0d37d182ea changelog: add some additional entries 2022-01-10 09:26:14 -07:00
Willi Ballenthin
6690634a3f Merge pull request #858 from mandiant/dependabot/pip/types-pyyaml-6.0.3
build(deps-dev): bump types-pyyaml from 6.0.1 to 6.0.3
2022-01-10 08:26:25 -07:00
dependabot[bot]
8f3730bae3 build(deps-dev): bump types-pyyaml from 6.0.1 to 6.0.3
Bumps [types-pyyaml](https://github.com/python/typeshed) from 6.0.1 to 6.0.3.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-pyyaml
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-10 15:25:42 +00:00
Willi Ballenthin
8f4e726774 Merge pull request #859 from mandiant/dependabot/pip/types-tabulate-0.8.5
build(deps-dev): bump types-tabulate from 0.8.4 to 0.8.5
2022-01-10 08:25:12 -07:00
Willi Ballenthin
5b8eda0f08 Merge pull request #861 from mandiant/dependabot/pip/mypy-0.931
build(deps-dev): bump mypy from 0.930 to 0.931
2022-01-10 08:24:59 -07:00
Willi Ballenthin
f5f62bbd71 Merge pull request #862 from mandiant/dependabot/pip/types-psutil-5.8.19
build(deps-dev): bump types-psutil from 5.8.17 to 5.8.19
2022-01-10 08:24:41 -07:00
dependabot[bot]
24c3edc7ec build(deps-dev): bump types-psutil from 5.8.17 to 5.8.19
Bumps [types-psutil](https://github.com/python/typeshed) from 5.8.17 to 5.8.19.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-psutil
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-10 14:18:21 +00:00
dependabot[bot]
0e3d46ef5e build(deps-dev): bump mypy from 0.930 to 0.931
Bumps [mypy](https://github.com/python/mypy) from 0.930 to 0.931.
- [Release notes](https://github.com/python/mypy/releases)
- [Commits](https://github.com/python/mypy/compare/v0.930...v0.931)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-10 14:18:19 +00:00
dependabot[bot]
a3546b65f7 build(deps-dev): bump types-tabulate from 0.8.4 to 0.8.5
Bumps [types-tabulate](https://github.com/python/typeshed) from 0.8.4 to 0.8.5.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-tabulate
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-10 14:18:09 +00:00
Willi Ballenthin
01b694b6ab Merge pull request #851 from kn0wl3dge/fix/430
smda: fix negative number extraction
2022-01-03 12:08:41 -07:00
Moritz
3598f83091 Merge pull request #856 from mandiant/dependabot/pip/psutil-5.9.0
build(deps-dev): bump psutil from 5.8.0 to 5.9.0
2022-01-03 17:33:56 +01:00
Moritz
2085dd7b02 Merge pull request #853 from mandiant/dependabot/pip/ruamel-yaml-0.17.20
build(deps): bump ruamel-yaml from 0.17.19 to 0.17.20
2022-01-03 17:33:40 +01:00
Moritz
65d916332d Merge pull request #855 from mandiant/dependabot/pip/types-psutil-5.8.17
build(deps-dev): bump types-psutil from 5.8.16 to 5.8.17
2022-01-03 17:33:26 +01:00
Moritz
1937efce88 Merge pull request #852 from mandiant/dependabot/pip/types-tabulate-0.8.4
build(deps-dev): bump types-tabulate from 0.8.3 to 0.8.4
2022-01-03 17:33:19 +01:00
Moritz
501d607b3a Merge pull request #854 from mandiant/dependabot/pip/types-colorama-0.4.5
build(deps-dev): bump types-colorama from 0.4.4 to 0.4.5
2022-01-03 17:33:07 +01:00
dependabot[bot]
7d6670c59e build(deps-dev): bump psutil from 5.8.0 to 5.9.0
Bumps [psutil](https://github.com/giampaolo/psutil) from 5.8.0 to 5.9.0.
- [Release notes](https://github.com/giampaolo/psutil/releases)
- [Changelog](https://github.com/giampaolo/psutil/blob/master/HISTORY.rst)
- [Commits](https://github.com/giampaolo/psutil/compare/release-5.8.0...release-5.9.0)

---
updated-dependencies:
- dependency-name: psutil
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-03 14:11:03 +00:00
dependabot[bot]
fe608db16a build(deps-dev): bump types-psutil from 5.8.16 to 5.8.17
Bumps [types-psutil](https://github.com/python/typeshed) from 5.8.16 to 5.8.17.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-psutil
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-03 14:10:58 +00:00
dependabot[bot]
be1f313d57 build(deps-dev): bump types-colorama from 0.4.4 to 0.4.5
Bumps [types-colorama](https://github.com/python/typeshed) from 0.4.4 to 0.4.5.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-colorama
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-03 14:10:55 +00:00
dependabot[bot]
cb77c55d2c build(deps): bump ruamel-yaml from 0.17.19 to 0.17.20
Bumps [ruamel-yaml](https://sourceforge.net/p/ruamel-yaml/code/ci/default/tree) from 0.17.19 to 0.17.20.

---
updated-dependencies:
- dependency-name: ruamel-yaml
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-03 14:10:53 +00:00
dependabot[bot]
417aa35c60 build(deps-dev): bump types-tabulate from 0.8.3 to 0.8.4
Bumps [types-tabulate](https://github.com/python/typeshed) from 0.8.3 to 0.8.4.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-tabulate
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-03 14:10:46 +00:00
Baptistin Boilot
18877eb676 changelog: add fixed issue 2021-12-31 21:14:56 +01:00
Baptistin Boilot
a9670c9510 smda: fix number extractor to return only unsigned values
SmdaInstruction operands are python `str` objects. SMDA number operands are signed integers.
This commit adds a converter to the SMDA number extractor.
The goal is to convert any signed number to the two’s complement representation with the correct bitness.
2021-12-31 20:10:36 +01:00
Baptistin Boilot
8474369575 tests: add fixtures for two's complement numbers
Add fixtures to validate the following number features:
- number(0x0): to check feature extraction for null number
- number(0xFFFFFFFF): to check feature extraction for -1 number
- number(0xFFFFFFF0): to check feature extraction for negative number (-0x10 in this case)
2021-12-31 20:08:56 +01:00
Baptistin Boilot
4739d121a2 scripts: add backend parameter (-b) to show-features.py 2021-12-31 20:07:34 +01:00
Mike Hunhoff
e47f5a2548 Merge pull request #849 from mandiant/fix/845
capa explorer: updating supported IDA versions
2021-12-31 10:48:53 -07:00
Willi Ballenthin
51f5628383 Merge pull request #847 from mandiant/dependabot/pip/ruamel-yaml-0.17.19
build(deps): bump ruamel-yaml from 0.17.17 to 0.17.19
2021-12-29 09:44:24 -07:00
Willi Ballenthin
aa67a1b285 Merge pull request #846 from mandiant/dependabot/pip/types-psutil-5.8.16
build(deps-dev): bump types-psutil from 5.8.15 to 5.8.16
2021-12-29 09:44:15 -07:00
Willi Ballenthin
d22e51fd84 Merge pull request #848 from mandiant/dependabot/pip/mypy-0.930
build(deps-dev): bump mypy from 0.920 to 0.930
2021-12-29 09:42:21 -07:00
Michael Hunhoff
cde4af40fe capa explorer: updating supported IDA versions 2021-12-28 10:51:53 -07:00
dependabot[bot]
a147755d13 build(deps-dev): bump mypy from 0.920 to 0.930
Bumps [mypy](https://github.com/python/mypy) from 0.920 to 0.930.
- [Release notes](https://github.com/python/mypy/releases)
- [Commits](https://github.com/python/mypy/compare/v0.920...v0.930)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-27 14:12:16 +00:00
dependabot[bot]
7b6c293069 build(deps): bump ruamel-yaml from 0.17.17 to 0.17.19
Bumps [ruamel-yaml](https://sourceforge.net/p/ruamel-yaml/code/ci/default/tree) from 0.17.17 to 0.17.19.

---
updated-dependencies:
- dependency-name: ruamel-yaml
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-27 14:12:12 +00:00
dependabot[bot]
b3f1244641 build(deps-dev): bump types-psutil from 5.8.15 to 5.8.16
Bumps [types-psutil](https://github.com/python/typeshed) from 5.8.15 to 5.8.16.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-psutil
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-27 14:12:06 +00:00
Capa Bot
e6423700b9 Sync capa rules submodule 2021-12-23 16:34:46 +00:00
Moritz
9462a26a05 Merge pull request #844 from mandiant/dependabot/pip/mypy-0.920
build(deps-dev): bump mypy from 0.910 to 0.920
2021-12-20 16:31:41 +01:00
dependabot[bot]
c059a52d0e build(deps-dev): bump mypy from 0.910 to 0.920
Bumps [mypy](https://github.com/python/mypy) from 0.910 to 0.920.
- [Release notes](https://github.com/python/mypy/releases)
- [Commits](https://github.com/python/mypy/compare/v0.910...v0.920)

---
updated-dependencies:
- dependency-name: mypy
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-20 14:09:06 +00:00
Capa Bot
a221db8a59 Sync capa rules submodule 2021-12-20 12:48:22 +00:00
Moritz
df43ed0219 Merge pull request #842 from mandiant/fix/maec-mal-fam
support maec/malware-family meta
2021-12-20 13:15:50 +01:00
Capa Bot
90430f52c6 Sync capa-testfiles submodule 2021-12-15 15:33:39 +00:00
Moritz Raabe
4e7f0b4591 support maec/malware-family meta 2021-12-15 10:40:34 +01:00
Capa Bot
bda76c22ec Sync capa rules submodule 2021-12-14 21:52:49 +00:00
Capa Bot
d67223c321 Sync capa rules submodule 2021-12-14 21:46:38 +00:00
Capa Bot
21278ff595 Sync capa rules submodule 2021-12-14 21:45:58 +00:00
Capa Bot
21fd6b27e2 Sync capa rules submodule 2021-12-13 18:48:16 +00:00
Capa Bot
cc8d57b242 Sync capa-testfiles submodule 2021-12-13 17:24:52 +00:00
Capa Bot
6081f4573c Sync capa-testfiles submodule 2021-12-13 17:24:32 +00:00
Capa Bot
ea2cafa715 Sync capa-testfiles submodule 2021-12-13 17:24:02 +00:00
Capa Bot
a34c993e31 Sync capa rules submodule 2021-12-07 04:32:49 +00:00
Willi Ballenthin
1a5fc3a21a Merge pull request #839 from cl3o/master
types: Add assert_never for exhaustivenes checking with mypy
2021-12-06 13:55:41 -07:00
cl3o
c15a9a72f5 Add local variable for easy_rules_by_feature at the beginning of match 2021-12-06 20:55:15 +01:00
cl3o
5b35058338 Forgot to add the second fix to the first commit. 2021-12-06 20:32:44 +01:00
cl3o
a0ca6e18c8 Made proposed changes to fix mypy errors 2021-12-06 20:30:07 +01:00
Capa Bot
1917004292 Sync capa rules submodule 2021-12-06 19:22:59 +00:00
Capa Bot
8ee3bb08bc Sync capa rules submodule 2021-12-06 18:24:54 +00:00
Capa Bot
7e96059fb5 Sync capa rules submodule 2021-12-06 17:58:59 +00:00
Capa Bot
4f7f06d316 Sync capa rules submodule 2021-12-06 17:57:11 +00:00
Capa Bot
448b5392be Sync capa rules submodule 2021-12-06 17:56:26 +00:00
Willi Ballenthin
6f5f3e091a Merge pull request #840 from mandiant/dependabot/pip/black-21.12b0
build(deps-dev): bump black from 21.11b1 to 21.12b0
2021-12-06 10:45:51 -07:00
dependabot[bot]
fa6a2069ce build(deps-dev): bump black from 21.11b1 to 21.12b0
Bumps [black](https://github.com/psf/black) from 21.11b1 to 21.12b0.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/commits)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-12-06 14:12:23 +00:00
Capa Bot
09fd371b9d Sync capa-testfiles submodule 2021-12-06 10:13:41 +00:00
Capa Bot
a598745938 Sync capa-testfiles submodule 2021-12-06 10:06:57 +00:00
Capa Bot
7751f693c8 Sync capa-testfiles submodule 2021-12-06 10:02:45 +00:00
Capa Bot
7ade9ca43e Sync capa-testfiles submodule 2021-12-06 10:01:17 +00:00
cl3o
061a66e437 create function assert_never 2021-12-04 19:02:54 +01:00
Capa Bot
39536e2727 Sync capa rules submodule 2021-12-03 15:29:51 +00:00
Capa Bot
38038626d4 Sync capa rules submodule 2021-12-03 15:29:28 +00:00
Capa Bot
c3d34abe89 Sync capa-testfiles submodule 2021-12-03 12:12:30 +00:00
Capa Bot
baf5005998 Sync capa-testfiles submodule 2021-12-03 12:12:20 +00:00
Capa Bot
107c3c0cf9 Sync capa rules submodule 2021-11-30 22:06:21 +00:00
Capa Bot
2d1bd37816 Sync capa rules submodule 2021-11-30 15:24:28 +00:00
Capa Bot
de017b15d0 Sync capa-testfiles submodule 2021-11-30 15:24:09 +00:00
Capa Bot
3b0974ae3e Sync capa rules submodule 2021-11-29 23:46:52 +00:00
Willi Ballenthin
cf6cbc16df Merge pull request #838 from mandiant/dependabot/pip/types-psutil-5.8.15
build(deps-dev): bump types-psutil from 5.8.14 to 5.8.15
2021-11-29 08:47:44 -07:00
dependabot[bot]
bd60a8d9cd build(deps-dev): bump types-psutil from 5.8.14 to 5.8.15
Bumps [types-psutil](https://github.com/python/typeshed) from 5.8.14 to 5.8.15.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-psutil
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-29 14:09:09 +00:00
Capa Bot
c77240c6b4 Sync capa rules submodule 2021-11-26 16:21:34 +00:00
Moritz
14d803c604 Merge pull request #837 from mandiant/dependabot/pip/black-21.11b1
build(deps-dev): bump black from 21.10b0 to 21.11b1
2021-11-22 18:45:02 +01:00
dependabot[bot]
f764829ca9 build(deps-dev): bump black from 21.10b0 to 21.11b1
Bumps [black](https://github.com/psf/black) from 21.10b0 to 21.11b1.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/commits)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-22 14:10:19 +00:00
Willi Ballenthin
418eedd7bd freeze: fix doc describing format 2021-11-17 12:06:56 -07:00
Willi Ballenthin
b9f1fe56c8 Merge pull request #834 from mandiant/williballenthin-patch-1
setup: bump viv-utils to v0.6.9
2021-11-16 11:21:30 -07:00
Willi Ballenthin
7e50a957ff ci: tests: python versions are strings not floats 2021-11-16 10:12:34 -07:00
Willi Ballenthin
137cff6127 ci: tests: test under py3.10 too 2021-11-16 10:06:32 -07:00
Willi Ballenthin
807b99e5e5 changelog 2021-11-15 14:12:07 -07:00
Willi Ballenthin
e21c69f4e3 setup: bump viv-utils to v0.6.9
closes #816 
closes #683
2021-11-15 14:10:48 -07:00
Moritz
9f7daca86e Merge pull request #833 from mandiant/dependabot/pip/types-pyyaml-6.0.1
build(deps-dev): bump types-pyyaml from 6.0.0 to 6.0.1
2021-11-15 16:54:11 +01:00
Moritz
1b89e274c9 Merge pull request #832 from mandiant/dependabot/pip/isort-5.10.1
build(deps-dev): bump isort from 5.10.0 to 5.10.1
2021-11-15 16:54:02 +01:00
Moritz
dd768dc080 Merge pull request #831 from mandiant/dependabot/pip/viv-utils-flirt--0.6.8
build(deps): bump viv-utils[flirt] from 0.6.7 to 0.6.8
2021-11-15 16:53:53 +01:00
dependabot[bot]
4aea481967 build(deps-dev): bump types-pyyaml from 6.0.0 to 6.0.1
Bumps [types-pyyaml](https://github.com/python/typeshed) from 6.0.0 to 6.0.1.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-pyyaml
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-15 14:12:07 +00:00
dependabot[bot]
265629d127 build(deps-dev): bump isort from 5.10.0 to 5.10.1
Bumps [isort](https://github.com/pycqa/isort) from 5.10.0 to 5.10.1.
- [Release notes](https://github.com/pycqa/isort/releases)
- [Changelog](https://github.com/PyCQA/isort/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pycqa/isort/compare/5.10.0...5.10.1)

---
updated-dependencies:
- dependency-name: isort
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-15 14:12:04 +00:00
dependabot[bot]
cef0cb809f build(deps): bump viv-utils[flirt] from 0.6.7 to 0.6.8
Bumps [viv-utils[flirt]](https://github.com/williballenthin/viv-utils) from 0.6.7 to 0.6.8.
- [Release notes](https://github.com/williballenthin/viv-utils/releases)
- [Commits](https://github.com/williballenthin/viv-utils/compare/v0.6.7...v0.6.8)

---
updated-dependencies:
- dependency-name: viv-utils[flirt]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-15 14:11:59 +00:00
Willi Ballenthin
57fe1e27b6 Merge pull request #830 from mandiant/perf/rule-selection
perf: don't try to match rules that will never match
2021-11-12 11:54:29 -07:00
Willi Ballenthin
83253eb7d0 rules: better variable name 2021-11-12 11:53:03 -07:00
Willi Ballenthin
9b5e8ff45d Update capa/rules.py
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2021-11-12 11:51:39 -07:00
William Ballenthin
cdfacc6247 Merge branch 'master' of github.com:fireeye/capa into perf/rule-selection 2021-11-10 14:30:08 -07:00
Capa Bot
10d747cc8c Sync capa rules submodule 2021-11-10 21:29:25 +00:00
William Ballenthin
a6b366602c mypy 2021-11-10 14:21:28 -07:00
William Ballenthin
80fb9dec3c pep8 2021-11-10 14:15:52 -07:00
William Ballenthin
68c86cf620 rules: easy/hard: better detect edge cases in optional, some, and range 2021-11-10 14:13:57 -07:00
William Ballenthin
e550d48bcd linter: optional maps to some, not range 2021-11-10 14:13:37 -07:00
William Ballenthin
1aaaa8919c rules: easy/hard: simplify indexing by considering not: hard 2021-11-10 13:55:34 -07:00
William Ballenthin
72c2ffc40b linter: add checks for not and optional not under and 2021-11-10 13:47:30 -07:00
William Ballenthin
f7ab2fb13a rules: easy/hard rules: detect not/optional at the root 2021-11-10 13:36:10 -07:00
William Ballenthin
3a1272246f rules: code consistency 2021-11-10 13:36:00 -07:00
William Ballenthin
6039a33bf8 engine: remove old import 2021-11-10 12:56:40 -07:00
William Ballenthin
2d68fb2536 pep8 2021-11-10 12:51:27 -07:00
William Ballenthin
845df282ef tests: split out match tests and validate alternative algorithms 2021-11-10 12:44:58 -07:00
William Ballenthin
1406dc28d9 rules: ruleset: fix collection of features under not statements 2021-11-10 12:44:19 -07:00
William Ballenthin
67884dd255 rules: match: more documentation 2021-11-09 16:42:32 -07:00
William Ballenthin
2bf05ac631 rules: index easy/hard: better handle not: statements 2021-11-09 16:37:30 -07:00
William Ballenthin
8cb04e4737 Merge branch 'master' into perf/rule-selection 2021-11-09 16:28:03 -07:00
William Ballenthin
733126591e Merge branch 'perf/query-optimizer' 2021-11-09 16:27:09 -07:00
William Ballenthin
d4d801c246 optimizer: tweak costs slightly 2021-11-09 16:26:26 -07:00
Willi Ballenthin
84ba32a8fe Merge pull request #829 from mandiant/perf/query-optimizer
perf: add query optimizer
2021-11-09 16:25:22 -07:00
William Ballenthin
ea386d02b6 tests: add test demonstrating optimizer 2021-11-09 16:24:26 -07:00
William Ballenthin
77cac63443 Merge branch 'master' into perf/query-optimizer 2021-11-09 16:12:30 -07:00
Willi Ballenthin
9350ee9479 Merge pull request #827 from mandiant/perf/short-circuit
perf: short circuit logic nodes when appropriate
2021-11-09 16:10:20 -07:00
Willi Ballenthin
025d156068 Merge pull request #828 from mandiant/profiling
profile infrastructure
2021-11-09 16:09:34 -07:00
William Ballenthin
7a4aee592b profile-time: add doc 2021-11-09 16:08:39 -07:00
Willi Ballenthin
f427c5e961 Update capa/engine.py
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2021-11-09 10:49:10 -07:00
Willi Ballenthin
51af2d4a56 Update capa/engine.py
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2021-11-09 10:49:01 -07:00
Willi Ballenthin
a68812b223 Update capa/engine.py
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2021-11-09 10:48:54 -07:00
William Ballenthin
e05f8c7034 changelog 2021-11-09 10:27:33 -07:00
William Ballenthin
182377581a main: use ruleset.match instead of engine.mathc 2021-11-09 09:52:45 -07:00
William Ballenthin
e647ae2ac4 rules: ruleset: add optimized match routine 2021-11-09 09:52:32 -07:00
William Ballenthin
1311da99ff rules: make Scope an enum 2021-11-09 09:51:50 -07:00
William Ballenthin
8badf226a2 engine: document match routine 2021-11-09 09:51:18 -07:00
William Ballenthin
6909d6a541 changelog 2021-11-08 16:04:15 -07:00
William Ballenthin
e287dc9a32 optimizer: fix sort order 2021-11-08 15:54:14 -07:00
William Ballenthin
152d0f3244 ruleset: add query optimizer 2021-11-08 15:34:59 -07:00
William Ballenthin
a6e2cfc90a Merge branch 'profiling' into perf/short-circuit 2021-11-08 15:24:50 -07:00
William Ballenthin
18c30e4f12 main: remove perf debug msgs 2021-11-08 15:24:43 -07:00
William Ballenthin
3c4f4d302c Merge branch 'profiling' into perf/short-circuit 2021-11-08 15:23:23 -07:00
William Ballenthin
2abebfbce7 main: remove perf messages 2021-11-08 15:22:58 -07:00
William Ballenthin
0b517c51d8 main: remove perf messages 2021-11-08 15:22:01 -07:00
William Ballenthin
9fbbda11b8 Merge branch 'profiling' into perf/short-circuit 2021-11-08 15:20:22 -07:00
William Ballenthin
6f6831f812 perf: document that counters is unstable 2021-11-08 15:20:11 -07:00
William Ballenthin
d425bb31c4 Merge branch 'profiling' into perf/short-circuit 2021-11-08 15:16:22 -07:00
William Ballenthin
334425a08f changelog 2021-11-08 15:16:08 -07:00
William Ballenthin
3e74da96a6 engine: make short circuiting configurable 2021-11-08 14:55:11 -07:00
William Ballenthin
ad119d789b Merge branch 'profiling' into perf/short-circuit 2021-11-08 14:35:26 -07:00
William Ballenthin
6c8d246af9 fix bad merge 2021-11-08 14:31:43 -07:00
William Ballenthin
26b7a0b91d Merge branch 'master' into profiling 2021-11-08 14:29:40 -07:00
Willi Ballenthin
0b6c6227b9 Merge pull request #825 from mandiant/fix/circular-import-freeze
fix circular import freeze
2021-11-08 14:28:01 -07:00
William Ballenthin
94fd7673fd common: mypy 2021-11-08 14:27:44 -07:00
William Ballenthin
f598acb8fc scripts: remove old profiling scripts 2021-11-08 14:24:48 -07:00
William Ballenthin
b621205a06 mypy 2021-11-08 14:24:13 -07:00
William Ballenthin
9fa9c6a5d0 tests: add test demonstrating short circuiting 2021-11-08 14:07:44 -07:00
William Ballenthin
1a84051679 changelog 2021-11-08 14:07:31 -07:00
William Ballenthin
d987719889 engine: some: correctly count satisfied children 2021-11-08 13:53:37 -07:00
William Ballenthin
96813c37b7 remove old improt 2021-11-08 13:48:33 -07:00
William Ballenthin
70f007525d pep8 2021-11-08 12:11:01 -07:00
William Ballenthin
e3496b0660 engine: move optimizer into its own module 2021-11-08 12:10:22 -07:00
William Ballenthin
24b4c99635 changelog 2021-11-08 11:58:02 -07:00
William Ballenthin
27b4a8ba73 common: remove old import 2021-11-08 11:55:58 -07:00
William Ballenthin
51b3f38f55 common: move Result to capa.common from capa.engine
fixes circular import error in capa.features.freeze
2021-11-08 11:54:36 -07:00
William Ballenthin
a35be4a666 scripts: add py script for profiling time 2021-11-08 11:52:34 -07:00
William Ballenthin
5770d0c12d perf: add reset routine 2021-11-08 11:52:25 -07:00
William Ballenthin
0629c584e1 common: move Result to capa.common from capa.engine
fixes circular import error in capa.features.freeze
2021-11-08 11:52:13 -07:00
William Ballenthin
480df323e5 scripts: add py script for profiling time 2021-11-08 11:51:09 -07:00
William Ballenthin
a995b53c38 perf: add reset routine 2021-11-08 11:50:49 -07:00
William Ballenthin
35fa50dbee pep8 2021-11-08 11:50:37 -07:00
William Ballenthin
d86c3f4d48 common: move Result to capa.common from capa.engine
fixes circular import error in capa.features.freeze
2021-11-08 11:50:16 -07:00
Moritz
4696c0ebb6 Merge pull request #822 from mandiant/dependabot/pip/types-psutil-5.8.14
build(deps-dev): bump types-psutil from 5.8.13 to 5.8.14
2021-11-08 17:02:58 +01:00
Moritz
09724e9787 Merge pull request #823 from mandiant/dependabot/pip/isort-5.10.0
build(deps-dev): bump isort from 5.9.3 to 5.10.0
2021-11-08 17:02:33 +01:00
dependabot[bot]
636548cdec build(deps-dev): bump isort from 5.9.3 to 5.10.0
Bumps [isort](https://github.com/pycqa/isort) from 5.9.3 to 5.10.0.
- [Release notes](https://github.com/pycqa/isort/releases)
- [Changelog](https://github.com/PyCQA/isort/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pycqa/isort/compare/5.9.3...5.10.0)

---
updated-dependencies:
- dependency-name: isort
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-08 14:16:28 +00:00
dependabot[bot]
b3970808df build(deps-dev): bump types-psutil from 5.8.13 to 5.8.14
Bumps [types-psutil](https://github.com/python/typeshed) from 5.8.13 to 5.8.14.
- [Release notes](https://github.com/python/typeshed/releases)
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-psutil
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-08 14:16:15 +00:00
William Ballenthin
d573b83c94 rule: optimization: add some documentation 2021-11-05 16:49:38 -06:00
William Ballenthin
e63f072e40 rules: optimizer: use recursive cost of statements 2021-11-05 16:39:00 -06:00
William Ballenthin
a329147d28 engine: some: short circuit 2021-11-05 16:32:23 -06:00
William Ballenthin
18ba986eba engine: or: short circuit 2021-11-05 16:32:12 -06:00
William Ballenthin
8d9f418b2b rules: optimize by cost 2021-11-05 16:20:22 -06:00
William Ballenthin
623bac1a40 engine: statement: document that the order of children is important 2021-11-05 16:19:16 -06:00
William Ballenthin
702d00da91 gitignore 2021-11-05 15:24:24 -06:00
William Ballenthin
3a12472be8 perf: render: show evaluate.feature counter 2021-11-05 15:23:34 -06:00
William Ballenthin
6524449ad1 main: perf: human format the numbers 2021-11-05 15:23:22 -06:00
William Ballenthin
86cab26a69 add perf counters in module capa.perf 2021-11-05 14:59:22 -06:00
William Ballenthin
3d068fe3cd scripts: add utilities for collecting profile traces 2021-11-04 13:17:38 -06:00
William Ballenthin
f98236046b main: add coarse timing measurements 2021-11-04 12:38:35 -06:00
William Ballenthin
ed3bd4ef75 main: add timing ctx manager 2021-11-04 12:20:05 -06:00
Capa Bot
7d3ae7a91b Sync capa rules submodule 2021-11-03 18:29:09 +00:00
Capa Bot
0409c431b8 Sync capa rules submodule 2021-11-02 18:47:47 +00:00
Capa Bot
ffbb841b03 Sync capa rules submodule 2021-11-02 18:47:18 +00:00
Willi Ballenthin
e9a7dbc2ff Merge pull request #820 from mandiant/fix/linter-file-format
auto recognize shellcode based on file extension
2021-11-02 11:31:33 -06:00
Capa Bot
10dc8950c1 Sync capa rules submodule 2021-11-02 17:29:30 +00:00
Capa Bot
fe0fb1ccd2 Sync capa rules submodule 2021-11-02 17:17:47 +00:00
Moritz Raabe
e9170a1d4b auto recognize shellcode based on file extension 2021-11-02 18:02:37 +01:00
Capa Bot
02bd8581d8 Sync capa-testfiles submodule 2021-11-02 16:42:40 +00:00
Moritz
ca574201a4 Merge pull request #818 from mandiant/dependabot/pip/ruamel-yaml-0.17.17
build(deps): bump ruamel-yaml from 0.17.16 to 0.17.17
2021-11-02 17:36:03 +01:00
Moritz
8e744d94e6 Merge pull request #817 from mandiant/dependabot/pip/black-21.10b0
build(deps-dev): bump black from 21.9b0 to 21.10b0
2021-11-02 17:35:52 +01:00
dependabot[bot]
6a28330dd1 build(deps): bump ruamel-yaml from 0.17.16 to 0.17.17
Bumps [ruamel-yaml](https://sourceforge.net/p/ruamel-yaml/code/ci/default/tree) from 0.17.16 to 0.17.17.

---
updated-dependencies:
- dependency-name: ruamel-yaml
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-01 14:11:49 +00:00
dependabot[bot]
4537b52c18 build(deps-dev): bump black from 21.9b0 to 21.10b0
Bumps [black](https://github.com/psf/black) from 21.9b0 to 21.10b0.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/commits)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-11-01 14:11:42 +00:00
30 changed files with 1497 additions and 599 deletions

View File

@@ -30,7 +30,7 @@ jobs:
- name: Set up Python 3.8
uses: actions/setup-python@v2
with:
python-version: 3.8
python-version: "3.8"
- name: Install dependencies
run: pip install -e .[dev]
- name: Lint with isort
@@ -50,7 +50,7 @@ jobs:
- name: Set up Python 3.8
uses: actions/setup-python@v2
with:
python-version: 3.8
python-version: "3.8"
- name: Install capa
run: pip install -e .
- name: Run rule linter
@@ -65,13 +65,15 @@ jobs:
matrix:
os: [ubuntu-20.04, windows-2019, macos-10.15]
# across all operating systems
python-version: [3.6, 3.9]
python-version: ["3.6", "3.10"]
include:
# on Ubuntu run these as well
- os: ubuntu-20.04
python-version: 3.7
python-version: "3.7"
- os: ubuntu-20.04
python-version: 3.8
python-version: "3.8"
- os: ubuntu-20.04
python-version: "3.9"
steps:
- name: Checkout capa with submodules
uses: actions/checkout@v2

3
.gitignore vendored
View File

@@ -115,3 +115,6 @@ isort-output.log
black-output.log
rule-linter-output.log
.vscode
scripts/perf/*.txt
scripts/perf/*.svg
scripts/perf/*.zip

View File

@@ -17,8 +17,81 @@
### Development
### Raw diffs
- [capa <release>...master](https://github.com/mandiant/capa/compare/v3.0.3...master)
- [capa-rules <release>...master](https://github.com/mandiant/capa-rules/compare/v3.0.3...master)
- [capa v3.1.0...master](https://github.com/mandiant/capa/compare/v3.1.0...master)
- [capa-rules v3.1.0...master](https://github.com/mandiant/capa-rules/compare/v3.1.0...master)
## v3.1.0 (2022-01-10)
This release improves the performance of capa while also adding 23 new rules and many code quality enhancements. We profiled capa's CPU usage and optimized the way that it matches rules, such as by short circuiting when appropriate. According to our testing, the matching phase is approximately 66% faster than v3.0.3! We also added support for Python 3.10, aarch64 builds, and additional MAEC metadata in the rule headers.
This release adds 23 new rules, including nine by Jakub Jozwiak of Mandiant. @ryantxu1 and @dzbeck updated the ATT&CK and MBC mappings for many rules. Thank you!
And as always, welcome first time contributors!
- @kn0wl3dge
- @jtothej
- @cl30
### New Features
- engine: short circuit logic nodes for better performance #824 @williballenthin
- engine: add optimizer the order faster nodes first #829 @williballenthin
- engine: optimize rule evaluation by skipping rules that can't match #830 @williballenthin
- support python 3.10 #816 @williballenthin
- support aarch64 #683 @williballenthin
- rules: support maec/malware-family meta #841 @mr-tz
- engine: better type annotations/exhaustiveness checking #839 @cl30
### Breaking Changes: None
### New Rules (23)
- nursery/delete-windows-backup-catalog michael.hunhoff@mandiant.com
- nursery/disable-automatic-windows-recovery-features michael.hunhoff@mandiant.com
- nursery/capture-webcam-video @johnk3r
- nursery/create-registry-key-via-stdregprov michael.hunhoff@mandiant.com
- nursery/delete-registry-key-via-stdregprov michael.hunhoff@mandiant.com
- nursery/delete-registry-value-via-stdregprov michael.hunhoff@mandiant.com
- nursery/query-or-enumerate-registry-key-via-stdregprov michael.hunhoff@mandiant.com
- nursery/query-or-enumerate-registry-value-via-stdregprov michael.hunhoff@mandiant.com
- nursery/set-registry-value-via-stdregprov michael.hunhoff@mandiant.com
- data-manipulation/compression/decompress-data-using-ucl jakub.jozwiak@mandiant.com
- linking/static/wolfcrypt/linked-against-wolfcrypt jakub.jozwiak@mandiant.com
- linking/static/wolfssl/linked-against-wolfssl jakub.jozwiak@mandiant.com
- anti-analysis/packer/pespin/packed-with-pespin jakub.jozwiak@mandiant.com
- load-code/shellcode/execute-shellcode-via-windows-fibers jakub.jozwiak@mandiant.com
- load-code/shellcode/execute-shellcode-via-enumuilanguages jakub.jozwiak@mandiant.com
- anti-analysis/packer/themida/packed-with-themida william.ballenthin@mandiant.com
- load-code/shellcode/execute-shellcode-via-createthreadpoolwait jakub.jozwiak@mandiant.com
- host-interaction/process/inject/inject-shellcode-using-a-file-mapping-object jakub.jozwiak@mandiant.com
- load-code/shellcode/execute-shellcode-via-copyfile2 jakub.jozwiak@mandiant.com
- malware-family/plugx/match-known-plugx-module still@teamt5.org
### Rule Changes
- update ATT&CK mappings by @ryantxu1
- update ATT&CK and MBC mappings by @dzbeck
- aplib detection by @cdong1012
- golang runtime detection by @stevemk14eber
### Bug Fixes
- fix circular import error #825 @williballenthin
- fix smda negative number extraction #430 @kn0wl3dge
### capa explorer IDA Pro plugin
- pin supported versions to >= 7.4 and < 8.0 #849 @mike-hunhoff
### Development
- add profiling infrastructure #828 @williballenthin
- linter: detect shellcode extension #820 @mr-tz
- show features script: add backend flag #430 @kn0wl3dge
### Raw diffs
- [capa v3.0.3...v3.1.0](https://github.com/mandiant/capa/compare/v3.0.3...v3.1.0)
- [capa-rules v3.0.3...v3.1.0](https://github.com/mandiant/capa-rules/compare/v3.0.3...v3.1.0)
## v3.0.3 (2021-10-27)

View File

@@ -2,7 +2,7 @@
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/flare-capa)](https://pypi.org/project/flare-capa)
[![Last release](https://img.shields.io/github/v/release/mandiant/capa)](https://github.com/mandiant/capa/releases)
[![Number of rules](https://img.shields.io/badge/rules-639-blue.svg)](https://github.com/mandiant/capa-rules)
[![Number of rules](https://img.shields.io/badge/rules-658-blue.svg)](https://github.com/mandiant/capa-rules)
[![CI status](https://github.com/mandiant/capa/workflows/CI/badge.svg)](https://github.com/mandiant/capa/actions?query=workflow%3ACI+event%3Apush+branch%3Amaster)
[![Downloads](https://img.shields.io/github/downloads/mandiant/capa/total)](https://github.com/mandiant/capa/releases)
[![License](https://img.shields.io/badge/license-Apache--2.0-green.svg)](LICENSE.txt)

View File

@@ -8,11 +8,16 @@
import copy
import collections
from typing import Set, Dict, List, Tuple, Union, Mapping, Iterable
from typing import TYPE_CHECKING, Set, Dict, List, Tuple, Mapping, Iterable
import capa.rules
import capa.perf
import capa.features.common
from capa.features.common import Feature
from capa.features.common import Result, Feature
if TYPE_CHECKING:
# circular import, otherwise
import capa.rules
# a collection of features and the locations at which they are found.
#
@@ -45,15 +50,12 @@ class Statement:
def __repr__(self):
return str(self)
def evaluate(self, features: FeatureSet) -> "Result":
def evaluate(self, features: FeatureSet, short_circuit=True) -> Result:
"""
classes that inherit `Statement` must implement `evaluate`
args:
ctx (defaultdict[Feature, set[VA]])
returns:
Result
short_circuit (bool): if true, then statements like and/or/some may short circuit.
"""
raise NotImplementedError()
@@ -77,70 +79,70 @@ class Statement:
children[i] = new
class Result:
"""
represents the results of an evaluation of statements against features.
instances of this class should behave like a bool,
e.g. `assert Result(True, ...) == True`
instances track additional metadata about evaluation results.
they contain references to the statement node (e.g. an And statement),
as well as the children Result instances.
we need this so that we can render the tree of expressions and their results.
"""
def __init__(self, success: bool, statement: Union[Statement, Feature], children: List["Result"], locations=None):
"""
args:
success (bool)
statement (capa.engine.Statement or capa.features.Feature)
children (list[Result])
locations (iterable[VA])
"""
super(Result, self).__init__()
self.success = success
self.statement = statement
self.children = children
self.locations = locations if locations is not None else ()
def __eq__(self, other):
if isinstance(other, bool):
return self.success == other
return False
def __bool__(self):
return self.success
def __nonzero__(self):
return self.success
class And(Statement):
"""match if all of the children evaluate to True."""
"""
match if all of the children evaluate to True.
the order of evaluation is dictated by the property
`And.children` (type: List[Statement|Feature]).
a query optimizer may safely manipulate the order of these children.
"""
def __init__(self, children, description=None):
super(And, self).__init__(description=description)
self.children = children
def evaluate(self, ctx):
results = [child.evaluate(ctx) for child in self.children]
success = all(results)
return Result(success, self, results)
def evaluate(self, ctx, short_circuit=True):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.and"] += 1
if short_circuit:
results = []
for child in self.children:
result = child.evaluate(ctx, short_circuit=short_circuit)
results.append(result)
if not result:
# short circuit
return Result(False, self, results)
return Result(True, self, results)
else:
results = [child.evaluate(ctx, short_circuit=short_circuit) for child in self.children]
success = all(results)
return Result(success, self, results)
class Or(Statement):
"""match if any of the children evaluate to True."""
"""
match if any of the children evaluate to True.
the order of evaluation is dictated by the property
`Or.children` (type: List[Statement|Feature]).
a query optimizer may safely manipulate the order of these children.
"""
def __init__(self, children, description=None):
super(Or, self).__init__(description=description)
self.children = children
def evaluate(self, ctx):
results = [child.evaluate(ctx) for child in self.children]
success = any(results)
return Result(success, self, results)
def evaluate(self, ctx, short_circuit=True):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.or"] += 1
if short_circuit:
results = []
for child in self.children:
result = child.evaluate(ctx, short_circuit=short_circuit)
results.append(result)
if result:
# short circuit as soon as we hit one match
return Result(True, self, results)
return Result(False, self, results)
else:
results = [child.evaluate(ctx, short_circuit=short_circuit) for child in self.children]
success = any(results)
return Result(success, self, results)
class Not(Statement):
@@ -150,28 +152,55 @@ class Not(Statement):
super(Not, self).__init__(description=description)
self.child = child
def evaluate(self, ctx):
results = [self.child.evaluate(ctx)]
def evaluate(self, ctx, short_circuit=True):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.not"] += 1
results = [self.child.evaluate(ctx, short_circuit=short_circuit)]
success = not results[0]
return Result(success, self, results)
class Some(Statement):
"""match if at least N of the children evaluate to True."""
"""
match if at least N of the children evaluate to True.
the order of evaluation is dictated by the property
`Some.children` (type: List[Statement|Feature]).
a query optimizer may safely manipulate the order of these children.
"""
def __init__(self, count, children, description=None):
super(Some, self).__init__(description=description)
self.count = count
self.children = children
def evaluate(self, ctx):
results = [child.evaluate(ctx) for child in self.children]
# note that here we cast the child result as a bool
# because we've overridden `__bool__` above.
#
# we can't use `if child is True` because the instance is not True.
success = sum([1 for child in results if bool(child) is True]) >= self.count
return Result(success, self, results)
def evaluate(self, ctx, short_circuit=True):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.some"] += 1
if short_circuit:
results = []
satisfied_children_count = 0
for child in self.children:
result = child.evaluate(ctx, short_circuit=short_circuit)
results.append(result)
if result:
satisfied_children_count += 1
if satisfied_children_count >= self.count:
# short circuit as soon as we hit the threshold
return Result(True, self, results)
return Result(False, self, results)
else:
results = [child.evaluate(ctx, short_circuit=short_circuit) for child in self.children]
# note that here we cast the child result as a bool
# because we've overridden `__bool__` above.
#
# we can't use `if child is True` because the instance is not True.
success = sum([1 for child in results if bool(child) is True]) >= self.count
return Result(success, self, results)
class Range(Statement):
@@ -183,7 +212,10 @@ class Range(Statement):
self.min = min if min is not None else 0
self.max = max if max is not None else (1 << 64 - 1)
def evaluate(self, ctx):
def evaluate(self, ctx, **kwargs):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.range"] += 1
count = len(ctx.get(self.child, []))
if self.min == 0 and count == 0:
return Result(True, self, [])
@@ -208,7 +240,7 @@ class Subscope(Statement):
self.scope = scope
self.child = child
def evaluate(self, ctx):
def evaluate(self, ctx, **kwargs):
raise ValueError("cannot evaluate a subscope directly!")
@@ -247,15 +279,20 @@ def index_rule_matches(features: FeatureSet, rule: "capa.rules.Rule", locations:
def match(rules: List["capa.rules.Rule"], features: FeatureSet, va: int) -> Tuple[FeatureSet, MatchResults]:
"""
Args:
rules (List[capa.rules.Rule]): these must already be ordered topologically by dependency.
features (Mapping[capa.features.Feature, int]):
va (int): location of the features
match the given rules against the given features,
returning an updated set of features and the matches.
Returns:
Tuple[FeatureSet, MatchResults]: two-tuple with entries:
- set of features used for matching (which may be a superset of the given `features` argument, due to rule match features), and
- mapping from rule name to [(location of match, result object)]
the updated features are just like the input,
but extended to include the match features (e.g. names of rules that matched).
the given feature set is not modified; an updated copy is returned.
the given list of rules must be ordered topologically by dependency,
or else `match` statements will not be handled correctly.
this routine should be fairly optimized, but is not guaranteed to be the fastest matcher possible.
it has a particularly convenient signature: (rules, features) -> matches
other strategies can be imagined that match differently; implement these elsewhere.
specifically, this routine does "top down" matching of the given rules against the feature set.
"""
results = collections.defaultdict(list) # type: MatchResults
@@ -266,8 +303,18 @@ def match(rules: List["capa.rules.Rule"], features: FeatureSet, va: int) -> Tupl
features = collections.defaultdict(set, copy.copy(features))
for rule in rules:
res = rule.evaluate(features)
res = rule.evaluate(features, short_circuit=True)
if res:
# we first matched the rule with short circuiting enabled.
# this is much faster than without short circuiting.
# however, we want to collect all results thoroughly,
# so once we've found a match quickly,
# go back and capture results without short circuiting.
res = rule.evaluate(features, short_circuit=False)
# sanity check
assert bool(res) is True
results[rule.name].append((va, res))
# we need to update the current `features`
# because subsequent iterations of this loop may use newly added features,

View File

@@ -10,9 +10,13 @@ import re
import codecs
import logging
import collections
from typing import Set, Dict, Union
from typing import TYPE_CHECKING, Set, Dict, List, Union
import capa.engine
if TYPE_CHECKING:
# circular import, otherwise
import capa.engine
import capa.perf
import capa.features
import capa.features.extractors.elf
@@ -46,6 +50,52 @@ def escape_string(s: str) -> str:
return s
class Result:
"""
represents the results of an evaluation of statements against features.
instances of this class should behave like a bool,
e.g. `assert Result(True, ...) == True`
instances track additional metadata about evaluation results.
they contain references to the statement node (e.g. an And statement),
as well as the children Result instances.
we need this so that we can render the tree of expressions and their results.
"""
def __init__(
self,
success: bool,
statement: Union["capa.engine.Statement", "Feature"],
children: List["Result"],
locations=None,
):
"""
args:
success (bool)
statement (capa.engine.Statement or capa.features.Feature)
children (list[Result])
locations (iterable[VA])
"""
super(Result, self).__init__()
self.success = success
self.statement = statement
self.children = children
self.locations = locations if locations is not None else ()
def __eq__(self, other):
if isinstance(other, bool):
return self.success == other
return False
def __bool__(self):
return self.success
def __nonzero__(self):
return self.success
class Feature:
def __init__(self, value: Union[str, int, bytes], bitness=None, description=None):
"""
@@ -96,8 +146,10 @@ class Feature:
def __repr__(self):
return str(self)
def evaluate(self, ctx: Dict["Feature", Set[int]]) -> "capa.engine.Result":
return capa.engine.Result(self in ctx, self, [], locations=ctx.get(self, []))
def evaluate(self, ctx: Dict["Feature", Set[int]], **kwargs) -> Result:
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature." + self.name] += 1
return Result(self in ctx, self, [], locations=ctx.get(self, []))
def freeze_serialize(self):
if self.bitness is not None:
@@ -140,7 +192,10 @@ class Substring(String):
super(Substring, self).__init__(value, description=description)
self.value = value
def evaluate(self, ctx):
def evaluate(self, ctx, short_circuit=True):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.substring"] += 1
# mapping from string value to list of locations.
# will unique the locations later on.
matches = collections.defaultdict(list)
@@ -155,6 +210,10 @@ class Substring(String):
if self.value in feature.value:
matches[feature.value].extend(locations)
if short_circuit:
# we found one matching string, thats sufficient to match.
# don't collect other matching strings in this mode.
break
if matches:
# finalize: defaultdict -> dict
@@ -170,9 +229,9 @@ class Substring(String):
# unlike other features, we cannot return put a reference to `self` directly in a `Result`.
# this is because `self` may match on many strings, so we can't stuff the matched value into it.
# instead, return a new instance that has a reference to both the substring and the matched values.
return capa.engine.Result(True, _MatchedSubstring(self, matches), [], locations=locations)
return Result(True, _MatchedSubstring(self, matches), [], locations=locations)
else:
return capa.engine.Result(False, _MatchedSubstring(self, None), [])
return Result(False, _MatchedSubstring(self, None), [])
def __str__(self):
return "substring(%s)" % self.value
@@ -225,7 +284,10 @@ class Regex(String):
"invalid regular expression: %s it should use Python syntax, try it at https://pythex.org" % value
)
def evaluate(self, ctx):
def evaluate(self, ctx, short_circuit=True):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.regex"] += 1
# mapping from string value to list of locations.
# will unique the locations later on.
matches = collections.defaultdict(list)
@@ -244,6 +306,10 @@ class Regex(String):
# so that they don't have to prefix/suffix their terms like: /.*foo.*/.
if self.re.search(feature.value):
matches[feature.value].extend(locations)
if short_circuit:
# we found one matching string, thats sufficient to match.
# don't collect other matching strings in this mode.
break
if matches:
# finalize: defaultdict -> dict
@@ -260,9 +326,9 @@ class Regex(String):
# this is because `self` may match on many strings, so we can't stuff the matched value into it.
# instead, return a new instance that has a reference to both the regex and the matched values.
# see #262.
return capa.engine.Result(True, _MatchedRegex(self, matches), [], locations=locations)
return Result(True, _MatchedRegex(self, matches), [], locations=locations)
else:
return capa.engine.Result(False, _MatchedRegex(self, None), [])
return Result(False, _MatchedRegex(self, None), [])
def __str__(self):
return "regex(string =~ %s)" % self.value
@@ -308,15 +374,18 @@ class Bytes(Feature):
super(Bytes, self).__init__(value, description=description)
self.value = value
def evaluate(self, ctx):
def evaluate(self, ctx, **kwargs):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.bytes"] += 1
for feature, locations in ctx.items():
if not isinstance(feature, (Bytes,)):
continue
if feature.value.startswith(self.value):
return capa.engine.Result(True, self, [], locations=locations)
return Result(True, self, [], locations=locations)
return capa.engine.Result(False, self, [])
return Result(False, self, [])
def get_value_str(self):
return hex_string(bytes_to_str(self.value))

View File

@@ -40,11 +40,11 @@ def extract_os():
def extract_arch():
info = idaapi.get_inf_structure()
if info.procName == "metapc" and info.is_64bit():
if info.procname == "metapc" and info.is_64bit():
yield Arch(ARCH_AMD64), 0x0
elif info.procName == "metapc" and info.is_32bit():
elif info.procname == "metapc" and info.is_32bit():
yield Arch(ARCH_I386), 0x0
elif info.procName == "metapc":
elif info.procname == "metapc":
logger.debug("unsupported architecture: non-32-bit nor non-64-bit intel")
return
else:
@@ -52,5 +52,5 @@ def extract_arch():
# 1. handling a new architecture (e.g. aarch64)
#
# for (1), this logic will need to be updated as the format is implemented.
logger.debug("unsupported architecture: %s", info.procName)
logger.debug("unsupported architecture: %s", info.procname)
return

View File

@@ -84,8 +84,12 @@ def extract_insn_number_features(f, bb, insn):
return
for operand in operands:
try:
yield Number(int(operand, 16)), insn.offset
yield Number(int(operand, 16), bitness=get_bitness(f.smda_report)), insn.offset
# The result of bitwise operations is calculated as though carried out
# in twos complement with an infinite number of sign bits
value = int(operand, 16) & ((1 << f.smda_report.bitness) - 1)
yield Number(value), insn.offset
yield Number(value, bitness=get_bitness(f.smda_report)), insn.offset
except:
continue

View File

@@ -8,12 +8,7 @@ json format:
'base address': int(base address),
'functions': {
int(function va): {
'basic blocks': {
int(basic block va): {
'instructions': [instruction va, ...]
},
...
},
int(basic block va): [int(instruction va), ...]
...
},
...

View File

@@ -7,6 +7,7 @@
# See the License for the specific language governing permissions and limitations under the License.
import os
from typing import NoReturn
_hex = hex
@@ -30,3 +31,7 @@ def is_runtime_ida():
return False
else:
return True
def assert_never(value: NoReturn) -> NoReturn:
assert False, f"Unhandled value: {value} ({type(value).__name__})"

View File

@@ -21,13 +21,6 @@ import capa.features.common
logger = logging.getLogger("capa")
# IDA version as returned by idaapi.get_kernel_version()
SUPPORTED_IDA_VERSIONS = (
"7.4",
"7.5",
"7.6",
)
# file type as returned by idainfo.file_type
SUPPORTED_FILE_TYPES = (
idaapi.f_PE,
@@ -45,13 +38,11 @@ def inform_user_ida_ui(message):
def is_supported_ida_version():
version = idaapi.get_kernel_version()
if version not in SUPPORTED_IDA_VERSIONS:
version = float(idaapi.get_kernel_version())
if version < 7.4 or version >= 8:
warning_msg = "This plugin does not support your IDA Pro version"
logger.warning(warning_msg)
logger.warning(
"Your IDA Pro version is: %s. Supported versions are: %s." % (version, ", ".join(SUPPORTED_IDA_VERSIONS))
)
logger.warning("Your IDA Pro version is: %s. Supported versions are: IDA >= 7.4 and IDA < 8.0." % version)
return False
return True

View File

@@ -39,6 +39,7 @@ capa explorer supports Python versions >= 3.6.x and the following IDA Pro versio
* IDA 7.4
* IDA 7.5
* IDA 7.6 (caveat below)
* IDA 7.7
capa explorer is however limited to the Python versions supported by your IDA installation (which may not include all Python versions >= 3.6.x). Based on our testing the following matrix shows the Python versions supported
by each supported IDA version:

View File

@@ -10,6 +10,7 @@ See the License for the specific language governing permissions and limitations
"""
import os
import sys
import time
import hashlib
import logging
import os.path
@@ -17,6 +18,7 @@ import argparse
import datetime
import textwrap
import itertools
import contextlib
import collections
from typing import Any, Dict, List, Tuple
@@ -26,6 +28,7 @@ import colorama
from pefile import PEFormatError
from elftools.common.exceptions import ELFError
import capa.perf
import capa.rules
import capa.engine
import capa.version
@@ -39,7 +42,7 @@ import capa.features.extractors
import capa.features.extractors.common
import capa.features.extractors.pefile
import capa.features.extractors.elffile
from capa.rules import Rule, RuleSet
from capa.rules import Rule, Scope, RuleSet
from capa.engine import FeatureSet, MatchResults
from capa.helpers import get_file_taste
from capa.features.extractors.base_extractor import FunctionHandle, FeatureExtractor
@@ -65,6 +68,14 @@ E_UNSUPPORTED_IDA_VERSION = -19
logger = logging.getLogger("capa")
@contextlib.contextmanager
def timing(msg: str):
t0 = time.time()
yield
t1 = time.time()
logger.debug("perf: %s: %0.2fs", msg, t1 - t0)
def set_vivisect_log_level(level):
logging.getLogger("vivisect").setLevel(level)
logging.getLogger("vivisect.base").setLevel(level)
@@ -103,7 +114,7 @@ def find_function_capabilities(ruleset: RuleSet, extractor: FeatureExtractor, f:
bb_features[feature].add(va)
function_features[feature].add(va)
_, matches = capa.engine.match(ruleset.basic_block_rules, bb_features, int(bb))
_, matches = ruleset.match(Scope.BASIC_BLOCK, bb_features, int(bb))
for rule_name, res in matches.items():
bb_matches[rule_name].extend(res)
@@ -111,7 +122,7 @@ def find_function_capabilities(ruleset: RuleSet, extractor: FeatureExtractor, f:
for va, _ in res:
capa.engine.index_rule_matches(function_features, rule, [va])
_, function_matches = capa.engine.match(ruleset.function_rules, function_features, int(f))
_, function_matches = ruleset.match(Scope.FUNCTION, function_features, int(f))
return function_matches, bb_matches, len(function_features)
@@ -132,7 +143,7 @@ def find_file_capabilities(ruleset: RuleSet, extractor: FeatureExtractor, functi
file_features.update(function_features)
_, matches = capa.engine.match(ruleset.file_rules, file_features, 0x0)
_, matches = ruleset.match(Scope.FILE, file_features, 0x0)
return matches, len(file_features)
@@ -892,6 +903,7 @@ def main(argv=None):
try:
rules = get_rules(args.rules, disable_progress=args.quiet)
rules = capa.rules.RuleSet(rules)
logger.debug(
"successfully loaded %s rules",
# during the load of the RuleSet, we extract subscope statements into their own rules

70
capa/optimizer.py Normal file
View File

@@ -0,0 +1,70 @@
import logging
import capa.engine as ceng
import capa.features.common
logger = logging.getLogger(__name__)
def get_node_cost(node):
if isinstance(node, (capa.features.common.OS, capa.features.common.Arch, capa.features.common.Format)):
# we assume these are the most restrictive features:
# authors commonly use them at the start of rules to restrict the category of samples to inspect
return 0
# elif "everything else":
# return 1
#
# this should be all hash-lookup features.
# see below.
elif isinstance(node, (capa.features.common.Substring, capa.features.common.Regex, capa.features.common.Bytes)):
# substring and regex features require a full scan of each string
# which we anticipate is more expensive then a hash lookup feature (e.g. mnemonic or count).
#
# TODO: compute the average cost of these feature relative to hash feature
# and adjust the factor accordingly.
return 2
elif isinstance(node, (ceng.Not, ceng.Range)):
# the cost of these nodes are defined by the complexity of their single child.
return 1 + get_node_cost(node.child)
elif isinstance(node, (ceng.And, ceng.Or, ceng.Some)):
# the cost of these nodes is the full cost of their children
# as this is the worst-case scenario.
return 1 + sum(map(get_node_cost, node.children))
else:
# this should be all hash-lookup features.
# we give this a arbitrary weight of 1.
# the only thing more "important" than this is checking OS/Arch/Format.
return 1
def optimize_statement(statement):
# this routine operates in-place
if isinstance(statement, (ceng.And, ceng.Or, ceng.Some)):
# has .children
statement.children = sorted(statement.children, key=lambda n: get_node_cost(n))
return
elif isinstance(statement, (ceng.Not, ceng.Range)):
# has .child
optimize_statement(statement.child)
return
else:
# appears to be "simple"
return
def optimize_rule(rule):
# this routine operates in-place
optimize_statement(rule.statement)
def optimize_rules(rules):
logger.debug("optimizing %d rules", len(rules))
for rule in rules:
optimize_rule(rule)
return rules

10
capa/perf.py Normal file
View File

@@ -0,0 +1,10 @@
import collections
from typing import Dict
# this structure is unstable and may change before the next major release.
counters: Dict[str, int] = collections.Counter()
def reset():
global counters
counters = collections.Counter()

View File

@@ -60,6 +60,8 @@ def capability_rules(doc):
continue
if rule["meta"].get("maec/analysis-conclusion-ov"):
continue
if rule["meta"].get("maec/malware-family"):
continue
if rule["meta"].get("maec/malware-category"):
continue
if rule["meta"].get("maec/malware-category-ov"):

View File

@@ -14,6 +14,9 @@ import logging
import binascii
import functools
import collections
from enum import Enum
from capa.helpers import assert_never
try:
from functools import lru_cache
@@ -22,13 +25,15 @@ except ImportError:
# https://github.com/python/mypy/issues/1153
from backports.functools_lru_cache import lru_cache # type: ignore
from typing import Any, Dict, List, Union, Iterator
from typing import Any, Set, Dict, List, Tuple, Union, Iterator
import yaml
import ruamel.yaml
import capa.perf
import capa.engine as ceng
import capa.features
import capa.optimizer
import capa.features.file
import capa.features.insn
import capa.features.common
@@ -46,6 +51,7 @@ META_KEYS = (
"rule-category",
"maec/analysis-conclusion",
"maec/analysis-conclusion-ov",
"maec/malware-family",
"maec/malware-category",
"maec/malware-category-ov",
"author",
@@ -64,9 +70,15 @@ META_KEYS = (
HIDDEN_META_KEYS = ("capa/nursery", "capa/path")
FILE_SCOPE = "file"
FUNCTION_SCOPE = "function"
BASIC_BLOCK_SCOPE = "basic block"
class Scope(str, Enum):
FILE = "file"
FUNCTION = "function"
BASIC_BLOCK = "basic block"
FILE_SCOPE = Scope.FILE.value
FUNCTION_SCOPE = Scope.FUNCTION.value
BASIC_BLOCK_SCOPE = Scope.BASIC_BLOCK.value
SUPPORTED_FEATURES = {
@@ -619,8 +631,10 @@ class Rule:
for new_rule in self._extract_subscope_rules_rec(self.statement):
yield new_rule
def evaluate(self, features: FeatureSet):
return self.statement.evaluate(features)
def evaluate(self, features: FeatureSet, short_circuit=True):
capa.perf.counters["evaluate.feature"] += 1
capa.perf.counters["evaluate.feature.rule"] += 1
return self.statement.evaluate(features, short_circuit=short_circuit)
@classmethod
def from_dict(cls, d, definition):
@@ -958,12 +972,23 @@ class RuleSet:
if len(rules) == 0:
raise InvalidRuleSet("no rules selected")
rules = capa.optimizer.optimize_rules(rules)
self.file_rules = self._get_rules_for_scope(rules, FILE_SCOPE)
self.function_rules = self._get_rules_for_scope(rules, FUNCTION_SCOPE)
self.basic_block_rules = self._get_rules_for_scope(rules, BASIC_BLOCK_SCOPE)
self.rules = {rule.name: rule for rule in rules}
self.rules_by_namespace = index_rules_by_namespace(rules)
# unstable
(self._easy_file_rules_by_feature, self._hard_file_rules) = self._index_rules_by_feature(self.file_rules)
(self._easy_function_rules_by_feature, self._hard_function_rules) = self._index_rules_by_feature(
self.function_rules
)
(self._easy_basic_block_rules_by_feature, self._hard_basic_block_rules) = self._index_rules_by_feature(
self.basic_block_rules
)
def __len__(self):
return len(self.rules)
@@ -973,6 +998,141 @@ class RuleSet:
def __contains__(self, rulename):
return rulename in self.rules
@staticmethod
def _index_rules_by_feature(rules) -> Tuple[Dict[Feature, Set[str]], List[str]]:
"""
split the given rules into two structures:
- "easy rules" are indexed by feature,
such that you can quickly find the rules that contain a given feature.
- "hard rules" are those that contain substring/regex/bytes features or match statements.
these continue to be ordered topologically.
a rule evaluator can use the "easy rule" index to restrict the
candidate rules that might match a given set of features.
at this time, a rule evaluator can't do anything special with
the "hard rules". it must still do a full top-down match of each
rule, in topological order.
"""
# we'll do a couple phases:
#
# 1. recursively visit all nodes in all rules,
# a. indexing all features
# b. recording the types of features found per rule
# 2. compute the easy and hard rule sets
# 3. remove hard rules from the rules-by-feature index
# 4. construct the topologically ordered list of hard rules
rules_with_easy_features: Set[str] = set()
rules_with_hard_features: Set[str] = set()
rules_by_feature: Dict[Feature, Set[str]] = collections.defaultdict(set)
def rec(rule_name: str, node: Union[Feature, Statement]):
"""
walk through a rule's logic tree, indexing the easy and hard rules,
and the features referenced by easy rules.
"""
if isinstance(
node,
(
# these are the "hard features"
# substring: scanning feature
capa.features.common.Substring,
# regex: scanning feature
capa.features.common.Regex,
# bytes: scanning feature
capa.features.common.Bytes,
# match: dependency on another rule,
# which we have to evaluate first,
# and is therefore tricky.
capa.features.common.MatchedRule,
),
):
# hard feature: requires scan or match lookup
rules_with_hard_features.add(rule_name)
elif isinstance(node, capa.features.common.Feature):
# easy feature: hash lookup
rules_with_easy_features.add(rule_name)
rules_by_feature[node].add(rule_name)
elif isinstance(node, (ceng.Not)):
# `not:` statements are tricky to deal with.
#
# first, features found under a `not:` should not be indexed,
# because they're not wanted to be found.
# second, `not:` can be nested under another `not:`, or two, etc.
# third, `not:` at the root or directly under an `or:`
# means the rule will match against *anything* not specified there,
# which is a difficult set of things to compute and index.
#
# so, if a rule has a `not:` statement, its hard.
# as of writing, this is an uncommon statement, with only 6 instances in 740 rules.
rules_with_hard_features.add(rule_name)
elif isinstance(node, (ceng.Some)) and node.count == 0:
# `optional:` and `0 or more:` are tricky to deal with.
#
# when a subtree is optional, it may match, but not matching
# doesn't have any impact either.
# now, our rule authors *should* not put this under `or:`
# and this is checked by the linter,
# but this could still happen (e.g. private rule set without linting)
# and would be hard to trace down.
#
# so better to be safe than sorry and consider this a hard case.
rules_with_hard_features.add(rule_name)
elif isinstance(node, (ceng.Range)) and node.min == 0:
# `count(foo): 0 or more` are tricky to deal with.
# because the min is 0,
# this subtree *can* match just about any feature
# (except the given one)
# which is a difficult set of things to compute and index.
rules_with_hard_features.add(rule_name)
elif isinstance(node, (ceng.Range)):
rec(rule_name, node.child)
elif isinstance(node, (ceng.And, ceng.Or, ceng.Some)):
for child in node.children:
rec(rule_name, child)
elif isinstance(node, ceng.Statement):
# unhandled type of statement.
# this should only happen if a new subtype of `Statement`
# has since been added to capa.
#
# ideally, we'd like to use mypy for exhaustiveness checking
# for all the subtypes of `Statement`.
# but, as far as i can tell, mypy does not support this type
# of checking.
#
# in a way, this makes some intuitive sense:
# the set of subtypes of type A is unbounded,
# because any user might come along and create a new subtype B,
# so mypy can't reason about this set of types.
assert False, f"Unhandled value: {node} ({type(node).__name__})"
else:
# programming error
assert_never(node)
for rule in rules:
rule_name = rule.meta["name"]
root = rule.statement
rec(rule_name, root)
# if a rule has a hard feature,
# dont consider it easy, and therefore,
# don't index any of its features.
#
# otherwise, its an easy rule, and index its features
for rules_with_feature in rules_by_feature.values():
rules_with_feature.difference_update(rules_with_hard_features)
easy_rules_by_feature = rules_by_feature
# `rules` is already topologically ordered,
# so extract our hard set into the topological ordering.
hard_rules = []
for rule in rules:
if rule.meta["name"] in rules_with_hard_features:
hard_rules.append(rule.meta["name"])
return (easy_rules_by_feature, hard_rules)
@staticmethod
def _get_rules_for_scope(rules, scope):
"""
@@ -1035,3 +1195,66 @@ class RuleSet:
rules_filtered.update(set(capa.rules.get_rules_and_dependencies(rules, rule.name)))
break
return RuleSet(list(rules_filtered))
def match(self, scope: Scope, features: FeatureSet, va: int) -> Tuple[FeatureSet, ceng.MatchResults]:
"""
match rules from this ruleset at the given scope against the given features.
this routine should act just like `capa.engine.match`,
except that it may be more performant.
"""
easy_rules_by_feature = {}
if scope is Scope.FILE:
easy_rules_by_feature = self._easy_file_rules_by_feature
hard_rule_names = self._hard_file_rules
elif scope is Scope.FUNCTION:
easy_rules_by_feature = self._easy_function_rules_by_feature
hard_rule_names = self._hard_function_rules
elif scope is Scope.BASIC_BLOCK:
easy_rules_by_feature = self._easy_basic_block_rules_by_feature
hard_rule_names = self._hard_basic_block_rules
else:
assert_never(scope)
candidate_rule_names = set()
for feature in features:
easy_rule_names = easy_rules_by_feature.get(feature)
if easy_rule_names:
candidate_rule_names.update(easy_rule_names)
# first, match against the set of rules that have at least one
# feature shared with our feature set.
candidate_rules = [self.rules[name] for name in candidate_rule_names]
features2, easy_matches = ceng.match(candidate_rules, features, va)
# note that we've stored the updated feature set in `features2`.
# this contains a superset of the features in `features`;
# it contains additional features for any easy rule matches.
# we'll pass this feature set to hard rule matching, since one
# of those rules might rely on an easy rule match.
#
# the updated feature set from hard matching will go into `features3`.
# this is a superset of `features2` is a superset of `features`.
# ultimately, this is what we'll return to the caller.
#
# in each case, we could have assigned the updated feature set back to `features`,
# but this is slightly more explicit how we're tracking the data.
# now, match against (topologically ordered) list of rules
# that we can't really make any guesses about.
# these are rules with hard features, like substring/regex/bytes and match statements.
hard_rules = [self.rules[name] for name in hard_rule_names]
features3, hard_matches = ceng.match(hard_rules, features2, va)
# note that above, we probably are skipping matching a bunch of
# rules that definitely would never hit.
# specifically, "easy rules" that don't share any features with
# feature set.
# MatchResults doesn't technically have an .update() method
# but a dict does.
matches = {} # type: ignore
matches.update(easy_matches)
matches.update(hard_matches)
return (features3, matches)

View File

@@ -1 +1 @@
__version__ = "3.0.3"
__version__ = "3.1.0"

2
rules

Submodule rules updated: 6481a5e82f...954f22acd8

View File

@@ -230,9 +230,16 @@ def get_sample_capabilities(ctx: Context, path: Path) -> Set[str]:
logger.debug("found cached results: %s: %d capabilities", nice_path, len(ctx.capabilities_by_sample[path]))
return ctx.capabilities_by_sample[path]
if nice_path.endswith(capa.main.EXTENSIONS_SHELLCODE_32):
format = "sc32"
elif nice_path.endswith(capa.main.EXTENSIONS_SHELLCODE_64):
format = "sc64"
else:
format = "auto"
logger.debug("analyzing sample: %s", nice_path)
extractor = capa.main.get_extractor(
nice_path, "auto", capa.main.BACKEND_VIV, DEFAULT_SIGNATURES, False, disable_progress=True
nice_path, format, capa.main.BACKEND_VIV, DEFAULT_SIGNATURES, False, disable_progress=True
)
capabilities, _ = capa.main.find_capabilities(ctx.rules, extractor, disable_progress=True)
@@ -332,6 +339,52 @@ class OrStatementWithAlwaysTrueChild(Lint):
return self.violation
class NotNotUnderAnd(Lint):
name = "rule contains a `not` statement that's not found under an `and` statement"
recommendation = "clarify the rule logic and ensure `not` is always found under `and`"
violation = False
def check_rule(self, ctx: Context, rule: Rule):
self.violation = False
def rec(statement):
if isinstance(statement, capa.engine.Statement):
if not isinstance(statement, capa.engine.And):
for child in statement.get_children():
if isinstance(child, capa.engine.Not):
self.violation = True
for child in statement.get_children():
rec(child)
rec(rule.statement)
return self.violation
class OptionalNotUnderAnd(Lint):
name = "rule contains an `optional` or `0 or more` statement that's not found under an `and` statement"
recommendation = "clarify the rule logic and ensure `optional` and `0 or more` is always found under `and`"
violation = False
def check_rule(self, ctx: Context, rule: Rule):
self.violation = False
def rec(statement):
if isinstance(statement, capa.engine.Statement):
if not isinstance(statement, capa.engine.And):
for child in statement.get_children():
if isinstance(child, capa.engine.Some) and child.count == 0:
self.violation = True
for child in statement.get_children():
rec(child)
rec(rule.statement)
return self.violation
class UnusualMetaField(Lint):
name = "unusual meta field"
recommendation = "Remove the meta field"
@@ -653,6 +706,8 @@ LOGIC_LINTS = (
DoesntMatchExample(),
StatementWithSingleChildStatement(),
OrStatementWithAlwaysTrueChild(),
NotNotUnderAnd(),
OptionalNotUnderAnd(),
)

150
scripts/profile-time.py Normal file
View File

@@ -0,0 +1,150 @@
"""
Invoke capa multiple times and record profiling informations.
Use the --number and --repeat options to change the number of iterations.
By default, the script will emit a markdown table with a label pulled from git.
Note: you can run this script against pre-generated .frz files to reduce the startup time.
usage:
usage: profile-time.py [--number NUMBER] [--repeat REPEAT] [--label LABEL] sample
Profile capa performance
positional arguments:
sample path to sample to analyze
optional arguments:
--number NUMBER batch size of profile collection
--repeat REPEAT batch count of profile collection
--label LABEL description of the profile collection
example:
$ python profile-time.py ./tests/data/kernel32.dll_.frz --number 1 --repeat 2
| label | count(evaluations) | avg(time) | min(time) | max(time) |
|--------------------------------------|----------------------|-------------|-------------|-------------|
| 18c30e4 main: remove perf debug msgs | 66,561,622 | 132.13s | 125.14s | 139.12s |
^^^ --label or git hash
"""
import sys
import timeit
import logging
import argparse
import subprocess
import tqdm
import tabulate
import capa.main
import capa.perf
import capa.rules
import capa.engine
import capa.helpers
import capa.features
import capa.features.common
import capa.features.freeze
logger = logging.getLogger("capa.profile")
def main(argv=None):
if argv is None:
argv = sys.argv[1:]
label = subprocess.run(
"git show --pretty=oneline --abbrev-commit | head -n 1", shell=True, capture_output=True, text=True
).stdout.strip()
is_dirty = (
subprocess.run(
"git status | grep 'modified: ' | grep -v 'rules' | grep -v 'tests/data'",
shell=True,
capture_output=True,
text=True,
).stdout
!= ""
)
if is_dirty:
label += " (dirty)"
parser = argparse.ArgumentParser(description="Profile capa performance")
capa.main.install_common_args(parser, wanted={"format", "sample", "signatures", "rules"})
parser.add_argument("--number", type=int, default=3, help="batch size of profile collection")
parser.add_argument("--repeat", type=int, default=30, help="batch count of profile collection")
parser.add_argument("--label", type=str, default=label, help="description of the profile collection")
args = parser.parse_args(args=argv)
capa.main.handle_common_args(args)
try:
taste = capa.helpers.get_file_taste(args.sample)
except IOError as e:
logger.error("%s", str(e))
return -1
try:
with capa.main.timing("load rules"):
rules = capa.rules.RuleSet(capa.main.get_rules(args.rules, disable_progress=True))
except (IOError) as e:
logger.error("%s", str(e))
return -1
try:
sig_paths = capa.main.get_signatures(args.signatures)
except (IOError) as e:
logger.error("%s", str(e))
return -1
if (args.format == "freeze") or (args.format == "auto" and capa.features.freeze.is_freeze(taste)):
with open(args.sample, "rb") as f:
extractor = capa.features.freeze.load(f.read())
else:
extractor = capa.main.get_extractor(
args.sample, args.format, capa.main.BACKEND_VIV, sig_paths, should_save_workspace=False
)
with tqdm.tqdm(total=args.number * args.repeat) as pbar:
def do_iteration():
capa.perf.reset()
capa.main.find_capabilities(rules, extractor, disable_progress=True)
pbar.update(1)
samples = timeit.repeat(do_iteration, number=args.number, repeat=args.repeat)
logger.debug("perf: find capabilities: min: %0.2fs" % (min(samples) / float(args.number)))
logger.debug("perf: find capabilities: avg: %0.2fs" % (sum(samples) / float(args.repeat) / float(args.number)))
logger.debug("perf: find capabilities: max: %0.2fs" % (max(samples) / float(args.number)))
for (counter, count) in capa.perf.counters.most_common():
logger.debug("perf: counter: {:}: {:,}".format(counter, count))
print(
tabulate.tabulate(
[
(
args.label,
"{:,}".format(capa.perf.counters["evaluate.feature"]),
# python documentation indicates that min(samples) should be preferred,
# so lets put that first.
#
# https://docs.python.org/3/library/timeit.html#timeit.Timer.repeat
"%0.2fs" % (min(samples) / float(args.number)),
"%0.2fs" % (sum(samples) / float(args.repeat) / float(args.number)),
"%0.2fs" % (max(samples) / float(args.number)),
)
],
headers=["label", "count(evaluations)", "min(time)", "avg(time)", "max(time)"],
tablefmt="github",
)
)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -86,7 +86,7 @@ def main(argv=None):
argv = sys.argv[1:]
parser = argparse.ArgumentParser(description="Show the features that capa extracts from the given sample")
capa.main.install_common_args(parser, wanted={"format", "sample", "signatures"})
capa.main.install_common_args(parser, wanted={"format", "sample", "signatures", "backend"})
parser.add_argument("-F", "--function", type=lambda x: int(x, 0x10), help="Show features for specific function")
args = parser.parse_args(args=argv)
@@ -111,7 +111,7 @@ def main(argv=None):
should_save_workspace = os.environ.get("CAPA_SAVE_WORKSPACE") not in ("0", "no", "NO", "n", None)
try:
extractor = capa.main.get_extractor(
args.sample, args.format, capa.main.BACKEND_VIV, sig_paths, should_save_workspace
args.sample, args.format, args.backend, sig_paths, should_save_workspace
)
except capa.main.UnsupportedFormatError:
logger.error("-" * 80)

View File

@@ -18,10 +18,10 @@ requirements = [
"termcolor==1.1.0",
"wcwidth==0.2.5",
"ida-settings==2.1.0",
"viv-utils[flirt]==0.6.7",
"viv-utils[flirt]==0.6.9",
"halo==0.0.31",
"networkx==2.5.1",
"ruamel.yaml==0.17.16",
"ruamel.yaml==0.17.20",
"vivisect==1.0.5",
"smda==1.6.2",
"pefile==2021.9.3",
@@ -72,17 +72,17 @@ setuptools.setup(
"pytest-instafail==0.4.2",
"pytest-cov==3.0.0",
"pycodestyle==2.8.0",
"black==21.9b0",
"isort==5.9.3",
"mypy==0.910",
"psutil==5.8.0",
"black==21.12b0",
"isort==5.10.1",
"mypy==0.931",
"psutil==5.9.0",
# type stubs for mypy
"types-backports==0.1.3",
"types-colorama==0.4.4",
"types-PyYAML==6.0.0",
"types-tabulate==0.8.3",
"types-colorama==0.4.5",
"types-PyYAML==6.0.3",
"types-tabulate==0.8.5",
"types-termcolor==1.1.2",
"types-psutil==5.8.13",
"types-psutil==5.8.19",
],
},
zip_safe=False,

View File

@@ -413,6 +413,7 @@ FEATURE_PRESENCE_TESTS = sorted(
# insn/number
("mimikatz", "function=0x40105D", capa.features.insn.Number(0xFF), True),
("mimikatz", "function=0x40105D", capa.features.insn.Number(0x3136B0), True),
("mimikatz", "function=0x401000", capa.features.insn.Number(0x0), True),
# insn/number: stack adjustments
("mimikatz", "function=0x40105D", capa.features.insn.Number(0xC), False),
("mimikatz", "function=0x40105D", capa.features.insn.Number(0x10), False),
@@ -420,6 +421,9 @@ FEATURE_PRESENCE_TESTS = sorted(
("mimikatz", "function=0x40105D", capa.features.insn.Number(0xFF), True),
("mimikatz", "function=0x40105D", capa.features.insn.Number(0xFF, bitness=BITNESS_X32), True),
("mimikatz", "function=0x40105D", capa.features.insn.Number(0xFF, bitness=BITNESS_X64), False),
# insn/number: negative
("mimikatz", "function=0x401553", capa.features.insn.Number(0xFFFFFFFF), True),
("mimikatz", "function=0x43e543", capa.features.insn.Number(0xFFFFFFF0), True),
# insn/offset
("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x0), True),
("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x4), True),

View File

@@ -5,13 +5,6 @@
# Unless required by applicable law or agreed to in writing, software distributed under the License
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
import textwrap
import capa.rules
import capa.engine
import capa.features.insn
import capa.features.common
from capa.engine import *
from capa.features import *
from capa.features.insn import *
@@ -117,419 +110,27 @@ def test_range():
assert Range(Number(1), min=1, max=3).evaluate({Number(1): {1, 2, 3, 4}}) == False
def test_range_exact():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- count(number(100)): 2
"""
)
r = capa.rules.Rule.from_yaml(rule)
def test_short_circuit():
assert Or([Number(1), Number(2)]).evaluate({Number(1): {1}}) == True
# just enough matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" in matches
# not enough matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert "test rule" not in matches
# too many matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1, 2, 3}}, 0x0)
assert "test rule" not in matches
# with short circuiting, only the children up until the first satisfied child are captured.
assert len(Or([Number(1), Number(2)]).evaluate({Number(1): {1}}, short_circuit=True).children) == 1
assert len(Or([Number(1), Number(2)]).evaluate({Number(1): {1}}, short_circuit=False).children) == 2
def test_range_range():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- count(number(100)): (2, 3)
"""
)
r = capa.rules.Rule.from_yaml(rule)
def test_eval_order():
# base cases.
assert Or([Number(1), Number(2)]).evaluate({Number(1): {1}}) == True
assert Or([Number(1), Number(2)]).evaluate({Number(2): {1}}) == True
# just enough matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" in matches
# with short circuiting, only the children up until the first satisfied child are captured.
assert len(Or([Number(1), Number(2)]).evaluate({Number(1): {1}}).children) == 1
assert len(Or([Number(1), Number(2)]).evaluate({Number(2): {1}}).children) == 2
assert len(Or([Number(1), Number(2)]).evaluate({Number(1): {1}, Number(2): {1}}).children) == 1
# enough matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1, 2, 3}}, 0x0)
assert "test rule" in matches
# and its guaranteed that children are evaluated in order.
assert Or([Number(1), Number(2)]).evaluate({Number(1): {1}}).children[0].statement == Number(1)
assert Or([Number(1), Number(2)]).evaluate({Number(1): {1}}).children[0].statement != Number(2)
# not enough matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert "test rule" not in matches
# too many matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1, 2, 3, 4}}, 0x0)
assert "test rule" not in matches
def test_range_exact_zero():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- count(number(100)): 0
"""
)
r = capa.rules.Rule.from_yaml(rule)
# feature isn't indexed - good.
features, matches = capa.engine.match([r], {}, 0x0)
assert "test rule" in matches
# feature is indexed, but no matches.
# i don't think we should ever really have this case, but good to check anyways.
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {}}, 0x0)
assert "test rule" in matches
# too many matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert "test rule" not in matches
def test_range_with_zero():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- count(number(100)): (0, 1)
"""
)
r = capa.rules.Rule.from_yaml(rule)
# ok
features, matches = capa.engine.match([r], {}, 0x0)
assert "test rule" in matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {}}, 0x0)
assert "test rule" in matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert "test rule" in matches
# too many matches
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" not in matches
def test_match_adds_matched_rule_feature():
"""show that using `match` adds a feature for matched rules."""
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- number: 100
"""
)
r = capa.rules.Rule.from_yaml(rule)
features, matches = capa.engine.match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert capa.features.common.MatchedRule("test rule") in features
def test_match_matched_rules():
"""show that using `match` adds a feature for matched rules."""
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule1
features:
- number: 100
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule2
features:
- match: test rule1
"""
)
),
]
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.insn.Number(100): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule1") in features
assert capa.features.common.MatchedRule("test rule2") in features
# the ordering of the rules must not matter,
# the engine should match rules in an appropriate order.
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(reversed(rules)),
{capa.features.insn.Number(100): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule1") in features
assert capa.features.common.MatchedRule("test rule2") in features
def test_substring():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- and:
- substring: abc
"""
)
),
]
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("aaaa"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") not in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("abc"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("111abc222"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("111abc"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("abc222"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
def test_regex():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- and:
- string: /.*bbbb.*/
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: rule with implied wildcards
features:
- and:
- string: /bbbb/
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: rule with anchor
features:
- and:
- string: /^bbbb/
"""
)
),
]
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.insn.Number(100): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") not in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("aaaa"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") not in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("aBBBBa"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") not in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("abbbba"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
assert capa.features.common.MatchedRule("rule with implied wildcards") in features
assert capa.features.common.MatchedRule("rule with anchor") not in features
def test_regex_ignorecase():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- and:
- string: /.*bbbb.*/i
"""
)
),
]
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("aBBBBa"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
def test_regex_complex():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
r"""
rule:
meta:
name: test rule
features:
- or:
- string: /.*HARDWARE\\Key\\key with spaces\\.*/i
"""
)
),
]
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String(r"Hardware\Key\key with spaces\some value"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
def test_match_namespace():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: CreateFile API
namespace: file/create/CreateFile
features:
- api: CreateFile
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: WriteFile API
namespace: file/write
features:
- api: WriteFile
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: file-create
features:
- match: file/create
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: filesystem-any
features:
- match: file
"""
)
),
]
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.insn.API("CreateFile"): {1}},
0x0,
)
assert "CreateFile API" in matches
assert "file-create" in matches
assert "filesystem-any" in matches
assert capa.features.common.MatchedRule("file") in features
assert capa.features.common.MatchedRule("file/create") in features
assert capa.features.common.MatchedRule("file/create/CreateFile") in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.insn.API("WriteFile"): {1}},
0x0,
)
assert "WriteFile API" in matches
assert "file-create" not in matches
assert "filesystem-any" in matches
def test_render_number():
assert str(capa.features.insn.Number(1)) == "number(0x1)"
assert str(capa.features.insn.Number(1, bitness=capa.features.common.BITNESS_X32)) == "number/x32(0x1)"
assert str(capa.features.insn.Number(1, bitness=capa.features.common.BITNESS_X64)) == "number/x64(0x1)"
def test_render_offset():
assert str(capa.features.insn.Offset(1)) == "offset(0x1)"
assert str(capa.features.insn.Offset(1, bitness=capa.features.common.BITNESS_X32)) == "offset/x32(0x1)"
assert str(capa.features.insn.Offset(1, bitness=capa.features.common.BITNESS_X64)) == "offset/x64(0x1)"
assert Or([Number(1), Number(2)]).evaluate({Number(2): {1}}).children[1].statement == Number(2)
assert Or([Number(1), Number(2)]).evaluate({Number(2): {1}}).children[1].statement != Number(1)

533
tests/test_match.py Normal file
View File

@@ -0,0 +1,533 @@
# Copyright (C) 2020 FireEye, Inc. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at: [package root]/LICENSE.txt
# Unless required by applicable law or agreed to in writing, software distributed under the License
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
import textwrap
import capa.rules
import capa.engine
import capa.features.insn
import capa.features.common
from capa.rules import Scope
from capa.features import *
from capa.features.insn import *
from capa.features.common import *
def match(rules, features, va, scope=Scope.FUNCTION):
"""
use all matching algorithms and verify that they compute the same result.
then, return those results to the caller so they can make their asserts.
"""
features1, matches1 = capa.engine.match(rules, features, va)
ruleset = capa.rules.RuleSet(rules)
features2, matches2 = ruleset.match(scope, features, va)
for feature, locations in features1.items():
assert feature in features2
assert locations == features2[feature]
for rulename, results in matches1.items():
assert rulename in matches2
assert len(results) == len(matches2[rulename])
return features1, matches1
def test_match_simple():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
namespace: testns1/testns2
features:
- number: 100
"""
)
r = capa.rules.Rule.from_yaml(rule)
features, matches = match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" in matches
assert MatchedRule("test rule") in features
assert MatchedRule("testns1") in features
assert MatchedRule("testns1/testns2") in features
def test_match_range_exact():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- count(number(100)): 2
"""
)
r = capa.rules.Rule.from_yaml(rule)
# just enough matches
_, matches = match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" in matches
# not enough matches
_, matches = match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert "test rule" not in matches
# too many matches
_, matches = match([r], {capa.features.insn.Number(100): {1, 2, 3}}, 0x0)
assert "test rule" not in matches
def test_match_range_range():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- count(number(100)): (2, 3)
"""
)
r = capa.rules.Rule.from_yaml(rule)
# just enough matches
_, matches = match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" in matches
# enough matches
_, matches = match([r], {capa.features.insn.Number(100): {1, 2, 3}}, 0x0)
assert "test rule" in matches
# not enough matches
_, matches = match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert "test rule" not in matches
# too many matches
_, matches = match([r], {capa.features.insn.Number(100): {1, 2, 3, 4}}, 0x0)
assert "test rule" not in matches
def test_match_range_exact_zero():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- count(number(100)): 0
"""
)
r = capa.rules.Rule.from_yaml(rule)
# feature isn't indexed - good.
_, matches = match([r], {}, 0x0)
assert "test rule" in matches
# feature is indexed, but no matches.
# i don't think we should ever really have this case, but good to check anyways.
_, matches = match([r], {capa.features.insn.Number(100): {}}, 0x0)
assert "test rule" in matches
# too many matches
_, matches = match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert "test rule" not in matches
def test_match_range_with_zero():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- count(number(100)): (0, 1)
"""
)
r = capa.rules.Rule.from_yaml(rule)
# ok
_, matches = match([r], {}, 0x0)
assert "test rule" in matches
_, matches = match([r], {capa.features.insn.Number(100): {}}, 0x0)
assert "test rule" in matches
_, matches = match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert "test rule" in matches
# too many matches
_, matches = match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" not in matches
def test_match_adds_matched_rule_feature():
"""show that using `match` adds a feature for matched rules."""
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- number: 100
"""
)
r = capa.rules.Rule.from_yaml(rule)
features, _ = match([r], {capa.features.insn.Number(100): {1}}, 0x0)
assert capa.features.common.MatchedRule("test rule") in features
def test_match_matched_rules():
"""show that using `match` adds a feature for matched rules."""
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule1
features:
- number: 100
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule2
features:
- match: test rule1
"""
)
),
]
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.insn.Number(100): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule1") in features
assert capa.features.common.MatchedRule("test rule2") in features
# the ordering of the rules must not matter,
# the engine should match rules in an appropriate order.
features, _ = match(
capa.rules.topologically_order_rules(reversed(rules)),
{capa.features.insn.Number(100): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule1") in features
assert capa.features.common.MatchedRule("test rule2") in features
def test_match_namespace():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: CreateFile API
namespace: file/create/CreateFile
features:
- api: CreateFile
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: WriteFile API
namespace: file/write
features:
- api: WriteFile
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: file-create
features:
- match: file/create
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: filesystem-any
features:
- match: file
"""
)
),
]
features, matches = match(
capa.rules.topologically_order_rules(rules),
{capa.features.insn.API("CreateFile"): {1}},
0x0,
)
assert "CreateFile API" in matches
assert "file-create" in matches
assert "filesystem-any" in matches
assert capa.features.common.MatchedRule("file") in features
assert capa.features.common.MatchedRule("file/create") in features
assert capa.features.common.MatchedRule("file/create/CreateFile") in features
features, matches = match(
capa.rules.topologically_order_rules(rules),
{capa.features.insn.API("WriteFile"): {1}},
0x0,
)
assert "WriteFile API" in matches
assert "file-create" not in matches
assert "filesystem-any" in matches
def test_match_substring():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- and:
- substring: abc
"""
)
),
]
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("aaaa"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") not in features
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("abc"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("111abc222"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("111abc"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("abc222"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
def test_match_regex():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- and:
- string: /.*bbbb.*/
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: rule with implied wildcards
features:
- and:
- string: /bbbb/
"""
)
),
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: rule with anchor
features:
- and:
- string: /^bbbb/
"""
)
),
]
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.insn.Number(100): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") not in features
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("aaaa"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") not in features
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("aBBBBa"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") not in features
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("abbbba"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
assert capa.features.common.MatchedRule("rule with implied wildcards") in features
assert capa.features.common.MatchedRule("rule with anchor") not in features
def test_match_regex_ignorecase():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- and:
- string: /.*bbbb.*/i
"""
)
),
]
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("aBBBBa"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
def test_match_regex_complex():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
r"""
rule:
meta:
name: test rule
features:
- or:
- string: /.*HARDWARE\\Key\\key with spaces\\.*/i
"""
)
),
]
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String(r"Hardware\Key\key with spaces\some value"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
def test_match_regex_values_always_string():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- or:
- string: /123/
- string: /0x123/
"""
)
),
]
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("123"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
features, _ = match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("0x123"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
def test_match_not():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
namespace: testns1/testns2
features:
- not:
- number: 99
"""
)
r = capa.rules.Rule.from_yaml(rule)
_, matches = match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" in matches
def test_match_not_not():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
namespace: testns1/testns2
features:
- not:
- not:
- number: 100
"""
)
r = capa.rules.Rule.from_yaml(rule)
_, matches = match([r], {capa.features.insn.Number(100): {1, 2}}, 0x0)
assert "test rule" in matches

65
tests/test_optimizer.py Normal file
View File

@@ -0,0 +1,65 @@
# Copyright (C) 2021 FireEye, Inc. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at: [package root]/LICENSE.txt
# Unless required by applicable law or agreed to in writing, software distributed under the License
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
import textwrap
import pytest
import capa.rules
import capa.engine
import capa.optimizer
import capa.features.common
from capa.engine import Or, And
from capa.features.insn import Mnemonic
from capa.features.common import Arch, Bytes, Substring
def test_optimizer_order():
rule = textwrap.dedent(
"""
rule:
meta:
name: test rule
scope: function
features:
- and:
- substring: "foo"
- arch: amd64
- mnemonic: cmp
- and:
- bytes: 3
- offset: 2
- or:
- number: 1
- offset: 4
"""
)
r = capa.rules.Rule.from_yaml(rule)
# before optimization
children = list(r.statement.get_children())
assert isinstance(children[0], Substring)
assert isinstance(children[1], Arch)
assert isinstance(children[2], Mnemonic)
assert isinstance(children[3], And)
assert isinstance(children[4], Or)
# after optimization
capa.optimizer.optimize_rules([r])
children = list(r.statement.get_children())
# cost: 0
assert isinstance(children[0], Arch)
# cost: 1
assert isinstance(children[1], Mnemonic)
# cost: 2
assert isinstance(children[2], Substring)
# cost: 3
assert isinstance(children[3], Or)
# cost: 4
assert isinstance(children[4], And)

View File

@@ -2,9 +2,23 @@ import textwrap
import capa.rules
import capa.render.utils
import capa.features.insn
import capa.features.common
import capa.render.result_document
def test_render_number():
assert str(capa.features.insn.Number(1)) == "number(0x1)"
assert str(capa.features.insn.Number(1, bitness=capa.features.common.BITNESS_X32)) == "number/x32(0x1)"
assert str(capa.features.insn.Number(1, bitness=capa.features.common.BITNESS_X64)) == "number/x64(0x1)"
def test_render_offset():
assert str(capa.features.insn.Offset(1)) == "offset(0x1)"
assert str(capa.features.insn.Offset(1, bitness=capa.features.common.BITNESS_X32)) == "offset/x32(0x1)"
assert str(capa.features.insn.Offset(1, bitness=capa.features.common.BITNESS_X64)) == "offset/x64(0x1)"
def test_render_meta_attack():
# Persistence::Boot or Logon Autostart Execution::Registry Run Keys / Startup Folder [T1547.001]
id = "T1543.003"

View File

@@ -785,37 +785,6 @@ def test_substring_description():
assert (Substring("abc") in children) == True
def test_regex_values_always_string():
rules = [
capa.rules.Rule.from_yaml(
textwrap.dedent(
"""
rule:
meta:
name: test rule
features:
- or:
- string: /123/
- string: /0x123/
"""
)
),
]
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("123"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
features, matches = capa.engine.match(
capa.rules.topologically_order_rules(rules),
{capa.features.common.String("0x123"): {1}},
0x0,
)
assert capa.features.common.MatchedRule("test rule") in features
def test_filter_rules():
rules = capa.rules.RuleSet(
[