Commit Graph

109 Commits

Author SHA1 Message Date
Willi Ballenthin 4d81b7ab98 rules: add references to existing issues 2024-06-07 05:54:49 +02:00
Willi Ballenthin b068890fa6 rules: match: optimize rule matching by better indexing rule by features
Implement the "tighten rule pre-selection" algorithm described here:
https://github.com/mandiant/capa/issues/2063#issuecomment-2100498720

In summary:

> Rather than indexing all features from all rules,
> we should pick and index the minimal set (ideally, one) of
> features from each rule that must be present for the rule to match.
> When we have multiple candidates, pick the feature that is
> probably most uncommon and therefore "selective".

This seems to work pretty well. Total evaluations when running against
mimikatz drop from 19M to 1.1M (wow!) and capa seems to match around
3x more functions per second (wow wow).

When doing large scale runs, capa is about 25% faster when using the
vivisect backend (analysis heavy) or 3x faster when using the
upcoming BinExport2 backend (minimal analysis).
2024-06-07 05:54:49 +02:00
Fariss 508a09ef25 include rule caching in PyInstaller build process (#2097)
* include rule caching in PyInstaller build process

The following commit introduces a new function that caches the capa
rule set, so that users don't have to manually run ./scripts/cache-
ruleset.py, before running pyinstaller.

* ci: omit Cache rule set step from build.yml workflow

* refactor: move cache generation to cache.py

* mkdir cache directory when it does not exist

---------

Co-authored-by: Soufiane Fariss <soufiane.fariss@um5s.net.ma>
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2024-06-04 18:47:41 +02:00
RainRat 8ad74ddbb6 fix typos 2024-06-01 11:48:19 -07:00
RainRat a4a4016463 fix typos 2024-04-29 23:31:15 -07:00
Aayush Goel 49231366f1 Handles circular dependencies while getting rules and dependencies (#2014)
* Remove test for scope "unspecified"

* raise error on circular dependency

* test for circular dependency
2024-03-06 11:39:21 +01:00
Willi Ballenthin 0f9dd9095b fmt 2024-02-14 15:57:24 +01:00
Willi Ballenthin 4e2f175b9f rules: don't eagerly import ruamel until needed 2024-02-14 15:57:24 +01:00
Willi Ballenthin c3301d3b3f refactor main to for ease of integration (#1948)
* main: split main into a bunch of "main routines"

[wip] since there are a few references to BinExport2
that are in progress elsewhre. Next commit will remove them.

* main: remove references to wip BinExport2 code

* changelog

* main: rename first position argument "input_file"

closes #1946

* main: linters

* main: move rule-related routines to capa.rules

ref #1821

* main: extract routines to capa.loader module

closes #1821

* add loader module

* loader: learn to load freeze format

* freeze: use new cli arg handling

* Update capa/loader.py

Co-authored-by: Moritz <mr-tz@users.noreply.github.com>

* main: remove duplicate documentation

* main: add doc about where some functions live

* scripts: migrate to new main wrapper helper functions

* scripts: port to main routines

* main: better handle auto-detection of backend

* scripts: migrate bulk-process to main wrappers

* scripts: migrate scripts to main wrappers

* main: rename *_from_args to *_from_cli

* changelog

* cache-ruleset: remove duplication

* main: fix tag handling

* cache-ruleset: fix cli args

* cache-ruleset: fix special rule cli handling

* scripts: fix type bytes

* main: remove old TODO message

* loader: fix references to binja extractor

---------

Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2024-01-29 13:59:05 +01:00
Willi Ballenthin ad46b33bb7 com: move database into python files (#1924)
* com: move database into python files

* com: pep8 and lints

* com: fix generated string feature type

* pyinstaller: remove reference to old assets directory
2024-01-11 14:06:24 +01:00
Mike Hunhoff f37b598010 fix: do not trim api names that include :: (#1897) 2024-01-08 10:59:24 -07:00
mr-tz 51ddadbc87 fix symbol generation, ordinals 2023-12-03 17:49:54 +02:00
Yacine 0097822e51 Merge pull request #1820 from yelhamer/capabilities-module
add a capabilities module
2023-10-27 13:39:49 +02:00
Yacine Elhamer e559cc27d5 capa.rules: remove redundant ceng.MatchResults import 2023-10-26 19:43:26 +02:00
Yacine Elhamer a0cec3f07d capa.rules: remove redundant is_internal_rule() and has_file_limitations() from capa source code 2023-10-26 19:41:09 +02:00
Yacine Elhamer ab06c94d80 capa/main.py: move has_rule_with_namespace() to capa.rules.RuleSet 2023-10-20 20:10:29 +02:00
Moritz c9df78252a Ignore DLL names for API features (#1824)
* ignore DLL name for api features

* keep DLL name for import features

* fix tests
2023-10-20 13:39:15 +02:00
Yacine Elhamer aae72667a3 Merge branch 'capabilities-module' of https://github.com/yelhamer/capa into capabilities-module 2023-10-20 10:16:41 +02:00
Yacine Elhamer d6c5d98b0d move is_file_limitation_rule() to the rules module (Rule class) 2023-10-20 10:16:09 +02:00
Willi Ballenthin d5e187bc70 Merge branch 'master' into dynamic-feature-extraction 2023-10-19 09:15:57 +00:00
Aayush Goel 94cf53a1e3 Update __init__.py 2023-10-18 16:33:31 +05:30
Aayush Goel 6dbd3768ce Update __init__.py 2023-10-17 21:04:21 +05:30
Aayush Goel 7cd5aa1c40 Added Enum for comType 2023-10-17 20:28:49 +05:30
Willi Ballenthin 1aac4a1a69 mypy 2023-10-17 14:42:58 +00:00
Aayush Goel 884b714be2 loading com db only once
avoid loading db multiple times by caching it.
2023-10-17 19:48:06 +05:30
Willi Ballenthin e1b3a3f6b4 rules: fix rendering of yaml 2023-10-17 12:22:32 +00:00
Willi Ballenthin 44d05f9498 dynamic: fix some tests 2023-10-17 11:41:40 +00:00
Aayush Goel 23ecb248a5 Update __init__.py 2023-10-10 18:08:07 +05:30
Aayush Goel bc165331db Update __init__.py 2023-10-10 17:56:18 +05:30
Aayush Goel 24dad6bcc4 Update capa/rules/__init__.py
Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
2023-08-30 21:48:48 +05:30
Aayush Goel ab3747e448 added com prefix CLSID, IID 2023-08-30 01:00:07 +05:30
Yacine Elhamer 49adecb25c add yaml representer for the Scope class, as well as other bugfixes 2023-08-26 18:11:35 +02:00
Willi Ballenthin 9bbd3184b0 rules: handle unsupported scopes again 2023-08-25 13:15:55 +00:00
Willi Ballenthin a734358377 rules: use Scope enum instead of constants 2023-08-25 12:54:57 +00:00
Aayush Goel bd0d8eb403 Update __init__.py
added parse_description for com feature
Update CHANGELOG.md
added comments, dealt with errors
2023-08-25 16:04:25 +05:30
Aayush Goel 95e279a03b update com db
moved code to rules/init.py , create db for coms
2023-08-25 15:32:40 +05:30
Yacine 86effec1a2 capa/rules/__init__.py: merge features from small scopes into larger ones
Co-authored-by: Willi Ballenthin <wballenthin@google.com>
2023-08-23 08:49:36 +03:00
Willi Ballenthin 4ab240e990 rules: add scope terms "unsupported" and "unspecified"
closes #1744
2023-08-22 12:58:06 +00:00
Willi Ballenthin 89c8c6d212 Update capa/rules/__init__.py 2023-08-22 09:38:41 +02:00
Aayush Goel 1027da9be0 add new feature for com 2023-08-20 00:36:37 +05:30
Willi Ballenthin 8202e9e921 main: don't use analysis flavor to filter rules
im worried this will interact poorly with our rule cache,
unless we add more handling there, which needs more testing.
so, since the filtering likely has only a small impact on performance,
revert the rule filtering changes for simplicity.
2023-08-11 10:36:59 +00:00
Willi Ballenthin 3c069a6784 rules: don't change passed-in argument
make a local copy of the scopes dict
2023-08-11 10:35:40 +00:00
Willi Ballenthin e100a63cc8 rules: use set instead of tuple, add doc
since the primary operation is `contain()`,
set is more appropriate than tuple.
2023-08-11 10:34:41 +00:00
Willi Ballenthin c1fbb27d73 Merge branch 'master' into dynamic-feature-extraction 2023-08-10 13:21:49 +00:00
Aayush Goel df9828dd7f Update capa/rules/__init__.py
Co-authored-by: Willi Ballenthin <wballenthin@google.com>
2023-08-09 15:32:12 +05:30
Aayush Goel 0fdc1dd3f5 Type Hints done , get_all_feature to Rule class 2023-08-07 21:00:29 +05:30
Yacine Elhamer f461f65a86 move thread-scope features into the call-scope 2023-08-06 18:12:29 +01:00
Yacine Elhamer 8b36cd1e35 add call-scope tests 2023-08-04 16:20:37 +01:00
Yacine eafed0f1d4 build_statements(): fix call-scope InvalidRule message typo
Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>
2023-08-03 14:38:38 +01:00
Yacine Elhamer ca2760fb46 Initial commit 2023-08-02 22:46:54 +01:00