Merge pull request #266 from fireeye/release-v1.2.0

release v1.2.0
Merge pull request #264 from winniepe/master
2025-12-13 08:00:44 -08:00 · 2020-08-31 10:29:38 -06:00 · 2020-08-31 09:22:34 -06:00 · 2020-08-31 13:51:49 +00:00 · 2020-08-31 13:02:37 +00:00 · 2020-08-30 16:48:51 +00:00
42 changed files with 1965 additions and 1628 deletions
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -4,7 +4,6 @@ on:
  release:
    types: [created, edited]

-
 jobs:
  build:
    name: PyInstaller for ${{ matrix.os }}
@@ -15,13 +14,13 @@ jobs:
          - os: ubuntu-16.04
            # use old linux so that the shared library versioning is more portable
            artifact_name: capa
-            asset_name: capa-linux
+            asset_name: linux
          - os: windows-latest
            artifact_name: capa.exe
-            asset_name: capa-windows.exe
+            asset_name: windows
          - os: macos-latest
            artifact_name: capa
-            asset_name: capa-macos
+            asset_name: macos
    steps:
      - name: Checkout capa
        uses: actions/checkout@v2
@@ -32,7 +31,8 @@ jobs:
        with:
          python-version: 2.7
      - name: Install PyInstaller
-      run: pip install pyinstaller
+        # pyinstaller 4 doesn't support Python 2.7
+        run: pip install 'pyinstaller==3.*'
      - name: Install capa
        run: pip install -e .
      - name: Build standalone executable
@@ -43,10 +43,35 @@ jobs:
        with:
          name: ${{ matrix.asset_name }}
          path: dist/${{ matrix.artifact_name }}
-    - name: Upload binaries to GH Release
+
+  zip:
+    name: zip ${{ matrix.asset_name }}
+    runs-on: ubuntu-latest
+    needs: build
+    strategy:
+      matrix:
+        include:
+          - asset_name: linux
+            artifact_name: capa
+          - asset_name: windows
+            artifact_name: capa.exe
+          - asset_name: macos
+            artifact_name: capa
+    steps:
+      - name: Download ${{ matrix.asset_name }}
+        uses: actions/download-artifact@v2
+        with:
+          name: ${{ matrix.asset_name }}
+      - name: Set executable flag
+        run: chmod +x ${{ matrix.artifact_name }}
+      - name: Set zip name
+        run: echo ::set-env name=zip_name::capa-${GITHUB_REF#refs/tags/}-${{ matrix.asset_name }}.zip
+      - name: Zip ${{ matrix.artifact_name }} into ${{ env.zip_name }}
+        run: zip ${{ env.zip_name }} ${{ matrix.artifact_name }}
+      - name: Upload ${{ env.zip_name }} to GH Release
        uses: svenstaro/upload-release-action@v2
        with:
-        repo_token: ${{ secrets.CAPA_TOKEN }}
-        file: dist/${{ matrix.artifact_name }}
-        asset_name: ${{ matrix.asset_name }}
+          repo_token: ${{ secrets.GITHUB_TOKEN}}
+          file: ${{ env.zip_name }}
          tag: ${{ github.ref }}
+
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -41,17 +41,26 @@ jobs:
      run: python scripts/lint.py rules/

  tests:
+    name: Tests in ${{ matrix.python }}
    runs-on: ubuntu-latest
    needs: [code_style, rule_linter]
+    strategy:
+      matrix:
+        include:
+          - python: 2.7
+          - python: 3.6
+          - python: 3.7
+          - python: 3.8
+          - python: '3.9.0-alpha - 3.9.x' # Python latest
    steps:
    - name: Checkout capa with submodules
      uses: actions/checkout@v2
      with:
        submodules: true
-    - name: Set up Python 2.7
+    - name: Set up Python ${{ matrix.python }}
      uses: actions/setup-python@v2
      with:
-        python-version: 2.7
+        python-version: ${{ matrix.python }}
    - name: Install capa
      run: pip install -e .[dev]
    - name: Run tests
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,6 +1,6 @@
 [submodule "rules"]
 	path = rules
-	url = git@github.com:fireeye/capa-rules.git
+	url = ../capa-rules.git
 [submodule "tests/data"]
 	path = tests/data
-	url = git@github.com:fireeye/capa-testfiles.git
+	url = ../capa-testfiles.git
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -0,0 +1,186 @@
+# Change Log
+
+## v1.2.0 (2020-08-31)
+
+This release brings UI enhancements, especially for the IDA Pro plugin, 
+investment towards py3 support,
+fixes some bugs identified by the community, 
+and 46 (!) new rules.
+We received contributions from ten reverse engineers, including five new ones:
+
+  - @agithubuserlol
+  - @recvfrom
+  - @D4nch3n
+  - @edeca
+  - @winniepe 
+  
+Download a standalone binary below and checkout the readme [here on GitHub](https://github.com/fireeye/capa/).
+Report issues on our [issue tracker](https://github.com/fireeye/capa/issues)
+and contribute new rules at [capa-rules](https://github.com/fireeye/capa-rules/).
+ 
+### New features
+
+  - ida plugin: display arch flavors @mike-hunhoff
+  - ida plugin: display block descriptions @mike-hunhoff
+  - ida backend: extract features from nested pointers @mike-hunhoff
+  - main: show more progress output @williballenthin
+  - core: pin dependency versions #258 @recvfrom
+
+### New rules
+  - bypass UAC via AppInfo ALPC @agithubuserlol
+  - bypass UAC via token manipulation @agithubuserlol
+  - check for sandbox and av modules @re-fox
+  - check for sandbox username @re-fox
+  - check if process is running under wine @re-fox
+  - validate credit card number using luhn algorithm @re-fox
+  - validate credit card number using luhn algorithm with no lookup table @re-fox
+  - hash data using FNV @edeca @mr-tz
+  - link many functions at runtime @mr-tz
+  - reference public RSA key @mr-tz
+  - packed with ASPack @williballenthin
+  - delete internet cache @mike-hunhoff
+  - enumerate internet cache @mike-hunhoff
+  - send ICMP echo request @mike-hunhoff
+  - check for debugger via API @mike-hunhoff
+  - check for hardware breakpoints @mike-hunhoff
+  - check for kernel debugger via shared user data structure @mike-hunhoff
+  - check for protected handle exception @mike-hunhoff
+  - check for software breakpoints @mike-hunhoff
+  - check for trap flag exception @mike-hunhoff
+  - check for unexpected memory writes @mike-hunhoff
+  - check process job object @mike-hunhoff
+  - reference anti-VM strings targeting Parallels @mike-hunhoff
+  - reference anti-VM strings targeting Qemu @mike-hunhoff
+  - reference anti-VM strings targeting VirtualBox @mike-hunhoff
+  - reference anti-VM strings targeting VirtualPC @mike-hunhoff
+  - reference anti-VM strings targeting VMWare @mike-hunhoff
+  - reference anti-VM strings targeting Xen @mike-hunhoff
+  - reference analysis tools strings @mike-hunhoff
+  - reference WMI statements @mike-hunhoff
+  - get number of processor cores @mike-hunhoff
+  - get number of processors @mike-hunhoff
+  - enumerate disk properties @mike-hunhoff
+  - get disk size @mike-hunhoff
+  - get process heap flags @mike-hunhoff
+  - get process heap force flags @mike-hunhoff
+  - get Explorer PID @mike-hunhoff
+  - delay execution @mike-hunhoff
+  - check for process debug object @mike-hunhoff
+  - check license value @mike-hunhoff
+  - check ProcessDebugFlags @mike-hunhoff
+  - check ProcessDebugPort @mike-hunhoff
+  - check SystemKernelDebuggerInformation @mike-hunhoff
+  - check thread yield allowed @mike-hunhoff
+  - enumerate system firmware tables @mike-hunhoff
+  - get system firmware table @mike-hunhoff
+  - hide thread from debugger @mike-hunhoff
+
+### Bug fixes
+
+  - ida backend: extract unmapped immediate number features @mike-hunhoff
+  - ida backend: fix stack cookie check #257 @mike-hunhoff
+  - viv backend: better extract gs segment access @williballenthin
+  - core: enable counting of string features #241 @D4nch3n @williballenthin
+  - core: enable descriptions on feature with arch flavors @mike-hunhoff
+  - core: update git links for non-SSH access #259 @recvfrom
+
+### Changes
+
+  - ida plugin: better default display showing first level nesting @winniepe
+  - remove unused `characteristic(switch)` feature @ana06
+  - prepare testing infrastructure for multiple backends/py3 @williballenthin
+  - ci: zip build artifacts @ana06
+  - ci: build all supported python versions @ana06
+  - code style and formatting @mr-tz
+
+### Raw diffs
+
+  - [capa v1.1.0...v1.2.0](https://github.com/fireeye/capa/compare/v1.1.0...v1.2.0)
+  - [capa-rules v1.1.0...v1.2.0](https://github.com/fireeye/capa-rules/compare/v1.1.0...v1.2.0)
+
+## v1.1.0 (2020-08-05)
+
+This release brings new rule format updates, such as adding `offset/x32` and negative offsets,
+fixes some bugs identified by the community, and 28 (!) new rules.
+We received contributions from eight reverse engineers, including four new ones:
+
+  - @re-fox
+  - @psifertex
+  - @bitsofbinary
+  - @threathive
+  
+Download a standalone binary below and checkout the readme [here on GitHub](https://github.com/fireeye/capa/). Report issues on our [issue tracker](https://github.com/fireeye/capa/issues) and contribute new rules at [capa-rules](https://github.com/fireeye/capa-rules/).
+  
+### New features
+
+  - import: add Binary Ninja import script #205 #207 @psifertex
+  - rules: offsets can be negative #197 #208 @williballenthin
+  - rules: enable descriptions for statement nodes #194 #209 @Ana06
+  - rules: add arch flavors to number and offset features #210 #216 @williballenthin
+  - render: show SHA1/SHA256 in default report #164 @threathive
+  - tests: add tests for IDA Pro backend #202 @williballenthin
+  
+### New rules
+
+  - check for unmoving mouse cursor @BitsOfBinary
+  - check mutex and exit @re-fox
+  - parse credit card information @re-fox
+  - read ini file @re-fox
+  - validate credit card number with luhn algorithm @re-fox
+  - change the wallpaper @re-fox
+  - acquire debug privileges @williballenthin
+  - import public key @williballenthin
+  - terminate process by name @williballenthin
+  - encrypt data using DES @re-fox
+  - encrypt data using DES via WinAPI @re-fox
+  - hash data using sha1 via x86 extensions @re-fox
+  - hash data using sha256 via x86 extensions @re-fox
+  - capture network configuration via ipconfig @re-fox
+  - hash data via WinCrypt @mike-hunhoff
+  - get file attributes @mike-hunhoff
+  - allocate thread local storage @mike-hunhoff
+  - get thread local storage value @mike-hunhoff
+  - set thread local storage @mike-hunhoff
+  - get session integrity level @mike-hunhoff
+  - add file to cabinet file @mike-hunhoff
+  - flush cabinet file @mike-hunhoff
+  - open cabinet file @mike-hunhoff
+  - gather firefox profile information @re-fox
+  - encrypt data using skipjack @re-fox
+  - encrypt data using camellia @re-fox
+  - hash data using tiger @re-fox
+  - encrypt data using blowfish @re-fox
+  - encrypt data using twofish @re-fox
+
+### Bug fixes
+
+  - linter: fix exception when examples is `None` @Ana06
+  - linter: fix suggested recommendations via templating @williballenthin
+  - render: fix exception when rendering counts @williballenthin
+  - render: fix render of negative offsets @williballenthin
+  - extractor: fix segmentation violation from vivisect @williballenthin
+  - main: fix crash when .viv cannot be saved #168 @secshoggoth @williballenthin
+  - main: fix shellcode .viv save path @williballenthin
+
+### Changes
+
+  - doc: explain how to bypass gatekeeper on macOS @psifertex
+  - doc: explain supported linux distributions @Ana06
+  - doc: explain submodule update with --init @psifertex
+  - main: improve program help output @mr-tz
+  - main: disable progress when run in quiet mode @mr-tz
+  - main: assert supported IDA versions @mr-tz
+  - extractor: better identify nested pointers to strings @williballenthin
+  - setup: specify vivisect download url @Ana06
+  - setup: pin vivisect version @williballenthin
+  - setup: bump vivisect dependency version @williballenthin
+  - setup: set Python project name to `flare-capa` @williballenthin
+  - ci: run tests and linter via Github Actions @Ana06
+  - hooks: run style checkers and hide stashed output @Ana06
+  - linter: ignore period in rule filename @williballenthin
+  - linter: warn on nursery rule with no changes needed @williballenthin
+
+### Raw diffs
+
+  - [capa v1.0.0...v1.1.0](https://github.com/fireeye/capa/compare/v1.0.0...v1.1.0)
+  - [capa-rules v1.0.0...v1.1.0](https://github.com/fireeye/capa-rules/compare/v1.0.0...v1.1.0)
--- a/README.md
+++ b/README.md
@@ -1,7 +1,7 @@
 ![capa](.github/logo.png)

 [![CI status](https://github.com/fireeye/capa/workflows/CI/badge.svg)](https://github.com/fireeye/capa/actions?query=workflow%3ACI+event%3Apush+branch%3Amaster)
-[![Number of rules](https://img.shields.io/badge/rules-290-blue.svg)](https://github.com/fireeye/capa-rules)
+[![Number of rules](https://img.shields.io/badge/rules-341-blue.svg)](https://github.com/fireeye/capa-rules)
 [![License](https://img.shields.io/badge/license-Apache--2.0-green.svg)](LICENSE.txt)

 capa detects capabilities in executable files.
--- a/capa/features/init.py
+++ b/capa/features/init.py
@@ -161,7 +161,7 @@ class Regex(String):


 class StringFactory(object):
-    def __new__(self, value, description):
+    def __new__(self, value, description=None):
        if value.startswith("/") and (value.endswith("/") or value.endswith("/i")):
            return Regex(value, description=description)
        return String(value, description=description)
--- a/capa/features/extractors/init.py
+++ b/capa/features/extractors/init.py
@@ -196,7 +196,7 @@ class NullFeatureExtractor(FeatureExtractor):
            'functions': {
                0x401000: {
                    'features': [
-                        (0x401000, capa.features.Characteristic('switch')),
+                        (0x401000, capa.features.Characteristic('nzxor')),
                    ],
                    'basic blocks': {
                        0x401000: {
--- a/capa/features/extractors/ida/init.py
+++ b/capa/features/extractors/ida/init.py
@@ -55,16 +55,27 @@ class IdaFeatureExtractor(FeatureExtractor):
    def get_functions(self):
        import capa.features.extractors.ida.helpers as ida_helpers

+        # data structure shared across functions yielded here.
+        # useful for caching analysis relevant across a single workspace.
+        ctx = {}
+
        # ignore library functions and thunk functions as identified by IDA
        for f in ida_helpers.get_functions(skip_thunks=True, skip_libs=True):
+            setattr(f, "ctx", ctx)
            yield add_ea_int_cast(f)

+    @staticmethod
+    def get_function(ea):
+        f = idaapi.get_func(ea)
+        setattr(f, "ctx", {})
+        return add_ea_int_cast(f)
+
    def extract_function_features(self, f):
        for (feature, ea) in capa.features.extractors.ida.function.extract_features(f):
            yield feature, ea

    def get_basic_blocks(self, f):
-        for bb in idaapi.FlowChart(f, flags=idaapi.FC_PREDS):
+        for bb in capa.features.extractors.ida.helpers.get_function_blocks(f):
            yield add_ea_int_cast(bb)

    def extract_basic_block_features(self, f, bb):
--- a/capa/features/extractors/ida/function.py
+++ b/capa/features/extractors/ida/function.py
@@ -14,16 +14,6 @@ from capa.features import Characteristic
 from capa.features.extractors import loops


-def extract_function_switch(f):
-    """ extract switch indicators from a function
-
-        arg:
-            f (IDA func_t)
-    """
-    if capa.features.extractors.ida.helpers.is_function_switch_statement(f):
-        yield Characteristic("switch"), f.start_ea
-
-
 def extract_function_calls_to(f):
    """extract callers to a function

@@ -72,7 +62,7 @@ def extract_features(f):
            yield feature, ea


-FUNCTION_HANDLERS = (extract_function_calls_to, extract_function_switch, extract_function_loop, extract_recursive_call)
+FUNCTION_HANDLERS = (extract_function_calls_to, extract_function_loop, extract_recursive_call)


 def main():
--- a/capa/features/extractors/ida/helpers.py
+++ b/capa/features/extractors/ida/helpers.py
@@ -300,22 +300,6 @@ def is_function_recursive(f):
    return False


-def is_function_switch_statement(f):
-    """ check a function for switch statement indicators
-
-        adapted from:
-        https://reverseengineering.stackexchange.com/questions/17548/calc-switch-cases-in-idapython-cant-iterate-over-results?rq=1
-
-        arg:
-            f (IDA func_t)
-    """
-    for (start, end) in idautils.Chunks(f.start_ea):
-        for head in idautils.Heads(start, end):
-            if idaapi.get_switch_info(head):
-                return True
-    return False
-
-
 def is_basic_block_tight_loop(bb):
    """check basic block loops to self

@@ -331,3 +315,47 @@ def is_basic_block_tight_loop(bb):
            if ref == bb.start_ea:
                return True
    return False
+
+
+def find_data_reference_from_insn(insn, max_depth=10):
+    """ search for data reference from instruction, return address of instruction if no reference exists """
+    depth = 0
+    ea = insn.ea
+
+    while True:
+        data_refs = list(idautils.DataRefsFrom(ea))
+
+        if len(data_refs) != 1:
+            # break if no refs or more than one ref (assume nested pointers only have one data reference)
+            break
+
+        if ea == data_refs[0]:
+            # break if circular reference
+            break
+
+        depth += 1
+        if depth > max_depth:
+            # break if max depth
+            break
+
+        ea = data_refs[0]
+
+    return ea
+
+
+def get_function_blocks(f):
+    """yield basic blocks contained in specified function
+
+    args:
+        f (IDA func_t)
+    yield:
+        block (IDA BasicBlock)
+    """
+    # leverage idaapi.FC_NOEXT flag to ignore useless external blocks referenced by the function
+    for block in idaapi.FlowChart(f, flags=(idaapi.FC_PREDS | idaapi.FC_NOEXT)):
+        yield block
+
+
+def is_basic_block_return(bb):
+    """ check if basic block is return block """
+    return bb.type == idaapi.fcb_ret
--- a/capa/features/extractors/ida/insn.py
+++ b/capa/features/extractors/ida/insn.py
@@ -15,41 +15,42 @@ import capa.features.extractors.ida.helpers
 from capa.features import ARCH_X32, ARCH_X64, MAX_BYTES_FEATURE_SIZE, Bytes, String, Characteristic
 from capa.features.insn import Number, Offset, Mnemonic

-_file_imports_cache = None
+# security cookie checks may perform non-zeroing XORs, these are expected within a certain
+# byte range within the first and returning basic blocks, this helps to reduce FP features
+SECURITY_COOKIE_BYTES_DELTA = 0x40


-def get_arch():
+def get_arch(ctx):
    """
    fetch the ARCH_* constant for the currently open workspace.
-    we expect this routine to be pretty lightweight, so we don't cache it.

    via Tamir Bahar/@tmr232
    https://reverseengineering.stackexchange.com/a/11398/17194
    """
+    if "arch" not in ctx:
        info = idaapi.get_inf_structure()
        if info.is_64bit():
-        return ARCH_X64
+            ctx["arch"] = ARCH_X64
        elif info.is_32bit():
-        return ARCH_X32
+            ctx["arch"] = ARCH_X32
        else:
            raise ValueError("unexpected architecture")
+    return ctx["arch"]


-def get_imports():
-    """ """
-    global _file_imports_cache
-    if _file_imports_cache is None:
-        _file_imports_cache = capa.features.extractors.ida.helpers.get_file_imports()
-    return _file_imports_cache
+def get_imports(ctx):
+    if "imports_cache" not in ctx:
+        ctx["imports_cache"] = capa.features.extractors.ida.helpers.get_file_imports()
+    return ctx["imports_cache"]


-def check_for_api_call(insn):
+def check_for_api_call(ctx, insn):
    """ check instruction for API call """
    if not idaapi.is_call_insn(insn):
        return

    for ref in idautils.CodeRefsFrom(insn.ea, False):
-        info = get_imports().get(ref, ())
+        info = get_imports(ctx).get(ref, ())
        if info:
            yield "%s.%s" % (info[0], info[1])
        else:
@@ -59,7 +60,7 @@ def check_for_api_call(insn):
            if f and (f.flags & idaapi.FUNC_THUNK):
                for thunk_ref in idautils.DataRefsFrom(ref):
                    # TODO: always data ref for thunk??
-                    info = get_imports().get(thunk_ref, ())
+                    info = get_imports(ctx).get(thunk_ref, ())
                    if info:
                        yield "%s.%s" % (info[0], info[1])

@@ -75,7 +76,7 @@ def extract_insn_api_features(f, bb, insn):
    example:
        call dword [0x00473038]
    """
-    for api in check_for_api_call(insn):
+    for api in check_for_api_call(f.ctx, insn):
        for (feature, ea) in capa.features.extractors.helpers.generate_api_features(api, insn.ea):
            yield feature, ea

@@ -101,11 +102,14 @@ def extract_insn_number_features(f, bb, insn):
        #   .text:00401145 add esp, 0Ch
        return

-    for op in capa.features.extractors.ida.helpers.get_insn_ops(insn, target_ops=(idaapi.o_imm,)):
+    for op in capa.features.extractors.ida.helpers.get_insn_ops(insn, target_ops=(idaapi.o_imm, idaapi.o_mem)):
+        if op.type == idaapi.o_imm:
            const = capa.features.extractors.ida.helpers.mask_op_val(op)
+        else:
+            const = op.addr
        if not idaapi.is_mapped(const):
            yield Number(const), insn.ea
-            yield Number(const, arch=get_arch()), insn.ea
+            yield Number(const, arch=get_arch(f.ctx)), insn.ea


 def extract_insn_bytes_features(f, bb, insn):
@@ -119,11 +123,8 @@ def extract_insn_bytes_features(f, bb, insn):
    example:
        push    offset iid_004118d4_IShellLinkA ; riid
    """
-    if idaapi.is_call_insn(insn):
-        # ignore call instructions
-        return
-
-    for ref in idautils.DataRefsFrom(insn.ea):
+    ref = capa.features.extractors.ida.helpers.find_data_reference_from_insn(insn)
+    if ref != insn.ea:
        extracted_bytes = capa.features.extractors.ida.helpers.read_bytes_at(ref, MAX_BYTES_FEATURE_SIZE)
        if extracted_bytes and not capa.features.extractors.helpers.all_zeros(extracted_bytes):
            yield Bytes(extracted_bytes), insn.ea
@@ -140,7 +141,8 @@ def extract_insn_string_features(f, bb, insn):
    example:
        push offset aAcr     ; "ACR  > "
    """
-    for ref in idautils.DataRefsFrom(insn.ea):
+    ref = capa.features.extractors.ida.helpers.find_data_reference_from_insn(insn)
+    if ref != insn.ea:
        found = capa.features.extractors.ida.helpers.find_string_at(ref)
        if found:
            yield String(found), insn.ea
@@ -173,7 +175,7 @@ def extract_insn_offset_features(f, bb, insn):
        op_off = capa.features.extractors.helpers.twos_complement(op_off, 32)

        yield Offset(op_off), insn.ea
-        yield Offset(op_off, arch=get_arch()), insn.ea
+        yield Offset(op_off, arch=get_arch(f.ctx)), insn.ea


 def contains_stack_cookie_keywords(s):
@@ -225,12 +227,37 @@ def bb_stack_cookie_registers(bb):
                    yield op.reg


+def is_nzxor_stack_cookie_delta(f, bb, insn):
+    """ check if nzxor exists within stack cookie delta """
+    # security cookie check should use SP or BP
+    if not capa.features.extractors.ida.helpers.is_frame_register(insn.Op2.reg):
+        return False
+
+    f_bbs = tuple(capa.features.extractors.ida.helpers.get_function_blocks(f))
+
+    # expect security cookie init in first basic block within first bytes (instructions)
+    if capa.features.extractors.ida.helpers.is_basic_block_equal(bb, f_bbs[0]) and insn.ea < (
+        bb.start_ea + SECURITY_COOKIE_BYTES_DELTA
+    ):
+        return True
+
+    # ... or within last bytes (instructions) before a return
+    if capa.features.extractors.ida.helpers.is_basic_block_return(bb) and insn.ea > (
+        bb.start_ea + capa.features.extractors.ida.helpers.basic_block_size(bb) - SECURITY_COOKIE_BYTES_DELTA
+    ):
+        return True
+
+    return False
+
+
 def is_nzxor_stack_cookie(f, bb, insn):
    """ check if nzxor is related to stack cookie """
    if contains_stack_cookie_keywords(idaapi.get_cmt(insn.ea, False)):
        # Example:
        #   xor     ecx, ebp        ; StackCookie
        return True
+    if is_nzxor_stack_cookie_delta(f, bb, insn):
+        return True
    stack_cookie_regs = tuple(bb_stack_cookie_registers(bb))
    if any(op_reg in stack_cookie_regs for op_reg in (insn.Op1.reg, insn.Op2.reg)):
        # Example:
@@ -322,7 +349,7 @@ def extract_insn_cross_section_cflow(f, bb, insn):
        insn (IDA insn_t)
    """
    for ref in idautils.CodeRefsFrom(insn.ea, False):
-        if ref in get_imports().keys():
+        if ref in get_imports(f.ctx).keys():
            # ignore API calls
            continue
        if not idaapi.getseg(ref):
--- a/capa/features/extractors/viv/function.py
+++ b/capa/features/extractors/viv/function.py
@@ -25,45 +25,6 @@ def interface_extract_function_XXX(f):
    yield NotImplementedError("feature"), NotImplementedError("virtual address")


-def get_switches(vw):
-    """
-    caching accessor to vivisect workspace switch constructs.
-    """
-    if "switches" in vw.metadata:
-        return vw.metadata["switches"]
-    else:
-        # addresses of switches in the program
-        switches = set()
-
-        for case_va, _ in filter(lambda t: "case" in t[1], vw.getNames()):
-            # assume that the xref to a case location is a switch construct
-            for switch_va, _, _, _ in vw.getXrefsTo(case_va):
-                switches.add(switch_va)
-
-        vw.metadata["switches"] = switches
-        return switches
-
-
-def get_functions_with_switch(vw):
-    if "functions_with_switch" in vw.metadata:
-        return vw.metadata["functions_with_switch"]
-    else:
-        functions = set()
-        for switch in get_switches(vw):
-            functions.add(vw.getFunction(switch))
-        vw.metadata["functions_with_switch"] = functions
-        return functions
-
-
-def extract_function_switch(f):
-    """
-    parse if a function contains a switch statement based on location names
-    method can be optimized
-    """
-    if f.va in get_functions_with_switch(f.vw):
-        yield Characteristic("switch"), f.va
-
-
 def extract_function_calls_to(f):
    for src, _, _, _ in f.vw.getXrefsTo(f.va, rtype=vivisect.const.REF_CODE):
        yield Characteristic("calls to"), src
@@ -106,4 +67,4 @@ def extract_features(f):
            yield feature, va


-FUNCTION_HANDLERS = (extract_function_switch, extract_function_calls_to, extract_function_loop)
+FUNCTION_HANDLERS = (extract_function_calls_to, extract_function_loop)
--- a/capa/features/extractors/viv/insn.py
+++ b/capa/features/extractors/viv/insn.py
@@ -128,10 +128,13 @@ def extract_insn_number_features(f, bb, insn):
    #     push    3136B0h         ; dwControlCode
    for oper in insn.opers:
        # this is for both x32 and x64
-        if not isinstance(oper, envi.archs.i386.disasm.i386ImmOper):
+        if not isinstance(oper, (envi.archs.i386.disasm.i386ImmOper, envi.archs.i386.disasm.i386ImmMemOper)):
            continue

+        if isinstance(oper, envi.archs.i386.disasm.i386ImmOper):
            v = oper.getOperValue(oper)
+        else:
+            v = oper.getOperAddr(oper)

        if f.vw.probeMemory(v, 1, envi.memory.MM_READ):
            # this is a valid address
@@ -162,7 +165,12 @@ def derefs(vw, p):
            return
        yield p

+        try:
            next = vw.readMemoryPtr(p)
+        except Exception:
+            # if not enough bytes can be read, such as end of the section.
+            # unfortunately, viv returns a plain old generic `Exception` for this.
+            return

        # sanity: pointer points to self
        if next == p:
@@ -390,7 +398,9 @@ def extract_insn_peb_access_characteristic_features(f, bb, insn):
    if insn.mnem not in ["push", "mov"]:
        return

-    if "fs" in insn.getPrefixName():
+    prefix = insn.getPrefixName()
+
+    if "fs" in prefix:
        for oper in insn.opers:
            # examples
            #
@@ -403,10 +413,12 @@ def extract_insn_peb_access_characteristic_features(f, bb, insn):
                isinstance(oper, envi.archs.i386.disasm.i386ImmMemOper) and oper.imm == 0x30
            ):
                yield Characteristic("peb access"), insn.va
-    elif "gs" in insn.getPrefixName():
+    elif "gs" in prefix:
        for oper in insn.opers:
-            if (isinstance(oper, envi.archs.amd64.disasm.i386RegMemOper) and oper.disp == 0x60) or (
-                isinstance(oper, envi.archs.amd64.disasm.i386ImmMemOper) and oper.imm == 0x60
+            if (
+                (isinstance(oper, envi.archs.amd64.disasm.i386RegMemOper) and oper.disp == 0x60)
+                or (isinstance(oper, envi.archs.amd64.disasm.i386SibOper) and oper.imm == 0x60)
+                or (isinstance(oper, envi.archs.amd64.disasm.i386ImmMemOper) and oper.imm == 0x60)
            ):
                yield Characteristic("peb access"), insn.va
    else:
--- a/capa/features/freeze.py
+++ b/capa/features/freeze.py
@@ -84,7 +84,16 @@ def dumps(extractor):
    returns:
      str: the serialized features.
    """
-    ret = {"version": 1, "functions": {}, "scopes": {"file": [], "function": [], "basic block": [], "instruction": [],}}
+    ret = {
+        "version": 1,
+        "functions": {},
+        "scopes": {
+            "file": [],
+            "function": [],
+            "basic block": [],
+            "instruction": [],
+        },
+    }

    for feature, va in extractor.extract_file_features():
        ret["scopes"]["file"].append(serialize_feature(feature) + (hex(va), ()))
@@ -99,14 +108,33 @@ def dumps(extractor):
            ret["functions"][hex(f)][hex(bb)] = []

            for feature, va in extractor.extract_basic_block_features(f, bb):
-                ret["scopes"]["basic block"].append(serialize_feature(feature) + (hex(va), (hex(f), hex(bb),)))
+                ret["scopes"]["basic block"].append(
+                    serialize_feature(feature)
+                    + (
+                        hex(va),
+                        (
+                            hex(f),
+                            hex(bb),
+                        ),
+                    )
+                )

-            for insn, insnva in sorted([(insn, int(insn)) for insn in extractor.get_instructions(f, bb)]):
+            for insnva, insn in sorted(
+                [(insn.__int__(), insn) for insn in extractor.get_instructions(f, bb)], key=lambda p: p[0]
+            ):
                ret["functions"][hex(f)][hex(bb)].append(hex(insnva))

                for feature, va in extractor.extract_insn_features(f, bb, insn):
                    ret["scopes"]["instruction"].append(
-                        serialize_feature(feature) + (hex(va), (hex(f), hex(bb), hex(insnva),))
+                        serialize_feature(feature)
+                        + (
+                            hex(va),
+                            (
+                                hex(f),
+                                hex(bb),
+                                hex(insnva),
+                            ),
+                        )
                    )
    return json.dumps(ret)

@@ -245,12 +273,7 @@ def main(argv=None):
        logging.basicConfig(level=logging.INFO)
        logging.getLogger().setLevel(logging.INFO)

-    vw = capa.main.get_workspace(args.sample, args.format)
-
-    # don't import this at top level to support ida/py3 backend
-    import capa.features.extractors.viv
-
-    extractor = capa.features.extractors.viv.VivisectFeatureExtractor(vw, args.sample)
+    extractor = capa.main.get_extractor(args.sample, args.format)
    with open(args.output, "wb") as f:
        f.write(dump(extractor))

--- a/capa/ida/explorer/model.py
+++ b/capa/ida/explorer/model.py
@@ -348,12 +348,18 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
        },
        """
        if statement["type"] in ("and", "or", "optional"):
-            return CapaExplorerDefaultItem(parent, statement["type"])
+            display = statement["type"]
+            if statement.get("description"):
+                display += " (%s)" % statement["description"]
+            return CapaExplorerDefaultItem(parent, display)
        elif statement["type"] == "not":
            # TODO: do we display 'not'
            pass
        elif statement["type"] == "some":
-            return CapaExplorerDefaultItem(parent, statement["count"] + " or more")
+            display = "%d or more" % statement["count"]
+            if statement.get("description"):
+                display += " (%s)" % statement["description"]
+            return CapaExplorerDefaultItem(parent, display)
        elif statement["type"] == "range":
            # `range` is a weird node, its almost a hybrid of statement + feature.
            # it is a specific feature repeated multiple times.
@@ -370,6 +376,9 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
            else:
                display += "between %d and %d" % (statement["min"], statement["max"])

+            if statement.get("description"):
+                display += " (%s)" % statement["description"]
+
            parent2 = CapaExplorerFeatureItem(parent, display=display)

            for location in locations:
@@ -378,7 +387,10 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):

            return parent2
        elif statement["type"] == "subscope":
-            return CapaExplorerSubscopeItem(parent, statement[statement["type"]])
+            display = statement[statement["type"]]
+            if statement.get("description"):
+                display += " (%s)" % statement["description"]
+            return CapaExplorerSubscopeItem(parent, display)
        else:
            raise RuntimeError("unexpected match statement type: " + str(statement))

@@ -497,7 +509,13 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):

        if len(locations) == 1:
            # only one location for feature so no need to nest children
-            parent2 = self.render_capa_doc_feature(parent, feature, next(iter(locations)), doc, display=display,)
+            parent2 = self.render_capa_doc_feature(
+                parent,
+                feature,
+                next(iter(locations)),
+                doc,
+                display=display,
+            )
        else:
            # feature has multiple children, nest  under one parent feature node
            parent2 = CapaExplorerFeatureItem(parent, display)
@@ -528,7 +546,7 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
            if feature[feature["type"]] in ("embedded pe",):
                return CapaExplorerByteViewItem(parent, display, location)

-            if feature[feature["type"]] in ("loop", "recursive call", "tight loop", "switch"):
+            if feature[feature["type"]] in ("loop", "recursive call", "tight loop"):
                return CapaExplorerFeatureItem(parent, display=display)

            # default to instruction view for all other characteristics
@@ -546,7 +564,17 @@ class CapaExplorerDataModel(QtCore.QAbstractItemModel):
        if feature["type"] == "basicblock":
            return CapaExplorerBlockItem(parent, location)

-        if feature["type"] in ("bytes", "api", "mnemonic", "number", "offset"):
+        if feature["type"] in (
+            "bytes",
+            "api",
+            "mnemonic",
+            "number",
+            "offset",
+            "number/x32",
+            "number/x64",
+            "offset/x32",
+            "offset/x64",
+        ):
            # display instruction preview
            return CapaExplorerInstructionViewItem(parent, display, location)

--- a/capa/ida/explorer/proxy.py
+++ b/capa/ida/explorer/proxy.py
@@ -16,6 +16,9 @@ class CapaExplorerSortFilterProxyModel(QtCore.QSortFilterProxyModel):
        """ """
        super(CapaExplorerSortFilterProxyModel, self).__init__(parent)

+        self.min_ea = None
+        self.max_ea = None
+
    def lessThan(self, left, right):
        """true if the value of the left item is less than value of right item

@@ -62,15 +65,6 @@ class CapaExplorerSortFilterProxyModel(QtCore.QSortFilterProxyModel):

        return False

-    def add_single_string_filter(self, column, string):
-        """ add fixed string filter
-
-            @param column: key column
-            @param string: string to sort
-        """
-        self.setFilterKeyColumn(column)
-        self.setFilterFixedString(string)
-
    def index_has_accepted_children(self, row, parent):
        """ """
        model_index = self.sourceModel().index(row, 0, parent)
@@ -86,4 +80,33 @@ class CapaExplorerSortFilterProxyModel(QtCore.QSortFilterProxyModel):

    def filter_accepts_row_self(self, row, parent):
        """ """
-        return super(CapaExplorerSortFilterProxyModel, self).filterAcceptsRow(row, parent)
+        # filter not set
+        if self.min_ea is None and self.max_ea is None:
+            return True
+
+        index = self.sourceModel().index(row, 0, parent)
+        data = index.internalPointer().data(CapaExplorerDataModel.COLUMN_INDEX_VIRTUAL_ADDRESS)
+
+        if not data:
+            return False
+
+        ea = int(data, 16)
+
+        if self.min_ea <= ea and ea < self.max_ea:
+            return True
+
+        return False
+
+    def add_address_range_filter(self, min_ea, max_ea):
+        """ """
+        self.min_ea = min_ea
+        self.max_ea = max_ea
+
+        self.setFilterKeyColumn(CapaExplorerDataModel.COLUMN_INDEX_VIRTUAL_ADDRESS)
+        self.invalidateFilter()
+
+    def reset_address_range_filter(self):
+        """ """
+        self.min_ea = None
+        self.max_ea = None
+        self.invalidateFilter()
--- a/capa/ida/explorer/view.py
+++ b/capa/ida/explorer/view.py
@@ -59,7 +59,7 @@ class CapaExplorerQtreeView(QtWidgets.QTreeView):
        called when view should reset any user interface changes
        made since the last reset e.g. IDA window highlighting
        """
-        self.collapseAll()
+        self.expandToDepth(0)
        self.resize_columns_to_content()

    def resize_columns_to_content(self):
--- a/capa/ida/helpers/init.py
+++ b/capa/ida/helpers/init.py
@@ -102,6 +102,9 @@ def collect_metadata():
            "sha256": sha256,
            "path": idaapi.get_input_file_path(),
        },
-        "analysis": {"format": idaapi.get_file_type_name(), "extractor": "ida",},
+        "analysis": {
+            "format": idaapi.get_file_type_name(),
+            "extractor": "ida",
+        },
        "version": capa.version.__version__,
    }
--- a/capa/ida/ida_capa_explorer.py
+++ b/capa/ida/ida_capa_explorer.py
@@ -324,35 +324,25 @@ class CapaExplorerForm(idaapi.PluginForm):
    def ida_hook_screen_ea_changed(self, widget, new_ea, old_ea):
        """hook for IDA screen ea changed

+        this hook is currently only relevant for limiting results displayed in the UI
+
        @param widget: IDA widget type
        @param new_ea: destination ea
        @param old_ea: source ea
        """
        if not self.view_limit_results_by_function.isChecked():
-            # ignore if checkbox not selected
+            # ignore if limit checkbox not selected
            return

        if idaapi.get_widget_type(widget) != idaapi.BWN_DISASM:
-            # ignore views other than asm
+            # ignore views not the assembly view
            return

-        # attempt to map virtual addresses to function start addresses
-        new_func_start = capa.ida.helpers.get_func_start_ea(new_ea)
-        old_func_start = capa.ida.helpers.get_func_start_ea(old_ea)
-
-        if new_func_start and new_func_start == old_func_start:
-            # navigated within the same function - do nothing
+        if idaapi.get_func(new_ea) == idaapi.get_func(old_ea):
+            # user navigated same function - ignore
            return

-        if new_func_start:
-            # navigated to new function - filter for function start virtual address
-            match = capa.ida.explorer.item.location_to_hex(new_func_start)
-        else:
-            # navigated to virtual address not in valid function - clear filter
-            match = ""
-
-        # filter on virtual address to avoid updating filter string if function name is changed
-        self.model_proxy.add_single_string_filter(CapaExplorerDataModel.COLUMN_INDEX_VIRTUAL_ADDRESS, match)
+        self.limit_results_to_function(idaapi.get_func(new_ea))
        self.view_tree.resize_columns_to_content()

    def load_capa_results(self):
@@ -534,15 +524,23 @@ class CapaExplorerForm(idaapi.PluginForm):
        if checked, configure function filter if screen location is located
        in function, otherwise clear filter
        """
-        match = ""
        if self.view_limit_results_by_function.isChecked():
-            location = capa.ida.helpers.get_func_start_ea(idaapi.get_screen_ea())
-            if location:
-                match = capa.ida.explorer.item.location_to_hex(location)
+            self.limit_results_to_function(idaapi.get_func(idaapi.get_screen_ea()))
+        else:
+            self.model_proxy.reset_address_range_filter()

-        self.model_proxy.add_single_string_filter(CapaExplorerDataModel.COLUMN_INDEX_VIRTUAL_ADDRESS, match)
+        self.view_tree.reset()

-        self.view_tree.resize_columns_to_content()
+    def limit_results_to_function(self, f):
+        """add filter to limit results to current function
+
+        @param f: (IDA func_t)
+        """
+        if f:
+            self.model_proxy.add_address_range_filter(f.start_ea, f.end_ea)
+        else:
+            # if function not exists don't display any results (address should not be -1)
+            self.model_proxy.add_address_range_filter(-1, -1)


 def main():
--- a/capa/main.py
+++ b/capa/main.py
@@ -18,6 +18,7 @@ import datetime
 import textwrap
 import collections

+import halo
 import tqdm
 import colorama

@@ -104,9 +105,14 @@ def find_capabilities(ruleset, extractor, disable_progress=None):
    all_function_matches = collections.defaultdict(list)
    all_bb_matches = collections.defaultdict(list)

-    meta = {"feature_counts": {"file": 0, "functions": {},}}
+    meta = {
+        "feature_counts": {
+            "file": 0,
+            "functions": {},
+        }
+    }

-    for f in tqdm.tqdm(extractor.get_functions(), disable=disable_progress, unit=" functions"):
+    for f in tqdm.tqdm(list(extractor.get_functions()), disable=disable_progress, desc="matching", unit=" functions"):
        function_matches, bb_matches, feature_count = find_function_capabilities(ruleset, extractor, f)
        meta["feature_counts"]["functions"][f.__int__()] = feature_count
        logger.debug("analyzed function 0x%x and extracted %d features", f.__int__(), feature_count)
@@ -269,9 +275,10 @@ def get_workspace(path, format, should_save=True):
    return vw


-def get_extractor_py2(path, format):
+def get_extractor_py2(path, format, disable_progress=False):
    import capa.features.extractors.viv

+    with halo.Halo(text="analyzing program", spinner="simpleDots", stream=sys.stderr, enabled=not disable_progress):
        vw = get_workspace(path, format, should_save=False)

        try:
@@ -287,19 +294,19 @@ class UnsupportedRuntimeError(RuntimeError):
    pass


-def get_extractor_py3(path, format):
+def get_extractor_py3(path, format, disable_progress=False):
    raise UnsupportedRuntimeError()


-def get_extractor(path, format):
+def get_extractor(path, format, disable_progress=False):
    """
    raises:
      UnsupportedFormatError:
    """
    if sys.version_info >= (3, 0):
-        return get_extractor_py3(path, format)
+        return get_extractor_py3(path, format, disable_progress=disable_progress)
    else:
-        return get_extractor_py2(path, format)
+        return get_extractor_py2(path, format, disable_progress=disable_progress)


 def is_nursery_rule_path(path):
@@ -315,7 +322,7 @@ def is_nursery_rule_path(path):
    return "nursery" in path


-def get_rules(rule_path):
+def get_rules(rule_path, disable_progress=False):
    if not os.path.exists(rule_path):
        raise IOError("rule path %s does not exist or cannot be accessed" % rule_path)

@@ -343,7 +350,8 @@ def get_rules(rule_path):
                rule_paths.append(rule_path)

    rules = []
-    for rule_path in rule_paths:
+
+    for rule_path in tqdm.tqdm(list(rule_paths), disable=disable_progress, desc="loading ", unit="     rules"):
        try:
            rule = capa.rules.Rule.from_yaml_file(rule_path)
        except capa.rules.InvalidRule:
@@ -526,7 +534,7 @@ def main(argv=None):
        logger.debug("using rules path: %s", rules_path)

    try:
-        rules = get_rules(rules_path)
+        rules = get_rules(rules_path, disable_progress=args.quiet)
        rules = capa.rules.RuleSet(rules)
        logger.debug("successfully loaded %s rules", len(rules))
        if args.tag:
@@ -546,7 +554,7 @@ def main(argv=None):
    else:
        format = args.format
        try:
-            extractor = get_extractor(args.sample, args.format)
+            extractor = get_extractor(args.sample, args.format, disable_progress=args.quiet)
        except UnsupportedFormatError:
            logger.error("-" * 80)
            logger.error(" Input file does not appear to be a PE file.")
--- a/capa/render/init.py
+++ b/capa/render/init.py
@@ -152,7 +152,10 @@ def convert_match_to_result_document(rules, capabilities, result):
            scope = rule.meta["scope"]
            doc["node"] = {
                "type": "statement",
-                "statement": {"type": "subscope", "subscope": scope,},
+                "statement": {
+                    "type": "subscope",
+                    "subscope": scope,
+                },
            }

        for location in doc["locations"]:
@@ -257,5 +260,7 @@ class CapaJsonObjectEncoder(json.JSONEncoder):

 def render_json(meta, rules, capabilities):
    return json.dumps(
-        convert_capabilities_to_result_document(meta, rules, capabilities), cls=CapaJsonObjectEncoder, sort_keys=True,
+        convert_capabilities_to_result_document(meta, rules, capabilities),
+        cls=CapaJsonObjectEncoder,
+        sort_keys=True,
    )
--- a/capa/render/default.py
+++ b/capa/render/default.py
@@ -109,7 +109,12 @@ def render_attack(doc, ostream):
                inner_rows.append("%s::%s %s" % (rutils.bold(technique), subtechnique, id))
            else:
                raise RuntimeError("unexpected ATT&CK spec format")
-        rows.append((rutils.bold(tactic.upper()), "\n".join(inner_rows),))
+        rows.append(
+            (
+                rutils.bold(tactic.upper()),
+                "\n".join(inner_rows),
+            )
+        )

    if rows:
        ostream.write(
--- a/capa/rules.py
+++ b/capa/rules.py
@@ -69,7 +69,6 @@ SUPPORTED_FEATURES = {
    FUNCTION_SCOPE: {
        # plus basic block scope features, see below
        capa.features.basicblock.BasicBlock,
-        capa.features.Characteristic("switch"),
        capa.features.Characteristic("calls from"),
        capa.features.Characteristic("calls to"),
        capa.features.Characteristic("loop"),
@@ -263,7 +262,7 @@ def parse_description(s, value_type, description=None):
                raise InvalidRule(
                    "unexpected bytes value: byte sequences must be no larger than %s bytes" % MAX_BYTES_FEATURE_SIZE
                )
-        elif value_type in {"number", "offset"}:
+        elif value_type in ("number", "offset") or value_type.startswith(("number/", "offset/")):
            try:
                value = parse_int(value)
            except ValueError:
--- a/capa/version.py
+++ b/capa/version.py
@@ -1 +1 @@
-__version__ = "1.0.0"
+__version__ = "1.2.0"
--- a/2
+++ b/2
--- a/scripts/import-to-bn.py
+++ b/scripts/import-to-bn.py
@@ -13,8 +13,7 @@ It will mark up functions with their capa matches, like:
    UninstallService proc near
    ...

-To use, invoke from the Binary Ninja Tools menu, or from the 
-command-palette.
+To use, invoke from the Binary Ninja Tools menu, or from the command-palette.

 Adapted for Binary Ninja by @psifertex

--- a/scripts/lint.py
+++ b/scripts/lint.py
@@ -399,7 +399,11 @@ def lint_rule(ctx, rule):
        print("")
        print(
            "%s%s %s"
-            % ("    (nursery) " if is_nursery_rule(rule) else "", rule.name, ("(%s)" % category) if category else "",)
+            % (
+                "    (nursery) " if is_nursery_rule(rule) else "",
+                rule.name,
+                ("(%s)" % category) if category else "",
+            )
        )

        level = "WARN" if is_nursery_rule(rule) else "FAIL"
@@ -407,7 +411,12 @@ def lint_rule(ctx, rule):
        for violation in violations:
            print(
                "%s  %s: %s: %s"
-                % ("    " if is_nursery_rule(rule) else "", level, violation.name, violation.recommendation,)
+                % (
+                    "    " if is_nursery_rule(rule) else "",
+                    level,
+                    violation.name,
+                    violation.recommendation,
+                )
            )

    elif len(violations) == 0 and is_nursery_rule(rule):
@@ -487,7 +496,9 @@ def main(argv=None):
    parser.add_argument("rules", type=str, help="Path to rules")
    parser.add_argument("--samples", type=str, default=samples_path, help="Path to samples")
    parser.add_argument(
-        "--thorough", action="store_true", help="Enable thorough linting - takes more time, but does a better job",
+        "--thorough",
+        action="store_true",
+        help="Enable thorough linting - takes more time, but does a better job",
    )
    parser.add_argument("-v", "--verbose", action="store_true", help="Enable debug logging")
    parser.add_argument("-q", "--quiet", action="store_true", help="Disable all output but errors")
--- a/setup.py
+++ b/setup.py
@@ -11,17 +11,19 @@ import sys

 import setuptools

-requirements = ["six", "tqdm", "pyyaml", "tabulate", "colorama", "termcolor", "ruamel.yaml", "wcwidth"]
+# halo==0.0.30 is the last version to support py2.7
+requirements = ["six", "tqdm", "pyyaml", "tabulate", "colorama", "termcolor", "ruamel.yaml", "wcwidth", "halo==0.0.30"]

 if sys.version_info >= (3, 0):
    # py3
    requirements.append("networkx")
 else:
    # py2
-    requirements.append("enum34")
-    requirements.append("vivisect @ https://github.com/williballenthin/vivisect/tarball/v0.0.20200708#egg=vivisect")
+    requirements.append("enum34==1.1.6")  # v1.1.6 is needed by halo 0.0.30 / spinners 0.0.24
+    requirements.append("vivisect @ https://github.com/williballenthin/vivisect/tarball/v0.0.20200804#egg=vivisect")
    requirements.append("viv-utils")
    requirements.append("networkx==2.2")  # v2.2 is last version supported by Python 2.7
+    requirements.append("backports.functools-lru-cache")

 # this sets __version__
 # via: http://stackoverflow.com/a/7071358/87207
@@ -40,7 +42,11 @@ setuptools.setup(
    url="https://www.github.com/fireeye/capa",
    packages=setuptools.find_packages(exclude=["tests"]),
    package_dir={"capa": "capa"},
-    entry_points={"console_scripts": ["capa=capa.main:main",]},
+    entry_points={
+        "console_scripts": [
+            "capa=capa.main:main",
+        ]
+    },
    include_package_data=True,
    install_requires=requirements,
    extras_require={
--- a/tests/data
+++ b/tests/data
--- a/tests/fixtures.py
+++ b/tests/fixtures.py
@@ -7,79 +7,500 @@
 # See the License for the specific language governing permissions and limitations under the License.

 import os
+import sys
 import os.path
+import contextlib
 import collections

 import pytest
-import viv_utils
+
+import capa.main
+import capa.features.file
+import capa.features.insn
+import capa.features.basicblock
+from capa.features import ARCH_X32, ARCH_X64
+
+try:
+    from functools import lru_cache
+except ImportError:
+    from backports.functools_lru_cache import lru_cache
+

 CD = os.path.dirname(__file__)


-Sample = collections.namedtuple("Sample", ["vw", "path"])
+@contextlib.contextmanager
+def xfail(condition, reason=None):
+    """
+    context manager that wraps a block that is expected to fail in some cases.
+    when it does fail (and is expected), then mark this as pytest.xfail.
+    if its unexpected, raise an exception, so the test fails.
+
+    example::
+
+        # this test:
+        #  - passes on py3 if foo() works
+        #  - fails  on py3 if foo() fails
+        #  - xfails on py2 if foo() fails
+        #  - fails  on py2 if foo() works
+        with xfail(sys.version_info < (3, 0), reason="py2 doesn't foo"):
+            foo()
+    """
+    try:
+        # do the block
+        yield
+    except:
+        if condition:
+            # we expected the test to fail, so raise and register this via pytest
+            pytest.xfail(reason)
+        else:
+            # we don't expect an exception, so the test should fail
+            raise
+    else:
+        if not condition:
+            # here we expect the block to run successfully,
+            # and we've received no exception,
+            # so this is good
+            pass
+        else:
+            # we expected an exception, but didn't find one. that's an error.
+            raise RuntimeError("expected to fail, but didn't")
+
+
+@lru_cache()
+def get_viv_extractor(path):
+    import capa.features.extractors.viv
+
+    if "raw32" in path:
+        vw = capa.main.get_workspace(path, "sc32", should_save=False)
+    elif "raw64" in path:
+        vw = capa.main.get_workspace(path, "sc64", should_save=False)
+    else:
+        vw = capa.main.get_workspace(path, "auto", should_save=True)
+    return capa.features.extractors.viv.VivisectFeatureExtractor(vw, path)
+
+
+@lru_cache()
+def extract_file_features(extractor):
+    features = collections.defaultdict(set)
+    for feature, va in extractor.extract_file_features():
+        features[feature].add(va)
+    return features
+
+
+# f may not be hashable (e.g. ida func_t) so cannot @lru_cache this
+def extract_function_features(extractor, f):
+    features = collections.defaultdict(set)
+    for bb in extractor.get_basic_blocks(f):
+        for insn in extractor.get_instructions(f, bb):
+            for feature, va in extractor.extract_insn_features(f, bb, insn):
+                features[feature].add(va)
+        for feature, va in extractor.extract_basic_block_features(f, bb):
+            features[feature].add(va)
+    for feature, va in extractor.extract_function_features(f):
+        features[feature].add(va)
+    return features
+
+
+# f may not be hashable (e.g. ida func_t) so cannot @lru_cache this
+def extract_basic_block_features(extractor, f, bb):
+    features = collections.defaultdict(set)
+    for insn in extractor.get_instructions(f, bb):
+        for feature, va in extractor.extract_insn_features(f, bb, insn):
+            features[feature].add(va)
+    for feature, va in extractor.extract_basic_block_features(f, bb):
+        features[feature].add(va)
+    return features
+
+
+def get_data_path_by_name(name):
+    if name == "mimikatz":
+        return os.path.join(CD, "data", "mimikatz.exe_")
+    elif name == "kernel32":
+        return os.path.join(CD, "data", "kernel32.dll_")
+    elif name == "kernel32-64":
+        return os.path.join(CD, "data", "kernel32-64.dll_")
+    elif name == "pma12-04":
+        return os.path.join(CD, "data", "Practical Malware Analysis Lab 12-04.exe_")
+    elif name == "pma21-01":
+        return os.path.join(CD, "data", "Practical Malware Analysis Lab 21-01.exe_")
+    elif name == "al-khaser x86":
+        return os.path.join(CD, "data", "al-khaser_x86.exe_")
+    elif name.startswith("39c05"):
+        return os.path.join(CD, "data", "39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41.dll_")
+    elif name.startswith("499c2"):
+        return os.path.join(CD, "data", "499c2a85f6e8142c3f48d4251c9c7cd6.raw32")
+    elif name.startswith("9324d"):
+        return os.path.join(CD, "data", "9324d1a8ae37a36ae560c37448c9705a.exe_")
+    elif name.startswith("a1982"):
+        return os.path.join(CD, "data", "a198216798ca38f280dc413f8c57f2c2.exe_")
+    elif name.startswith("a933a"):
+        return os.path.join(CD, "data", "a933a1a402775cfa94b6bee0963f4b46.dll_")
+    elif name.startswith("bfb9b"):
+        return os.path.join(CD, "data", "bfb9b5391a13d0afd787e87ab90f14f5.dll_")
+    elif name.startswith("c9188"):
+        return os.path.join(CD, "data", "c91887d861d9bd4a5872249b641bc9f9.exe_")
+    else:
+        raise ValueError("unexpected sample fixture")
+
+
+def get_sample_md5_by_name(name):
+    """used by IDA tests to ensure the correct IDB is loaded"""
+    if name == "mimikatz":
+        return "5f66b82558ca92e54e77f216ef4c066c"
+    elif name == "kernel32":
+        return "e80758cf485db142fca1ee03a34ead05"
+    elif name == "kernel32-64":
+        return "a8565440629ac87f6fef7d588fe3ff0f"
+    elif name == "pma12-04":
+        return "56bed8249e7c2982a90e54e1e55391a2"
+    elif name == "pma21-01":
+        return "c8403fb05244e23a7931c766409b5e22"
+    elif name == "al-khaser x86":
+        return "db648cd247281954344f1d810c6fd590"
+    elif name.startswith("39c05"):
+        return "b7841b9d5dc1f511a93cc7576672ec0c"
+    elif name.startswith("499c2"):
+        return "499c2a85f6e8142c3f48d4251c9c7cd6"
+    elif name.startswith("9324d"):
+        return "9324d1a8ae37a36ae560c37448c9705a"
+    elif name.startswith("a1982"):
+        return "a198216798ca38f280dc413f8c57f2c2"
+    elif name.startswith("a933a"):
+        return "a933a1a402775cfa94b6bee0963f4b46"
+    elif name.startswith("bfb9b"):
+        return "bfb9b5391a13d0afd787e87ab90f14f5"
+    elif name.startswith("c9188"):
+        return "c91887d861d9bd4a5872249b641bc9f9"
+    else:
+        raise ValueError("unexpected sample fixture")
+
+
+def resolve_sample(sample):
+    return get_data_path_by_name(sample)


@pytest.fixture
-def mimikatz():
-    path = os.path.join(CD, "data", "mimikatz.exe_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def sample(request):
+    return resolve_sample(request.param)
+
+
+def get_function(extractor, fva):
+    for f in extractor.get_functions():
+        if f.__int__() == fva:
+            return f
+    raise ValueError("function not found")
+
+
+def get_basic_block(extractor, f, va):
+    for bb in extractor.get_basic_blocks(f):
+        if bb.__int__() == va:
+            return bb
+    raise ValueError("basic block not found")
+
+
+def resolve_scope(scope):
+    if scope == "file":
+
+        def inner(extractor):
+            return extract_file_features(extractor)
+
+        inner.__name__ = scope
+        return inner
+    elif "bb=" in scope:
+        # like `function=0x401000,bb=0x40100A`
+        fspec, _, bbspec = scope.partition(",")
+        fva = int(fspec.partition("=")[2], 0x10)
+        bbva = int(bbspec.partition("=")[2], 0x10)
+
+        def inner(extractor):
+            f = get_function(extractor, fva)
+            bb = get_basic_block(extractor, f, bbva)
+            return extract_basic_block_features(extractor, f, bb)
+
+        inner.__name__ = scope
+        return inner
+    elif scope.startswith("function"):
+        # like `function=0x401000`
+        va = int(scope.partition("=")[2], 0x10)
+
+        def inner(extractor):
+            f = get_function(extractor, va)
+            return extract_function_features(extractor, f)
+
+        inner.__name__ = scope
+        return inner
+    else:
+        raise ValueError("unexpected scope fixture")


@pytest.fixture
-def sample_a933a1a402775cfa94b6bee0963f4b46():
-    path = os.path.join(CD, "data", "a933a1a402775cfa94b6bee0963f4b46.dll_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def scope(request):
+    return resolve_scope(request.param)
+
+
+def make_test_id(values):
+    return "-".join(map(str, values))
+
+
+def parametrize(params, values, **kwargs):
+    """
+    extend `pytest.mark.parametrize` to pretty-print features.
+    by default, it renders objects as an opaque value.
+    ref: https://docs.pytest.org/en/2.9.0/example/parametrize.html#different-options-for-test-ids
+    rendered ID might look something like:
+        mimikatz-function=0x403BAC-api(CryptDestroyKey)-True
+    """
+    ids = list(map(make_test_id, values))
+    return pytest.mark.parametrize(params, values, ids=ids, **kwargs)
+
+
+FEATURE_PRESENCE_TESTS = [
+    # file/characteristic("embedded pe")
+    ("pma12-04", "file", capa.features.Characteristic("embedded pe"), True),
+    # file/string
+    ("mimikatz", "file", capa.features.String("SCardControl"), True),
+    ("mimikatz", "file", capa.features.String("SCardTransmit"), True),
+    ("mimikatz", "file", capa.features.String("ACR  > "), True),
+    ("mimikatz", "file", capa.features.String("nope"), False),
+    # file/sections
+    ("mimikatz", "file", capa.features.file.Section(".text"), True),
+    ("mimikatz", "file", capa.features.file.Section(".nope"), False),
+    # IDA doesn't extract unmapped sections by default
+    # ("mimikatz", "file", capa.features.file.Section(".rsrc"), True),
+    # file/exports
+    ("kernel32", "file", capa.features.file.Export("BaseThreadInitThunk"), True),
+    ("kernel32", "file", capa.features.file.Export("lstrlenW"), True),
+    ("kernel32", "file", capa.features.file.Export("nope"), False),
+    # file/imports
+    ("mimikatz", "file", capa.features.file.Import("advapi32.CryptSetHashParam"), True),
+    ("mimikatz", "file", capa.features.file.Import("CryptSetHashParam"), True),
+    ("mimikatz", "file", capa.features.file.Import("kernel32.IsWow64Process"), True),
+    ("mimikatz", "file", capa.features.file.Import("msvcrt.exit"), True),
+    ("mimikatz", "file", capa.features.file.Import("cabinet.#11"), True),
+    ("mimikatz", "file", capa.features.file.Import("#11"), False),
+    ("mimikatz", "file", capa.features.file.Import("#nope"), False),
+    ("mimikatz", "file", capa.features.file.Import("nope"), False),
+    # function/characteristic(loop)
+    ("mimikatz", "function=0x401517", capa.features.Characteristic("loop"), True),
+    ("mimikatz", "function=0x401000", capa.features.Characteristic("loop"), False),
+    # bb/characteristic(tight loop)
+    ("mimikatz", "function=0x402EC4", capa.features.Characteristic("tight loop"), True),
+    ("mimikatz", "function=0x401000", capa.features.Characteristic("tight loop"), False),
+    # bb/characteristic(stack string)
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("stack string"), True),
+    ("mimikatz", "function=0x401000", capa.features.Characteristic("stack string"), False),
+    # bb/characteristic(tight loop)
+    ("mimikatz", "function=0x402EC4,bb=0x402F8E", capa.features.Characteristic("tight loop"), True),
+    ("mimikatz", "function=0x401000,bb=0x401000", capa.features.Characteristic("tight loop"), False),
+    # insn/mnemonic
+    ("mimikatz", "function=0x40105D", capa.features.insn.Mnemonic("push"), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Mnemonic("movzx"), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Mnemonic("xor"), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Mnemonic("in"), False),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Mnemonic("out"), False),
+    # insn/number
+    ("mimikatz", "function=0x40105D", capa.features.insn.Number(0xFF), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Number(0x3136B0), True),
+    # insn/number: stack adjustments
+    ("mimikatz", "function=0x40105D", capa.features.insn.Number(0xC), False),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Number(0x10), False),
+    # insn/number: arch flavors
+    ("mimikatz", "function=0x40105D", capa.features.insn.Number(0xFF), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Number(0xFF, arch=ARCH_X32), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Number(0xFF, arch=ARCH_X64), False),
+    # insn/offset
+    ("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x0), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x4), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Offset(0xC), True),
+    # insn/offset: stack references
+    ("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x8), False),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x10), False),
+    # insn/offset: negative
+    ("mimikatz", "function=0x4011FB", capa.features.insn.Offset(-0x1), True),
+    ("mimikatz", "function=0x4011FB", capa.features.insn.Offset(-0x2), True),
+    # insn/offset: arch flavors
+    ("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x0), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x0, arch=ARCH_X32), True),
+    ("mimikatz", "function=0x40105D", capa.features.insn.Offset(0x0, arch=ARCH_X64), False),
+    # insn/api
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("advapi32.CryptAcquireContextW"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("advapi32.CryptAcquireContext"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("advapi32.CryptGenKey"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("advapi32.CryptImportKey"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("advapi32.CryptDestroyKey"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("CryptAcquireContextW"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("CryptAcquireContext"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("CryptGenKey"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("CryptImportKey"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("CryptDestroyKey"), True),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("Nope"), False),
+    ("mimikatz", "function=0x403BAC", capa.features.insn.API("advapi32.Nope"), False),
+    # insn/api: thunk
+    ("mimikatz", "function=0x4556E5", capa.features.insn.API("advapi32.LsaQueryInformationPolicy"), True),
+    ("mimikatz", "function=0x4556E5", capa.features.insn.API("LsaQueryInformationPolicy"), True),
+    # insn/api: x64
+    (
+        "kernel32-64",
+        "function=0x180001010",
+        capa.features.insn.API("RtlVirtualUnwind"),
+        True,
+    ),
+    ("kernel32-64", "function=0x180001010", capa.features.insn.API("RtlVirtualUnwind"), True),
+    # insn/api: x64 thunk
+    (
+        "kernel32-64",
+        "function=0x1800202B0",
+        capa.features.insn.API("RtlCaptureContext"),
+        True,
+    ),
+    ("kernel32-64", "function=0x1800202B0", capa.features.insn.API("RtlCaptureContext"), True),
+    # insn/api: resolve indirect calls
+    ("c91887...", "function=0x401A77", capa.features.insn.API("kernel32.CreatePipe"), True),
+    ("c91887...", "function=0x401A77", capa.features.insn.API("kernel32.SetHandleInformation"), True),
+    ("c91887...", "function=0x401A77", capa.features.insn.API("kernel32.CloseHandle"), True),
+    ("c91887...", "function=0x401A77", capa.features.insn.API("kernel32.WriteFile"), True),
+    # insn/string
+    ("mimikatz", "function=0x40105D", capa.features.String("SCardControl"), True),
+    ("mimikatz", "function=0x40105D", capa.features.String("SCardTransmit"), True),
+    ("mimikatz", "function=0x40105D", capa.features.String("ACR  > "), True),
+    ("mimikatz", "function=0x40105D", capa.features.String("nope"), False),
+    # insn/string, pointer to string
+    ("mimikatz", "function=0x44EDEF", capa.features.String("INPUTEVENT"), True),
+    # insn/bytes
+    ("mimikatz", "function=0x40105D", capa.features.Bytes("SCardControl".encode("utf-16le")), True),
+    ("mimikatz", "function=0x40105D", capa.features.Bytes("SCardTransmit".encode("utf-16le")), True),
+    ("mimikatz", "function=0x40105D", capa.features.Bytes("ACR  > ".encode("utf-16le")), True),
+    ("mimikatz", "function=0x40105D", capa.features.Bytes("nope".encode("ascii")), False),
+    # insn/bytes, pointer to bytes
+    ("mimikatz", "function=0x44EDEF", capa.features.Bytes("INPUTEVENT".encode("utf-16le")), True),
+    # insn/characteristic(nzxor)
+    ("mimikatz", "function=0x410DFC", capa.features.Characteristic("nzxor"), True),
+    ("mimikatz", "function=0x40105D", capa.features.Characteristic("nzxor"), False),
+    # insn/characteristic(nzxor): no security cookies
+    ("mimikatz", "function=0x46D534", capa.features.Characteristic("nzxor"), False),
+    # insn/characteristic(peb access)
+    ("kernel32-64", "function=0x1800017D0", capa.features.Characteristic("peb access"), True),
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("peb access"), False),
+    # insn/characteristic(gs access)
+    ("kernel32-64", "function=0x180001068", capa.features.Characteristic("gs access"), True),
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("gs access"), False),
+    # insn/characteristic(cross section flow)
+    ("a1982...", "function=0x4014D0", capa.features.Characteristic("cross section flow"), True),
+    # insn/characteristic(cross section flow): imports don't count
+    ("kernel32-64", "function=0x180001068", capa.features.Characteristic("cross section flow"), False),
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("cross section flow"), False),
+    # insn/characteristic(recursive call)
+    ("39c05...", "function=0x10003100", capa.features.Characteristic("recursive call"), True),
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("recursive call"), False),
+    # insn/characteristic(indirect call)
+    ("mimikatz", "function=0x4175FF", capa.features.Characteristic("indirect call"), True),
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("indirect call"), False),
+    # insn/characteristic(calls from)
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("calls from"), True),
+    ("mimikatz", "function=0x4702FD", capa.features.Characteristic("calls from"), False),
+    # function/characteristic(calls to)
+    ("mimikatz", "function=0x40105D", capa.features.Characteristic("calls to"), True),
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("calls to"), False),
+]
+
+FEATURE_COUNT_TESTS = [
+    ("mimikatz", "function=0x40E5C2", capa.features.basicblock.BasicBlock(), 7),
+    ("mimikatz", "function=0x4702FD", capa.features.Characteristic("calls from"), 0),
+    ("mimikatz", "function=0x40E5C2", capa.features.Characteristic("calls from"), 3),
+    ("mimikatz", "function=0x4556E5", capa.features.Characteristic("calls to"), 0),
+    ("mimikatz", "function=0x40B1F1", capa.features.Characteristic("calls to"), 3),
+]
+
+
+def do_test_feature_presence(get_extractor, sample, scope, feature, expected):
+    extractor = get_extractor(sample)
+    features = scope(extractor)
+    if expected:
+        msg = "%s should be found in %s" % (str(feature), scope.__name__)
+    else:
+        msg = "%s should not be found in %s" % (str(feature), scope.__name__)
+    assert feature.evaluate(features) == expected, msg
+
+
+def do_test_feature_count(get_extractor, sample, scope, feature, expected):
+    extractor = get_extractor(sample)
+    features = scope(extractor)
+    msg = "%s should be found %d times in %s, found: %d" % (
+        str(feature),
+        expected,
+        scope.__name__,
+        len(features[feature]),
+    )
+    assert len(features[feature]) == expected, msg
+
+
+def get_extractor(path):
+    if sys.version_info >= (3, 0):
+        raise RuntimeError("no supported py3 backends yet")
+    else:
+        extractor = get_viv_extractor(path)
+
+    # overload the extractor so that the fixture exposes `extractor.path`
+    setattr(extractor, "path", path)
+    return extractor


@pytest.fixture
-def kernel32():
-    path = os.path.join(CD, "data", "kernel32.dll_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def mimikatz_extractor():
+    return get_extractor(get_data_path_by_name("mimikatz"))


@pytest.fixture
-def sample_a198216798ca38f280dc413f8c57f2c2():
-    path = os.path.join(CD, "data", "a198216798ca38f280dc413f8c57f2c2.exe_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def a933a_extractor():
+    return get_extractor(get_data_path_by_name("a933a..."))


@pytest.fixture
-def sample_9324d1a8ae37a36ae560c37448c9705a():
-    path = os.path.join(CD, "data", "9324d1a8ae37a36ae560c37448c9705a.exe_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def kernel32_extractor():
+    return get_extractor(get_data_path_by_name("kernel32"))


@pytest.fixture
-def pma_lab_12_04():
-    path = os.path.join(CD, "data", "Practical Malware Analysis Lab 12-04.exe_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def a1982_extractor():
+    return get_extractor(get_data_path_by_name("a1982..."))


@pytest.fixture
-def sample_bfb9b5391a13d0afd787e87ab90f14f5():
-    path = os.path.join(CD, "data", "bfb9b5391a13d0afd787e87ab90f14f5.dll_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def z9324d_extractor():
+    return get_extractor(get_data_path_by_name("9324d..."))


@pytest.fixture
-def sample_lab21_01():
-    path = os.path.join(CD, "data", "Practical Malware Analysis Lab 21-01.exe_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def pma12_04_extractor():
+    return get_extractor(get_data_path_by_name("pma12-04"))


@pytest.fixture
-def sample_c91887d861d9bd4a5872249b641bc9f9():
-    path = os.path.join(CD, "data", "c91887d861d9bd4a5872249b641bc9f9.exe_")
-    return Sample(viv_utils.getWorkspace(path), path)
+def bfb9b_extractor():
+    return get_extractor(get_data_path_by_name("bfb9b..."))


@pytest.fixture
-def sample_39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41():
-    path = os.path.join(CD, "data", "39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41.dll_",)
-    return Sample(viv_utils.getWorkspace(path), path)
+def pma21_01_extractor():
+    return get_extractor(get_data_path_by_name("pma21-01"))


@pytest.fixture
-def sample_499c2a85f6e8142c3f48d4251c9c7cd6_raw32():
-    path = os.path.join(CD, "data", "499c2a85f6e8142c3f48d4251c9c7cd6.raw32")
-    return Sample(viv_utils.getShellcodeWorkspace(path), path)
+def c9188_extractor():
+    return get_extractor(get_data_path_by_name("c9188..."))
+
+
+@pytest.fixture
+def z39c05_extractor():
+    return get_extractor(get_data_path_by_name("39c05..."))
+
+
+@pytest.fixture
+def z499c2_extractor():
+    return get_extractor(get_data_path_by_name("499c2..."))
+
+
+@pytest.fixture
+def al_khaser_x86_extractor():
+    return get_extractor(get_data_path_by_name("al-khaser x86"))
--- a/tests/test_engine.py
+++ b/tests/test_engine.py
@@ -59,7 +59,13 @@ def test_some():
    )
    assert (
        Some(2, [Number(1), Number(2), Number(3)]).evaluate(
-            {Number(0): {1}, Number(1): {1}, Number(2): {1}, Number(3): {1}, Number(4): {1},}
+            {
+                Number(0): {1},
+                Number(1): {1},
+                Number(2): {1},
+                Number(3): {1},
+                Number(4): {1},
+            }
        )
        == True
    )
@@ -258,7 +264,9 @@ def test_match_matched_rules():
    ]

    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.insn.Number(100): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.insn.Number(100): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule1") in features
    assert capa.features.MatchedRule("test rule2") in features
@@ -266,7 +274,9 @@ def test_match_matched_rules():
    # the ordering of the rules must not matter,
    # the engine should match rules in an appropriate order.
    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(reversed(rules)), {capa.features.insn.Number(100): {1}}, 0x0,
+        capa.engine.topologically_order_rules(reversed(rules)),
+        {capa.features.insn.Number(100): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule1") in features
    assert capa.features.MatchedRule("test rule2") in features
@@ -312,22 +322,30 @@ def test_regex():
        ),
    ]
    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.insn.Number(100): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.insn.Number(100): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule") not in features

    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.String("aaaa"): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.String("aaaa"): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule") not in features

    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.String("aBBBBa"): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.String("aBBBBa"): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule") not in features

    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.String("abbbba"): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.String("abbbba"): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule") in features
    assert capa.features.MatchedRule("rule with implied wildcards") in features
@@ -350,7 +368,9 @@ def test_regex_ignorecase():
        ),
    ]
    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.String("aBBBBa"): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.String("aBBBBa"): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule") in features

@@ -429,7 +449,9 @@ def test_match_namespace():
    ]

    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.insn.API("CreateFile"): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.insn.API("CreateFile"): {1}},
+        0x0,
    )
    assert "CreateFile API" in matches
    assert "file-create" in matches
@@ -439,7 +461,9 @@ def test_match_namespace():
    assert capa.features.MatchedRule("file/create/CreateFile") in features

    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.insn.API("WriteFile"): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.insn.API("WriteFile"): {1}},
+        0x0,
    )
    assert "WriteFile API" in matches
    assert "file-create" not in matches
--- a/tests/test_freeze.py
+++ b/tests/test_freeze.py
@@ -5,9 +5,10 @@
 # Unless required by applicable law or agreed to in writing, software distributed under the License
 #  is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and limitations under the License.
-
+import sys
 import textwrap

+import pytest
 from fixtures import *

 import capa.main
@@ -20,13 +21,19 @@ import capa.features.extractors
 EXTRACTOR = capa.features.extractors.NullFeatureExtractor(
    {
        "base address": 0x401000,
-        "file features": [(0x402345, capa.features.Characteristic("embedded pe")),],
+        "file features": [
+            (0x402345, capa.features.Characteristic("embedded pe")),
+        ],
        "functions": {
            0x401000: {
-                "features": [(0x401000, capa.features.Characteristic("switch")),],
+                "features": [
+                    (0x401000, capa.features.Characteristic("indirect call")),
+                ],
                "basic blocks": {
                    0x401000: {
-                        "features": [(0x401000, capa.features.Characteristic("tight loop")),],
+                        "features": [
+                            (0x401000, capa.features.Characteristic("tight loop")),
+                        ],
                        "instructions": {
                            0x401000: {
                                "features": [
@@ -34,7 +41,11 @@ EXTRACTOR = capa.features.extractors.NullFeatureExtractor(
                                    (0x401000, capa.features.Characteristic("nzxor")),
                                ],
                            },
-                            0x401002: {"features": [(0x401002, capa.features.insn.Mnemonic("mov")),],},
+                            0x401002: {
+                                "features": [
+                                    (0x401002, capa.features.insn.Mnemonic("mov")),
+                                ],
+                            },
                        },
                    },
                },
@@ -104,17 +115,14 @@ def compare_extractors_viv_null(viv_ext, null_ext):
      viv_ext (capa.features.extractors.viv.VivisectFeatureExtractor)
      null_ext (capa.features.extractors.NullFeatureExtractor)
    """
-
-    # TODO: ordering of these things probably doesn't work yet
-
    assert list(viv_ext.extract_file_features()) == list(null_ext.extract_file_features())
-    assert to_int(list(viv_ext.get_functions())) == list(null_ext.get_functions())
+    assert list(map(to_int, viv_ext.get_functions())) == list(null_ext.get_functions())
    for f in viv_ext.get_functions():
-        assert to_int(list(viv_ext.get_basic_blocks(f))) == list(null_ext.get_basic_blocks(to_int(f)))
+        assert list(map(to_int, viv_ext.get_basic_blocks(f))) == list(null_ext.get_basic_blocks(to_int(f)))
        assert list(viv_ext.extract_function_features(f)) == list(null_ext.extract_function_features(to_int(f)))

        for bb in viv_ext.get_basic_blocks(f):
-            assert to_int(list(viv_ext.get_instructions(f, bb))) == list(
+            assert list(map(to_int, viv_ext.get_instructions(f, bb))) == list(
                null_ext.get_instructions(to_int(f), to_int(bb))
            )
            assert list(viv_ext.extract_basic_block_features(f, bb)) == list(
@@ -129,9 +137,6 @@ def compare_extractors_viv_null(viv_ext, null_ext):

 def to_int(o):
    """helper to get int value of extractor items"""
-    if isinstance(o, list):
-        return map(lambda x: capa.helpers.oint(x), o)
-    else:
    return capa.helpers.oint(o)


@@ -169,18 +174,22 @@ def test_serialize_features():
    roundtrip_feature(capa.features.file.Import("#11"))


-def test_freeze_sample(tmpdir, sample_9324d1a8ae37a36ae560c37448c9705a):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_freeze_sample(tmpdir, z9324d_extractor):
    # tmpdir fixture handles cleanup
    o = tmpdir.mkdir("capa").join("test.frz").strpath
-    assert capa.features.freeze.main([sample_9324d1a8ae37a36ae560c37448c9705a.path, o, "-v"]) == 0
+    path = z9324d_extractor.path
+    assert capa.features.freeze.main([path, o, "-v"]) == 0


-def test_freeze_load_sample(tmpdir, sample_9324d1a8ae37a36ae560c37448c9705a):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_freeze_load_sample(tmpdir, z9324d_extractor):
    o = tmpdir.mkdir("capa").join("test.frz")
-    viv_extractor = capa.features.extractors.viv.VivisectFeatureExtractor(
-        sample_9324d1a8ae37a36ae560c37448c9705a.vw, sample_9324d1a8ae37a36ae560c37448c9705a.path,
-    )
+
    with open(o.strpath, "wb") as f:
-        f.write(capa.features.freeze.dump(viv_extractor))
-    null_extractor = capa.features.freeze.load(o.open("rb").read())
-    compare_extractors_viv_null(viv_extractor, null_extractor)
+        f.write(capa.features.freeze.dump(z9324d_extractor))
+
+    with open(o.strpath, "rb") as f:
+        null_extractor = capa.features.freeze.load(f.read())
+
+    compare_extractors_viv_null(z9324d_extractor, null_extractor)
--- a/tests/test_ida_features.py
+++ b/tests/test_ida_features.py
@@ -1,24 +1,25 @@
 # run this script from within IDA with ./tests/data/mimikatz.exe open
+import sys
 import logging
+import os.path
 import binascii
 import traceback
-import collections

 import pytest

-import capa.features
-import capa.features.file
-import capa.features.insn
-import capa.features.basicblock
-from capa.features import ARCH_X32, ARCH_X64
+try:
+    sys.path.append(os.path.dirname(__file__))
+    from fixtures import *
+finally:
+    sys.path.pop()
+

 logger = logging.getLogger("test_ida_features")


-def check_input_file():
+def check_input_file(wanted):
    import idautils

-    wanted = "5f66b82558ca92e54e77f216ef4c066c"
    # some versions (7.4) of IDA return a truncated version of the MD5.
    # https://github.com/idapython/bin/issues/11
    try:
@@ -27,12 +28,13 @@ def check_input_file():
        # in IDA 7.5 or so, GetInputFileMD5 started returning raw binary
        # rather than the hex digest
        found = binascii.hexlify(idautils.GetInputFileMD5()[:15]).decode("ascii").lower()
+
    if not wanted.startswith(found):
-        raise RuntimeError("please run the tests against `mimikatz.exe`")
+        raise RuntimeError("please run the tests against sample with MD5: `%s`" % (wanted))


-def get_extractor():
-    check_input_file()
+def get_ida_extractor(_path):
+    check_input_file("5f66b82558ca92e54e77f216ef4c066c")

    # have to import import this inline so pytest doesn't bail outside of IDA
    import capa.features.extractors.ida
@@ -40,263 +42,50 @@ def get_extractor():
    return capa.features.extractors.ida.IdaFeatureExtractor()


-def extract_file_features():
-    extractor = get_extractor()
-    features = set([])
-    for feature, va in extractor.extract_file_features():
-        features.add(feature)
-    return features
-
-
-def extract_function_features(f):
-    extractor = get_extractor()
-    features = collections.defaultdict(set)
-    for bb in extractor.get_basic_blocks(f):
-        for insn in extractor.get_instructions(f, bb):
-            for feature, va in extractor.extract_insn_features(f, bb, insn):
-                features[feature].add(va)
-        for feature, va in extractor.extract_basic_block_features(f, bb):
-            features[feature].add(va)
-    for feature, va in extractor.extract_function_features(f):
-        features[feature].add(va)
-    return features
-
-
-def extract_basic_block_features(f, bb):
-    extractor = get_extractor()
-    features = collections.defaultdict(set)
-    for insn in extractor.get_instructions(f, bb):
-        for feature, va in extractor.extract_insn_features(f, bb, insn):
-            features[feature].add(va)
-    for feature, va in extractor.extract_basic_block_features(f, bb):
-        features[feature].add(va)
-    return features
-
-
@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_api_features():
-    # have to import import this inline so pytest doesn't bail outside of IDA
-    import idaapi
+def test_ida_features():
+    for (sample, scope, feature, expected) in FEATURE_PRESENCE_TESTS:
+        id = make_test_id((sample, scope, feature, expected))

-    f = idaapi.get_func(0x403BAC)
-    features = extract_function_features(f)
-    assert capa.features.insn.API("advapi32.CryptAcquireContextW") in features
-    assert capa.features.insn.API("advapi32.CryptAcquireContext") in features
-    assert capa.features.insn.API("advapi32.CryptGenKey") in features
-    assert capa.features.insn.API("advapi32.CryptImportKey") in features
-    assert capa.features.insn.API("advapi32.CryptDestroyKey") in features
-    assert capa.features.insn.API("CryptAcquireContextW") in features
-    assert capa.features.insn.API("CryptAcquireContext") in features
-    assert capa.features.insn.API("CryptGenKey") in features
-    assert capa.features.insn.API("CryptImportKey") in features
-    assert capa.features.insn.API("CryptDestroyKey") in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_string_features():
-    import idaapi
-
-    f = idaapi.get_func(0x40105D)
-    features = extract_function_features(f)
-    assert capa.features.String("SCardControl") in features
-    assert capa.features.String("SCardTransmit") in features
-    assert capa.features.String("ACR  > ") in features
-    # other strings not in this function
-    assert capa.features.String("bcrypt.dll") not in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_byte_features():
-    import idaapi
-
-    f = idaapi.get_func(0x40105D)
-    features = extract_function_features(f)
-    wanted = capa.features.Bytes("SCardControl".encode("utf-16le"))
-    # use `==` rather than `is` because the result is not `True` but a truthy value.
-    assert wanted.evaluate(features) == True
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_number_features():
-    import idaapi
-
-    f = idaapi.get_func(0x40105D)
-    features = extract_function_features(f)
-    assert capa.features.insn.Number(0xFF) in features
-    assert capa.features.insn.Number(0x3136B0) in features
-    # the following are stack adjustments
-    assert capa.features.insn.Number(0xC) not in features
-    assert capa.features.insn.Number(0x10) not in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_number_arch_features():
-    import idaapi
-
-    f = idaapi.get_func(0x40105D)
-    features = extract_function_features(f)
-    assert capa.features.insn.Number(0xFF) in features
-    assert capa.features.insn.Number(0xFF, arch=ARCH_X32) in features
-    assert capa.features.insn.Number(0xFF, arch=ARCH_X64) not in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_offset_features():
-    import idaapi
-
-    f = idaapi.get_func(0x40105D)
-    features = extract_function_features(f)
-    assert capa.features.insn.Offset(0x0) in features
-    assert capa.features.insn.Offset(0x4) in features
-    assert capa.features.insn.Offset(0xC) in features
-    # the following are stack references
-    assert capa.features.insn.Offset(0x8) not in features
-    assert capa.features.insn.Offset(0x10) not in features
-
-    # this function has the following negative offsets
-    # movzx   ecx, byte ptr [eax-1]
-    # movzx   eax, byte ptr [eax-2]
-    f = idaapi.get_func(0x4011FB)
-    features = extract_function_features(f)
-    assert capa.features.insn.Offset(-0x1) in features
-    assert capa.features.insn.Offset(-0x2) in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_offset_arch_features(mimikatz):
-    import idaapi
-
-    f = idaapi.get_func(0x40105D)
-    features = extract_function_features(f)
-    assert capa.features.insn.Offset(0x0) in features
-    assert capa.features.insn.Offset(0x0, arch=ARCH_X32) in features
-    assert capa.features.insn.Offset(0x0, arch=ARCH_X64) not in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_nzxor_features():
-    import idaapi
-
-    f = idaapi.get_func(0x410DFC)
-    features = extract_function_features(f)
-    assert capa.features.Characteristic("nzxor") in features  # 0x0410F0B
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_mnemonic_features():
-    import idaapi
-
-    f = idaapi.get_func(0x40105D)
-    features = extract_function_features(f)
-    assert capa.features.insn.Mnemonic("push") in features
-    assert capa.features.insn.Mnemonic("movzx") in features
-    assert capa.features.insn.Mnemonic("xor") in features
-
-    assert capa.features.insn.Mnemonic("in") not in features
-    assert capa.features.insn.Mnemonic("out") not in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_file_section_name_features():
-    features = extract_file_features()
-    assert capa.features.file.Section(".idata") in features
-    assert capa.features.file.Section(".text") in features
-    assert capa.features.file.Section(".nope") not in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_tight_loop_features():
-    import idaapi
-
-    extractor = get_extractor()
-    f = idaapi.get_func(0x402EC4)
-    for bb in extractor.get_basic_blocks(f):
-        if bb.__int__() != 0x402F8E:
+        try:
+            check_input_file(get_sample_md5_by_name(sample))
+        except RuntimeError:
+            print("SKIP %s" % (id))
            continue
-        features = extract_basic_block_features(f, bb)
-        assert capa.features.Characteristic("tight loop") in features
-        assert capa.features.basicblock.BasicBlock() in features
+
+        scope = resolve_scope(scope)
+        sample = resolve_sample(sample)
+
+        try:
+            do_test_feature_presence(get_ida_extractor, sample, scope, feature, expected)
+        except Exception as e:
+            print("FAIL %s" % (id))
+            traceback.print_exc()
+        else:
+            print("OK   %s" % (id))


@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_tight_loop_bb_features():
-    import idaapi
+def test_ida_feature_counts():
+    for (sample, scope, feature, expected) in FEATURE_COUNT_TESTS:
+        id = make_test_id((sample, scope, feature, expected))

-    extractor = get_extractor()
-    f = idaapi.get_func(0x402EC4)
-    for bb in extractor.get_basic_blocks(f):
-        if bb.__int__() != 0x402F8E:
+        try:
+            check_input_file(get_sample_md5_by_name(sample))
+        except RuntimeError:
+            print("SKIP %s" % (id))
            continue
-        features = extract_basic_block_features(f, bb)
-        assert capa.features.Characteristic("tight loop") in features
-        assert capa.features.basicblock.BasicBlock() in features

+        scope = resolve_scope(scope)
+        sample = resolve_sample(sample)

-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_file_import_name_features():
-    features = extract_file_features()
-    assert capa.features.file.Import("advapi32.CryptSetHashParam") in features
-    assert capa.features.file.Import("CryptSetHashParam") in features
-    assert capa.features.file.Import("kernel32.IsWow64Process") in features
-    assert capa.features.file.Import("msvcrt.exit") in features
-    assert capa.features.file.Import("cabinet.#11") in features
-    assert capa.features.file.Import("#11") not in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_stackstring_features():
-    import idaapi
-
-    f = idaapi.get_func(0x4556E5)
-    features = extract_function_features(f)
-    assert capa.features.Characteristic("stack string") in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_switch_features():
-    import idaapi
-
-    f = idaapi.get_func(0x409411)
-    features = extract_function_features(f)
-    assert capa.features.Characteristic("switch") in features
-
-    f = idaapi.get_func(0x409393)
-    features = extract_function_features(f)
-    assert capa.features.Characteristic("switch") not in features
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_function_calls_to():
-    import idaapi
-
-    # this function is used in a function pointer
-    f = idaapi.get_func(0x4011FB)
-    features = extract_function_features(f)
-    assert capa.features.Characteristic("calls to") not in features
-
-    # __FindPESection is called once
-    f = idaapi.get_func(0x470360)
-    features = extract_function_features(f)
-    assert len(features[capa.features.Characteristic("calls to")]) == 1
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_function_calls_from():
-    import idaapi
-
-    f = idaapi.get_func(0x4011FB)
-    features = extract_function_features(f)
-    assert capa.features.Characteristic("calls from") in features
-    assert len(features[capa.features.Characteristic("calls from")]) == 3
-
-
-@pytest.mark.skip(reason="IDA Pro tests must be run within IDA")
-def test_basic_block_count():
-    import idaapi
-
-    f = idaapi.get_func(0x4011FB)
-    features = extract_function_features(f)
-    assert len(features[capa.features.basicblock.BasicBlock()]) == 15
+        try:
+            do_test_feature_count(get_ida_extractor, sample, scope, feature, expected)
+        except Exception as e:
+            print("FAIL %s" % (id))
+            traceback.print_exc()
+        else:
+            print("OK   %s" % (id))


 if __name__ == "__main__":
@@ -310,10 +99,6 @@ if __name__ == "__main__":
        test = getattr(sys.modules[__name__], name)
        logger.debug("invoking test: %s", name)
        sys.stderr.flush()
-        try:
        test()
-        except AssertionError as e:
-            print("FAIL %s" % (name))
-            traceback.print_exc()
-        else:
-            print("OK   %s" % (name))
+
+    print("DONE")
--- a/tests/test_main.py
+++ b/tests/test_main.py
@@ -5,28 +5,31 @@
 # Unless required by applicable law or agreed to in writing, software distributed under the License
 #  is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and limitations under the License.
-
+import sys
 import textwrap

+import pytest
 from fixtures import *

 import capa.main
 import capa.rules
 import capa.engine
 import capa.features
-import capa.features.extractors.viv
 from capa.engine import *


-def test_main(sample_9324d1a8ae37a36ae560c37448c9705a):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_main(z9324d_extractor):
    # tests rules can be loaded successfully and all output modes
-    assert capa.main.main([sample_9324d1a8ae37a36ae560c37448c9705a.path, "-vv"]) == 0
-    assert capa.main.main([sample_9324d1a8ae37a36ae560c37448c9705a.path, "-v"]) == 0
-    assert capa.main.main([sample_9324d1a8ae37a36ae560c37448c9705a.path, "-j"]) == 0
-    assert capa.main.main([sample_9324d1a8ae37a36ae560c37448c9705a.path]) == 0
+    path = z9324d_extractor.path
+    assert capa.main.main([path, "-vv"]) == 0
+    assert capa.main.main([path, "-v"]) == 0
+    assert capa.main.main([path, "-j"]) == 0
+    assert capa.main.main([path]) == 0


-def test_main_single_rule(sample_9324d1a8ae37a36ae560c37448c9705a, tmpdir):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_main_single_rule(z9324d_extractor, tmpdir):
    # tests a single rule can be loaded successfully
    RULE_CONTENT = textwrap.dedent(
        """
@@ -38,16 +41,29 @@ def test_main_single_rule(sample_9324d1a8ae37a36ae560c37448c9705a, tmpdir):
              - string: test
        """
    )
+    path = z9324d_extractor.path
    rule_file = tmpdir.mkdir("capa").join("rule.yml")
    rule_file.write(RULE_CONTENT)
-    assert capa.main.main([sample_9324d1a8ae37a36ae560c37448c9705a.path, "-v", "-r", rule_file.strpath,]) == 0
+    assert (
+        capa.main.main(
+            [
+                path,
+                "-v",
+                "-r",
+                rule_file.strpath,
+            ]
+        )
+        == 0
+    )


-def test_main_shellcode(sample_499c2a85f6e8142c3f48d4251c9c7cd6_raw32):
-    assert capa.main.main([sample_499c2a85f6e8142c3f48d4251c9c7cd6_raw32.path, "-vv", "-f", "sc32"]) == 0
-    assert capa.main.main([sample_499c2a85f6e8142c3f48d4251c9c7cd6_raw32.path, "-v", "-f", "sc32"]) == 0
-    assert capa.main.main([sample_499c2a85f6e8142c3f48d4251c9c7cd6_raw32.path, "-j", "-f", "sc32"]) == 0
-    assert capa.main.main([sample_499c2a85f6e8142c3f48d4251c9c7cd6_raw32.path, "-f", "sc32"]) == 0
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_main_shellcode(z499c2_extractor):
+    path = z499c2_extractor.path
+    assert capa.main.main([path, "-vv", "-f", "sc32"]) == 0
+    assert capa.main.main([path, "-v", "-f", "sc32"]) == 0
+    assert capa.main.main([path, "-j", "-f", "sc32"]) == 0
+    assert capa.main.main([path, "-f", "sc32"]) == 0


 def test_ruleset():
@@ -73,7 +89,7 @@ def test_ruleset():
                            name: function rule
                            scope: function
                        features:
-                          - characteristic: switch
+                          - characteristic: tight loop
                    """
                )
            ),
@@ -96,7 +112,8 @@ def test_ruleset():
    assert len(rules.basic_block_rules) == 1


-def test_match_across_scopes_file_function(sample_9324d1a8ae37a36ae560c37448c9705a):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_match_across_scopes_file_function(z9324d_extractor):
    rules = capa.rules.RuleSet(
        [
            # this rule should match on a function (0x4073F0)
@@ -153,16 +170,14 @@ def test_match_across_scopes_file_function(sample_9324d1a8ae37a36ae560c37448c970
            ),
        ]
    )
-    extractor = capa.features.extractors.viv.VivisectFeatureExtractor(
-        sample_9324d1a8ae37a36ae560c37448c9705a.vw, sample_9324d1a8ae37a36ae560c37448c9705a.path,
-    )
-    capabilities, meta = capa.main.find_capabilities(rules, extractor)
+    capabilities, meta = capa.main.find_capabilities(rules, z9324d_extractor)
    assert "install service" in capabilities
    assert ".text section" in capabilities
    assert ".text section and install service" in capabilities


-def test_match_across_scopes(sample_9324d1a8ae37a36ae560c37448c9705a):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_match_across_scopes(z9324d_extractor):
    rules = capa.rules.RuleSet(
        [
            # this rule should match on a basic block (including at least 0x403685)
@@ -218,16 +233,14 @@ def test_match_across_scopes(sample_9324d1a8ae37a36ae560c37448c9705a):
            ),
        ]
    )
-    extractor = capa.features.extractors.viv.VivisectFeatureExtractor(
-        sample_9324d1a8ae37a36ae560c37448c9705a.vw, sample_9324d1a8ae37a36ae560c37448c9705a.path
-    )
-    capabilities, meta = capa.main.find_capabilities(rules, extractor)
+    capabilities, meta = capa.main.find_capabilities(rules, z9324d_extractor)
    assert "tight loop" in capabilities
    assert "kill thread loop" in capabilities
    assert "kill thread program" in capabilities


-def test_subscope_bb_rules(sample_9324d1a8ae37a36ae560c37448c9705a):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_subscope_bb_rules(z9324d_extractor):
    rules = capa.rules.RuleSet(
        [
            capa.rules.Rule.from_yaml(
@@ -247,14 +260,12 @@ def test_subscope_bb_rules(sample_9324d1a8ae37a36ae560c37448c9705a):
        ]
    )
    # tight loop at 0x403685
-    extractor = capa.features.extractors.viv.VivisectFeatureExtractor(
-        sample_9324d1a8ae37a36ae560c37448c9705a.vw, sample_9324d1a8ae37a36ae560c37448c9705a.path,
-    )
-    capabilities, meta = capa.main.find_capabilities(rules, extractor)
+    capabilities, meta = capa.main.find_capabilities(rules, z9324d_extractor)
    assert "test rule" in capabilities


-def test_byte_matching(sample_9324d1a8ae37a36ae560c37448c9705a):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_byte_matching(z9324d_extractor):
    rules = capa.rules.RuleSet(
        [
            capa.rules.Rule.from_yaml(
@@ -272,15 +283,12 @@ def test_byte_matching(sample_9324d1a8ae37a36ae560c37448c9705a):
            )
        ]
    )
-
-    extractor = capa.features.extractors.viv.VivisectFeatureExtractor(
-        sample_9324d1a8ae37a36ae560c37448c9705a.vw, sample_9324d1a8ae37a36ae560c37448c9705a.path,
-    )
-    capabilities, meta = capa.main.find_capabilities(rules, extractor)
+    capabilities, meta = capa.main.find_capabilities(rules, z9324d_extractor)
    assert "byte match test" in capabilities


-def test_count_bb(sample_9324d1a8ae37a36ae560c37448c9705a):
+@pytest.mark.xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2")
+def test_count_bb(z9324d_extractor):
    rules = capa.rules.RuleSet(
        [
            capa.rules.Rule.from_yaml(
@@ -299,9 +307,5 @@ def test_count_bb(sample_9324d1a8ae37a36ae560c37448c9705a):
            )
        ]
    )
-
-    extractor = capa.features.extractors.viv.VivisectFeatureExtractor(
-        sample_9324d1a8ae37a36ae560c37448c9705a.vw, sample_9324d1a8ae37a36ae560c37448c9705a.path,
-    )
-    capabilities, meta = capa.main.find_capabilities(rules, extractor)
+    capabilities, meta = capa.main.find_capabilities(rules, z9324d_extractor)
    assert "count bb" in capabilities
--- a/tests/test_rules.py
+++ b/tests/test_rules.py
@@ -162,6 +162,23 @@ def test_rule_yaml_count_range():
    assert r.evaluate({Number(100): {1, 2, 3}}) == False


+def test_rule_yaml_count_string():
+    rule = textwrap.dedent(
+        """
+        rule:
+            meta:
+                name: test rule
+            features:
+                - count(string(foo)): 2
+        """
+    )
+    r = capa.rules.Rule.from_yaml(rule)
+    assert r.evaluate({String("foo"): {}}) == False
+    assert r.evaluate({String("foo"): {1}}) == False
+    assert r.evaluate({String("foo"): {1, 2}}) == True
+    assert r.evaluate({String("foo"): {1, 2, 3}}) == False
+
+
 def test_invalid_rule_feature():
    with pytest.raises(capa.rules.InvalidRule):
        capa.rules.Rule.from_yaml(
@@ -267,7 +284,7 @@ def test_subscope_rules():
                                - function:
                                    - and:
                                        - characteristic: nzxor
-                                        - characteristic: switch
+                                        - characteristic: loop
                    """
                )
            )
@@ -466,6 +483,21 @@ def test_number_arch():
    assert r.evaluate({Number(2, arch=ARCH_X64): {1}}) == False


+def test_number_arch_symbol():
+    r = capa.rules.Rule.from_yaml(
+        textwrap.dedent(
+            """
+            rule:
+                meta:
+                    name: test rule
+                features:
+                    - number/x32: 2 = some constant
+            """
+        )
+    )
+    assert r.evaluate({Number(2, arch=ARCH_X32, description="some constant"): {1}}) == True
+
+
 def test_offset_symbol():
    rule = textwrap.dedent(
        """
@@ -529,6 +561,21 @@ def test_offset_arch():
    assert r.evaluate({Offset(2, arch=ARCH_X64): {1}}) == False


+def test_offset_arch_symbol():
+    r = capa.rules.Rule.from_yaml(
+        textwrap.dedent(
+            """
+            rule:
+                meta:
+                    name: test rule
+                features:
+                    - offset/x32: 2 = some constant
+            """
+        )
+    )
+    assert r.evaluate({Offset(2, arch=ARCH_X32, description="some constant"): {1}}) == True
+
+
 def test_invalid_offset():
    with pytest.raises(capa.rules.InvalidRule):
        r = capa.rules.Rule.from_yaml(
@@ -633,12 +680,16 @@ def test_regex_values_always_string():
        ),
    ]
    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.String("123"): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.String("123"): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule") in features

    features, matches = capa.engine.match(
-        capa.engine.topologically_order_rules(rules), {capa.features.String("0x123"): {1}}, 0x0,
+        capa.engine.topologically_order_rules(rules),
+        {capa.features.String("0x123"): {1}},
+        0x0,
    )
    assert capa.features.MatchedRule("test rule") in features

--- a/tests/test_viv_features.py
+++ b/tests/test_viv_features.py
@@ -5,340 +5,26 @@
 # Unless required by applicable law or agreed to in writing, software distributed under the License
 #  is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and limitations under the License.
+import sys

-import viv_utils
 from fixtures import *

-import capa.features
-import capa.features.file
-import capa.features.insn
-import capa.features.basicblock
-import capa.features.extractors.viv.file
-import capa.features.extractors.viv.insn
-import capa.features.extractors.viv.function
-import capa.features.extractors.viv.basicblock
-from capa.features import ARCH_X32, ARCH_X64

-
-def extract_file_features(vw, path):
-    features = set([])
-    for feature, va in capa.features.extractors.viv.file.extract_features(vw, path):
-        features.add(feature)
-    return features
-
-
-def extract_function_features(f):
-    features = collections.defaultdict(set)
-    for bb in f.basic_blocks:
-        for insn in bb.instructions:
-            for feature, va in capa.features.extractors.viv.insn.extract_features(f, bb, insn):
-                features[feature].add(va)
-        for feature, va in capa.features.extractors.viv.basicblock.extract_features(f, bb):
-            features[feature].add(va)
-    for feature, va in capa.features.extractors.viv.function.extract_features(f):
-        features[feature].add(va)
-    return features
-
-
-def extract_basic_block_features(f, bb):
-    features = set({})
-    for insn in bb.instructions:
-        for feature, _ in capa.features.extractors.viv.insn.extract_features(f, bb, insn):
-            features.add(feature)
-    for feature, _ in capa.features.extractors.viv.basicblock.extract_features(f, bb):
-        features.add(feature)
-    return features
-
-
-def test_api_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x403BAC))
-    assert capa.features.insn.API("advapi32.CryptAcquireContextW") in features
-    assert capa.features.insn.API("advapi32.CryptAcquireContext") in features
-    assert capa.features.insn.API("advapi32.CryptGenKey") in features
-    assert capa.features.insn.API("advapi32.CryptImportKey") in features
-    assert capa.features.insn.API("advapi32.CryptDestroyKey") in features
-    assert capa.features.insn.API("CryptAcquireContextW") in features
-    assert capa.features.insn.API("CryptAcquireContext") in features
-    assert capa.features.insn.API("CryptGenKey") in features
-    assert capa.features.insn.API("CryptImportKey") in features
-    assert capa.features.insn.API("CryptDestroyKey") in features
-
-
-def test_api_features_64_bit(sample_a198216798ca38f280dc413f8c57f2c2):
-    features = extract_function_features(viv_utils.Function(sample_a198216798ca38f280dc413f8c57f2c2.vw, 0x4011B0))
-    assert capa.features.insn.API("kernel32.GetStringTypeA") in features
-    assert capa.features.insn.API("kernel32.GetStringTypeW") not in features
-    assert capa.features.insn.API("kernel32.GetStringType") in features
-    assert capa.features.insn.API("GetStringTypeA") in features
-    assert capa.features.insn.API("GetStringType") in features
-    # call via thunk in IDA Pro
-    features = extract_function_features(viv_utils.Function(sample_a198216798ca38f280dc413f8c57f2c2.vw, 0x401CB0))
-    assert capa.features.insn.API("msvcrt.vfprintf") in features
-    assert capa.features.insn.API("vfprintf") in features
-
-
-def test_string_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x40105D))
-    assert capa.features.String("SCardControl") in features
-    assert capa.features.String("SCardTransmit") in features
-    assert capa.features.String("ACR  > ") in features
-    # other strings not in this function
-    assert capa.features.String("bcrypt.dll") not in features
-
-
-def test_string_pointer_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x44EDEF))
-    assert capa.features.String("INPUTEVENT") in features
-
-
-def test_byte_features(sample_9324d1a8ae37a36ae560c37448c9705a):
-    features = extract_function_features(viv_utils.Function(sample_9324d1a8ae37a36ae560c37448c9705a.vw, 0x406F60))
-    wanted = capa.features.Bytes(b"\xED\x24\x9E\xF4\x52\xA9\x07\x47\x55\x8E\xE1\xAB\x30\x8E\x23\x61")
-    # use `==` rather than `is` because the result is not `True` but a truthy value.
-    assert wanted.evaluate(features) == True
-
-
-def test_byte_features64(sample_lab21_01):
-    features = extract_function_features(viv_utils.Function(sample_lab21_01.vw, 0x1400010C0))
-    wanted = capa.features.Bytes(b"\x32\xA2\xDF\x2D\x99\x2B\x00\x00")
-    # use `==` rather than `is` because the result is not `True` but a truthy value.
-    assert wanted.evaluate(features) == True
-
-
-def test_bytes_pointer_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x44EDEF))
-    assert capa.features.Bytes("INPUTEVENT".encode("utf-16le")).evaluate(features) == True
-
-
-def test_number_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x40105D))
-    assert capa.features.insn.Number(0xFF) in features
-    assert capa.features.insn.Number(0x3136B0) in features
-    # the following are stack adjustments
-    assert capa.features.insn.Number(0xC) not in features
-    assert capa.features.insn.Number(0x10) not in features
-
-
-def test_number_arch_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x40105D))
-    assert capa.features.insn.Number(0xFF) in features
-    assert capa.features.insn.Number(0xFF, arch=ARCH_X32) in features
-    assert capa.features.insn.Number(0xFF, arch=ARCH_X64) not in features
-
-
-def test_offset_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x40105D))
-    assert capa.features.insn.Offset(0x0) in features
-    assert capa.features.insn.Offset(0x4) in features
-    assert capa.features.insn.Offset(0xC) in features
-    # the following are stack references
-    assert capa.features.insn.Offset(0x8) not in features
-    assert capa.features.insn.Offset(0x10) not in features
-
-    # this function has the following negative offsets
-    # movzx   ecx, byte ptr [eax-1]
-    # movzx   eax, byte ptr [eax-2]
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x4011FB))
-    assert capa.features.insn.Offset(-0x1) in features
-    assert capa.features.insn.Offset(-0x2) in features
-
-
-def test_offset_arch_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x40105D))
-    assert capa.features.insn.Offset(0x0) in features
-    assert capa.features.insn.Offset(0x0, arch=ARCH_X32) in features
-    assert capa.features.insn.Offset(0x0, arch=ARCH_X64) not in features
-
-
-def test_nzxor_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x410DFC))
-    assert capa.features.Characteristic("nzxor") in features  # 0x0410F0B
-
-
-def get_bb_insn(f, va):
-    """fetch the BasicBlock and Instruction instances for the given VA in the given function."""
-    for bb in f.basic_blocks:
-        for insn in bb.instructions:
-            if insn.va == va:
-                return (bb, insn)
-    raise KeyError(va)
-
-
-def test_is_security_cookie(mimikatz):
-    # not a security cookie check
-    f = viv_utils.Function(mimikatz.vw, 0x410DFC)
-    for va in [0x0410F0B]:
-        bb, insn = get_bb_insn(f, va)
-        assert capa.features.extractors.viv.insn.is_security_cookie(f, bb, insn) == False
-
-    # security cookie initial set and final check
-    f = viv_utils.Function(mimikatz.vw, 0x46C54A)
-    for va in [0x46C557, 0x46C63A]:
-        bb, insn = get_bb_insn(f, va)
-        assert capa.features.extractors.viv.insn.is_security_cookie(f, bb, insn) == True
-
-
-def test_mnemonic_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x40105D))
-    assert capa.features.insn.Mnemonic("push") in features
-    assert capa.features.insn.Mnemonic("movzx") in features
-    assert capa.features.insn.Mnemonic("xor") in features
-
-    assert capa.features.insn.Mnemonic("in") not in features
-    assert capa.features.insn.Mnemonic("out") not in features
-
-
-def test_peb_access_features(sample_a933a1a402775cfa94b6bee0963f4b46):
-    features = extract_function_features(viv_utils.Function(sample_a933a1a402775cfa94b6bee0963f4b46.vw, 0xABA6FEC))
-    assert capa.features.Characteristic("peb access") in features
-
-
-def test_file_section_name_features(mimikatz):
-    features = extract_file_features(mimikatz.vw, mimikatz.path)
-    assert capa.features.file.Section(".rsrc") in features
-    assert capa.features.file.Section(".text") in features
-    assert capa.features.file.Section(".nope") not in features
-
-
-def test_tight_loop_features(mimikatz):
-    f = viv_utils.Function(mimikatz.vw, 0x402EC4)
-    for bb in f.basic_blocks:
-        if bb.va != 0x402F8E:
-            continue
-        features = extract_basic_block_features(f, bb)
-        assert capa.features.Characteristic("tight loop") in features
-        assert capa.features.basicblock.BasicBlock() in features
-
-
-def test_tight_loop_bb_features(mimikatz):
-    f = viv_utils.Function(mimikatz.vw, 0x402EC4)
-    for bb in f.basic_blocks:
-        if bb.va != 0x402F8E:
-            continue
-        features = extract_basic_block_features(f, bb)
-        assert capa.features.Characteristic("tight loop") in features
-        assert capa.features.basicblock.BasicBlock() in features
-
-
-def test_file_export_name_features(kernel32):
-    features = extract_file_features(kernel32.vw, kernel32.path)
-    assert capa.features.file.Export("BaseThreadInitThunk") in features
-    assert capa.features.file.Export("lstrlenW") in features
-
-
-def test_file_import_name_features(mimikatz):
-    features = extract_file_features(mimikatz.vw, mimikatz.path)
-    assert capa.features.file.Import("advapi32.CryptSetHashParam") in features
-    assert capa.features.file.Import("CryptSetHashParam") in features
-    assert capa.features.file.Import("kernel32.IsWow64Process") in features
-    assert capa.features.file.Import("msvcrt.exit") in features
-    assert capa.features.file.Import("cabinet.#11") in features
-    assert capa.features.file.Import("#11") not in features
-
-
-def test_cross_section_flow_features(sample_a198216798ca38f280dc413f8c57f2c2):
-    features = extract_function_features(viv_utils.Function(sample_a198216798ca38f280dc413f8c57f2c2.vw, 0x4014D0))
-    assert capa.features.Characteristic("cross section flow") in features
-
-    # this function has calls to some imports,
-    # which should not trigger cross-section flow characteristic
-    features = extract_function_features(viv_utils.Function(sample_a198216798ca38f280dc413f8c57f2c2.vw, 0x401563))
-    assert capa.features.Characteristic("cross section flow") not in features
-
-
-def test_segment_access_features(sample_a933a1a402775cfa94b6bee0963f4b46):
-    features = extract_function_features(viv_utils.Function(sample_a933a1a402775cfa94b6bee0963f4b46.vw, 0xABA6FEC))
-    assert capa.features.Characteristic("fs access") in features
-
-
-def test_thunk_features(sample_9324d1a8ae37a36ae560c37448c9705a):
-    features = extract_function_features(viv_utils.Function(sample_9324d1a8ae37a36ae560c37448c9705a.vw, 0x407970))
-    assert capa.features.insn.API("kernel32.CreateToolhelp32Snapshot") in features
-    assert capa.features.insn.API("CreateToolhelp32Snapshot") in features
-
-
-def test_file_embedded_pe(pma_lab_12_04):
-    features = extract_file_features(pma_lab_12_04.vw, pma_lab_12_04.path)
-    assert capa.features.Characteristic("embedded pe") in features
-
-
-def test_stackstring_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x4556E5))
-    assert capa.features.Characteristic("stack string") in features
-
-
-def test_switch_features(mimikatz):
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x409411))
-    assert capa.features.Characteristic("switch") in features
-
-    features = extract_function_features(viv_utils.Function(mimikatz.vw, 0x409393))
-    assert capa.features.Characteristic("switch") not in features
-
-
-def test_recursive_call_feature(sample_39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41):
-    features = extract_function_features(
-        viv_utils.Function(sample_39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41.vw, 0x10003100)
+@parametrize(
+    "sample,scope,feature,expected",
+    FEATURE_PRESENCE_TESTS,
+    indirect=["sample", "scope"],
 )
-    assert capa.features.Characteristic("recursive call") in features
+def test_viv_features(sample, scope, feature, expected):
+    with xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2"):
+        do_test_feature_presence(get_viv_extractor, sample, scope, feature, expected)

-    features = extract_function_features(
-        viv_utils.Function(sample_39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41.vw, 0x10007B00)
+
+@parametrize(
+    "sample,scope,feature,expected",
+    FEATURE_COUNT_TESTS,
+    indirect=["sample", "scope"],
 )
-    assert capa.features.Characteristic("recursive call") not in features
-
-
-def test_loop_feature(sample_39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41):
-    features = extract_function_features(
-        viv_utils.Function(sample_39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41.vw, 0x10003D30)
-    )
-    assert capa.features.Characteristic("loop") in features
-
-    features = extract_function_features(
-        viv_utils.Function(sample_39c05b15e9834ac93f206bc114d0a00c357c888db567ba8f5345da0529cbed41.vw, 0x10007250)
-    )
-    assert capa.features.Characteristic("loop") not in features
-
-
-def test_file_string_features(sample_bfb9b5391a13d0afd787e87ab90f14f5):
-    features = extract_file_features(
-        sample_bfb9b5391a13d0afd787e87ab90f14f5.vw, sample_bfb9b5391a13d0afd787e87ab90f14f5.path,
-    )
-    assert capa.features.String("WarStop") in features  # ASCII, offset 0x40EC
-    assert capa.features.String("cimage/png") in features  # UTF-16 LE, offset 0x350E
-
-
-def test_function_calls_to(sample_9324d1a8ae37a36ae560c37448c9705a):
-    features = extract_function_features(viv_utils.Function(sample_9324d1a8ae37a36ae560c37448c9705a.vw, 0x406F60))
-    assert capa.features.Characteristic("calls to") in features
-    assert len(features[capa.features.Characteristic("calls to")]) == 1
-
-
-def test_function_calls_to64(sample_lab21_01):
-    features = extract_function_features(viv_utils.Function(sample_lab21_01.vw, 0x1400052D0))  # memcpy
-    assert capa.features.Characteristic("calls to") in features
-    assert len(features[capa.features.Characteristic("calls to")]) == 8
-
-
-def test_function_calls_from(sample_9324d1a8ae37a36ae560c37448c9705a):
-    features = extract_function_features(viv_utils.Function(sample_9324d1a8ae37a36ae560c37448c9705a.vw, 0x406F60))
-    assert capa.features.Characteristic("calls from") in features
-    assert len(features[capa.features.Characteristic("calls from")]) == 23
-
-
-def test_basic_block_count(sample_9324d1a8ae37a36ae560c37448c9705a):
-    features = extract_function_features(viv_utils.Function(sample_9324d1a8ae37a36ae560c37448c9705a.vw, 0x406F60))
-    assert len(features[capa.features.basicblock.BasicBlock()]) == 26
-
-
-def test_indirect_call_features(sample_a933a1a402775cfa94b6bee0963f4b46):
-    features = extract_function_features(viv_utils.Function(sample_a933a1a402775cfa94b6bee0963f4b46.vw, 0xABA68A0))
-    assert capa.features.Characteristic("indirect call") in features
-    assert len(features[capa.features.Characteristic("indirect call")]) == 3
-
-
-def test_indirect_calls_resolved(sample_c91887d861d9bd4a5872249b641bc9f9):
-    features = extract_function_features(viv_utils.Function(sample_c91887d861d9bd4a5872249b641bc9f9.vw, 0x401A77))
-    assert capa.features.insn.API("kernel32.CreatePipe") in features
-    assert capa.features.insn.API("kernel32.SetHandleInformation") in features
-    assert capa.features.insn.API("kernel32.CloseHandle") in features
-    assert capa.features.insn.API("kernel32.WriteFile") in features
+def test_viv_feature_counts(sample, scope, feature, expected):
+    with xfail(sys.version_info >= (3, 0), reason="vivsect only works on py2"):
+        do_test_feature_count(get_viv_extractor, sample, scope, feature, expected)
Author	SHA1	Message	Date
Willi Ballenthin	a801a681b8	Merge pull request #266 from fireeye/release-v1.2.0 release v1.2.0	2020-08-31 10:29:38 -06:00
mike-hunhoff	c25632b12c	Merge pull request #264 from winniepe/master	2020-08-31 09:22:34 -06:00
Capa Bot	8e6974b10f	Sync capa rules submodule	2020-08-31 13:51:49 +00:00
Capa Bot	7616603b11	Sync capa rules submodule	2020-08-31 13:02:37 +00:00
winniepe	7c27af8868	Restore default expansion after unselecting 'Limit results to current function' checkbox.	2020-08-30 16:48:51 +00:00
winniepe	19e5e9b766	Expand one layer by default to make IDA navigation easier.	2020-08-30 16:27:48 +00:00
William Ballenthin	adeee3e834	changelog: don't forget to reference @edeca!	2020-08-29 22:53:51 -06:00
William Ballenthin	c2997c8033	changelog: add entry from #264	2020-08-29 22:32:24 -06:00
William Ballenthin	28b463f145	changelog: add entries for v1.2.0	2020-08-29 22:26:40 -06:00
William Ballenthin	cc59f5b91e	setup: bump version to v1.2.0	2020-08-29 21:54:16 -06:00
William Ballenthin	06ac49e629	submodule: rules, data update	2020-08-29 21:51:40 -06:00
Capa Bot	6c07617082	Sync capa rules submodule	2020-08-29 00:11:38 +00:00
Capa Bot	13390918a1	Sync capa rules submodule	2020-08-28 20:09:50 +00:00
Capa Bot	0f44ec0dd8	Sync capa-testfiles submodule	2020-08-28 19:59:22 +00:00
mike-hunhoff	c49199138e	Merge pull request #261 from fireeye/explorer_include_block_scope_limit_by_func	2020-08-28 10:46:40 -06:00
Michael Hunhoff	3f88bb8500	adding code to include basic block scope when limiting results by a function	2020-08-28 10:30:09 -06:00
Willi Ballenthin	b2b9f15bc1	Merge pull request #260 from fireeye/explorer_plugin_display_statement_description explorer: display statement descriptions	2020-08-27 17:16:38 -06:00
Michael Hunhoff	d2cd224fb3	adding code to display statement description in explorer plugin UI	2020-08-27 14:49:49 -06:00
Capa Bot	aac13164a5	Sync capa rules submodule	2020-08-27 20:40:06 +00:00
Capa Bot	f2fff02b49	Sync capa rules submodule	2020-08-27 20:39:21 +00:00
Willi Ballenthin	662a7eaae6	Merge pull request #259 from recvfrom/master Fix #255: Use relative paths for the git submodule	2020-08-27 14:20:10 -06:00
Willi Ballenthin	f6ba63083b	Merge pull request #258 from recvfrom/fix-256 Fix 256: Pin enum34 version to 1.1.6 for python2.7	2020-08-27 14:19:43 -06:00
Andrew	49774110cc	Fix #255 : Use relative paths for the git submodule Fixes #255 This enables both HTTPS and SSH to be used to checkout the project, per https://stackoverflow.com/a/44630028/9457431	2020-08-27 15:25:14 -04:00
Andrew	c7840e0769	Fix 256: Pin enum34 version to 1.1.6 for python2.7 Fixes #256 - capa requires halo==0.0.30, which has a dependency on spinners>=0.0.24. spinners 0.0.24 has a dependency on enum34==1.1.6, but 1.1.10 gets installed and used on my machine without the version being pinned to 1.1.6. This issue occurs when using python 2.7.	2020-08-27 14:59:58 -04:00
mike-hunhoff	d2155eb3a1	Merge pull request #257 from fireeye/fix-237	2020-08-27 12:39:20 -06:00
Michael Hunhoff	3772c5c0bc	add additional nzxor stack cookie check for IDA extractor	2020-08-27 12:32:44 -06:00
Capa Bot	d47d149196	Sync capa rules submodule	2020-08-27 16:08:48 +00:00
Capa Bot	528645c0d2	Sync capa rules submodule	2020-08-27 13:53:01 +00:00
Willi Ballenthin	7464a62943	Merge pull request #253 from fireeye/black-reformat Black reformat	2020-08-27 07:50:46 -06:00
Moritz Raabe	34e7991081	black 20.8b1 updates	2020-08-27 11:26:28 +02:00
Moritz Raabe	3e20f0fc71	dos2unix	2020-08-27 11:25:43 +02:00
Capa Bot	cb9bd2eab7	Sync capa-testfiles submodule	2020-08-27 08:40:12 +00:00
Willi Ballenthin	9d102843ac	Merge pull request #251 from fireeye/bugfix-249-arch-description bugfix 249	2020-08-26 17:18:34 -06:00
Michael Hunhoff	dc8870861b	fixes 249	2020-08-26 16:31:07 -06:00
Capa Bot	8be1c84fd2	Sync capa rules submodule	2020-08-25 16:35:30 +00:00
Capa Bot	739100d481	Sync capa-testfiles submodule	2020-08-25 16:34:26 +00:00
Willi Ballenthin	fd7d9aafe9	Merge pull request #247 from Ana06/test-pythons Test all supported Python versions	2020-08-21 07:55:08 -06:00
Ana María Martínez Gómez	a39e3cca79	ci: test all supported Python versions I assume once we migrate to Python3, we want to support Python 3.6-9. Python 3.5 will stop receiving security fixes next month, so I don't think we need to support it. As running the test as many times as we want is free, run them for all supported versions to ensure capa work in all of them.	2020-08-21 15:39:13 +02:00
Ana María Martínez Gómez	ad011b08f6	ci: use matrix in tests workflow to avoid duplication Use a matrix with the Python version to avoid duplication when testing different Python versions.	2020-08-21 15:00:06 +02:00
Capa Bot	b4fa6fc954	Sync capa rules submodule	2020-08-20 10:06:04 +00:00
Willi Ballenthin	585a9c167f	Merge pull request #243 from fireeye/fix-241 fix 241: string counting exception	2020-08-18 12:09:52 -06:00
Willi Ballenthin	5f731f72ed	Merge pull request #239 from fireeye/backport-py3-fixes backport py3 testing enhancements	2020-08-18 12:09:22 -06:00
Willi Ballenthin	385c956184	fixtures: fix doc	2020-08-17 20:53:34 -06:00
Willi Ballenthin	d8f2b7b4df	Merge pull request #236 from fireeye/fix-225 fix 225: declarative tests	2020-08-17 10:06:22 -06:00
Willi Ballenthin	b49ed276a9	Merge pull request #238 from Ana06/zip-binaries Fix build workflow & zip binaries	2020-08-17 07:47:08 -06:00
Ana María Martínez Gómez	a2da55fb6f	Add version number to zip in build workflow Relay in `github.ref` (the release tag).	2020-08-17 11:59:04 +02:00
William Ballenthin	d3dad3a66a	rules: fix bug in string counting closes #241	2020-08-16 21:38:13 -06:00
William Ballenthin	b084f7cb9b	pep8	2020-08-16 05:18:39 -06:00
William Ballenthin	89edaf4c5c	tests: xfail things that won't work on py3	2020-08-16 05:17:17 -06:00
William Ballenthin	6cd2931645	ci: test on both py2 and py3	2020-08-16 05:04:19 -06:00
William Ballenthin	295d3fee5d	tests: limit tests to py2/py3	2020-08-16 05:03:57 -06:00
William Ballenthin	0af6386693	tests: fixtures: add ctxmgr for catching xfail	2020-08-16 05:03:23 -06:00
William Ballenthin	1873d0b7c5	*: py3 compat	2020-08-16 05:03:08 -06:00
William Ballenthin	c032d556fb	tests: freeze: make py3 compatible	2020-08-16 05:02:35 -06:00
William Ballenthin	d7f1c23f4d	tests: show found number of features when unexpected	2020-08-16 05:01:20 -06:00
Ana María Martínez Gómez	f7925c2990	Fix pypinstaller to version 3 in build workflow pyinstaller 4 doesn't support Python 2.7. Without a version, it takes the last version making the workflow fail.	2020-08-15 12:28:51 +02:00
Ana María Martínez Gómez	b94f665d4b	Zip release binaries Update `build` workflow to zip the binaries before upload them. Use linux to zip all the binaries.	2020-08-15 12:28:48 +02:00
Ana María Martínez Gómez	68f27dfea4	Fix indentation of build workflow Correct indentation to make it easier to read.	2020-08-15 09:11:18 +02:00
Ana María Martínez Gómez	35226e1e4e	Use GitHub default repo token in build action As we this workflow modifies only the same repo, no extra token (`CAPA_TOKEN`) is needed and we can use the default `GITHUB TOKEN` instead.	2020-08-15 09:11:16 +02:00
Capa Bot	9c40befdd3	Sync capa-testfiles submodule	2020-08-14 19:35:00 +00:00
William Ballenthin	c1b7176e36	submodule: testfiles update	2020-08-14 13:34:43 -06:00
William Ballenthin	259a0a2007	tests: ida: remove old print	2020-08-14 13:15:22 -06:00
William Ballenthin	eee565b596	tests: ida: tweak tests to fit IDA behavior	2020-08-14 13:10:38 -06:00
William Ballenthin	26061c25a5	tests: fixtures: add mapping from test data to md5	2020-08-14 12:58:08 -06:00
William Ballenthin	897da4237d	tests: fixtures: remove lru_cache on some accessors	2020-08-14 12:48:19 -06:00
William Ballenthin	1923d479d8	tests: fixtures: fix name error	2020-08-14 12:35:30 -06:00
William Ballenthin	6b8bce4f42	tests: fixtures: factor out resolution of scope/sample	2020-08-14 12:34:00 -06:00
William Ballenthin	107a68628b	tests: ida: attempt to use new framework (wip)	2020-08-14 12:22:59 -06:00
William Ballenthin	26c9811ba1	tests: viv: fix typo preventing some tests from running	2020-08-14 12:22:39 -06:00
William Ballenthin	b784f086b4	tests: make fixtures more consistent in prep for other backends	2020-08-14 12:04:53 -06:00
William Ballenthin	d161c094a6	setup: add backports.lru_cache for py2.7	2020-08-14 11:28:44 -06:00
William Ballenthin	8cbe3f8546	tests: move expected features into fixtures for reuse closes #225	2020-08-14 11:25:00 -06:00
William Ballenthin	0e049ef56d	viv: insn: fix gs extraction	2020-08-14 11:18:19 -06:00
Willi Ballenthin	ac7f079af8	Merge pull request #235 from fireeye/progressbar-tweaks main: progress bar updates (+rules, and realize iterators)	2020-08-14 10:23:43 -06:00
William Ballenthin	5f47280e0d	main: disable spinner when in quiet mode	2020-08-14 10:19:39 -06:00
Capa Bot	b7d39cf4c9	Sync capa rules submodule	2020-08-14 16:02:13 +00:00
William Ballenthin	de2c3c9800	main: display spinner while generating viv workspace	2020-08-14 09:38:08 -06:00
William Ballenthin	6e525a93d7	viv: insn: derefs: fix exception	2020-08-14 09:37:51 -06:00
William Ballenthin	90cdef5232	main: progress bar updates (+rules, and realize iterators)	2020-08-13 17:25:07 -06:00
Capa Bot	e3e13cdb11	Sync capa rules submodule	2020-08-13 18:51:28 +00:00
Willi Ballenthin	db3369fd09	Merge pull request #232 from Ana06/remove-switch extractor: remove characteristic(switch)	2020-08-13 10:07:07 -06:00
Capa Bot	35086d4a69	Sync capa rules submodule	2020-08-13 16:06:21 +00:00
Ana María Martínez Gómez	adaac03d1d	extractor: remove characteristic(switch) Get rid of the `characteristic(switch)` feature as any of our rules use it and its analysis is not very easy. Analysis results most likely differ across backends, leading to inconsistency.	2020-08-13 16:47:01 +02:00
Capa Bot	199cccaef9	Sync capa rules submodule	2020-08-12 23:27:17 +00:00
Capa Bot	e64277ed41	Sync capa-testfiles submodule	2020-08-12 23:26:45 +00:00
Willi Ballenthin	744b4915c9	Merge pull request #226 from fireeye/enhancement-223 IDA: resolve nested data references to strings/bytes	2020-08-12 09:05:11 -06:00
Capa Bot	5d9ccf1f76	Sync capa rules submodule	2020-08-11 21:04:09 +00:00
Capa Bot	15607d63ab	Sync capa-testfiles submodule	2020-08-11 21:03:00 +00:00
Willi Ballenthin	362db6898a	Merge pull request #230 from fireeye/enhancement-immediate-memory-reference-as-number adding support to emit number features for unmapped immediate memory references	2020-08-11 14:59:26 -06:00
Michael Hunhoff	70b4546c33	adding test for unmapped immediate data reference	2020-08-11 14:13:43 -06:00
Michael Hunhoff	791afd7ac8	adding code to emit number feature for unmapped immediate data reference	2020-08-11 14:12:41 -06:00
Capa Bot	6f352283e6	Sync capa-testfiles submodule	2020-08-11 19:36:17 +00:00
Capa Bot	db85fbab4f	Sync capa rules submodule	2020-08-11 14:54:42 +00:00
mike-hunhoff	20cc23adc5	Merge pull request #228 from fireeye/bugfix-explorer-display-arch-decorator explorer: adding support to display arch decorator on numbers/offsets	2020-08-11 07:50:08 -07:00
Michael Hunhoff	828819e13f	switching to iterative solution for data reference search	2020-08-11 08:45:20 -06:00
Michael Hunhoff	79d94144c6	adding IDA extractor code to resolve nested data references for string and bytes features	2020-08-11 08:44:44 -06:00
Michael Hunhoff	c46a1d2b44	black format changes	2020-08-11 08:26:48 -06:00
Capa Bot	7a18fbf9d4	Sync capa rules submodule	2020-08-11 07:19:00 +00:00
Capa Bot	7d62156a29	Sync capa-testfiles submodule	2020-08-11 07:12:56 +00:00
Michael Hunhoff	def8130a24	adding support to display arch decorator on numbers/offsets	2020-08-10 18:27:37 -06:00
Capa Bot	f7cd52826e	Sync capa rules submodule	2020-08-05 18:51:51 +00:00
Capa Bot	23d31c3c2c	Sync capa-testfiles submodule	2020-08-05 18:50:52 +00:00
Willi Ballenthin	732b47e845	changelog: fix @mike-hunhoff handle	2020-08-05 08:20:34 -06:00
Willi Ballenthin	12076eeda2	Merge pull request #222 from fireeye/release-v1.1.0 draft v1.1.0 release	2020-08-05 08:11:08 -06:00
Willi Ballenthin	9af55292ab	changelog: fix feature name	2020-08-04 21:56:54 -06:00
Willi Ballenthin	9943de0746	Merge pull request #219 from fireeye/fix-218 ida: use a local context for cache instead of global	2020-08-04 21:55:50 -06:00
Capa Bot	1c3da73324	Sync capa rules submodule	2020-08-05 03:18:55 +00:00
William Ballenthin	a7484b9dbe	changelog: add download text	2020-08-04 16:28:49 -06:00
William Ballenthin	ea72454d74	init changelog	2020-08-04 16:27:43 -06:00
William Ballenthin	183f533efd	version: bump to v1.1.0	2020-08-04 15:50:13 -06:00
Willi Ballenthin	715c38b4ff	Merge pull request #221 from fireeye/fix-199 setup: bump viv version	2020-08-04 13:07:32 -06:00
William Ballenthin	fd92165f29	setup: bump viv version	2020-08-04 13:06:52 -06:00
William Ballenthin	4bb13d6075	tests: ida: fix offset arch test	2020-08-04 10:35:10 -06:00
William Ballenthin	6aa17782b7	extractors: ida: fix method signature	2020-08-04 10:33:45 -06:00
William Ballenthin	e74b80a318	extractors: ida: add helper method get_function	2020-08-04 10:32:24 -06:00
William Ballenthin	f993efb8f4	extractors: ida: cache data using shared context not globals attempts to close #218	2020-08-04 10:23:47 -06:00