mirror of
https://github.com/mandiant/capa.git
synced 2026-03-12 21:23:12 -07:00
* loader: skip PE files with unrealistically large section virtual sizes Some malformed PE samples declare section virtual sizes orders of magnitude larger than the file itself (e.g. a ~400 KB file with a 900 MB section). vivisect attempts to map these regions, causing unbounded CPU and memory consumption (see #1989). Add _is_probably_corrupt_pe() which uses pefile (fast_load=True) to check whether any section's Misc_VirtualSize exceeds max(file_size * 128, 512 MB). If the check fires, get_workspace() raises CorruptFile before vivisect is invoked, keeping the existing exception handling path consistent. Thresholds are intentionally conservative to avoid false positives on large but legitimate binaries. When pefile is unavailable the helper returns False and behaviour is unchanged. Fixes #1989. * changelog: add entry for #1989 corrupt PE large sections * loader: apply Gemini review improvements - Extend corrupt-PE check to FORMAT_AUTO so malformed PE files cannot bypass the guard when format is auto-detected (the helper returns False for non-PE files so there is no false-positive risk). - Replace magic literals 128 and 512*1024*1024 with named constants _VSIZE_FILE_RATIO and _MAX_REASONABLE_VSIZE for clarity. - Remove redundant int() cast around getattr(Misc_VirtualSize); keep the `or 0` guard for corrupt files where pefile may return None. - Extend test to cover FORMAT_AUTO path alongside FORMAT_PE. * tests: remove mock-only corrupt PE test per maintainer request williballenthin noted the test doesn't add real value since it only exercises the mock, not the actual heuristic. Removing it per feedback. * fix: resolve flake8 NIC002 implicit string concat and add missing test Fix the implicit string concatenation across multiple lines that caused code_style CI to fail. Also add the test_corrupt_pe_with_unrealistic_section_size_short_circuits test that was described in the PR body but not committed.
61 lines
1.9 KiB
Python
61 lines
1.9 KiB
Python
# Copyright 2025 Google LLC
|
|
#
|
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
# you may not use this file except in compliance with the License.
|
|
# You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
|
|
from pathlib import Path
|
|
from unittest.mock import patch
|
|
|
|
import pytest
|
|
import envi.exc
|
|
|
|
from capa.loader import CorruptFile, get_workspace
|
|
from capa.features.common import FORMAT_PE, FORMAT_ELF
|
|
|
|
|
|
def test_segmentation_violation_handling():
|
|
"""
|
|
Test that SegmentationViolation from vivisect is caught and
|
|
converted to a CorruptFile exception.
|
|
|
|
See #2794.
|
|
"""
|
|
fake_path = Path("/tmp/fake_malformed.elf")
|
|
|
|
with patch("viv_utils.getWorkspace") as mock_workspace:
|
|
mock_workspace.side_effect = envi.exc.SegmentationViolation(
|
|
0x30A4B8BD60,
|
|
)
|
|
|
|
with pytest.raises(CorruptFile, match="Invalid memory access"):
|
|
get_workspace(fake_path, FORMAT_ELF, [])
|
|
|
|
|
|
def test_corrupt_pe_with_unrealistic_section_size_short_circuits():
|
|
"""
|
|
Test that a PE with an unrealistically large section virtual size
|
|
is caught early and raises CorruptFile before vivisect is invoked.
|
|
|
|
See #1989.
|
|
"""
|
|
fake_path = Path("/tmp/fake_corrupt.exe")
|
|
|
|
with (
|
|
patch("capa.loader._is_probably_corrupt_pe", return_value=True),
|
|
patch("viv_utils.getWorkspace") as mock_workspace,
|
|
):
|
|
with pytest.raises(CorruptFile, match="unrealistically large sections"):
|
|
get_workspace(fake_path, FORMAT_PE, [])
|
|
|
|
# vivisect should never have been called
|
|
mock_workspace.assert_not_called()
|