Skip to main content

Home

Airflow

Airflow can be integrated with Unravel by Installing the Unravel Airflow agent in the same instance as that of the Airflow instance or an instance that has access to the Airflow host. The metadata information from Airflow is pushed to the agent. The agent pulls additional information such as logs, if necessary, consolidates all the DAG information, and forwards it to Unravel.

After integrating, you can observe the details of the pipelines in Airflow from Pipeline Health & Observability.

Do the following to integrate Airflow with Unravel:

  1. Download and install the Unravel Airflow agent in the same instance as that of the Airflow instance or in the instance, which can be accessible via a network by Airflow.

    1. Download the airflow-agent.tar.gz file.

    2. Extract the airflow-agent.tar.gz file.

      tar -zxf airflow-agent.tar.gz
  2. As unravel user, do the following before you run the Unravel Airflow agent:

    1. Go to the location where the Unravel Airflow agent is installed. For example: home/unravel/airflow-agent

    2. From $AGENT_HOME/src/sample_airflow_configuration/config, open the folder based on the Python version of your Airflow application and copy the file in the folder. The following folders are listed:

      • greater_than_py3.6

      • lesser_equalto_py3.6

      For example, if you have Airflow running with Python 3.8 version, copy the file in the folder  $AGENT_HOME/src/sample_airflow_configuration/config/greater_than_py3.6 with the following command.

      ```cp $AGENT_HOME/src/sample_airflow_configuration/config/greater_than_py3.6/airflow_local_settings.py $AIRFLOW_HOME/config```

      Note

      If the Unravel Airflow agent is in a different instance than your Airflow application, open the copied file $AIRFLOW_HOME/config/airflow_local_settings.py and edit line number 14.

       ```airflow_agent_url = "http://localhost:5002"```

      Change localhost to the instance name where the Unravel Airflow agent is running.

    3. Open the airflow.cfg file, which is located in the AIRFLOW_HOME folder, and edit the value of the key auth_backends. For the Airflow to support API calls specify the value for auth_backends as airflow.api.auth.backend.basic_auth.

      Note

      From Airflow version 2.3.0, multiple values are supported for the auth_backends key.

    4. Restart the Airflow application. The DAGs start sending callbacks during each task execution.

    5. Go to $AGENT_HOME/src/config.py and edit the following configs. These are needed for making API calls to Airflow.

      • airflow_url: change to the URL where the Airflow webserver is running.

      • airflow_username: specify your Airflow application username.

      • airflow_password: specify your Airflow application password.

  3. Run the Unravel Airflow agent. Go to  $AGENT_HOME/src and execute the start.sh file.

    bash start.sh
  4. Download the Pipeline Health and Observability from here and install it. Refer to Installing Pipeline health and observability.

    From the Pipeline Health & Observability interface, you can now view the pipeline run information.