Apache Airflow Installation on Linux / Ubuntu / CentOS

To install Apache Airflow on a Linux machine, you will need Python 3.6 or later and a few other dependencies. Here is a more detailed outline of the steps you can follow to install Airflow on Linux:

Datathreads Advertisement - On-Premise ETL,BI, and AI Platform
  1. Install Python: If you don't already have Python installed on your machine, you will need to install it. You can download the latest version of Python from the official Python website (https://www.python.org/downloads/) or use your Linux distribution's package manager to install it. Make sure to install Python 3.6 or later, as earlier versions are not supported by Airflow.

  2. Install virtualenv: virtualenv is a tool that allows you to create isolated Python environments. It is recommended to use virtualenv when installing Airflow, as it will allow you to keep the Airflow installation and its dependencies separate from your other Python packages. To install virtualenv, open a terminal and run the following command:

pip install virtualenv 
  1. Create a virtual environment: To create a new virtual environment for Airflow, open a terminal and navigate to the directory where you want to create the environment. Then run the following command:
virtualenv airflow-env
  1. Activate the virtual environment: To activate the virtual environment, run the following command:
source airflow-env/bin/activate 
  1. Install the required dependencies for Airflow: To install the required dependencies for Airflow, run the following command:
pip install apache-airflow[postgres,mysql] 

This will install the Apache airflow package and the required dependencies for PostgreSQL and MySQL support.

  1. Initialize the Airflow database: Airflow uses an SQLite database to store metadata about your DAGs and their executions. To initialize the database, run the following command:
airflow initdb 
Datathreads Advertisement - On-Premise ETL,BI, and AI Platform
  1. Start the Airflow web server: To start the web server, run the following command:
airflow webserver 
  1. Start the Airflow scheduler: To start the scheduler, run the following command:
airflow scheduler 

This will start the Airflow web server and scheduler, and you should be able to access the Airflow web UI at http://localhost:8080 .

Datathreads Advertisement - On-Premise ETL,BI, and AI Platform