workflows

The workflows library provides tools for running a series of tasks in a directed acyclic graph (DAG).

Workflows parses tasks and creates a DAG from files in a folder. Each file has a YAML header which identifies the type of task and its upstream dependencies.

Workflows executes tasks from the command line or can be integrated with other tools such as Airflow or Kubernetes. Workflows can run validations on task outputs and has helpers for writing tests.

An optional scheduler can be used to trigger DAGs on a cron schedule. The scheduler runs alongside a task manager that executes the task. Two task manager implementations are included: a long-running worker suitable for use on a server and a Kubernetes manager which runs tasks as Pods. The scheduler requires a Postgres database to store state.

Table of Contents