Apache Airflow project
Stay organized with collections
Save and categorize content based on your preferences.
This page contains the details of a technical writing project accepted for
Google Season of Docs.
Project summary
- Open source organization:
- Apache Airflow
- Technical writer:
- kartik khare
- Project name:
- How to create a workflow
- Project length:
- Standard length (3 months)
Project description
I’ll be working on creating documentation for How to create new workflows easily and effectively.
There are some of the steps involved in workflows which are -
- Read
- Pre processing
- Processing
- Post processing
- Save/Action
- Monitoring
Each step can involve multiple tasks and multitude of actions can be taken after each step such as aborting the job if 2 or more tasks fail in a stage or re run a task if it fails for at least 2 times.
Other part of the workflows include executing 2 or more jobs in parallel then utilising their combined result for the next stage.
Another aspect of a workflow is to alert the user in case anything goes wrong either through mail or slack or pager duty.
I also plan on including some non-trivial ways in which workflows can be used such as to run realtime streaming jobs on restart them on any missing data in downstream Kafka topics.
I’ll be working with mentors to make the scope of the project much more refined and then complete the tasks from there on.
Looking forward to amazing few months ahead.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-11-08 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-11-08 UTC."],[[["This Google Season of Docs project focuses on creating documentation for Apache Airflow, specifically on how to easily and effectively create new workflows."],["The documentation will cover the steps involved in a workflow, including reading, preprocessing, processing, postprocessing, saving/action, and monitoring, as well as handling task failures and parallel job execution."],["The project aims to provide guidance on using workflows for various scenarios, including real-time streaming jobs and restarting workflows based on missing data, and incorporating alerting mechanisms."],["The project scope will be refined in collaboration with mentors throughout its three-month duration."]]],["The project focuses on documenting the creation of new workflows for Apache Airflow. Key steps in workflows include reading, pre-processing, processing, post-processing, saving/action, and monitoring, each potentially involving multiple tasks. Workflows can handle task failures, parallel job execution, and combined result utilization. Alerting users via mail, Slack, or PagerDuty in case of errors is also part of workflows. The project will also include workflows for running real-time streaming jobs and restarting them on missing data.\n"]]