Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail tasks in scheduler when executor reports they failed #15929

Merged

Conversation

ephraimbuddy
Copy link
Contributor

When a task fails in executor while still queued in scheduler, the executor reports
this failure but scheduler doesn't change the task state resulting in the task
being queued until the scheduler is restarted. This commit fixes it by ensuring
that when a task is reported to have failed in the executor, the task is failed
in scheduler

This can be reproduced with CeleryExecutor in breeze

  1. Start breeze without redis integration
  2. start the webserver and scheduler
  3. Run a dag e.g example_bash_operator.py
  4. The CeleryExecutor will raise exceptions and fails the task
  5. Notice that the tasks remain queued

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label May 18, 2021
@ephraimbuddy ephraimbuddy force-pushed the set-state-to-executor-reportedd-state branch from 32d8c0e to 4f73511 Compare May 18, 2021 20:59
airflow/jobs/scheduler_job.py Outdated Show resolved Hide resolved
@ephraimbuddy ephraimbuddy force-pushed the set-state-to-executor-reportedd-state branch from 4f73511 to 5e35b83 Compare May 18, 2021 22:27
@ephraimbuddy ephraimbuddy reopened this May 19, 2021
@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label May 19, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

When a task fails in executor while still queued in scheduler, the executor reports
this failure but scheduler doesn't change the task state resulting in the task
being queued until the scheduler is restarted. This commit fixes it by ensuring
that when a task is reported to have failed in the executor, the task is failed
in scheduler
@ephraimbuddy ephraimbuddy force-pushed the set-state-to-executor-reportedd-state branch from 5e35b83 to 9f402e2 Compare May 20, 2021 08:14
@ephraimbuddy ephraimbuddy merged commit deececc into apache:master May 20, 2021
@ephraimbuddy ephraimbuddy deleted the set-state-to-executor-reportedd-state branch May 20, 2021 10:22
@ephraimbuddy ephraimbuddy added this to the Airflow 2.1.2 milestone Jun 26, 2021
ephraimbuddy added a commit to astronomer/airflow that referenced this pull request Jun 26, 2021
When a task fails in executor while still queued in scheduler, the executor reports
this failure but scheduler doesn't change the task state resulting in the task
being queued until the scheduler is restarted. This commit fixes it by ensuring
that when a task is reported to have failed in the executor, the task is failed
in scheduler
@ashb ashb modified the milestones: Airflow 2.1.2, Airflow 2.1.3 Jul 7, 2021
jhtimmins pushed a commit that referenced this pull request Aug 5, 2021
When a task fails in executor while still queued in scheduler, the executor reports
this failure but scheduler doesn't change the task state resulting in the task
being queued until the scheduler is restarted. This commit fixes it by ensuring
that when a task is reported to have failed in the executor, the task is failed
in scheduler

(cherry picked from commit deececc)
jhtimmins pushed a commit that referenced this pull request Aug 13, 2021
When a task fails in executor while still queued in scheduler, the executor reports
this failure but scheduler doesn't change the task state resulting in the task
being queued until the scheduler is restarted. This commit fixes it by ensuring
that when a task is reported to have failed in the executor, the task is failed
in scheduler

(cherry picked from commit deececc)
kaxil pushed a commit that referenced this pull request Aug 17, 2021
When a task fails in executor while still queued in scheduler, the executor reports
this failure but scheduler doesn't change the task state resulting in the task
being queued until the scheduler is restarted. This commit fixes it by ensuring
that when a task is reported to have failed in the executor, the task is failed
in scheduler

(cherry picked from commit deececc)
jhtimmins pushed a commit that referenced this pull request Aug 17, 2021
When a task fails in executor while still queued in scheduler, the executor reports
this failure but scheduler doesn't change the task state resulting in the task
being queued until the scheduler is restarted. This commit fixes it by ensuring
that when a task is reported to have failed in the executor, the task is failed
in scheduler

(cherry picked from commit deececc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler including HA (high availability) scheduler full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
  翻译: