-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SerializedDagNotFound: DAG not found in serialized_dag table #18843
Comments
We have removed SerializedDagNotFound error in #18554 |
Great news |
I'm experiencing a similar issue. If a task is in the |
cc @kaxil |
Can confirm this kind of behavior on our deployment as well. 2.1.4, 3.9 - Scheduler went down after DAG file deletion from DAGs folder. Error messages went from: airflow.exceptions.SerializedDagNotFound: DAG 'some_dag' not found in serialized_dag table to AttributeError: 'NoneType' object has no attribute 'dag_id'. We are also noticing ~10x higher CPU usage from single Scheduler on 2.1.4 compared with 2.1.2 (same number of DAGs, same settings), but this seems unrelated and we will probably create a new issue for that, if not already answered. Posting it just in case someone experiences the same. |
I believe we already fixed this in #18554 (therefore in 2.2.0). cc @ephraimbuddy |
Not sure if this qualifies as a separate bug but I think it might be related - facing after Upgraded from 2.1.2 - > 2.2.0. Deployment details
How to Reproduce: Restart (rollout) Scheduler when a task is running for a dag_id = dag_1, dag_run_1 Error: Scheduler goes in CrashLoopBackOff with error -
Full Stacktrace for an actual Dag:
At this stage all the dag_runs which are running will cause this issue: Scheduler can be resurrected by deleting all dag_runs which are in running state:
|
The error definitely is different and should be reported separately IMO. The SerializedDagNotFound: DAG not found in serialized_dag table error speficically should already have been fixed. |
@kaxil I'm doing the 2.2.0 upgrade next week, I will let you know if behavior changes. |
@kaxil Hi, yesterday we've upgraded to 2.2.1, so I've tested it by deleting DAG file during runtime:
Scheduler survived - so it seems we are good here. Thank you! |
Awesome, glad to hear that. Thanks for the update |
Apache Airflow version
2.1.4 (latest released)
Operating System
Linux 5.4.149-73.259.amzn2.x86_64
Versions of Apache Airflow Providers
No response
Deployment
Other 3rd-party Helm chart
Deployment details
AWS EKS over own helm chart
What happened
We have an issue back from 2.0.x #13504
Each time scheduler is restarted it deletes all DAGS deom serialized_dag table and trying to serialize them again from the scratch. Afterwards scheduler pod become failed with error:
causing All DAGs to be absent in serialized_dag table
What you expected to happen
Scheduler shouldn't fail
How to reproduce
restart scheduler pod
observe its failure
open dag in webserver
observe an error
Anything else
issue is temporary gone when i've run "serialize" script from webserver pod until next scheduler reboot
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: