Skip to content

Commit

Permalink
Move min airflow version to 2.3.0 for all providers (#27196)
Browse files Browse the repository at this point in the history
As of October 11 our providers are supposed to be compatible with
Airflow 2.3+ and all code for backwards compatibility with Airflow 2.2
can be removed now.
  • Loading branch information
potiuk authored Oct 24, 2022
1 parent ffc3548 commit 78b8ea2
Show file tree
Hide file tree
Showing 121 changed files with 442 additions and 794 deletions.
34 changes: 12 additions & 22 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -884,32 +884,22 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
run: |
pipx install twine
twine check dist/*.whl
- name: "Remove airflow package and replace providers with 2.2-compliant versions"
- name: "Remove airflow package and replace providers with 2.3-compliant versions"
run: |
rm -vf dist/apache_airflow-*.whl \
dist/apache_airflow_providers_cncf_kubernetes*.whl \
dist/apache_airflow_providers_celery*.whl
dist/apache_airflow_providers_docker*.whl
pip download --no-deps --dest dist \
apache-airflow-providers-cncf-kubernetes==3.0.0 \
apache-airflow-providers-celery==2.1.3
- name: "Install and test provider packages and airflow on Airflow 2.2 files"
apache-airflow-providers-docker==3.1.0
- name: "Get all provider extras as AIRFLOW_EXTRAS evn variable"
run: >
breeze release-management verify-provider-packages --use-airflow-version 2.2.0
--use-packages-from-dist --package-format wheel --airflow-constraints-reference constraints-2.2.0
env:
# The extras below are all extras that should be installed with Airflow 2.2.0
AIRFLOW_EXTRAS: "airbyte,alibaba,amazon,apache.atlas,apache.beam,apache.cassandra,apache.drill,\
apache.druid,apache.hdfs,apache.hive,apache.kylin,apache.livy,apache.pig,apache.pinot,\
apache.spark,apache.sqoop,apache.webhdfs,asana,async,\
celery,cgroups,cloudant,cncf.kubernetes,dask,databricks,datadog,\
deprecated_api,dingding,discord,docker,\
elasticsearch,exasol,facebook,ftp,github_enterprise,google,google_auth,\
grpc,hashicorp,http,imap,influxdb,jdbc,jenkins,jira,kerberos,ldap,\
leveldb,microsoft.azure,microsoft.mssql,microsoft.psrp,microsoft.winrm,mongo,mysql,\
neo4j,odbc,openfaas,opsgenie,oracle,pagerduty,pandas,papermill,password,plexus,\
postgres,presto,qubole,rabbitmq,redis,salesforce,samba,segment,sendgrid,sentry,\
sftp,singularity,slack,snowflake,sqlite,ssh,statsd,tableau,telegram,trino,vertica,\
virtualenv,yandex,zendesk"
python -c 'from pathlib import Path; import json;
providers = json.loads(Path("generated/provider_dependencies.json").read_text());
provider_keys = ",".join(providers.keys());
print("AIRFLOW_EXTRAS={}".format(provider_keys))' >> $GITHUB_ENV
- name: "Install and test provider packages and airflow on Airflow 2.3 files"
run: >
breeze release-management verify-provider-packages --use-airflow-version 2.3.0
--use-packages-from-dist --package-format wheel --airflow-constraints-reference constraints-2.3.0
- name: "Fix ownership"
run: breeze ci fix-ownership
if: always()
Expand Down
7 changes: 3 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -360,10 +360,9 @@ repos:
pass_filenames: false
entry: ./scripts/ci/pre_commit/pre_commit_check_setup_extra_packages_ref.py
additional_dependencies: ['rich>=12.4.4']
# This check might be removed when min-airflow-version in providers is 2.2
- id: check-airflow-2-2-compatibility
name: Check that providers are 2.2 compatible.
entry: ./scripts/ci/pre_commit/pre_commit_check_2_2_compatibility.py
- id: check-airflow-provider-compatibility
name: Check compatibility of Providers with Airflow
entry: ./scripts/ci/pre_commit/pre_commit_check_provider_airflow_compatibility.py
language: python
pass_filenames: true
files: ^airflow/providers/.*\.py$
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -407,8 +407,8 @@ that we increase the minimum Airflow version, when 12 months passed since the
first release for the MINOR version of Airflow.

For example this means that by default we upgrade the minimum version of Airflow supported by providers
to 2.3.0 in the first Provider's release after 11th of October 2022 (11th of October 2021 is the date when the
first `PATCHLEVEL` of 2.2 (2.2.0) has been released.
to 2.4.0 in the first Provider's release after 30th of April 2023. The 30th of April 2022 is the date when the
first `PATCHLEVEL` of 2.3 (2.3.0) has been released.

Providers are often connected with some stakeholders that are vitally interested in maintaining backwards
compatibilities in their integrations (for example cloud providers, or specific service providers). But,
Expand Down
4 changes: 2 additions & 2 deletions STATIC_CODE_CHECKS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,10 +138,10 @@ require Breeze Docker image to be build locally.
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| blacken-docs | Run black on python code blocks in documentation files | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| check-airflow-2-2-compatibility | Check that providers are 2.2 compatible. | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| check-airflow-config-yaml-consistent | Checks for consistency between config.yml and default_config.cfg | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| check-airflow-provider-compatibility | Check compatibility of Providers with Airflow | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| check-apache-license-rat | Check if licenses are OK for Apache | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| check-base-operator-partial-arguments | Check BaseOperator and partial() arguments | |
Expand Down
2 changes: 1 addition & 1 deletion airflow/operators/email.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

from typing import Any, Sequence

from airflow.models import BaseOperator
from airflow.models.baseoperator import BaseOperator
from airflow.utils.context import Context
from airflow.utils.email import send_email

Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/airbyte/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- apache-airflow-providers-http

integrations:
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/alibaba/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- oss2>=2.14.0

integrations:
Expand Down
21 changes: 4 additions & 17 deletions airflow/providers/amazon/aws/links/base_aws.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
# under the License.
from __future__ import annotations

from datetime import datetime
from typing import TYPE_CHECKING, ClassVar

from airflow.models import BaseOperatorLink, XCom
Expand Down Expand Up @@ -63,30 +62,18 @@ def format_link(self, **kwargs) -> str:

def get_link(
self,
operator,
dttm: datetime | None = None,
ti_key: TaskInstanceKey | None = None,
operator: BaseOperator,
*,
ti_key: TaskInstanceKey,
) -> str:
"""
Link to Amazon Web Services Console.
:param operator: airflow operator
:param ti_key: TaskInstance ID to return link for
:param dttm: execution date. Uses for compatibility with Airflow 2.2
:return: link to external system
"""
if ti_key is not None:
conf = XCom.get_value(key=self.key, ti_key=ti_key)
elif not dttm:
conf = {}
else:
conf = XCom.get_one(
key=self.key,
dag_id=operator.dag.dag_id,
task_id=operator.task_id,
execution_date=dttm,
)

conf = XCom.get_value(key=self.key, ti_key=ti_key)
return self.format_link(**conf) if conf else ""

@classmethod
Expand Down
12 changes: 3 additions & 9 deletions airflow/providers/amazon/aws/operators/appflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
from airflow.models import BaseOperator
from airflow.operators.python import ShortCircuitOperator
from airflow.providers.amazon.aws.hooks.appflow import AppflowHook
from airflow.providers.amazon.aws.utils import datetime_to_epoch_ms, get_airflow_version
from airflow.providers.amazon.aws.utils import datetime_to_epoch_ms

if TYPE_CHECKING:
from mypy_boto3_appflow.type_defs import (
Expand Down Expand Up @@ -400,7 +400,7 @@ class AppflowRecordsShortCircuitOperator(ShortCircuitOperator):
:param flow_name: The flow name
:param appflow_run_task_id: Run task ID from where this operator should extract the execution ID
:param ignore_downstream_trigger_rules: Ignore downstream trigger rules (Ignored for Airflow < 2.3)
:param ignore_downstream_trigger_rules: Ignore downstream trigger rules
:param aws_conn_id: aws connection to use
:param region: aws region to use
"""
Expand All @@ -417,19 +417,13 @@ def __init__(
region: str | None = None,
**kwargs,
) -> None:
if get_airflow_version() >= (2, 3):
kwargs["ignore_downstream_trigger_rules"] = ignore_downstream_trigger_rules
else:
self.log.warning(
"Ignoring argument ignore_downstream_trigger_rules (%s) - Only supported for Airflow >= 2.3",
ignore_downstream_trigger_rules,
)
super().__init__(
python_callable=self._has_new_records_func,
op_kwargs={
"flow_name": flow_name,
"appflow_run_task_id": appflow_run_task_id,
},
ignore_downstream_trigger_rules=ignore_downstream_trigger_rules,
**kwargs,
)
self.aws_conn_id = aws_conn_id
Expand Down
6 changes: 1 addition & 5 deletions airflow/providers/amazon/aws/operators/redshift_sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
from typing import Sequence

from airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator
from airflow.www import utils as wwwutils


class RedshiftSQLOperator(SQLExecuteQueryOperator):
Expand All @@ -46,10 +45,7 @@ class RedshiftSQLOperator(SQLExecuteQueryOperator):
"redshift_conn_id",
)
template_ext: Sequence[str] = (".sql",)
# TODO: Remove renderer check when the provider has an Airflow 2.3+ requirement.
template_fields_renderers = {
"sql": "postgresql" if "postgresql" in wwwutils.get_attr_renderer() else "sql"
}
template_fields_renderers = {"sql": "postgresql"}

def __init__(self, *, redshift_conn_id: str = "redshift_default", **kwargs) -> None:
super().__init__(conn_id=redshift_conn_id, **kwargs)
Expand Down
10 changes: 1 addition & 9 deletions airflow/providers/amazon/aws/utils/connection_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,7 @@
from airflow.providers.amazon.aws.utils import trim_none_values
from airflow.utils.log.logging_mixin import LoggingMixin
from airflow.utils.log.secrets_masker import mask_secret

try:
from airflow.utils.types import NOTSET, ArgNotSet
except ImportError: # TODO: Remove when the provider has an Airflow 2.3+ requirement.

class ArgNotSet: # type: ignore[no-redef]
"""Sentinel type for annotations, useful when None is not viable."""

NOTSET = ArgNotSet()
from airflow.utils.types import NOTSET, ArgNotSet

if TYPE_CHECKING:
from airflow.models.connection import Connection # Avoid circular imports.
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/amazon/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- apache-airflow-providers-common-sql>=1.3.0
- boto3>=1.15.0
# watchtower 3 has been released end Jan and introduced breaking change across the board that might
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/beam/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- apache-beam>=2.39.0

integrations:
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/cassandra/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- cassandra-driver>=3.13.0

integrations:
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/drill/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- apache-airflow-providers-common-sql>=1.3.0
- sqlalchemy-drill>=1.1.0

Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/druid/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- apache-airflow-providers-common-sql>=1.2.0
- pydruid>=0.4.1

Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/hdfs/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- snakebite-py3
- hdfs[avro,dataframe,kerberos]>=2.0.4

Expand Down
4 changes: 1 addition & 3 deletions airflow/providers/apache/hive/operators/hive.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,13 +103,11 @@ def __init__(
self.mapred_queue_priority = mapred_queue_priority
self.mapred_job_name = mapred_job_name

job_name_template = conf.get(
job_name_template = conf.get_mandatory_value(
"hive",
"mapred_job_name_template",
fallback="Airflow HiveOperator task for {hostname}.{dag_id}.{task_id}.{execution_date}",
)
if job_name_template is None:
raise ValueError("Job name template should be set !")
self.mapred_job_name_template: str = job_name_template

# assigned lazily - just for consistency we can create the attribute with a
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/hive/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- apache-airflow-providers-common-sql>=1.2.0
- hmsclient>=0.1.0
- pandas>=0.17.1
Expand Down
8 changes: 2 additions & 6 deletions airflow/providers/apache/hive/transfers/hive_to_mysql.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,10 @@
from airflow.providers.apache.hive.hooks.hive import HiveServer2Hook
from airflow.providers.mysql.hooks.mysql import MySqlHook
from airflow.utils.operator_helpers import context_to_airflow_vars
from airflow.www import utils as wwwutils

if TYPE_CHECKING:
from airflow.utils.context import Context

# TODO: Remove renderer check when the provider has an Airflow 2.3+ requirement.
MYSQL_RENDERER = "mysql" if "mysql" in wwwutils.get_attr_renderer() else "sql"


class HiveToMySqlOperator(BaseOperator):
"""
Expand Down Expand Up @@ -64,8 +60,8 @@ class HiveToMySqlOperator(BaseOperator):
template_ext: Sequence[str] = (".sql",)
template_fields_renderers = {
"sql": "hql",
"mysql_preoperator": MYSQL_RENDERER,
"mysql_postoperator": MYSQL_RENDERER,
"mysql_preoperator": "mysql",
"mysql_postoperator": "mysql",
}
ui_color = "#a0e08c"

Expand Down
4 changes: 1 addition & 3 deletions airflow/providers/apache/hive/transfers/mssql_to_hive.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@
from airflow.models import BaseOperator
from airflow.providers.apache.hive.hooks.hive import HiveCliHook
from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
from airflow.www import utils as wwwutils

if TYPE_CHECKING:
from airflow.utils.context import Context
Expand Down Expand Up @@ -66,8 +65,7 @@ class MsSqlToHiveOperator(BaseOperator):

template_fields: Sequence[str] = ("sql", "partition", "hive_table")
template_ext: Sequence[str] = (".sql",)
# TODO: Remove renderer check when the provider has an Airflow 2.3+ requirement.
template_fields_renderers = {"sql": "tsql" if "tsql" in wwwutils.get_attr_renderer() else "sql"}
template_fields_renderers = {"sql": "tsql"}
ui_color = "#a0e08c"

def __init__(
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/kylin/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- kylinpy>=2.6

integrations:
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/livy/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- apache-airflow-providers-http

integrations:
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/pig/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0

integrations:
- integration-name: Apache Pig
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/apache/pinot/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ versions:
- 1.0.0

dependencies:
- apache-airflow>=2.2.0
- apache-airflow>=2.3.0
- apache-airflow-providers-common-sql>=1.2.0
- pinotdb>0.4.7

Expand Down
4 changes: 1 addition & 3 deletions airflow/providers/apache/spark/hooks/spark_submit.py
Original file line number Diff line number Diff line change
Expand Up @@ -627,9 +627,7 @@ def on_kill(self) -> None:
# we still attempt to kill the yarn application
renew_from_kt(self._principal, self._keytab, exit_on_fail=False)
env = os.environ.copy()
ccacche = airflow_conf.get("kerberos", "ccache")
if ccacche is None:
raise ValueError("The kerberos/ccache config should be set here!")
ccacche = airflow_conf.get_mandatory_value("kerberos", "ccache")
env["KRB5CCNAME"] = ccacche

with subprocess.Popen(
Expand Down
Loading

0 comments on commit 78b8ea2

Please sign in to comment.
  翻译: