You are viewing an unreleased or outdated version of the documentation

Changelog#

1.6.6 (core) / 0.22.6 (libraries)#

New#

  • Dagster officially supports Python 3.12.
  • dagster-polars has been added as an integration. Thanks @danielgafni!
  • [dagster-dbt] @dbt_assets now supports loading projects with semantic models.
  • [dagster-dbt] @dbt_assets now supports loading projects with model versions.
  • [dagster-dbt] get_asset_key_for_model now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok!
  • [dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
  • [UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.

Bugfixes#

  • Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
  • Fixed an issue with the type annotations on the @asset decorator causing a false positive in Pyright strict mode. Thanks @tylershunt!
  • [ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
  • [ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
  • [dagster-k8s] Fixed an issue where setting the security_context field on the k8s_job_executor didn't correctly set the security context on the launched step pods. Thanks @krgn!

Experimental#

  • Observable source assets can now yield ObserveResults with no data_version.
  • You can now include FreshnessPolicys on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy.
  • [ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.

Documentation#

  • Updated docs to reflect newly-added support for Python 3.12.

Dagster Cloud#

  • [kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.

1.6.5 (core) / 0.22.5 (libraries)#

New#

  • Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
  • [dagster-k8s] Include k8s pod debug info in run worker failure messages.
  • [dagster-dbt] Events emitted by DbtCliResource now include metadata from the dbt adapter response. This includes fields like rows_affected, query_id from the Snowflake adapter, or bytes_processed from the BigQuery adapter.

Bugfixes#

  • A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
  • [dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the k8s_job_executor.
  • [instigator-tick-logs] Fixed an issue where invoking context.log.exception in a sensor or schedule did not properly capture exception information.
  • [asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
  • [dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.

Experimental#

  • @observable_source_asset-decorated functions can now return an ObserveResult. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.
  • [auto-materialize] A new AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron class allows you to construct AutoMaterializePolicys which wait for all parents to be updated after the latest tick of a given cron schedule.
  • [Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.

Documentation#

  • Fixed an error in our asset checks docs. Thanks @vaharoni!
  • Fixed an error in our Dagster Pipes Kubernetes docs. Thanks @cameronmartin!
  • Fixed an issue on the Hello Dagster! guide that prevented it from loading.
  • Add specific capabilities of the Airflow integration to the Airflow integration page.
  • Re-arranged sections in the I/O manager concept page to make info about using I/O versus resources more prominent.

1.4.12 / 0.20.12 (libraries)#

New#

  • The context object now has an asset_key property to get the AssetKey of the current asset.
  • Performance improvements to the auto-materialize daemon when running on large asset graphs.
  • The dagster dev and dagster-daemon run commands now include a --log-level argument that allows you to customize the logger level threshold.
  • [dagster-airbyte] AirbyteResource now includes a poll_interval key that allows you to configure how often it checks an Airbyte sync’s status.

Bugfixes#

  • Fixed an issue where the dagster scheduler would sometimes raise an error if a schedule set its cron_schedule to a list of strings and also had its default status set to AUTOMATICALLY_RUNNING.
  • Fixed an issue where the auto-materialize daemon would sometimes raise a RecursionError when processing asset graphs with long upstream dependency chains.
  • [ui] Fixed an issue where the Raw Compute Logs dropdown on the Run page sometimes didn’t show the current step name or properly account for retried steps.

Community Contributions#

  • [dagster-databricks] Fixed a regression causing DatabricksStepLauncher to fail. Thanks @zyd14!
  • Fixed an issue where Dagster raised an exception when combining observable source assets with multiple partitions definitions. Thanks @aroig!
  • [dagster-databricks] Added support for client authentication with OAuth. Thanks @zyd14!
  • [dagster-databricks] Added support for workspace and volumes init scripts in the databricks client. Thanks @zyd14!
  • Fixed a missing import in our docs. Thanks @C0DK!

Experimental#

  • Asset checks are now displayed in the asset graph and sidebar.

  • [Breaking] Asset check severity is now set at runtime on AssetCheckResult instead of in the @asset_check definition. Now you can define one check that either errors or warns depending on your check logic. ERROR severity no longer causes the run to fail. We plan to reintroduce this functionality with a different API.

  • [Breaking] @asset_check now requires the asset= argument, even if the asset is passed as an input to the decorated function. Example:

    @asset_check(asset=my_asset)
    def my_check(my_asset) -> AssetCheckResult:
        ...
    
  • [Breaking] AssetCheckSpec now takes asset= instead of asset_key=, and can accept either a key or an asset definition.

  • [Bugfix] Asset checks now work on assets with key_prefix set.

  • [Bugfix] Execution failure asset checks are now displayed correctly on the checks tab.

Documentation#

  • [dagster-dbt] Added example of invoking DbtCliResource in custom asset/op to API docs.
  • [dagster-dbt] Added reference to explain how a dbt manifest can be created at run time or build time.
  • [dagster-dbt] Added reference to outline the steps required to deploy a Dagster and dbt project in CI/CD.
  • Miscellaneous fixes to broken links and typos.

1.4.11 / 0.20.11 (libraries)#

New#

  • Dagster code servers now wait to shut down until any calls that they are running have finished, preventing them from stopping while in the middle of executing sensor ticks or other long-running operations.
  • The dagster execute job cli now accepts —-op-selection (thanks @silent-lad!)
  • [ui] Option (Alt) + R now reloads all code locations (OSS only)

Bugfixes#

  • Adds a check to validate partition mappings when directly constructing AssetsDefinition instances.
  • Assets invoked in composition functions like @graph and @job now work again, fixing a regression introduced in 1.4.5.
  • Fixed an issue where a race condition with parallel runs materializing the same asset could cause a run to raise a RecursionError during execution.
  • Fixed an issue where including a resource in both a schedule and a job raised a “Cannot specify resource requirements” exception when the definitions were loaded.
  • The ins argument to graph_asset is now respected correctly.
  • Fixed an issue where the daemon process could sometimes stop with a heartbeat failure when the first sensor it ran took a long time to execute.
  • Fixed an issue where dagster dev failed on startup when the DAGSTER_GRPC_PORT `environment variable was set in the environment.
  • deps arguments for an asset can now be specified as an iterable instead of a sequence, allowing for sets to be passed.
  • [dagster-aws] Fixed a bug where the S3PickleIOManager didn’t correctly handle missing partitions when allow_missing_partitions was set. Thanks @o-sirawat!
  • [dagster-k8s] in the helm chart, the daemon securityContext setting now applies correctly to all init containers (thanks @maowerner!)

Community Contributions#

  • [dagster-databricks] Migrated to use new official databricks Python SDK. Thanks @judahrand!

Experimental#

  • New APIs for defining and executing checks on software-defined assets. These APIs are very early and subject to change. The corresponding UI has limited functionality. Docs
  • Adds a new auto-materialize skip rule AutoMaterializeRule.skip_on_not_all_parents_updated that enforces that an asset can only be materialized if all parents have been materialized since the asset's last materialization.
  • Exposed an auto-materialize skip rule – AutoMaterializeRule.skip_on_parent_missing –which is already part of the behavior of the default auto-materialize policy.
  • Auto-materialize evaluation history will now be stored for 1 month, instead of 1 week.
  • The auto-materialize asset daemon now includes more logs about what it’s doing for each asset in each tick in the Dagster Daemon process output.

Documentation#

  • [dagster-dbt] Added reference docs for dagster-dbt project scaffold.

Dagster Cloud#

  • Fixed an issue where the Docker agent would sometimes fail to load code locations with long names with a hostname connection error.

1.4.10 / 0.20.10 (libraries)#

Bugfixes#

  • [dagster-webserver] Fixed an issue that broke loading static files on Windows.

1.4.9 / 0.20.9 (libraries)#

Bugfixes#

  • [dagster-webserver] Fixed an issue that caused some missing icons in the UI.

1.4.8 / 0.20.8 (libraries)#

New#

  • A new @partitioned_config decorator has been added for defined configuration for partitioned jobs. Thanks @danielgafni!
  • [dagster-aws] The ConfigurablePickledObjectS3IOManager has been renamed S3PickleIOManager for simplicity. The ConfigurablePickledObjecS3IOManager will continue to be available but is considered deprecated in favor of S3PickleIOManager. There is no change in the functionality of the I/O manager.
  • [dagster-azure] The ConfigurablePickledObjectADLS2IOManager has been renamed ADLS2PickleIOManager for simplicity. The ConfigurablePickledObjectADLS2IOManager will continue to be available but is considered deprecated in favor of ADLS2PickleIOManager. There is no change in the functionality of the I/O manager.
  • [dagster-dbt] When an exception is raised when invoking a dbt command using DbtCliResource, the exception message now includes a link to the dbt.log produced. This log file can be inspected for debugging.
  • [dagster-gcp] The ConfigurablePickledObjectGCSIOManager has been renamed GCSPickleIOManager for simplicity. The ConfigurablePickledObjecGCSIOManager will continue to be available but is considered deprecated in favor of GCSPickleIOManager. There is no change in the functionality of the I/O manager.

Bugfixes#

  • Fixed a bug that caused a DagsterInvariantViolationError when executing a multi-asset where both assets have self-dependencies on earlier partitions.
  • Fixed an asset backfill issue where some runs continue to be submitted after a backfill is requested for cancellation.
  • [dagster-dbt] Fixed an issue where using the --debug flag raised an exception in the Dagster framework.
  • [ui] “Launched run” and “Launched backfill” toasts in the Dagster UI behave the same way. To open in a new tab, hold the cmd/ctrl key when clicking “View”
  • [ui] When opening step compute logs, the view defaults to stderr which aligns with Python’s logging defaults.
  • [ui] When viewing a global asset graph with more than 100 assets, the “choose a subset to display” prompt is correctly aligned to the query input.

Community Contributions#

  • Fix for loading assets with a BackfillPolicy, thanks @ruizh22!

Experimental#

  • [dagster-graphql] The Dagster GraphQL Python client now includes a default timeout of 300 seconds for each query, to ensure that GraphQL requests don’t hang and never return a response. If you are running a query that is expected to take longer than 300 seconds, you can set the timeout argument when constructing a DagsterGraphQLClient.
  • [ui] We are continuing to improve the new horizontal rendering of the asset graph, which you can enable in Settings. This release increases spacing between nodes and improves the traceability of arrows on the graph.

Documentation#

  • Several Pythonic resources and I/O managers now have API docs entries.
  • Updated the tutorial’s example project and content to be more explicit about resources.
  • [dagster-dbt] Added API docs examples for DbtCliResource and DbtCliResource.cli(...).
  • Some code samples in API docs for InputContext and OutputContext have been fixed. Thanks @Sergey Mezentsev!

Dagster Cloud#

  • When setting up a new organization by importing a dbt project, using GitLab is now supported.