You are viewing an unreleased or outdated version of the documentation

Changelog#

1.6.6 (core) / 0.22.6 (libraries)#

New#

  • Dagster officially supports Python 3.12.
  • dagster-polars has been added as an integration. Thanks @danielgafni!
  • [dagster-dbt] @dbt_assets now supports loading projects with semantic models.
  • [dagster-dbt] @dbt_assets now supports loading projects with model versions.
  • [dagster-dbt] get_asset_key_for_model now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok!
  • [dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
  • [UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.

Bugfixes#

  • Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
  • Fixed an issue with the type annotations on the @asset decorator causing a false positive in Pyright strict mode. Thanks @tylershunt!
  • [ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
  • [ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
  • [dagster-k8s] Fixed an issue where setting the security_context field on the k8s_job_executor didn't correctly set the security context on the launched step pods. Thanks @krgn!

Experimental#

  • Observable source assets can now yield ObserveResults with no data_version.
  • You can now include FreshnessPolicys on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy.
  • [ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.

Documentation#

  • Updated docs to reflect newly-added support for Python 3.12.

Dagster Cloud#

  • [kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.

1.6.5 (core) / 0.22.5 (libraries)#

New#

  • Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
  • [dagster-k8s] Include k8s pod debug info in run worker failure messages.
  • [dagster-dbt] Events emitted by DbtCliResource now include metadata from the dbt adapter response. This includes fields like rows_affected, query_id from the Snowflake adapter, or bytes_processed from the BigQuery adapter.

Bugfixes#

  • A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
  • [dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the k8s_job_executor.
  • [instigator-tick-logs] Fixed an issue where invoking context.log.exception in a sensor or schedule did not properly capture exception information.
  • [asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
  • [dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.

Experimental#

  • @observable_source_asset-decorated functions can now return an ObserveResult. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.
  • [auto-materialize] A new AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron class allows you to construct AutoMaterializePolicys which wait for all parents to be updated after the latest tick of a given cron schedule.
  • [Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.

Documentation#

  • Fixed an error in our asset checks docs. Thanks @vaharoni!
  • Fixed an error in our Dagster Pipes Kubernetes docs. Thanks @cameronmartin!
  • Fixed an issue on the Hello Dagster! guide that prevented it from loading.
  • Add specific capabilities of the Airflow integration to the Airflow integration page.
  • Re-arranged sections in the I/O manager concept page to make info about using I/O versus resources more prominent.

0.13.0 "Get the Party Started"#

Major Changes#

  • The job, op, and graph APIs now represent the stable core of the system, and replace pipelines, solids, composite solids, modes, and presets as Dagster’s core abstractions. All of Dagster’s documentation - tutorials, examples, table of contents - is in terms of these new core APIs. Pipelines, modes, presets, solids, and composite solids are still supported, but are now considered “Legacy APIs”. We will maintain backcompatibility with the legacy APIs for some time, however, we believe the new APIs represent an elegant foundation for Dagster going forward. As time goes on, we will be adding new features that only apply to the new core. All in all, the new APIs provide increased clarity - they unify related concepts, make testing more lightweight, and simplify operational workflows in Dagit. For comprehensive instructions on how to transition to the new APIs, refer to the migration guide.
  • Dagit has received a complete makeover. This includes a refresh to the color palette and general design patterns, as well as functional changes that make common Dagit workflows more elegant. These changes are designed to go hand in hand with the new set of core APIs to represent a stable core for the system going forward.
  • You no longer have to pass a context object around to do basic logging. Many updates have been made to our logging system to make it more compatible with the python logging module. You can now capture logs produced by standard python loggers, set a global python log level, and set python log handlers that will be applied to every log message emitted from the Dagster framework. Check out the docs here!
  • The Dagit “playground” has been re-named into the Dagit “launchpad”. This reflects a vision of the tool closer to how our users actually interact with it - not just a testing/development tool, but also as a first-class starting point for many one-off workflows.
  • Introduced a new integration with Microsoft Teams, which includes a connection resource and support for sending messages to Microsoft Teams. See details in the API Docs (thanks @iswariyam!).
  • Intermediate storages, which were deprecated in 0.10.0, have now been removed. Refer to the “Deprecation: Intermediate Storage” section of the 0.10.0 release notes for how to use IOManagers instead.
  • The pipeline-level event types in the run log have been renamed so that the PIPELINE prefix has been replaced with RUN. For example, the PIPELINE_START event is now the RUN_START event.

New since 0.12.15#

  • Addition of get_dagster_logger function, which creates a python loggers whose output messages will be captured and converted into Dagster log messages.

Community Contributions#

  • The run_config attribute is now available on ops/solids built using the build_op_context or build_solid_context functions. Thanks @jiafi!
  • Limit configuration of applyLimitPerUniqueValue in k8s environments. Thanks @cvb!
  • Fix for a solid’s return statement in the intro tutorial. Thanks @dbready!
  • Fix for a bug with output keys in the s3_pickle_io_manager. Thanks @jiafi!

Breaking Changes#

  • We have renamed a lot of our GraphQL Types to reflect our emphasis on the new job/op/graph APIs. We have made the existing types backwards compatible so that GraphQL fragments should still work. However, if you are making custom GraphQL requests to your Dagit webserver, you may need to change your code to handle the new types.
  • We have paired our GraphQL changes with changes to our Python GraphQL client. If you have upgraded the version of your Dagit instance, you will most likely also want to upgrade the version of your Python GraphQL client.

Improvements#

  • Solid, op, pipeline, job, and graph descriptions that are inferred from docstrings now have leading whitespaces stripped out.
  • Improvements to how we cache and store step keys should speed up dynamic workflows with many dynamic outputs significantly.

Bugfixes#

  • Fixed a bug where kwargs could not be used to set the context when directly invoking a solid. IE my_solid(context=context_obj).
  • Fixed a bug where celery-k8s config did not work in the None case:
execution:
  celery-k8s:

Experimental#

  • Removed the lakehouse library, whose functionality is subsumed by @asset and build_assets_job in Dagster core.

Documentation#

  • Removed the trigger_pipeline example, which was not referenced in docs.
  • dagster-mlflow APIs have been added to API docs.

0.12.15#

Community Contributions#

  • You can now configure credentials for the GCSComputeLogManager using a string or environment variable instead of passing a path to a credentials file. Thanks @silentsokolov!
  • Fixed a bug in the dagster-dbt integration that caused the DBT RPC solids not to retry when they received errors from the server. Thanks @cdchan!
  • Improved helm schema for the QueuedRunCoordinator config. Thanks @cvb!

Bugfixes#

  • Fixed a bug where dagster instance migrate would run out of memory when migrating over long run histories.

Experimental#

  • Fixed broken links in the Dagit workspace table view for the experimental software-defined assets feature.

0.12.14#

Community Contributions#

  • Updated click version, thanks @ashwin153!
  • Typo fix, thanks @geoHeil!

Bugfixes#

  • Fixed a bug in dagster_aws.s3.sensor.get_s3_keys that would return no keys if an invalid s3 key was provided
  • Fixed a bug with capturing python logs where statements of the form my_log.info("foo %s", "bar") would cause errors in some scenarios.
  • Fixed a bug where the scheduler would sometimes hang during fall Daylight Savings Time transitions when Pendulum 2 was installed.

Experimental#

  • Dagit now uses an asset graph to represent jobs built using build_assets_job. The asset graph shows each node in the job’s graph with metadata about the asset it corresponds to - including asset materializations. It also contains links to upstream jobs that produce assets consumed by the job, as well as downstream jobs that consume assets produced by the job.
  • Fixed a bug in load_assets_from_dbt_project and load_assets_from_dbt_project that would cause runs to fail if no runtime_metadata_fn argument were supplied.
  • Fixed a bug that caused @asset not to infer the type of inputs and outputs from type annotations of the decorated function.
  • @asset now accepts a compute_kind argument. You can supply values like “spark”, “pandas”, or “dbt”, and see them represented as a badge on the asset in the Dagit asset graph.

0.12.13#

Community Contributions#

  • Changed VersionStrategy.get_solid_version and VersionStrategy.get_resource_version to take in a SolidVersionContext and ResourceVersionContext, respectively. This gives VersionStrategy access to the config (in addition to the definition object) when determining the code version for memoization. (Thanks @RBrossard!).

    Note: This is a breaking change for anyone using the experimental VersionStrategy API. Instead of directly being passed solid_def and resource_def, you should access them off of the context object using context.solid_def and context.resource_def respectively.

New#

  • [dagster-k8s] When launching a pipeline using the K8sRunLauncher or k8s_job_executor, you can know specify a list of volumes to be mounted in the created pod. See the API docs for for information.
  • [dagster-k8s] When specifying a list of environment variables to be included in a pod using custom configuration, you can now specify the full set of parameters allowed by a V1EnvVar in Kubernetes.

Bugfixes#

  • Fixed a bug where mapping inputs through nested composite solids incorrectly caused validation errors.
  • Fixed a bug in Dagit, where WebSocket reconnections sometimes led to logs being duplicated on the Run page.
  • Fixed a bug In Dagit, where log views that were scrolled all the way down would not auto-scroll as new logs came in.

Documentation#

0.12.12#

Community Contributions#

  • [dagster-msteams] Introduced a new integration with Microsoft Teams, which includes a connection resource and support for sending messages to Microsoft Teams. See details in the API Docs (thanks @iswariyam!).
  • Fixed a mistake in the sensors docs (thanks @vitorbaptista)!

Bugfixes#

  • Fixed a bug that caused run status sensors to sometimes repeatedly fire alerts.
  • Fixed a bug that caused the emr_pyspark_step_launcher to fail when stderr included non-Log4J-formatted lines.
  • Fixed a bug that caused applyPerUniqueValue config on the QueuedRunCoordinator to fail Helm schema validation.
  • [dagster-shell] Fixed an issue where a failure while executing a shell command sometimes didn’t raise a clear explanation for the failure.

Experimental#

  • Added experimental @asset decorator and build_assets_job APIs to construct asset-based jobs, along with Dagit support.
  • Added load_assets_from_dbt_project and load_assets_from_dbt_manifest, which enable constructing asset-based jobs from DBT models.