dagster-polars has been added as an integration. Thanks @danielgafni!
[dagster-dbt] @dbt_assets now supports loading projects with semantic models.
[dagster-dbt] @dbt_assets now supports loading projects with model versions.
[dagster-dbt] get_asset_key_for_model now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok!
[dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
[UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.
Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
Fixed an issue with the type annotations on the @asset decorator causing a false positive in Pyright strict mode. Thanks @tylershunt!
[ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
[ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
[dagster-k8s] Fixed an issue where setting the security_context field on the k8s_job_executor didn't correctly set the security context on the launched step pods. Thanks @krgn!
Observable source assets can now yield ObserveResults with no data_version.
You can now include FreshnessPolicys on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy.
[ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.
[kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.
Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
[dagster-k8s] Include k8s pod debug info in run worker failure messages.
[dagster-dbt] Events emitted by DbtCliResource now include metadata from the dbt adapter response. This includes fields like rows_affected, query_id from the Snowflake adapter, or bytes_processed from the BigQuery adapter.
A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
[dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the k8s_job_executor.
[instigator-tick-logs] Fixed an issue where invoking context.log.exception in a sensor or schedule did not properly capture exception information.
[asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
[dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.
@observable_source_asset-decorated functions can now return an ObserveResult. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.
[auto-materialize] A new AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron class allows you to construct AutoMaterializePolicys which wait for all parents to be updated after the latest tick of a given cron schedule.
[Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.
Added IO manager for materializing assets to GCS. You can specify the GCS asset IO manager by using the following config for resource_defs in AssetGroup:
Improved the performance of storage queries run by the sensor daemon to enforce the idempotency of run keys. This should reduce the database CPU when evaluating sensors with a large volume of run requests with run keys that repeat across evaluations.
[dagit] Added information on sensor ticks to show when a sensor has requested runs that did not result in the creation of a new run due to the enforcement of idempotency using run keys.
[k8s] Run and step workers are now labeled with the Dagster run id that they are currently handling.
If a step launched with a StepLauncher encounters an exception, that exception / stack trace will now appear in the event log.
Fixed a race condition where canceled backfills would resume under certain conditions.
Fixed an issue where exceptions that were raised during sensor and schedule execution didn’t always show a stack trace in Dagit.
During execution, dependencies will now resolve correctly for certain dynamic graph structures that were previously resolving incorrectly.
When using the forkserver start_method on the multiprocess executor, preload_modules have been adjusted to prevent libraries that change namedtuple serialization from causing unexpected exceptions.
Fixed a naming collision between dagster decorators and submodules that sometimes interfered with static type checkers (e.g. pyright).
[dagit] postgres database connection management has improved when watching actively executing runs
[dagster-databricks] The databricks_pyspark_step_launcher now supports steps with RetryPolicies defined, as well as RetryRequested exceptions.
Dagster now supports non-standard vixie-style cron strings, like @hourly, @daily, @weekly, and @monthly in addition to the standard 5-field cron strings (e.g. * * * * *).
value is now an alias argument of entry_data (deprecated) for the MetadataEntry constructor.
Typed metadata can now be attached to SourceAssets and is rendered in dagit.
When a step fails to upload its compute log to Dagster, it will now add an event to the event log with the stack trace of the error instead of only logging the error to the process output.
[dagit] Made a number of improvements to the Schedule/Sensor pages in Dagit, including showing a paginated table of tick information, showing historical cursor state, and adding the ability to set a cursor from Dagit. Previously, we only showed tick information on the timeline view and cursors could only be set using the dagster CLI.
[dagit] When materializing assets, Dagit presents a link to the run rather than jumping to it, and the status of the materialization (pending, running, failed) is shown on nodes in the asset graph.
[dagit] Dagit now shows sensor and schedule information at the top of asset pages based on the jobs in which the asset appears.
[dagit] Dagit now performs "middle truncation" on gantt chart steps and graph nodes, making it much easier to differentiate long assets and ops.
[dagit] Dagit no longer refreshes data when tabs are in the background, lowering browser CPU usage.
dagster-k8s, dagster-celery-k8s, and dagster-docker now name step workers dagster-step-... rather than dagster-job-....
[dagit] The launchpad is significantly more responsive when you're working with very large partition sets.
[dagit] We now show an informative message on the Asset catalog table when there are no matching assets to display. Previously, we would show a blank white space.
[dagit] Running Dagit without a backfill daemon no longer generates a warning unless queued backfills are present. Similarly, a missing sensor or schedule daemon only yields a warning if sensors or schedules are turned on.
[dagit] On the instance summary page, hovering over a recent run’s status dot shows a more helpful tooltip.
[dagster-k8s] Improved performance of the k8s_job_executor for runs with many user logs
[dagster-k8s] When using the dagster-k8s/config tag to configure Dagster Kubernetes pods, the tags can now accept any valid Kubernetes config, and can be written in either snake case (node_selector_terms) or camel case (nodeSelectorTerms). See the docs for more information.
When loading assets from modules using AssetGroup.from_package_name and similar methods, lists of assets at module scope are now loaded.
Added the static methods AssetGroup.from_modules and AssetGroup.from_current_module, which automatically load assets at module scope from particular modules.
Software-defined assets jobs can now load partitioned assets that are defined outside the job.
AssetGraph.from_modules now correctly raises an error if multiple assets with the same key are detected.
The InputContext object provided to IOManager.load_input previously did not include resource config. Now it does.
Previously, if an assets job had a partitioned asset as well as a non-partitioned asset that depended on another non-partitioned asset, it would fail to run. Now it runs without issue.
[dagit] The asset "View Upstream Graph" links no longer select the current asset, making it easier to click "Materialize All".
[dagit] The asset page's "partition health bar" highlights missing partitions better in large partition sets.
[dagit] The asset "Materialize Partitions" modal now presents an error when partition config or tags cannot be generated.
[dagit] The right sidebar of the global asset graph no longer defaults to 0% wide in fresh / incognito browser windows, which made it difficult to click nodes in the global graph.
[dagit] In the asset catalog, the search bar now matches substrings so it's easier to find assets with long path prefixes.
[dagit] Dagit no longer displays duplicate downstream dependencies on the Asset Details page in some scenarios.
[dagster-fivetran] Assets created using build_fivetran_assets will now be properly tagged with a fivetran pill in Dagit.
Fixed issue causing step launchers to fail in many scenarios involving re-execution or dynamic execution.
Previously, incorrect selections (generally, step selections) could be generated for strings of the form ++item. This has been fixed.
Fixed an issue where run status sensors sometimes logged the wrong status to the event log if the run moved into a different status while the sensor was running.
Fixed an issue where daily schedules sometimes produced an incorrect partition name on spring Daylight Savings time boundaries.
[dagit] Certain workspace or repo-scoped pages relied on versions of the SQLAlchemy package to be 1.4 or greater to be installed. We are now using queries supported by SQLAlchemy>=1.3. Previously we would raise an error including the message: 'Select' object has no attribute 'filter'.
[dagit] Certain workspace or repo-scoped pages relied on versions of sqlite to be 3.25.0 or greater to be installed. This has been relaxed to support older versions of sqlite. This was previously marked as fixed in our 0.14.0 notes, but a handful of cases that were still broken have now been fixed. Previously we would raise an error (sqlite3.OperationalError).
[dagit] When changing presets / partitions in the launchpad, Dagit preserves user-entered tags and replaces only the tags inherited from the previous base.
[dagit] Dagit no longer hangs when rendering the run gantt chart for certain graph structures.
[dagster-airbyte] Fixed issues that could cause failures when generating asset materializations from an Airbyte API response.
[dagster-aws] 0.14.3 removed the ability for the EcsRunLauncher to use sidecars without you providing your own custom task definition. Now, you can continue to inherit sidecars from the launching task’s task definition by setting include_sidecars: True in your run launcher config.
dagster-snowflake has dropped support for python 3.6. The library it is currently built on, snowflake-connector-python, dropped 3.6 support in their recent 2.7.5 release.
Concepts sections added for Op Retries and Dynamic Graphs
The Hacker News Assets demo now uses AssetGroup instead of build_assets_job, and it can now be run entirely from a local machine with no additional infrastructure (storing data inside DuckDB).
The Software-Defined Assets guide in the docs now uses AssetGroup instead of build_assets_job.
When using an executor that runs each op in its own process, exceptions in the Dagster system code that result in the op process failing will now be surfaced in the event log.
Introduced new SecretsManager resources to the dagster-aws package to enable loading secrets into Jobs more easily. For more information, seethe documentation.
Daemon heartbeats are now processed in a batch request to the database.
Job definitions now contain a method called run_request_for_partition, which returns a RunRequest that can be returned in a sensor or schedule evaluation function to launch a run for a particular partition for that job. See our documentation for more information.
Renamed the filter class from PipelineRunsFilter => RunsFilter.
Assets can now be directly invoked for unit testing.
[dagster-dbt] load_assets_from_dbt_project will now attach schema information to the generated assets if it is available in the dbt project (schema.yml).
[examples] Added an example that demonstrates using Software Defined Assets with Airbyte, dbt, and custom Python.
The default io manager used in the AssetGroup api is now the fs_asset_io_manager.
It's now possible to build a job where partitioned assets depend on partitioned assets that are maintained outside the job, and for those upstream partitions to show up on the context in the op and IOManager load_input function.
SourceAssets can now be partitioned, by setting the partitions_def argument.
Fixed an issue where run status sensors would sometimes fire multiple times for the same run if the sensor function raised an error.
[ECS] Previously, setting cpu/memory tags on a job would override the ECS task’s cpu/memory, but not individual containers. If you were using a custom task definition that explicitly sets a container’s cpu/memory, the container would not resize even if you resized the task. Now, setting cpu/memory tags on a job overrides both the ECS task’s cpu/memory and the container's cpu/memory.
[ECS] Previously, if the EcsRunLauncher launched a run from a task with multiple containers - for example if both dagit and daemon were running in the same task - then the run would be launched with too many containers. Now, the EcsRunLauncher only launches tasks with a single container.
Fixed an issue where the run status of job invoked through execute_in_process was not updated properly.
Fixed some storage queries that were incompatible with versions of SQLAlchemy<1.4.0.
[dagster-dbt] Fixed issue where load_assets_from_dbt_project would fail if models were organized into subdirectories.
[dagster-dbt] Fixed issue where load_assets_from_dbt_project would fail if seeds or snapshots were present in the project.
[dagster-fivetran] A new fivetran_resync_op (along with a corresponding resync_and_poll method on the fivetran_resource) allows you to kick off Fivetran resyncs using Dagster (thanks @dwallace0723!)
[dagster-shell] Fixed an issue where large log output could cause operations to hang (thanks @kbd!)
[documentation] Fixed export message with dagster home path (thanks @proteusiq)!
[documentation] Remove duplicate entries under integrations (thanks @kahnwong)!
An issue preventing the use of default_value on inputs has been resolved. Previously, a defensive error that did not take default_value in to account was thrown.
[dagster-aws] Fixed issue where re-emitting log records from the pyspark_step_launcher would occasionally cause a failure.
[dagit] The asset catalog now displays entries for materialized assets when only a subset of repositories were selected. Previously, it only showed the software-defined assets unless all repositories were selected in Dagit.
Fixed an invariant check in the databricks step launcher that was causing failures when setting the local_dagster_job_package_path config option (Thanks Iswariya Manivannan!)