Changelog
1.9.10 (core) / 0.25.10 (libraries)
New
- Added a new
.replace()
method toAutomationCondition
, which allows sub-conditions to be modified in-place. - Added new
.allow()
and.ignore()
methods to the booleanAutomationCondition
operators, which allow asset selections to be propagated to sub-conditions such asAutomationCondition.any_deps_match()
andAutomationCondition.all_deps_match()
. - When using the
DAGSTER_REDACT_USER_CODE_ERRORS
environment variable to mask user code errors, the unmasked log lines are now written using adagster.masked
Python logger instead of being written to stderr, allowing the format of those log lines to be customized. - Added a
get_partition_key()
helper method that can be used on hourly/daily/weekly/monthly partitioned assets to get the partition key for any given partition definition. (Thanks @Gw1p!) - [dagster-aws] Added a
task_definition_prefix
argument toEcsRunLauncher
, allowing the name of the task definition families for launched runs to be customized. Previously, the task definition families always started withrun
. - [dagster-aws] Added the
PipesEMRContainersClient
Dagster Pipes client for running and monitoring workloads on AWS EMR on EKS with Dagster. - [dagster-pipes] Added support for setting timestamp metadata (e.g.
{"my_key": {"raw_value": 111, "type": "timestamp"}}
). - [dagster-databricks, dagster-pipes] Databricks Pipes now support log forwarding when running on existing clusters. It can be enabled by setting
PipesDbfsMessageReader(include_stdio_in_messages=True)
. - [dagster-polars] Added
rust
engine support when writing a Delta Lake table using native partitioning. (Thanks @Milias!)
Bugfixes
- Fixed a bug where setting an
AutomationCondition
on an observable source asset could sometimes result in invalid backfills being launched. - Using
AndAutomationCondition.without()
no longer removes the condition's label. - [ui] Sensors targeting asset checks now list the asset checks when you click to view their targets.
- [dagster-aws] Fixed the execution of EMR Serverless jobs using
PipesEMRServerlessClient
failing if a job is in theQUEUED
state. - [dagster-pipes] Fixed Dagster Pipes log capturing when running on Databricks.
- [dagster-snowflake] Fixed a bug where passing a non-base64-encoded private key to a
SnowflakeResource
resulted in an error. - [dagster-openai] Updated
openai
kinds tag to be "OpenAI" instead of "Open AI" in line with the OpenAI branding.
Documentation
- [dagster-pipes] Added a tutorial for using Dagster Pipes with PySpark.
1.9.9 (core) / 0.25.9 (libraries)
New
- Added a new function
load_definitions_from_module
, which can load all the assets, checks, schedules, sensors, and job objects within a module scope into a single Definitions object. Check out the documentation to learn more. - When using the
DAGSTER_REDACT_USER_CODE_ERRORS
environment variable to mask user code errors, the unmasked log lines are now written using adagster.redacted_errors
Python logger instead of being written to stderr, allowing the format of those log lines to be customized. - The
croniter
package is now vendored in dagster. - [ui] Corrected the
minstral
typo and updated the Mistral logo for assetkinds
tag. - [ui] The relevant runs are now shown within the same dialog when viewing details of an automation evaluation.
- [ui] Clicking to view runs with a specific status from the backfill overview now switches to the new backfill runs tab with your filters applied, instead of the global runs page.
- [ui] In the run timeline, all run ids and timings are now shown in the hover popover.
- [ui] Added a new tab on the Runs page that shows a filterable list of recent backfills.
- [dagster-airlift] Added support for Python 3.7.
- [dagster-aws] Added a
task_definition_prefix
argument toEcsRunLauncher
, allowing the name of the task definition families for launched runs to be customized. Previously, the task definition families always started withrun
. - [dagster-azure] Moved azure fake implementations to its own submodule, paving the way for fake implementations to not be imported by default. (Thanks @futurwasfree!)
- [dagster-dlt] The
dagster-dlt
library is added. It replaces the dlt module ofdagster-embedded-elt
. - [dagster-sling] The
dagster-sling
library is added. It replaces the Sling module ofdagster-embedded-elt
. - [helm] Added support for sidecar containers for all Dagster pods, for versions of K8s after 1.29 (Native Sidecars). (Thanks @hom3r!)
Bugfixes
- Fixed an issue where the tick timeline wouldn't load for an automation condition sensor that emitted a backfill.
- Fixed a bug with asset checks where additional_deps/additional_ins were not being threaded through properly in certain cases, and would result in errors at job creation.
- Fixed a bug where the UI will hit an unexpected error when loading details for a run containing a step retry before the step has started.
- Fixed a bug with load_assets_from_x functions where we began erroring when a spec and AssetsDefinition had the same key in a given module. We now only error in this case if include_specs=True.
- Fixed a bug with
load_assets_from_modules
where AssetSpec objects were being given the key_prefix instead of the source_key_prefix. Going forward, when load_specs is set to True, only the source_key_prefix will affect AssetSpec objects. - Fixed a bug with the run queue criteria UI for branch deployments in Dagster Plus.
- [ui] Fixed the "View evaluation" links from the "Automation condition" tag popover on Runs.
- [dagster-aws] Fixed an issue with the EcsRunLauncher where it would sometimes create a new task definition revision for each run if the "task_role_arn" or "execution_role_arn" parameters were specified without the
arn:aws:iam:
prefix. - [dagster-aws] Fixed a bug with
PipesEMRServerlessClient
trying to get the dashboard URL for a run before it transitions to RUNNING state. - [dagster-dbt] Fixed an issue where group names set on partitioned dbt assets created using the
@dbt_assets
decorator would be ignored. - [dagster-azure] Fixed the default configuration for the
show_url_only
parameter on theAzureBlobComputeLogManager
. (Thanks @ion-elgreco!) - [dagster-aws] Fixed an issue handling null
networkConfiguration
parameters for the ECS run launcher. (Thanks @markgrin!)
Documentation
- Added example potential use cases for sensors. (Thanks @gianfrancodemarco!)
- Updated the tutorial to match the outlined structure. (Thanks @vincent0426!)
Deprecations
- [dagster-embedded-elt] the
dagster-embedded-elt
library is deprecated in favor ofdagster-dlt
anddagster-sling
.
Dagster Plus
- The Alert Policies page will now show a warning if a slack channel for a policy no longer exists.
1.9.8 (core) / 0.25.8 (libraries)
Bugfixes
- Fixed a bug with
load_assets_from_x
functions where we began erroring when a spec and AssetsDefinition had the same key in a given module. We now only error in this case ifinclude_specs=True
. - [dagster-azure] Fixed a bug in 1.9.6 and 1.9.7 where the default behavior of the compute log manager switched from showing logs in the UI to showing a URL. You can toggle the
show_url_only
option toTrue
to enable the URL showing behavior. - [dagster-dbt] Fixed an issue where group names set on partitioned dbt assets created using the
@dbt_assets
decorator would be ignored
1.9.7 (core) / 0.25.7 (libraries)
New
- Added new function
load_definitions_from_module
, which can load all the assets, checks, schedules, sensors, and job objects within a module scope into a single Definitions object. Check out the documentation to learn more: https://docs.dagster.io/_apidocs/definitions#dagster.load_definitions_from_module. - Previously, asset backfills could only target selections of assets in which all assets had a
BackfillPolicy
, or none of them did. Mixed selections are now supported. AssetSpecs
may now contain apartitions_def
. DifferentAssetSpecs
passed to the same invocation of@multi_asset
can now have differentPartitionsDefinitions
, as long ascan_subset=True
.- Added the option to use a thread pool to process backfills in parallel.
- Exceptions that are raised when a schedule or sensor is writing to logs will now write an error message to stdout instead of failing the tick.
- Added validation of
title
for asset backfills (not just for job backfills). - [ui] Design tweaks to the asset Automations tab.
- [ui] Asset selection filtering is now case insensitive.
- [ui] Add Teradata icon for kind tags.
- [ui] When creating and editing alerts, when the form is in an invalid state, display the reason on the disabled buttons.
- [ui] Add Automation history to asset checks.
- [ui] Improve performance of Run page for very long-running runs.
- [dagster-airbyte] The
airbyte_assets
decorator has been added. It can be used with theAirbyteCloudWorkspace
resource andDagsterAirbyteTranslator
translator to load Airbyte tables for a given connection as assets in Dagster. Thebuild_airbyte_assets_definitions
factory can be used to create assets for all the connections in your Airbyte workspace. - [dagster-airbyte] Airbyte Cloud assets can now be materialized using the
AirbyteCloudWorkspace.sync_and_poll(…)
method in the definition of a@airbyte_assets
decorator. - [dagster-airlift] Airflow imports are now compatible with Airflow 1.
- [dagster-aws] new
ecs_executor
which executes Dagster steps via AWS ECS tasks. This can be used in conjunction withECSRunLauncher
. - [dagster-dbt]
dbt-core>=1.9
is now supported. - [dagster-dbt] Adds SQL syntax highlighting to raw sql code in dbt asset descriptions.
- [dagster-looker]
load_looker_asset_specs
andbuild_looker_pdt_assets_definitions
are updated to accept an instance ofDagsterLookerApiTranslator
or custom subclass. - [dagster-looker] Type hints in the signature of
DagsterLookerApiTranslator.get_asset_spec
have been updated - the parameterlooker_structure
is now of typeLookerApiTranslatorStructureData
instead ofLookerStructureData
. Custom Looker API translators should be updated. - [dagster-powerbi]
load_powerbi_asset_specs
has been updated to accept an instance ofDagsterPowerBITranslator
or custom subclass. - [dagster-powerbi] Type hints in the signature of
DagsterPowerBITranslator.get_asset_spec
have been updated - the parameterdata
is now of typePowerBITranslatorData
instead ofPowerBIContentData
. Custom Power BI translators should be updated. - [dagster-sigma]
load_sigma_asset_specs
has been updated to accept an instance ofDagsterSigmaTranslator
or a custom subclass. - [dagster-sigma] Type hints in the signature of
DagsterLookerApiTranslator.get_asset_spec
have been updated - the parameterdata
is now of typeUnion[SigmaDatasetTranslatorData, SigmaWorkbookTranslatorData]
instead ofUnion[SigmaDataset, SigmaWorkbook]
. Custom Looker API translators should be updated. - [dagster-sigma] Added the option to filter to specific workbooks in addition to folders.
- [dagster-sigma] Added the option to skip fetching lineage for workbooks in cases where users want to build this information themselves.
- [dagster-tableau]
load_tableau_asset_specs
has been updated to accept an instance ofDagsterTableauTranslator
or custom subclass. - [dagster-tableau] Type hints in the signature of
DagsterTableauTranslator.get_asset_spec
have been updated - the parameterdata
is now of typeTableauTranslatorData
instead ofTableauContentData
. Custom Tableau translators should be updated.
Bugfixes
- Fixed an issue where sensor and schedule tick logs would accumulate disk over time on Dagster code servers.
- [ui] Fixed an issue where the app sometimes loads with styles missing.
- [ui] Fix search string highlighting in global search results.
- Fixed a race condition where immediately after adding a new asset to the graph, a freshness check sensor targeting that asset might raise an InvalidSubsetError in its first one.
- [ui] Fixed a bug where backfills launched by Declarative Automation were not being shown in the table of launched runs.
- The
dagster-airlift
package erroneously introduced a dependency ondagster
. This has been rectified -dagster
is only required for thedagster-airlift[core]
submodule.
Deprecations
- Deprecation of
@multi_asset_sensor
has been rolled back.
Dagster Plus
- Introduce the Catalog Viewer role for Users and Teams.
- Slack, MS Teams, and email alerts for run failures will list the steps that were successful or not executed.
- [experimental] The option
blobStorageSnapshotUploads
has been added which enables a new process for how definition snapshots are uploaded to Dagster Cloud. - Fixed a catalog search issue where exact prefix matches are not prioritized in the search results.
- Fixed a bug with Insights metric customization.
1.9.6 (core) / 0.25.6 (libraries)
New
- Updated
cronitor
pin to allow versions>= 5.0.1
to enable use ofDayOfWeek
as 7. Cronitor4.0.0
is still disallowed. (Thanks, @joshuataylor!) - Added flag
checkDbReadyInitContainer
to optionally disable db check initContainer. - Added job name filtering to increase the throughput for run status sensors that target jobs.
- [ui] Added Google Drive icon for
kind
tags. (Thanks, @dragos-pop!) - [ui] Renamed the run lineage sidebar on the Run details page to
Re-executions
. - [ui] Sensors and schedules that appear in the Runs page are now clickable.
- [ui] Runs targeting assets now show more of the assets in the Runs page.
- [dagster-airbyte] The destination type for an Airbyte asset is now added as a
kind
tag for display in the UI. - [dagster-gcp]
DataprocResource
now receives an optional parameterlabels
to be attached to Dataproc clusters. (Thanks, @thiagoazcampos!) - [dagster-k8s] Added a
checkDbReadyInitContainer
flag to the Dagster Helm chart to allow disabling the default init container behavior. (Thanks, @easontm!) - [dagster-k8s] K8s pod logs are now logged when a pod fails. (Thanks, @apetryla!)
- [dagster-sigma] Introduced
build_materialize_workbook_assets_definition
which can be used to build assets that run materialize schedules for a Sigma workbook. - [dagster-snowflake]
SnowflakeResource
andSnowflakeIOManager
both acceptadditional_snowflake_connection_args
config. This dictionary of arguments will be passed to thesnowflake.connector.connect
method. This config will be ignored if you are using thesqlalchemy
connector. - [helm] Added the ability to set user-deployments labels on k8s deployments as well as pods.
Bugfixes
- Assets with self dependencies and
BackfillPolicy
are now evaluated correctly during backfills. Self dependent assets no longer result in serial partition submissions or disregarded upstream dependencies. - Previously, the freshness check sensor would not re-evaluate freshness checks if an in-flight run was planning on evaluating that check. Now, the freshness check sensor will kick off an independent run of the check, even if there's already an in flight run, as long as the freshness check can potentially fail.
- Previously, if the freshness check was in a failing state, the sensor would wait for a run to update the freshness check before re-evaluating. Now, if there's a materialization later than the last evaluation of the freshness check and no planned evaluation, we will re-evaluate the freshness check automatically.
- [ui] Fixed run log streaming for runs with a large volume of logs.
- [ui] Fixed a bug in the Backfill Preview where a loading spinner would spin forever if an asset had no valid partitions targeted by the backfill.
- [dagster-aws]
PipesCloudWatchMessageReader
correctly identifies streams which are not ready yet and doesn't fail onThrottlingException
. (Thanks, @jenkoian!) - [dagster-fivetran] Column metadata can now be fetched for Fivetran assets using
FivetranWorkspace.sync_and_poll(...).fetch_column_metadata()
. - [dagster-k8s] The k8s client now waits for the main container to be ready instead of only waiting for sidecar init containers. (Thanks, @OrenLederman!)
Documentation
- Fixed a typo in the
dlt_assets
API docs. (Thanks, @zilto!)
1.9.5 (core) / 0.25.5 (libraries)
New
- The automatic run retry daemon has been updated so that there is a single source of truth for if a run will be retried and if the retry has been launched. Tags are now added to run at failure time indicating if the run will be retried by the automatic retry system. Once the automatic retry has been launched, the run ID of the retry is added to the original run.
- When canceling a backfill of a job, the backfill daemon will now cancel all runs launched by that backfill before marking the backfill as canceled.
- Dagster execution info (tags such as
dagster/run-id
,dagster/code-location
,dagster/user
and Dagster Cloud environment variables) typically attached to external resources are now available underDagsterRun.dagster_execution_info
. SensorReturnTypesUnion
is now exported for typing the output of sensor functions.- [dagster-dbt] dbt seeds now get a valid code version (Thanks @marijncv!).
- Manual and automatic retries of runs launched by backfills that occur while the backfill is still in progress are now incorporated into the backfill's status.
- Manual retries of runs launched by backfills are no longer considered part of the backfill if the backfill is complete when the retry is launched.
- [dagster-fivetran] Fivetran assets can now be materialized using the FivetranWorkspace.sync_and_poll(…) method in the definition of a
@fivetran_assets
decorator. - [dagster-fivetran]
load_fivetran_asset_specs
has been updated to accept an instance ofDagsterFivetranTranslator
or custom subclass. - [dagster-fivetran] The
fivetran_assets
decorator was added. It can be used with theFivetranWorkspace
resource andDagsterFivetranTranslator
translator to load Fivetran tables for a given connector as assets in Dagster. Thebuild_fivetran_assets_definitions
factory can be used to create assets for all the connectors in your Fivetran workspace. - [dagster-aws]
ECSPipesClient.run
now waits up to 70 days for tasks completion (waiter parameters are configurable) (Thanks @jenkoian!) - [dagster-dbt] Update dagster-dbt scaffold template to be compatible with uv (Thanks @wingyplus!).
- [dagster-airbyte] A
load_airbyte_cloud_asset_specs
function has been added. It can be used with theAirbyteCloudWorkspace
resource andDagsterAirbyteTranslator
translator to load your Airbyte Cloud connection streams as external assets in Dagster. - [ui] Add an icon for the
icechunk
kind. - [ui] Improved ui for manual sensor/schedule evaluation.
Bugfixes
- Fixed database locking bug for the
ConsolidatedSqliteEventLogStorage
, which is mostly used for tests. - [dagster-aws] Fixed a bug in the ECSRunLauncher that prevented it from accepting a user-provided task definition when DAGSTER_CURRENT_IMAGE was not set in the code location.
- [ui] Fixed an issue that would sometimes cause the asset graph to fail to render on initial load.
- [ui] Fix global auto-materialize tick timeline when paginating.
1.9.4 (core) / 0.25.4 (libraries)
New
- Global op concurrency is now enabled on the default SQLite storage. Deployments that have not been migrated since
1.6.0
may need to rundagster instance migrate
to enable. - Introduced
map_asset_specs
to enable modifyingAssetSpec
s andAssetsDefinition
s in bulk. - Introduced
AssetSpec.replace_attributes
andAssetSpec.merge_attributes
to easily alter properties of an asset spec. - [ui] Add a "View logs" button to open tick logs in the sensor tick history table.
- [ui] Add Spanner kind icon.
- [ui] The asset catalog now supports filtering using the asset selection syntax.
- [dagster-pipes, dagster-aws]
PipesS3MessageReader
now has a new parameterinclude_stdio_in_messages
which enables log forwarding to Dagster via Pipes messages. - [dagster-pipes] Experimental: A new Dagster Pipes message type
log_external_stream
has been added. It can be used to forward external logs to Dagster via Pipes messages. - [dagster-powerbi] Opts in to using admin scan APIs to pull data from a Power BI instance. This can be disabled by passing
load_powerbi_asset_specs(..., use_workspace_scan=False)
. - [dagster-sigma] Introduced an experimental
dagster-sigma snapshot
command, allowing Sigma workspaces to be captured to a file for faster subsequent loading.
Bugfixes
- Fixed a bug that caused
DagsterExecutionStepNotFoundError
errors when trying to execute an asset check step of a run launched by a backfill. - Fixed an issue where invalid cron strings like "0 0 30 2 *" that represented invalid dates in February were still allowed as Dagster cron strings, but then failed during schedule execution. Now, these invalid cronstrings will raise an exception when they are first loaded.
- Fixed a bug where
owners
added toAssetOut
s when defining a@graph_multi_asset
were not added to the underlyingAssetsDefinition
. - Fixed a bug where using the
&
or|
operators onAutomationCondition
s with labels would cause that label to be erased. - [ui] Launching partitioned asset jobs from the launchpad now warns if no partition is selected.
- [ui] Fixed unnecessary middle truncation occurring in dialogs.
- [ui] Fixed timestamp labels and "Now" line rendering bugs on the sensor tick timeline.
- [ui] Opening Dagster's UI with a single job defined takes you to the Overview page rather than the Job page.
- [ui] Fix stretched tags in backfill table view for non-partitioned assets.
- [ui] Open automation sensor evaluation details in a dialog instead of navigating away.
- [ui] Fix scrollbars in dark mode.
- [dagster-sigma] Workbooks filtered using a
SigmaFilter
no longer fetch lineage information. - [dagster-powerbi] Fixed an issue where reports without an upstream dataset dependency would fail to translate to an asset spec.
Deprecations
- [dagster-powerbi]
DagsterPowerBITranslator.get_asset_key
is deprecated in favor ofDagsterPowerBITranslator.get_asset_spec().key
- [dagster-looker]
DagsterLookerApiTranslator.get_asset_key
is deprecated in favor ofDagsterLookerApiTranslator.get_asset_spec().key
- [dagster-sigma]
DagsterSigmaTranslator.get_asset_key
is deprecated in favor ofDagsterSigmaTranslator.get_asset_spec().key
- [dagster-tableau]
DagsterTableauTranslator.get_asset_key
is deprecated in favor ofDagsterTableauTranslator.get_asset_spec().key
1.9.3 (core) / 0.25.3 (libraries)
New
-
Added
run_id
to therun_tags
index to improve database performance. Rundagster instance migrate
to update the index. (Thanks, @HynekBlaha!) -
Added icons for
kind
tags: Cassandra, ClickHouse, CockroachDB, Doris, Druid, Elasticsearch, Flink, Hadoop, Impala, Kafka, MariaDB, MinIO, Pinot, Presto, Pulsar, RabbitMQ, Redis, Redpanda, ScyllaDB, Starrocks, and Superset. (Thanks, @swrookie!) -
Added a new icon for the Denodo kind tag. (Thanks, @tintamarre!)
-
Errors raised from defining more than one
Definitions
object at module scope now include the object names so that the source of the error is easier to determine. -
[ui] Asset metadata entries like
dagster/row_count
now appear on the events page and are properly hidden on the overview page when they appear in the sidebar. -
[dagster-aws]
PipesGlueClient
now attaches AWS Glue metadata to Dagster results produced during Pipes invocation. -
[dagster-aws]
PipesEMRServerlessClient
now attaches AWS EMR Serverless metadata to Dagster results produced during Pipes invocation and adds Dagster tags to the job run. -
[dagster-aws]
PipesECSClient
now attaches AWS ECS metadata to Dagster results produced during Pipes invocation and adds Dagster tags to the ECS task. -
[dagster-aws]
PipesEMRClient
now attaches AWS EMR metadata to Dagster results produced during Pipes invocation. -
[dagster-databricks]
PipesDatabricksClient
now attaches Databricks metadata to Dagster results produced during Pipes invocation and adds Dagster tags to the Databricks job. -
[dagster-fivetran] Added
load_fivetran_asset_specs
function. It can be used with theFivetranWorkspace
resource andDagsterFivetranTranslator
translator to load your Fivetran connector tables as external assets in Dagster. -
[dagster-looker] Errors are now handled more gracefully when parsing derived tables.
-
[dagster-sigma] Sigma assets now contain extra metadata and kind tags.
-
[dagster-sigma] Added support for direct workbook to warehouse table dependencies.
-
[dagster-sigma] Added
include_unused_datasets
field toSigmaFilter
to disable pulling datasets that aren't used by a downstream workbook. -
[dagster-sigma] Added
skip_fetch_column_data
option to skip loading Sigma column lineage. This can speed up loading large instances. -
[dagster-sigma] Introduced an experimental
dagster-sigma snapshot
command, allowing Sigma workspaces to be captured to a file for faster subsequent loading.Introducing:
dagster-airlift
(experimental)dagster-airlift
is coming out of stealth. See the initial Airlift RFC here, and the following documentation to learn more:- A full Airflow migration tutorial.
- A tutorial on federating between Airflow instances.
More Airflow-related content is coming soon! We'd love for you to check it out, and post any comments / questions in the
#airflow-migration
channel in the Dagster slack.
Bugfixes
- Fixed a bug in run status sensors where setting incompatible arguments
monitor_all_code_locations
andmonitored_jobs
did not raise the expected error. (Thanks, @apetryla!) - Fixed an issue that would cause the label for
AutomationCondition.any_deps_match()
andAutomationCondition.all_deps_match()
to render incorrectly whenallow_selection
orignore_selection
were set. - Fixed a bug which could cause code location load errors when using
CacheableAssetsDefinitions
in code locations that containedAutomationConditions
- Fixed an issue where the default multiprocess executor kept holding onto subprocesses after their step completed, potentially causing
Too many open files
errors for jobs with many steps. - [ui] Fixed an issue introduced in 1.9.2 where the backfill overview page would sometimes display extra assets that were targeted by the backfill.
- [ui] Fixed "Open in Launchpad" button when testing a schedule or sensor by ensuring that it opens to the correct deployment.
- [ui] Fixed an issue where switching a user setting was immediately saved, rather than waiting for the change to be confirmed.
- [dagster-looker] Unions without unique/distinct criteria are now properly handled.
- [dagster-powerbi] Fixed an issue where reports without an upstream dataset dependency would fail to translate to an asset spec.
- [dagster-sigma] Fixed an issue where API fetches did not paginate properly.
Documentation
- Added an Airflow Federation Tutorial.
- Added
dagster-dingtalk
to the list of community supported libraries. - Fixed typos in the
dagster-wandb
(Weights and Biases) documentation. (Thanks, @matt-weingarten!) - Updated the Role-based Access Control (RBAC) documentation.
- Added additional information about filtering to the
dagster-sigma
documentation.
Dagster Plus
- [ui] Fixed an issue with filtering and catalog search in branch deployments.
- [ui] Fixed an issue where the asset graph would reload unexpectedly.
1.9.2 (core) / 0.25.2 (libraries)
New
- Introduced a new constructor,
AssetOut.from_spec
, that will construct anAssetOut
from anAssetSpec
. - [ui] Column tags are now displayed in the
Column name
section of the asset overview page. - [ui] Introduced an icon for the
gcs
(Google Cloud Storage) kind tag. - [ui] Introduced icons for
report
andsemanticmodel
kind tags. - [ui] The tooltip for a tag containing a cron expression now shows a human-readable, timezone-aware cron string.
- [ui] Asset check descriptions are now sourced from docstrings and rendered in the UI. (Thanks, @marijncv!)
- [dagster-aws] Added option to propagate tags to ECS tasks when using the
EcsRunLauncher
. (Thanks, @zyd14!) - [dagster-dbt] You can now implement
DagsterDbtTranslator.get_code_version
to customize the code version for your dbt assets. (Thanks, @Grzyblon!) - [dagster-pipes] Added the ability to pass arbitrary metadata to
PipesClientCompletedInvocation
. This metadata will be attached to all materializations and asset checks stored during the pipes invocation. - [dagster-powerbi] During a full workspace scan, owner and column metadata is now automatically attached to assets.
Bugfixes
- Fixed an issue with
AutomationCondition.execution_in_progress
which would cause it to evaluate toTrue
for unpartitioned assets that were part of a run that was in progress, even if the asset itself had already been materialized. - Fixed an issue with
AutomationCondition.run_in_progress
that would cause it to ignore queued runs. - Fixed an issue that would cause a
default_automation_condition_sensor
to be constructed for user code servers running on dagster version< 1.9.0
even if the legacyauto_materialize: use_sensors
configuration setting was set toFalse
. - [ui] Fixed an issue when executing asset checks where the wrong job name was used in some situations. The correct job name is now used.
- [ui] Selecting assets with 100k+ partitions no longer causes the asset graph to temporarily freeze.
- [ui] Fixed an issue that could cause a GraphQL error on certain pages after removing an asset.
- [ui] The asset events page no longer truncates event history in cases where both materialization and observation events are present.
- [ui] The backfill coordinator logs tab no longer sits in a loading state when no logs are available to display.
- [ui] Fixed issue which would cause the "Partitions evaluated" label on an asset's automation history page to incorrectly display
0
in cases where all partitions were evaluated. - [ui] Fix "Open in Playground" link when testing a schedule or sensor by ensuring that it opens to the correct deployment.
- [ui] Fixed an issue where the asset graph would reload unexpectedly.
- [dagster-dbt] Fixed an issue where the SQL filepath for a dbt model was incorrectly resolved when the dbt manifest file was built on a Windows machine, but executed on a Unix machine.
- [dagster-pipes] Asset keys containing embedded
/
characters now work correctly with Dagster Pipes.
Documentation
- Community-hosted integrations are now listed on the Integrations page.
- Added a tutorial, reference page and API docs for
dagster-airlift
. - Fixed a typo in the label for superseded APIs. (Thanks, @matt-weingarten!)
Deprecations
- The
types-sqlalchemy
package is no longer included in thedagster[pyright]
extra package.
Dagster Plus
- [ui] The Environment Variables table can now be sorted by name and update time.
- [ui] The code location configuration dialog now contains more metadata about the code location.
- [ui] Fixed an issue where the incorrect user icons were shown in the Users table when a search filter had been applied.
1.9.1 (core) / 0.25.1 (libraries)
New
dagster project scaffold
now has an option to create dagster projects from templates with excluded files/filepaths.- [ui] Filters in the asset catalog now persist when navigating subdirectories.
- [ui] The Run page now displays the partition(s) a run was for.
- [ui] Filtering on owners/groups/tags is now case-insensitive.
- [dagster-tableau] the helper function
parse_tableau_external_and_materializable_asset_specs
is now available to parse a list of Tableau asset specs into a list of external asset specs and materializable asset specs. - [dagster-looker] Looker assets now by default have owner and URL metadata.
- [dagster-k8s] Added a per_step_k8s_config configuration option to the k8s_job_executor, allowing the k8s configuration of individual steps to be configured at run launch time (thanks @Kuhlwein!)
- [dagster-fivetran] Introduced
DagsterFivetranTranslator
to customize assets loaded from Fivetran. - [dagster-snowflake]
dagster_snowflake.fetch_last_updated_timestamps
now supports ignoring tables not found in Snowflake instead of raising an error.
Bugfixes
- Fixed issue which would cause a
default_automation_condition_sensor
to be constructed for user code servers running on dagster version < 1.9.0 even if the legacyauto_materialize: use_sensors
configuration setting was set toFalse
. - Fixed an issue where running
dagster instance migrate
on Dagster version 1.9.0 constructed a SQL query that exceeded the maximum allowed depth. - Fixed an issue where wiping a dynamically partitioned asset causes an error.
- [dagster-polars]
ImportError
s are no longer raised when bigquery libraries are not installed [#25708]
Documentation
- [dagster-dbt] A guide on how to use dbt defer with Dagster branch deployments has been added to the dbt reference.
1.9.0 (core) / 0.25.0 (libraries)
Major changes since 1.8.0 (core) / 0.24.0 (libraries)
Automation
- Declarative Automation, the system which enables setting per-asset
AutomationConditions
, is no longer experimental. We now recommend using this system in all cases where asset-centric orchestration is desired. A suite of built-in static constructors have been added for common usecases, such asAutomationCondition.on_missing()
(which can fill in missing partitions of assets as soon as upstream data is available), andAutomationCondition.all_deps_blocking_checks_passed()
(which can prevent materialization of assets until all upstream blocking checks have passed). - You can now assign
AutomationConditions
to asset checks, via theautomation_condition
parameter on@asset_check
orAssetCheckSpec
. - You can now assign
AutomationConditions
to observable source assets, via theautomation_condition
parameter on@observable_source_asset
. - [experimental] You can now define custom subclasses of
AutomationCondition
to execute arbitrary Python code in the context of a broader expression. This allows you to compose built-in conditions with custom business logic. - The
target
arguments on schedules and sensors are now marked stable, allowing a stable way for schedules and sensors to target asset selections without needing to define a job.
Integrations
- Introduced a slate of integrations with business intelligence (BI) tools, enabling dashboards, views, and reports to be represented in the Dagster asset graph.
- A rich set of metadata is now automatically collected by our suite of ELT integrations.
- The
dagster/table_name
metadata tag, containing the fully-qualified name of the destination model, has been added for Airbyte, dlt, Fivetran and Sling assets. - The
dagster/row_count
metadata tag, containing the number of records loaded in the corresponding run, has been added for dlt and Sling assets. - The
dagster/column_schema
metadata tag, containing column schema information of the destination tables, has been added for Fivetran assets. - Column lineage information is now collected for Sling assets.
- The
- dagster-pipes are replacing the now deprecated Step Launchers as the new recommended approach for executing remote Spark jobs. Three new Pipes clients for running Spark applications on Amazon Web Services have been added:
dagster_aws.pipes.PipesGlueClient
dagster_aws.pipes.PipesEMRServerlessClient
dagster_aws.pipes.PipesEMRClient
UI
- Several changes have been made to the information architecture to make it easier to find what you’re looking for:
- Backfills have been moved from their own tab underneath the Overview page to entries within the table on the Runs page. This reflects the fact that backfills and runs are similar entities that share most properties. You can continue to use the legacy Runs page with the “Revert to legacy Runs page” user setting. (GitHub Discussion)
- “Jobs” is now a page reachable from the top-level navigation pane. It replaces the Jobs tab within the Overview page.
- “Automations” is now a page reachable from the top-level navigation pane. It replaces the schedule and sensor tabs within the Overview page.
@asset
andAssetSpec
now have akinds
attribute that enables specifying labels that show up on asset nodes in the asset graph in the UI. This supersedes thecompute_kind
attribute.
Changes since 1.8.13 (core) / 0.24.13 (libraries)
New
- The
tags
parameter to@asset
andAssetSpec
is no longer marked as experimental. - The
@observable_source_asset
decorator now supports anautomation_condition
argument. AutomationCondition
and associated APIs are no longer marked as experimental.- Added a new
use_user_code_server
parameter toAutomationConditionSensorDefinition
. If set, the sensor will be evaluated in the user code server (as traditional sensors are), allowing customAutomationCondition
subclasses to be evaluated. - Added a new column to the BulkActions table, a new column to the Runs table, and a new BackfillTags table to improve the performance of the Runs page. To take advantage of these performance improvements, run
dagster instance migrate
. This migration involves a schema migration to add the new columns and table, and a data migration to populate the new columns for historical backfills and runs. - Performance improvements when loading definitions with multi-assets with many asset keys.
- [ui] The previously-experimental changes to the top nav are now enabled for all users.
- [ui] Added new code location pages which provide information regarding library versions, metadata, and definitions.
- [ui] The new version of the Runs page is now enabled by default. To use the legacy version of the Runs page, toggle the "Revert to legacy Runs page" user setting.
- [ui] Clicking an asset with failed partitions on the asset health overview now takes you to a list of the failed partitions.
- [ui] The Materialize button runs pre-flight checks more efficiently, resulting in faster run launch times.
- [dagster-pipes] Added support for multi-container log streaming (thanks, @MattyKuzyk!)
- [dagster-docker]
container_kwargs.stop_timeout
can now be set when using theDockerRunLauncher
ordocker_executor
to configure the amount of time that Docker will wait when terminating a run for it to clean up before forcibly stopping it with a SIGKILL signal. - [dagster-dbt] Performance improvements when loading definitions using
build_dbt_asset_selection
.
Bugfixes
- [ui] Fixed redirect behavior on full pageloads of the legacy auto-materialize overview page.
- [ui] Plots for assets that emit materialization and observation events at different rates no longer display a time period missing the more frequent event type.
- [ui] Fixed issue causing scrolling to misbehave on the concurrency settings page.
- [helm] The blockOpConcurrencyLimitedRuns section of queuedRunCoordinator now correctly templates the appropriate config.
- [dagster-pipes] Fixed issue where k8s ops would fail after 4 hours (thanks, @MattyKuzyk!)
Documentation
- [dagster-dbt] Added guide for using dbt defer with Dagster branch deployments.
- [docs] Step Launchers documentation has been removed and replaced with references to Dagster Pipes.
- [docs] Fixed code example in Dagster Essentials (thanks, @aleexharris!)
Breaking Changes
dagster
no longer supports Python 3.8, which hit EOL on 2024-10-07.dagster
now requirespydantic>=2
.- By default,
AutomationConditionSensorDefinitions
will now emit backfills to handle cases where more than one partition of an asset is requested on a given tick. This allows that asset'sBackfillPolicy
to be respected. This feature can be disabled by settingallow_backfills
toFalse
. - Passing a custom
PartitionsDefinition
subclass into aDefinitions
object now issues an error instead of a deprecation warning. AssetExecutionContext
is no longer a subclass ofOpExecutionContext
. At this release,AssetExecutionContext
andOpExecutionContext
implement the same methods, but in the future, the methods implemented by each class may diverge. If you have written helper functions withOpExecutionContext
type annotations, they may need to be updated to includeAssetExecutionContext
depending on your usage. Explicit calls toisinstance(context, OpExecutionContext)
will now fail ifcontext
is anAssetExecutionContext
.- The
asset_selection
parameter onAutomationConditionSensorDefinition
has been renamed totarget
, to align with existing sensor APIs. - The experimental
freshness_policy_sensor
has been removed, as it relies on the long-deprecatedFreshnessPolicy
API. - The deprecated
external_assets_from_specs
andexternal_asset_from_spec
methods have been removed. Users should useAssetsDefinition(specs=[...])
, or pass specs directly into theDefinitions
object instead. AssetKey
objects can no longer be iterated over or indexed in to. This behavior was never an intended access pattern and in all observed cases was a mistake.- The
dagster/relation_identifier
metadata key has been renamed todagster/table_name
. - [dagster-ge]
dagster-ge
now only supportsgreat_expectations>=0.17.15
. Thege_validation_op_factory
API has been replaced with the API previously calledge_validation_op_factory_v3
. - [dagster-aws] Removed deprecated parameters from
dagster_aws.pipes.PipesGlueClient.run
. - [dagster-embedded-elt] Removed deprecated parameter
dlt_dagster_translator
from@dlt_assets
. Thedagster_dlt_translator
parameter should be used instead. - [dagster-polars] Dropped support for saving storage-level arbitrary metadata via IOManagers.
Deprecations
- The
DataBricksPysparkStepLauncher
,EmrPySparkStepLauncher
, and any custom subclass ofStepLauncher
have been marked as deprecated, but will not be removed from the codebase until Dagster 2.0 is released, meaning they will continue to function as they currently do for the foreseeable future. Their functionality has been superseded by the interfaces provided bydagster-pipes
, and so future development work will be focused there. - The experimental
multi_asset_sensor
has been marked as deprecated, as its main use cases have been superseded by theAutomationCondition
APIs. However, it will not be removed until version 2.0.0.
1.8.13 (core) / 0.24.13 (libraries)
New
- Performance improvements when loading code locations using multi-assets with many asset keys.
AutomationCondition.in_progress()
now will be true if an asset partition is part of an in-progress backfill that has not yet executed it. The prior behavior, which only considered runs, is encapsulated inAutomationCondition.execution_in_progress()
.- [ui] Added tag filter to the jobs page.
- [ui] Preserve user login state for a longer period of time.
- [dagster-dbt] Performance improvements when loading definitions using
build_dbt_asset_selection
. - [dagster-docker]
container_kwargs.stop_timeout
can now be set when using theDockerRunLauncher
ordocker_executor
to configure the amount of time that Docker will wait when terminating a run for it to clean up before forcibly stopping it with a SIGKILL signal. - [dagster-sigma] The Sigma integration now fetches initial API responses in parallel, speeding up initial load.
- [dagster-looker] Attempt to naively render liquid templates for derived table sql.
- [dagster-looker] Added support for views and explores that rely on refinements or extends.
- [dagster-looker] When fetching explores and dashboards from the Looker API, retrieve in parallel.
Bugfixes
- Fixed an issue with
AutomationCondition.eager()
that could cause it to attempt to launch a second attempt of an asset in cases where it was skipped or failed during a run where one of its parents successfully materialized. - Fixed an issue which would cause
AutomationConditionSensorDefinitions
to not be evaluated if theuse_user_code_server
value was toggled after the initial evaluation. - Fixed an issue where configuration values for aliased pydantic fields would be dropped.
- [ui] Fix an issue in the code locations page where invalid query parameters could crash the page.
- [ui] Fix navigation between deployments when query parameters are present in the URL.
- [helm] the blockOpConcurrencyLimitedRuns section of queuedRunCoordinator now correctly templates the appropriate config.
- [dagster-sigma] Fixed pulling incomplete data for very large workspaces.
1.8.12 (core) / 0.24.12 (libraries)
New
- The
AutomationCondition.eager()
,AutomationCondition.missing()
, andAutomationCondition.on_cron
conditions are now compatible with asset checks. - Added
AssetSelection.materializable()
, which returns only assets that are materializable in an existing selection. - Added a new
AutomationCondition.all_deps_blocking_checks_passed
condition, which can be used to prevent materialization when any upstream blocking checks have failed. - Added a
code_version
parameter to the@graph_asset
decorator. - If a
LaunchPartitionBackfill
mutation is submitted to GQL with invalid partition keys, it will now return an earlyPartitionKeysNotFoundError
. AssetSelection.checks_for_assets
now acceptsAssetKey
s and string asset keys, in addition toAssetsDefinition
s.- [ui] Added a search bar to partitions tab on the asset details page.
- [ui] Restored docked left nav behavior for wide viewports.
- [dagster-aws]
get_objects
now has asince_last_modified
that enables only fetching objects modified after a given timestamp. - [dagster-aws] New AWS EMR Dagster Pipes client (
dagster_aws.pipes.PipesEMRCLient
) for running and monitoring AWS EMR jobs from Dagster. - [dagster-looker] Pinned the looker-sdk dependency below 24.18.0 to avoid this issue: https://github.com/looker-open-source/sdk-codegen/issues/1518.
Bugfixes
- Fixed an issue which could cause incorrect evaluation results when using self-dependent partition mappings with
AutomationConditions
that operate over dependencies. - [ui] Fixed an issue where the breadcumb on asset pages would flicker nonstop.
- [dagster-embedded-elt] Fixed extraction of metadata for dlt assets whose source and destination identifiers differ.
- [dagster-databricks] Fixed a permissioning gap that existed with the
DatabricksPySparkStepLauncher
, so that permissions are now set correctly for non-admin users. - [dagster-dbt] Fixed an issue where column metadata generated with
fetch_column_metadata
did not work properly for models imported through dbt dependencies.
Documentation
- [dagster-k8s]
DagsterK8sPipesClient.run
now shows up in API docs.
Dagster Plus
- [ui] Fixed a bug in the catalog UI where owners filters were not applied correctly.
- [ui] Fixed width of the column lineage dropdown selector on the asset page.
- [ui] Column lineage now correctly renders when set on asset definition metadata
- [ui] Fixed Settings link on the list of deployments, for users in the legacy navigation flag.
1.8.11 (core) / 0.24.11 (libraries)
New
- [experimental] AutomationCondition.eager() will now only launch runs for missing partitions which become missing after the condition has been added to the asset. This avoids situations in which the eager policy kicks off a large amount of work when added to an asset with many missing historical static/dynamic partitions.
- [experimental] Added a new AutomationCondition.asset_matches() condition, which can apply a condition against an arbitrary asset in the graph.
- [experimental] Added the ability to specify multiple kinds for an asset with the kinds parameter.
- [dagster-github] Added
create_pull_request
method onGithubClient
that enables creating a pull request. - [dagster-github] Added
create_ref
method onGithubClient
that enables creating a new branch. - [dagster-embedded-elt] dlt assets now generate column metadata for child tables.
- [dagster-embedded-elt] dlt assets can now fetch row count metadata with
dlt.run(...).fetch_row_count()
for both partitioned and non-partitioned assets. Thanks @kristianandre! - [dagster-airbyte] relation identifier metadata is now attached to Airbyte assets.
- [dagster-embedded-elt] relation identifier metadata is now attached to sling assets.
- [dagster-embedded-elt] relation identifier metadata is now attached to dlt assets.
Bugfixes
PartitionedConfig
objects can now return aRunConfig
without causing a crash.- Corrected the
AssetIn.__new__
typing for the dagster_type argument. - [dagster-embedded-elt] dlt assets now generate correct column metadata after the first materialization.
- [dagster-embedded-elt] Sling's
fetch_row_count()
method now works for databases returning uppercase column names. Thanks @kristianandre! - [dagster-gcp] Ensure blob download is flushed to temporary file for
GCSFileManager.read
operations. Thanks @ollie-bell!
Dagster Plus
- Fixed a bug in the catalog UI where owners filters were not applied correctly.
1.8.10 (core) / 0.24.10 (libraries)
New
JobDefinition
,@job
, anddefine_asset_job
now take arun_tags
parameter. Ifrun_tags
are defined, they will be attached to all runs of the job, andtags
will not be. Ifrun_tags
is not set, thentags
are attached to all runs of the job (status quo behavior). This change enables the separation of definition-level and run-level tags on jobs.- Then env var
DAGSTER_COMPUTE_LOG_TAIL_WAIT_AFTER_FINISH
can now be used to pause before capturing logs (thanks @HynekBlaha!) - The
kinds
parameter is now available onAssetSpec
. OutputContext
now exposes theAssetSpec
of the asset that is being stored as an output (thanks, @marijncv!)- [experimental] Backfills are incorporated into the Runs page to improve observability and provide a more simplified UI. See the GitHub discussion for more details.
- [ui] The updated navigation is now enabled for all users. You can revert to the legacy navigation via a feature flag. See GitHub discussion for more.
- [ui] Improved performance for loading partition statuses of an asset job.
- [dagster-docker] Run containers launched by the DockerRunLauncher now include dagster/job_name and dagster/run_id labels.
- [dagster-aws] The ECS launcher now automatically retries transient ECS RunTask failures (like capacity placement failures).
Bugfixes
- Changed the log volume for global concurrency blocked runs in the run coordinator to be less spammy.
- [ui] Asset checks are now visible in the run page header when launched from a schedule.
- [ui] Fixed asset group outlines not rendering properly in Safari.
- [ui] Reporting a materialization event now removes the asset from the asset health "Execution failures" list and returns the asset to a green / success state.
- [ui] When setting an
AutomationCondition
on an asset, the label of this condition will now be shown in the sidebar on the Asset Details page. - [ui] Previously, filtering runs by Created date would include runs that had been updated after the lower bound of the requested time range. This has been updated so that only runs created after the lower bound will be included.
- [ui] When using the new experimental navigation flag, added a fix for the automations page for code locations that have schedules but no sensors.
- [ui] Fixed tag wrapping on asset column schema table.
- [ui] Restored object counts on the code location list view.
- [ui] Padding when displaying warnings on unsupported run coordinators has been corrected (thanks @hainenber!)
- [dagster-k8s] Fixed an issue where run termination sometimes did not terminate all step processes when using the k8s_job_executor, if the termination was initiated while it was in the middle of launching a step pod.
Documentation
- Corrections on the Dagster instance concept page (thanks @mheguy!)
- Corrections on the code locations concept page (thanks @tiberiuana!)
- Repeated words removed (thanks @tianzedavid!)
- [dagster-deltalake] Corrections and improvements (thanks @avriiil!)
- [dagster-aws] Added docs for PipesEMRServerlessClient.
- [dagster-cli] A guide on how to validate Dagster definitions using
dagster definitions validate
have been added. - [dagster-databricks] Added docs for using Databricks Pipes with existing clusters.
- [dagster-dbt] Corrected sample sql code (thanks @b-per!)
1.8.9 (core) / 0.24.9 (libraries)
New
AssetSpec
now has awith_io_manager_key
method that returns anAssetSpec
with the appropriate metadata entry to dictate the key for the IO manager used to load it. The deprecation warning forSourceAsset
now references this method.- Added a
max_runtime_seconds
configuration option to run monitoring, allowing you to specify that any run in your Dagster deployment should terminate if it exceeds a certain runtime. Prevoiusly, jobs had to be individually tagged with adagster/max_runtime
tag in order to take advantage of this feature. Jobs and runs can still be tagged in order to override this value for an individual run. - It is now possible to set both
tags
and a customexecution_fn
on aScheduleDefinition
. Scheduletags
are intended to annotate the definition and can be used to search and filter in the UI. They will not be attached to run requests emitted from the schedule if a customexecution_fn
is provided. If no customexecution_fn
is provided, then for back-compatibility the tags will also be automatically attached to run requests emitted from the schedule. SensorDefinition
and all of its variants/decorators now accept atags
parameter. The tags annotate the definition and can be used to search and filter in the UI.- Added the
dagster definitions validate
command to Dagster CLI. This command validates if Dagster definitions are loadable. - [dagster-databricks] Databricks Pipes now allow running tasks in existing clusters.
Bugfixes
- Fixed an issue where calling
build_op_context
in a unit test would sometimes raise aTypeError: signal handler must be signal.SIG_IGN, signal.SIG_DFL, or a callable object
Exception on process shutdown. - [dagster-webserver] Fix an issue where the incorrect sensor/schedule state would appear when using
DefaultScheduleStatus.STOPPED
/DefaultSensorStatus.STOPPED
after performing a reset.
Documentation
- [dagster-pipes] Fixed inconsistencies in the k8s pipes example.
- [dagster-pandas-pyspark] Fixed example in the Spark/Pandas SDA guide.
Dagster Plus
- Fixed an issue where users with Launcher permissions for a particular code location were not able to cancel backfills targeting only assets in that code location.
- Fixed an issue preventing long-running alerts from being sent when there was a quick subsequent run.
1.8.8 (core) / 0.24.8 (libraries)
New
- Added
--partition-range
option todagster asset materialize
CLI. This option only works for assets with single-run Backfill Policies. - Added a new
.without()
method toAutomationCondition.eager()
,AutomationCondition.on_cron()
, andAutomationCondition.on_missing()
which allows sub-conditions to be removed, e.g.AutomationCondition.eager().without(AutomationCondition.in_latest_time_window())
. - Added
AutomationCondition.on_missing()
, which materializes an asset partition as soon as all of its parent partitions are filled in. pyproject.toml
can now load multiple Python modules as individual Code Locations. Thanks, @bdart!- [ui] If a code location has errors, a button will be shown to view the error on any page in the UI.
- [dagster-adls2] The
ADLS2PickleIOManager
now acceptslease_duration
configuration. Thanks, @0xfabioo! - [dagster-embedded-elt] Added an option to fetch row count metadata after running a Sling sync by calling
sling.replicate(...).fetch_row_count()
. - [dagster-fivetran] The dagster-fivetran integration will now automatically pull and attach column schema metadata after each sync.
Bugfixes
- Fixed an issue which could cause errors when using
AutomationCondition.any_downstream_condition()
with downstreamAutoMaterializePolicy
objects. - Fixed an issue where
process_config_and_initialize
did not properly handle processing nested resource config. - [ui] Fixed an issue that would cause some AutomationCondition evaluations to be labeled
DepConditionWrapperCondition
instead of the key that they were evaluated against. - [dagster-webserver] Fixed an issue with code locations appearing in fluctuating incorrect state in deployments with multiple webserver processes.
- [dagster-embedded-elt] Fixed an issue where Sling column lineage did not correctly resolve int the Dagster UI.
- [dagster-k8s] The
wait_for_pod
check now waits until all pods are available, rather than erroneously returning after the first pod becomes available. Thanks @easontm!
Dagster Plus
- Backfill daemon logs are now available in the "Coordinator Logs" tab in a backfill details page.
- Users without proper code location permissions can no longer edit sensor cursors.
1.8.7 (core) / 0.24.7 (libraries)
New
- The
AssetSpec
constructor now raises an error if an invalid group name is provided, instead of an error being raised when constructing theDefinitions
object. dagster/relation_identifier
metadata is now automatically attached to assets which are stored using a DbIOManager.- [ui] Streamlined the code location list view.
- [ui] The “group by” selection on the Timeline Overview page is now part of the query parameters, meaning it will be retained when linked to directly or when navigating between pages.
- [dagster-dbt] When instantiating
DbtCliResource
, theproject_dir
argument will now override theDBT_PROJECT_DIR
environment variable if it exists in the local environment (thanks, @marijncv!). - [dagster-embedded-elt] dlt assets now generate
rows_loaded
metadata (thanks, @kristianandre!). - Added support for pydantic version 1.9.0.
Bugfixes
- Fixed a bug where setting
asset_selection=[]
onRunRequest
objects yielded from sensors usingasset_selection
would select all assets instead of none. - Fixed bug where the tick status filter for batch-fetched graphql sensors was not being respected.
- [examples] Fixed missing assets in
assets_dbt_python
example. - [dagster-airbyte] Updated the op names generated for Airbyte assets to include the full connection ID, avoiding name collisions.
- [dagster-dbt] Fixed issue causing dagster-dbt to be unable to load dbt projects where the adapter did not have a
database
field set (thanks, @dargmuesli!) - [dagster-dbt] Removed a warning about not being able to load the
dbt.adapters.duckdb
module when loading dbt assets without that package installed.
Documentation
- Fixed typo on the automation concepts page (thanks, @oedokumaci!)
Dagster Plus
- You may now wipe specific asset partitions directly from the execution context in user code by calling
DagsterInstance.wipe_asset_partitions
. - Dagster+ users with a "Viewer" role can now create private catalog views.
- Fixed an issue where the default IOManager used by Dagster+ Serverless did not respect setting
allow_missing_partitions
as metadata on a downstream asset.
1.8.6 (core) / 0.24.6 (libraries)
Bugfixes
- Fixed an issue where runs in Dagster+ Serverless that materialized partitioned assets would sometimes fail with an
object has no attribute '_base_path'
error. - [dagster-graphql] Fixed an issue where the
statuses
filter argument to thesensorsOrError
GraphQL field was sometimes ignored when querying GraphQL for multiple sensors at the same time.
1.8.5 (core) / 0.24.5 (libraries)
New
- Updated multi-asset sensor definition to be less likely to timeout queries against the asset history storage.
- Consolidated the
CapturedLogManager
andComputeLogManager
APIs into a single base class. - [ui] Added an option under user settings to clear client side indexeddb caches as an escape hatch for caching related bugs.
- [dagster-aws, dagster-pipes] Added a new
PipesECSClient
to allow Dagster to interface with ECS tasks. - [dagster-dbt] Increased the default timeout when terminating a run that is running a
dbt
subprocess to wait 25 seconds for the subprocess to cleanly terminate. Previously, it would only wait 2 seconds. - [dagster-sdf] Increased the default timeout when terminating a run that is running an
sdf
subprocess to wait 25 seconds for the subprocess to cleanly terminate. Previously, it would only wait 2 seconds. - [dagster-sdf] Added support for caching and asset selection (Thanks, akbog!)
- [dagster-dlt] Added support for
AutomationCondition
usingDagsterDltTranslator.get_automation_condition()
(Thanks, aksestok!) - [dagster-k8s] Added support for setting
dagsterDaemon.runRetries.retryOnAssetOrOpFailure
to False in the Dagster Helm chart to prevent op retries and run retries from simultaneously firing on the same failure. - [dagster-wandb] Removed usage of deprecated
recursive
parameter (Thanks, chrishiste!)
Bugfixes
- [ui] Fixed a bug where in-progress runs from a backfill could not be terminated from the backfill UI.
- [ui] Fixed a bug that caused an "Asset must be part of at least one job" error when clicking on an external asset in the asset graph UI
- Fixed an issue where viewing run logs with the latest 5.0 release of the watchdog package raised an exception.
- [ui] Fixed issue causing the “filter to group” action in the lineage graph to have no effect.
- [ui] Fixed case sensitivity when searching for partitions in the launchpad.
- [ui] Fixed a bug which would redirect to the events tab for an asset if you loaded the partitions tab directly.
- [ui] Fixed issue causing runs to get skipped when paging through the runs list (Thanks, @HynekBlaha!)
- [ui] Fixed a bug where the asset catalog list view for a particular group would show all assets.
- [dagster-dbt] fix bug where empty newlines in raw dbt logs were not being handled correctly.
- [dagster-k8s, dagster-celery-k8s] Correctly set
dagster/image
label when image is provided fromuser_defined_k8s_config
. (Thanks, @HynekBlaha!) - [dagster-duckdb] Fixed an issue for DuckDB versions older than 1.0.0 where an unsupported configuration option,
custom_user_agent
, was provided by default - [dagster-k8s] Fixed an issue where Kubernetes Pipes failed to create a pod if the op name contained capital or non-alphanumeric containers.
- [dagster-embedded-elt] Fixed an issue where dbt assets downstream of Sling were skipped
Deprecations
- [dagser-aws]: Direct AWS API arguments in
PipesGlueClient.run
have been deprecated and will be removed in1.9.0
. The newparams
argument should be used instead.
Dagster Plus
- Fixed a bug that caused an error when loading the launchpad for a partition, when using Dagster+ with an agent with version below 1.8.2.
- Fixed an issue where terminating a Dagster+ Serverless run wouldn’t forward the termination signal to the job to allow it to cleanly terminate.
1.8.4 (core) / 0.24.4 (libraries)
Bugfixes
- Fixed an issue where viewing run logs with the latest 5.0 release of the watchdog package raised an exception.
- Fixed a bug that caused an "Asset must be part of at least one job" error when clicking on an external asset in the asset graph UI
Dagster Plus
- The default io_manager on Serverless now supports the
allow_missing_partitions
configuration option. - Fixed a bug that caused an error when loading the launchpad for a partition, when using in Dagster+ with an agent with version below 1.8.2
1.8.3 (core) / 0.24.3 (libraries) (YANKED)
This version of Dagster resulted in errors when trying to launch runs that target individual asset partitions)
New
- When different assets within a code location have different
PartitionsDefinition
s, there will no longer be an implicit asset job__ASSET_JOB_...
for eachPartitionsDefinition
; there will just be one with all the assets. This reduces the time it takes to load code locations with assets with many differentPartitionsDefinition
s.
1.8.2 (core) / 0.24.2 (libraries)
New
- [ui] Improved performance of the Automation history view for partitioned assets
- [ui] You can now delete dynamic partitions for an asset from the ui
- [dagster-sdf] Added support for quoted table identifiers (Thanks, @akbog!)
- [dagster-openai] Add additional configuration options for the
OpenAIResource
(Thanks, @chasleslr!) - [dagster-fivetran] Fivetran assets now have relation identifier metadata.
Bugfixes
- [ui] Fixed a collection of broken links pointing to renamed Declarative Automation pages.
- [dagster-dbt] Fixed issue preventing usage of
MultiPartitionMapping
with@dbt_assets
(Thanks, @arookieds!) - [dagster-azure] Fixed issue that would cause an error when configuring an
AzureBlobComputeLogManager
without asecret_key
(Thanks, @ion-elgreco and @HynekBlaha!)
Documentation
- Added API docs for
AutomationCondition
and associated static constructors. - [dagster-deltalake] Corrected some typos in the integration reference (Thanks, @dargmuesli!)
- [dagster-aws] Added API docs for the new
PipesCloudWatchMessageReader
1.8.1 (core) / 0.24.1 (libraries)
New
- If the sensor daemon fails while submitting runs, it will now checkpoint its progress and attempt to submit the remaining runs on the next evaluation.
build_op_context
andbuild_asset_context
now accepts arun_tags
argument.- Nested partially configured resources can now be used outside of
Definitions
. - [ui] Replaced GraphQL Explorer with GraphiQL.
- [ui] The run timeline can now be grouped by job or by automation.
- [ui] For users in the experimental navigation flag, schedules and sensors are now in a single merged automations table.
- [ui] Logs can now be filtered by metadata keys and values.
- [ui] Logs for
RUN_CANCELED
events now display relevant error messages. - [dagster-aws] The new
PipesCloudWatchMessageReader
can consume logs from CloudWatch as pipes messages. - [dagster-aws] Glue jobs launched via pipes can be automatically canceled if Dagster receives a termination signal.
- [dagster-azure]
AzureBlobComputeLogManager
now supports service principals, thanks @ion-elgreco! - [dagster-databricks]
dagster-databricks
now supportsdatabricks-sdk<=0.17.0
. - [dagster-datahub]
dagster-datahub
now allows pydantic versions below 3.0.0, thanks @kevin-longe-unmind! - [dagster-dbt] The
DagsterDbtTranslator
class now supports a modfiying theAutomationCondition
for dbt models by overridingget_automation_condition
. - [dagster-pandera]
dagster-pandera
now supportspolars
. - [dagster-sdf] Table and columns tests can now be used as asset checks.
- [dagster-embedded-elt] Column metadata and lineage can be fetched on Sling assets by chaining the new
replicate(...).fetch_column_metadata()
method. - [dagster-embedded-elt] dlt resource docstrings will now be used to populate asset descriptions, by default.
- [dagster-embedded-elt] dlt assets now generate column metadata.
- [dagster-embedded-elt] dlt transformers now refer to the base resource as upstream asset.
- [dagster-openai]
OpenAIResource
now supportsorganization
,project
andbase_url
for configurting the OpenAI client, thanks @chasleslr! - [dagster-pandas][dagster-pandera][dagster-wandb] These libraries no longer pin
numpy<2
, thanks @judahrand!
Bugfixes
- Fixed a bug for job backfills using backfill policies that materialized multiple partitions in a single run would be launched multiple times.
- Fixed an issue where runs would sometimes move into a FAILURE state rather than a CANCELED state if an error occurred after a run termination request was started.
- [ui] Fixed a bug where an incorrect dialog was shown when canceling a backfill.
- [ui] Fixed the asset page header breadcrumbs for assets with very long key path elements.
- [ui] Fixed the run timeline time markers for users in timezones that have off-hour offsets.
- [ui] Fixed bar chart tooltips to use correct timezone for timestamp display.
- [ui] Fixed an issue introduced in the 1.8.0 release where some jobs created from graph-backed assets were missing the “View as Asset Graph” toggle in the Dagster UI.
Breaking Changes
- [dagster-airbyte]
AirbyteCloudResource
now supportsclient_id
andclient_secret
for authentication - theapi_key
approach is no longer supported. This is motivated by the deprecation of portal.airbyte.com on August 15, 2024.
Deprecations
- [dagster-databricks] Removed deprecated authentication clients provided by
databricks-cli
anddatabricks_api
- [dagster-embedded-elt] Removed deprecated Sling resources
SlingSourceConnection
,SlingTargetConnection
- [dagster-embedded-elt] Removed deprecated Sling resources
SlingSourceConnection
,SlingTargetConnection
- [dagster-embedded-elt] Removed deprecated Sling methods
build_sling_assets
, andsync
Documentation
- The Integrating Snowflake & dbt with Dagster+ Insights guide no longer erroneously references BigQuery, thanks @dnxie12!
1.8.0 (core) / 0.24.0 (libraries)
Major changes since 1.7.0 (core) / 0.22.0 (libraries)
Core definition APIs
- You can now pass
AssetSpec
objects to theassets
argument ofDefinitions
, to let Dagster know about assets without associated materialization functions. This replaces the experimentalexternal_assets_from_specs
API, as well asSourceAsset
s, which are now deprecated. UnlikeSourceAsset
s,AssetSpec
s can be used for non-materializable assets with dependencies on Dagster assets, such as BI dashboards that live downstream of warehouse tables that are orchestrated by Dagster. [docs]. - [Experimental] You can now merge
Definitions
objects together into a single largerDefinitions
object, using the newDefinitions.merge
API (doc). This makes it easier to structure large Dagster projects, as you can construct aDefinitions
object for each sub-domain and then merge them together at the top level.
Partitions and backfills
BackfillPolicy
s assigned to assets are now respected for backfills launched from jobs that target those assets.- You can now wipe materializations for individual asset partitions.
Automation
- [Experimental] You can now add
AutomationCondition
s to your assets to have them automatically executed in response to specific conditions (docs). These serve as a drop-in replacement and improvement over theAutoMaterializePolicy
system, which is being marked as deprecated. - [Experimental] Sensors and schedules can now directly target assets, via the new
target
parameter, instead of needing to construct a job. - [Experimental] The Timeline page can now be grouped by job or automation. When grouped by automation, all runs launched by a sensor responsible for evaluating automation conditions will get bucketed to that sensor in the timeline instead of the "Ad-hoc materializations" row. Enable this by opting in to the
Experimental navigation
feature flag in user settings.
Catalog
- The Asset Details page now prominently displays row count and relation identifier (table name, schema, database), when corresponding asset metadata values are provided. For more information, see the metadata and tags docs.
- Introduced code reference metadata which can be used to open local files in your editor, or files in source control in your browser. Dagster can automatically attach code references to your assets’ Python source. For more information, see the docs.
Data quality and reliability
- [Experimental] Metadata bound checks – The new
build_metadata_bounds_checks
API [doc] enables easily defining asset checks that fail if a numeric asset metadata value falls outside given bounds. - [Experimental] Freshness checks from dbt config - Freshness checks can now be set on dbt assets, straight from dbt. Check out the API docs for build_freshness_checks_from_dbt_assets for more.
Integrations
- Dagster Pipes (
PipesSubprocessClient
) and its integrations with Lambda (PipesLambdaClient
), Kubernetes (PipesK8sClient
), and Databricks (PipesDatabricksClient
) are no longer experimental. - The new
DbtProject
class (docs) makes it simpler to define dbt assets that can be constructed in both development and production.DbtProject.prepare_if_dev()
eliminates boilerplate for local development, and thedagster-dbt project prepare-and-package
CLI can helps pull deps and generate the manifest at build time. - [Experimental] The
dagster-looker
package can be used to define a set of Dagster assets from a Looker project that is defined in LookML and is backed by git. See the GitHub discussion for more details.
Dagster Plus
- Catalog views — In Dagster+, selections into the catalog can now be saved and shared across an organization as catalog views. Catalog views have a name and description, and can be applied to scope the catalog, asset health, and global asset lineage pages against the view’s saved selection.
- Code location history — Dagster+ now stores a history of code location deploys, including the ability to revert to a previously deployed configuration.
Changes since 1.7.16 (core) / 0.22.16 (libraries)
New
-
The target of both schedules and sensors can now be set using an experimental
target
parameter that accepts anAssetSelection
or list of assets. Any assets passed this way will also be included automatically in theassets
list of the containingDefinitions
object. -
ScheduleDefinition
andSensorDefinition
now have atarget
argument that can accept anAssetSelection
. -
You can now wipe materializations for individual asset partitions.
-
AssetSpec
now has apartitions_def
attribute. All theAssetSpec
s provided to a@multi_asset
must have the samepartitions_def
. -
The
assets
argument onmaterialize
now acceptsAssetSpec
s. -
The
assets
argument onDefinitions
now acceptsAssetSpec
s. -
The new
merge
method onDefinitions
enables combining multipleDefinitions
object into a single largerDefinition
s object with their combined contents. -
Runs requested through the Declarative Automation system now have a
dagster/from_automation_condition: true
tag applied to them. -
Changed the run tags query to be more performant. Thanks @egordm!
-
Dagster Pipes and its integrations with Lambda, Kubernetes, and Databricks are no longer experimental.
-
The
Definitions
constructor will no longer raise errors when the provided definitions aren’t mutually resolve-able – e.g. when there are conflicting definitions with the same name, unsatisfied resource dependencies, etc. These errors will still be raised at code location load time. The newDefinitions.validate_loadable
static method also allows performing the validation steps that used to occur in constructor. -
AssetsDefinitions
object provided to aDefinitions
object will now be deduped by reference equality. That is, the following will now work:from dagster import asset, Definitions
@asset
def my_asset(): ...
defs = Definitions(assets=[my_asset, my_asset]) # Deduped into just one AssetsDefinition. -
[dagster-embedded-elt] Adds translator options for dlt integration to override auto materialize policy, group name, owners, and tags
-
[dagster-sdf] Introducing the dagster-sdf integration for data modeling and transformations powered by sdf.
-
[dagster-dbt] Added a new
with_insights()
method which can be used to more easily attach Dagster+ Insights metrics to dbt executions:dbt.cli(...).stream().with_insights()
Bugfixes
- Dagster now raises an error when an op yields an output corresponding to an unselected asset.
- Fixed a bug that caused downstream ops within a graph-backed asset to be skipped when they were downstream of assets within the graph-backed assets that aren’t part of the selection for the current run.
- Fixed a bug where code references did not work properly for self-hosted GitLab instances. Thanks @cooperellidge!
- [ui] When engine events with errors appear in run logs, their metadata entries are now rendered correctly.
- [ui] The asset catalog greeting now uses your first name from your identity provider.
- [ui] The create alert modal now links to the alerting documentation, and links to the documentation have been updated.
- [ui] Fixed an issue introduced in the 1.7.13 release where some asset jobs were only displaying their ops in the Dagster UI instead of their assets.
- Fixed an issue where terminating a run while it was using the Snowflake python connector would sometimes move it into a FAILURE state instead of a CANCELED state.
- Fixed an issue where backfills would sometimes move into a FAILURE state instead of a CANCELED state when the backfill was canceled.
Breaking Changes
- The experimental and deprecated
build_asset_with_blocking_check
has been removed. Use theblocking
argument on@asset_check
instead. - Users with
mypy
andpydantic
1 may now experience a “metaclass conflict” error when usingConfig
. Previously this would occur when using pydantic 2. AutoMaterializeSensorDefinition
has been renamedAutomationConditionSensorDefinition
.- The deprecated methods of the
ComputeLogManager
have been removed. CustomComputeLogManager
implementations must also implement theCapturedLogManager
interface. This will not affect any of the core implementations available in the coredagster
package or the library packages. - By default, an
AutomationConditionSensorDefinition
with the name“default_automation_condition_sensor”
will be constructed for each code location, and will handle evaluating and launching runs for allAutomationConditions
andAutoMaterializePolicies
within that code location. You can restore the previous behavior by setting:in your dagster.yaml file.auto_materialize:
use_sensors: False - [dagster-dbt] Support for
dbt-core==1.6.*
has been removed because the version is now end-of-life. - [dagster-dbt] The following deprecated APIs have been removed:
KeyPrefixDagsterDbtTranslator
has been removed. To modify the asset keys for a set of dbt assets, implementDagsterDbtTranslator.get_asset_key()
instead.- Support for setting freshness policies through dbt metadata on field
+meta.dagster_freshness_policy
has been removed. Use+meta.dagster.freshness_policy
instead. - Support for setting auto-materialize policies through dbt metadata on field
+meta.dagster_auto_materialize_policy
has been removed. Use+meta.dagster.auto_materialize_policy
instead. - Support for
load_assets_from_dbt_project
,load_assets_from_dbt_manifest
, anddbt_cli_resource
has been removed. Use@dbt_assets
,DbtCliResource
, andDbtProject
instead to define how to load dbt assets from a dbt project and to execute them. - Support for rebuilt ops like
dbt_run_op
,dbt_compile_op
, etc has been removed. Use@op
andDbtCliResource
directly to execute dbt commands in an op.
- Properties on
AssetExecutionContext
,OpExecutionContext
, andScheduleExecutionContext
that includedatetime
s now return standard Pythondatetime
objects instead of Pendulum datetimes. The types in the public API for these properties have always beendatetime
and this change should not be breaking in the majority of cases, but Pendulum datetimes include some additional methods that are not present on standard Pythondatetime
s, and any code that was using those methods will need to be updated to either no longer use those methods or transform thedatetime
into a Pendulum datetime. See the 1.8 migration guide for more information and examples. MemoizableIOManager
,VersionStrategy
,SourceHashVersionStrategy
,OpVersionContext
,ResourceVersionContext
, andMEMOIZED_RUN_TAG
, which have been deprecated and experimental since pre-1.0, have been removed.
Deprecations
- The Run Status column of the Backfills page has been removed. This column was only populated for backfills of jobs. To see the run statuses for job backfills, click on the backfill ID to get to the Backfill Details page.
- The experimental
external_assets_from_specs
API has been deprecated. Instead, you can directly passAssetSpec
objects to theassets
argument of theDefinitions
constructor. AutoMaterializePolicy
has been marked as deprecated in favor ofAutomationCondition
, which provides a significantly more flexible and customizable interface for expressing when an asset should be executed. More details on how to migrate yourAutoMaterializePolicies
can be found in the Migration Guide.SourceAsset
has been deprecated. See the major changes section and migration guide for more details.- The
asset_partition_key_for_output
,asset_partition_keys_for_output
, andasset_partition_key_range_for_output
, andasset_partitions_time_window_for_output
methods onOpExecutionContext
have been deprecated. Instead, use the corresponding property:partition_key
,partition_keys
,partition_key_range
, orpartition_time_window
. - The
partitions_def
parameter ondefine_asset_job
is now deprecated. Thepartitions_def
for an asset job is determined from thepartitions_def
attributes on the assets it targets, so this parameter is redundant. - [dagster-shell]
create_shell_command_op
andcreate_shell_script_op
have been marked as deprecated in favor ofPipesSubprocessClient
(see details in Dagster Pipes subprocess reference) - [dagster-airbyte]
load_assets_from_airbyte_project
is now deprecated, because the Octavia CLI that it relies on is an experimental feature that is no longer supported. Usebuild_airbyte_assets
orload_assets_from_airbyte_project
instead.
Documentation
- The Asset Checks concept overview page now includes a table with all the built-in asset checks.
- The Asset Metadata page concept page now includes a table with all the standard “dagster/” metadata keys.
- Fixed a typo in the documentation for
MonthlyPartitionsDefinition
. Thanks @zero_stroke! - Added a new page about Declarative Automation and a guide about customizing automation conditions
- Fixed a link in the Limiting concurrency guide.
Dagster Plus
- In Dagster+, selections into the catalog can now be saved and shared across an organization as catalog views. Catalog views have a name and description, and can be applied to scope the catalog, asset health, and global asset lineage pages against the view’s saved selection.
- In Dagster+ run alerts, if you are running Dagster 1.8 or greater in your user code, you will now receive exception-level information in the alert body.
1.7.16 (core) / 0.23.16 (libraries)
Experimental
- [pipes] PipesGlueClient, an AWS Glue pipes client has been added to
dagster_aws
.
1.7.15 (core) / 0.23.15 (libraries)
New
- [dagster-celery-k8s] Added a
per_step_k8s_config
configuration option to thecelery_k8s_job_executor
, allowing the k8s configuration of individual steps to be configured at run launch time. Thanks @alekseik1! - [dagster-dbt] Deprecated the
log_column_level_metadata
macro in favor of the newwith_column_metadata
API. - [dagster-airbyte] Deprecated
load_assets_from_airbyte_project
as the Octavia CLI has been deprecated.
Bugfixes
- [ui] Fix global search to find matches on very long strings.
- Fixed an issue introduced in the 1.7.14 release where multi-asset sensors would sometimes raise an error about fetching too many event records.
- Fixes an issue introduced in 1.7.13 where type-checkers interpretted the return type of
RunRequest(...)
asNone
- [dagster-aws] Fixed an issue where the
EcsRunLauncher
would sometimes fail to launch runs when theinclude_sidecars
option was set toTrue
. - [dagster-dbt] Fixed an issue where errors would not propagate through deferred metadata fetches.
Dagster Plus
- On June 20, 2024, AWS changed the AWS CloudMap CreateService API to allow resource-level permissions. The Dagster+ ECS Agent uses this API to launch code locations. We’ve updated the Dagster+ ECS Agent CloudFormation template to accommodate this change for new users. Existing users have until October 14, 2024 to add the new permissions and should have already received similar communication directly from AWS.
- Fixed a bug with BigQuery cost tracking in Dagster+ insights, where some runs would fail if there were null values for either
total_byte_billed
ortotal_slot_ms
in the BigQueryINFORMATION_SCHEMA.JOBS
table. - Fixed an issue where code locations that failed to load with extremely large error messages or stack traces would sometimes cause errors with agent heartbeats until the code location was redeployed.
1.7.14 (core) / 0.23.14 (libraries)
New
- [blueprints] When specifying an asset key in
ShellCommandBlueprint
, you can now use slashes as a delimiter to generate anAssetKey
with multiple path components. - [community-controbution][mlflow] The mlflow resource now has a
mlflow_run_id
attribute (Thanks Joe Percivall!) - [community-contribution][mlflow] The mlflow resource will now retry when it fails to fetch the mlflow run ID (Thanks Joe Percivall!)
Bugfixes
- Fixed an issue introduced in the 1.7.13 release where Dagster would fail to load certain definitions when using Python 3.12.4.
- Fixed an issue where in-progress steps would continue running after an unexpected exception caused a run to fail.
- [dagster-dbt] Fixed an issue where column lineage was unable to be built in self-referential incremental models.
- Fixed an issue where
dagster dev
was logging unexpectedly without thegrpcio<1.65.0
pin. - Fixed an issue where a
ContextVar was created in a different context
error was raised when executing an async asset. - [community-contribution]
multi_asset
type-checker fix from @aksestok, thanks! - [community-contribution][ui] Fix to use relative links for manifest/favicon files, thanks @aebrahim!
Documentation
- [community-contribution] Fixed helm repo CLI command typo, thanks @fxd24!
Dagster Plus
- [ui] The deployment settings yaml editor is now on a page with its own URL, instead of within a dialog.
1.7.13 (core) / 0.23.13 (libraries)
New
- The
InputContext
passed to anIOManager
’sload_input
function when invoking theoutput_value
oroutput_for_node
methods onJobExecutionResult
now has the name"dummy_input_name"
instead ofNone
. - [dagster-ui] Asset materializations can now be reported from the dropdown menu in the asset list view.
- [dagster-dbt]
DbtProject
is adopted and no longer experimental. UsingDbtProject
helps achieve a setup where the dbt manifest file and dbt dependencies are available and up-to-date, during development and in production. Check out the API docs for more: https://docs.dagster.io/_apidocs/libraries/dagster-dbt#dagster_dbt.DbtProject. - [dagster-dbt] The
—use-dbt-project
flag was introduced for the cli commanddagster-dbt project scaffold
. Creating a Dagster project wrapping a dbt project using that flag will include aDbtProject
. - [dagster-ui] The Dagster UI now loads events in batches of 1000 in the run log viewer, instead of batches of 10000. This value can be adjusted by setting the
DAGSTER_UI_EVENT_LOAD_CHUNK_SIZE
environment variable on the Dagster webserver. - Asset backfills will now retry if there is an unexpected exception raised in the middle of the backfill. Previously, they would only retry if there was a problem connecting to the code server while launching runs in the backfill.
- Added the ability to monitor jobs which have failed to start in time with the
RunFailureReason.START_TIMEOUT
run monitoring failure reason. Thanks @jobicarter! - [experimental] Introduced the ability to attach code references to your assets, which allow you to view source code for an asset in your editor or in git source control. For more information, see the code references docs: https://docs.dagster.io/guides/dagster/code-references.
- [ui] Performance improvements to loading the asset overview tab.
- [ui] Performance improvements for rendering gantt charts with 1000’s of ops/steps.
- [dagster-celery] Introduced a new Dagster Celery runner, a more lightweight way to run Dagster jobs without an executor. Thanks, @egordm!
Bugfixes
- Fixed a bug that caused tags added to
ObserveResult
objects to not be stored with the producedAssetObservation
event. - Fixed a bug which could cause
metadata
defined onSourceAssets
to be unavailable when accessed in an IOManager. - For subselections of graph-backed multi-assets, there are some situations where we used to unnecessarily execute some of the non-selected assets. Now, we no longer execute them in those situations. There are also some situations where we would skip execution of some ops that might be needed. More information on the particulars is available here.
- Fixed the
@graph_asset
decorator overload missing anowners
argument, thanks @askvinni! - Fixed behavior of passing custom image config to the K8sRunLauncher, thanks @marchinho11!
- [dagster-dbt] Fixed an issue with emitting column lineage when using BigQuery.
- [dagster-k8s] Added additional retries to
execute_k8s_job
when there was a transient failure while loading logs from the launched job. Thanks @piotrmarczydlo! - [dagster-fivetran] Fixed an issue where the Fivetran connector resource would sometimes hang if there was a networking issue while connecting to the Fivetran API.
- [dagster-aws] Fixed an issue where the EMR step launcher would sometimes fail due to multiple versions of the
dateutil
package being installed in the default EMR python evnrionment. - [ui] The “Create date” column in the runs table now correctly shows the time at which a run was created instead of the time when it started to execute.
- [ui] Fixed dark mode colors in run partitions graphs.
- [auto-materialize] Fixed an issue which could cause errors in the
AutoMaterializeRule.skip_on_parent_missing
rule when a parent asset had itsPartitionsDefinition
changed. - [declarative-automation] Fixed an issue which could cause errors when viewing the evaluation history of assets with
AutomationConditions
. - [declarative-automation] Previously,
AutomationCondition.newly_updated()
would trigger on anyASSET_OBSERVATION
event. Now, it only triggers when the data version on that event changes.
Breaking Changes
- [dagster-dbt] The cli command
dagster-dbt project prepare-for-deployment
has been replaced bydagster-dbt project prepare-and-package
. - [dagster-dbt] During development,
DbtProject
no longer prepares the dbt manifest file and dbt dependencies in its constructor during initialization. This process has been moved toprepare_if_dev()
, that can be called on theDbtProject
instance after initialization. Check out the API docs for more: https://docs.dagster.io/_apidocs/libraries/dagster-dbt#dagster_dbt.DbtProject.prepare_if_dev.
Deprecations
- Passing
GraphDefinition
as thejob
argument to schedules and sensors is deprecated. Derive a job from theGraphDefinition
usinggraph_def.to_job()
and pass this instead.
Documentation
- Added some additional copy, headings, and other formatting to the dbt quickstart.
- Added information about asset checks to the Testing assets guide.
- Updated
dagster-plus CLI
in the sidenav to correctly bedagster-cloud CLI
. - Thanks to Tim Nordenfur and Dimitar Vanguelov for fixing a few typos!
- Introduced guides to migrate Airflow pipelines to Dagster that leverage the TaskFlow API or are containerized and executed with an operator like the KubernetesPodOperator.
- Fixed instructions on setting secrets in Kubernetes Dagster deployments, thanks @abhinavDhulipala!
Dagster Plus
- A history of code location deploys can now be viewed on the Code Locations tab under the Deployment view. Previously deployed versions can now be restored from history.
- [ui] Various improvements have been made to the asset health dashboard, which is now no longer experimental.
- [ui] Fixed issues in per-event asset insights where bar charts incorrectly displayed events in reverse order, and with UTC timestamps.
- Fixed a recent regression where creating an alert that notifies asset owners that are teams raises an error.
1.7.12 (core)/ 0.23.12 (libraries)
Bugfixes
- [ui] fixes behavior issues with jobs and asset pages introduced in 1.7.11
1.7.11 (core)/ 0.23.11 (libraries)
New
- [ui] Improved performance for loading assets that are part of big asset graphs.
- [ui] Improved performance for loading job backfills that have thousands of partitions
- [ui] The code location page can now be filtered by status
- [agent] K8s and ECS agent main loop writes a sentinel file that can be used for liveness checks.
- [agent][experimental] ECS CloudFormation template with private IP addresses using NAT Gateways, security groups, IAM role separation, tighter permissions requirements, and improved documentation.
- Ephemeral asset jobs are now supported in run status sensors (thanks @the4thamigo-uk)!
Bugfixes
- In
AssetsDefinition
construction, enforce single key per output name - Fixed a bug where freshness checks on assets with both observations and materializations would incorrectly miss a materialization if there’s no observation with
dagster/last_updated_timestamp
. - Fixed a bug with anomaly detection freshness checks where “not enough records” result would cause the sensor to crash loop.
- Fixed a bug that could cause errors in the Asset Daemon if an asset using
AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron()
rule gained a new dependency with a different PartitionsDefinition. - [ui] Fixed an issue that caused the backfill page not to be scrollable.
- [ui] Fixed an issue where filtering by partition on the Runs page wouldn’t work if fetching all of your partitions timed out.
- [dagster-dlt] Fixed bug with dlt integration in which partitioned assets would change the file name when using the filesystem destination.
- [ui] Fixed an issue where an erroring code location would cause multiple toast popups.
- Allow a string to be provided for
source_key_prefix
arg ofload_assets_from_modules
. (thanks @drjlin)! - Added a missing debug level log message when loading partitions with polars (thanks Daniel Gafni)!
- Set postgres timeout via statement, which improves storage-layer compatibility with Amazon RDS (thanks @james lewis)!
- In DBT integration, quote the table identifiers to handle cases where table names require quotes due to special characters. (thanks @alex launi)!
- remove deprecated param usage in dagster-wandb integration (thanks @chris histe)!
- Add missing QUEUED state to DatabricksRunLifeCycleState (thanks @gabor ratky)!
- Fixed a bug with dbt-cloud integration subsetting implementation (thanks @ivan tsarev)!
Breaking Changes
- [dagster-airflow]
load_assets_from_airflow_dag
no longer allows multiple tasks to materialize the same asset.
Documentation
- Added type-hinting to backfills example
- Added syntax highlighting to some examples (thanks @Niko)!
- Fixed broken link (thanks @federico caselli)!
Dagster Plus
- The
dagster-cloud ci init
CLI will now use the--deployment
argument as the base deployment when creating a branch deployment. This base deployment will be used for Change Tracking. - The BigQuery dbt insights wrapper
dbt_with_bigquery_insights
now respects CLI arguments for profile configuration and also selects location / dataset from the profile when available. - [experimental feature] Fixes a recent regression where the UI errored upon attempting to create an insights metric alert.
1.7.10 (core)/ 0.23.10 (libraries)
New
- Performance improvements when rendering the asset graph while runs are in progress.
- A new API
build_freshness_checks_for_dbt_assets
which allows users to parameterize freshness checks entirely within dbt. Check out the API docs for more: https://docs.dagster.io/_apidocs/libraries/dagster-dbt#dbt-dagster-dbt. - Asset search results now display compute and storage kind icons.
- Asset jobs where the underlying assets have multiple backfill policies will no longer fail at definition time. Instead, the backfill policy for the job will use the minimum
max_partitions_per_run
from the job’s constituent assets. - [dagstermill]
asset_tags
can now be specified when building dagstermill assets - [dagster-embedded-elt] Custom asset tags can be applied to Sling assets via the
DagsterSlingTranslator
- [dagster-embedded-elt] dlt assets now automatically have
dagster/storage_kind
tags attached
Bugfixes
tags
passed toouts
ingraph_multi_asset
now get correctly propagated to the resulting assets.- [ui] Fixed an issue in the where when multiple runs were started at the same time to materialize the same asset, the most recent one was not always shown as in progress in the asset graph in the Dagster UI.
- The “newly updated” auto-materialize rule will now respond to either new observations or materializations for observable assets.
build_metadata_bounds_checks
now no longer errors when targeting metadata keys that have special characters.
Documentation
- The Schedule concept docs got a revamp! Specifically, we:
- Updated the Schedule concept page to be a “jumping off” point for all-things scheduling, including a high-level look at how schedules work, their benefits, and what you need to know before diving in
- Added some basic how-to guides for automating assets and ops using schedules
- Added a reference of schedule-focused examples
- Added dedicated guides for common schedule uses, including creating partitioned schedules, customizing executing timezones, testing, and troubleshooting
Dagster Plus
- [experimental] The backfill daemon can now store logs and display them in the UI for increased visibility into the daemon’s behavior. Please contact Dagster Labs if you are interested in piloting this experimental feature.
- Added a
--read-only
flag to thedagster-cloud ci branch-deployment
CLI command, which returns the current branch deployment name for the current code repository branch without update the status of the branch deployment.
1.7.9 (core) / 0.23.9 (libraries)
New
- Dagster will now display a “storage kind” tag on assets in the UI, similar to the existing compute kind. To set storage kind for an asset, set its
dagster/storage_kind
tag. - You can now set retry policy on dbt assets, to enable coarse-grained retries with delay and jitter. For fine-grained partial retries, we still recommend invoking
dbt retry
within a try/except block to avoid unnecessary, duplicate work. AssetExecutionContext
now exposes ahas_partition_key_range
property.- The
owners
,metadata
,tags
, anddeps
properties onAssetSpec
are no longerOptional
. TheAssetSpec
constructor still acceptsNone
values, which are coerced to empty collections of the relevant type. - The
docker_executor
andk8s_job_executor
now consider at most 1000 events at a time when loading events from the current run to determine which steps should be launched. This value can be tuned by setting theDAGSTER_EXECUTOR_POP_EVENTS_LIMIT
environment variable in the run process. - Added a
dagster/retry_on_asset_or_op_failure
tag that can be added to jobs to override run retry behavior for runs of specific jobs. See the docs for more information. - Improved the sensor produced by
build_sensor_for_freshness_checks
to describe when/why it skips evaluating freshness checks. - A new “Runs” tab on the backfill details page allows you to see list and timeline views of the runs launched by the backfill.
- [dagster-dbt] dbt will now attach relation identifier metadata to asset materializations to indicate where the built model is materialized to.
- [dagster-graphql] The GraphQL Python client will now include the HTTP error code in the exception when a query fails. Thanks @yuvalgimmunai!
Bugfixes
- Fixed sensor logging behavior with the
@multi_asset_sensor
. ScheduleDefinition
now properly supports being passed aRunConfig
object.- When an asset function returns a
MaterializeResult
, but the function has no type annotation, previously, the IO manager would still be invoked with aNone
value. Now, the IO manager is not invoked. - The
AssetSpec
constructor now raises an error if an invalid owner string is passed to it. - When using the
graph_multi_asset
decorator, thecode_version
property onAssetOut
s passed in used to be ignored. Now, they no longer are. - [dagster-deltalake] Fixed GcsConfig import error and type error for partitioned assets (Thanks @thmswt)
- The asset graph and asset catalog now show the materialization status of External assets (when manually reported) rather than showing “Never observed”
Documentation
- The External Assets REST APIs now have their own reference page
- Added details, updated copy, and improved formatting to External Assets REST APIs
Dagster Plus
- The ability to set a custom base deployment when creating a branch deployment has been enabled for all organizations.
- When a code location fails to deploy, the Kubernetes agent now includes additional any warning messages from the underlying replicaset in the failure message to aid with troubleshooting.
- Serverless deployments now support using a requirements.txt with hashes.
- Fixed an issue where the
dagster-cloud job launch
command did not support specifying asset keys with prefixes in the--asset-key
argument. - [catalog UI] Catalog search now allows filtering by type, i.e.
group:
,code location:
,tag:
,owner:
. - New dagster+ accounts will now start with two default alert policies; one to alert if the default free credit budget for your plan is exceeded, and one to alert if a single run goes over 24 hours. These alerts will be sent as emails to the email with which the account was initially created.
1.7.8 (core) / 0.23.8 (libraries)
New
- Backfills created via GQL can have a custom title and description.
Definitions
now has aget_all_asset_specs
method, which allows iterating over properties of the defined assets- [ui] In filter dropdowns, it’s now possible to submit before all the suggestions have been loaded (thanks @bmalehorn!)
- [ui] Performance improvements when loading the Dagster UI for asset graphs with thousands of partition keys.
- [dagster-dbt] Dbt asset checks now emit execution duration and the number of failing rows as metadata
- [dagster-embedded-elt] Added support for partitioning in dlt assets (thanks @edsoncezar16!)
- [dagster-embedded-elt] Added ability to set custom metadata on dlt assets (thanks @edsoncezar16!)
- [dagster-graphql] Added a
terminate_runs
method to the Python GraphQL Client. (thanks @baumann-t!) - [dagster-polars] dagster-polars IO managers now emit dagster/row_count metadata (thanks @danielgafni!)
- [dagster-dbt]
DbtCliInvocation
now has a.get_error()
method that can be useful when usingdbt.cli(..., raise_on_error=False)
.
Bugfixes
- Fix a bug with legacy
DynamicPartitionsDefinition
(usingpartitions_fn
) that caused a crash during job backfills. - [ui] On the asset graph, filtering to one or more code locations via the Filter dropdown now works as expected.
- [ui] On the asset overview page, viewing an asset with no definition in a loaded code location no longer renders a clipped empty state.
Experimental
- The new
build_metadata_bounds_checks
API creates asset checks which verify that numeric metadata values on asset materializations fall within min or max values. See the documentation for more information.
Documentation
- Added details and links to the Schedules and Sensors API documentation
- Removed leftover mention of Dagster Cloud from the Dagster+ Hybrid architecture documentation
Dagster Plus
- Fixed an incompatibility between
build_sensor_for_freshness_checks
and Dagster Plus. This API should now work when used with Dagster Plus. - [ui] Billing / usage charts no longer appear black-on-black in Dagster’s dark mode.
- [ui] The asset catalog is now available for teams plans.
- [ui] Fixed a bug where the alert policy editor would misinterpret the threshold on a long-running job alert.
- [kubernetes] Added a
dagsterCloudAgent.additionalPodSpecConfig
to the Kubernetes agent Helm chart allowing arbitrary pod configuration to be applied to the agent pod. - [ECS] Fixed an issue where the ECS agent would sometimes raise a “Too many concurrent attempts to create a new revision of the specified family” exception when using agent replicas.
1.7.7 (core) / 0.23.7 (libraries)
New
- [ui] Command clicking on nodes in the asset lineage tab will now open them in a separate tab. Same with external asset links in the asset graph.
- Added support for setting a custom job namespace in user code deployments. (thanks @tmatthews0020!)
- Removed warnings due to use of
datetime.utcfromtimestamp
(thanks @dbrtly!) - Custom smtp user can now be used for e-mail alerts (thanks @edsoncezar16!)
- [dagster-dbt] Added support for
dbt-core==1.8.*
. - [dagster-embedded-elt] Failed dlt pipelines are now accurately reflected on the asset materialization (thanks @edsoncezar16!)
Bugfixes
- Fixed spurious errors in logs due to module shadowing.
- Fixed an issue in the Backfill Daemon where if the assets to be materialized had different
BackfillPolicy
s, each asset would get materialized in its own run, rather than grouping assets together into single run. - Fixed an issue that could cause the Asset Daemon to lose information in its cursor about an asset if that asset’s code location was temporarily unavailable.
- [dagster-dbt] Mitigated issues with cli length limits by only listing specific dbt tests as needed when the tests aren’t included via indirect selection, rather than listing all tests.
Documentation
- Markdoc tags can now be used in place of MDX components (thanks @nikomancy)
1.7.6 (core) / 0.23.6 (libraries)
New
- The backfill daemon now has additional logging to document the progression through each tick and why assets are and are not materialized during each evaluation of a backfill.
- Made performance improvements in both calculating and storing data version for assets, especially for assets with a large fan-in.
- Standardized table row count metadata output by various integrations to
dagster/row_count
. - [dagster-aws][community-contribution] Additional parameters can now be passed to the following resources:
CloudwatchLogsHandler
,ECRPublicClient
,SecretsManagerResource
,SSMResource
thanks@jacob-white-simplisafe
! - Added additional frontend telemetry. See https://docs.dagster.io/about/telemetry for more information.
Bugfixes
- Fixed issue that could cause runs to fail if they targeted any assets which had a metadata value of type
TableMetadataValue
,TableSchemaMetadataValue
, orTableColumnLineageMetadataValue
defined. - Fixed an issue which could cause evaluations produced via the Auto-materialize system to not render the “skip”-type rules.
- Backfills of asset jobs now correctly use the
BackfillPolicy
of the underlying assets in the job. - [dagster-databricks][community-contribution]
databricks-sdk
version bumped to0.17.0
, thanks@lamalex
! - [helm][community-contribution] resolved incorrect comments about
dagster code-server start
, thanks@SanjaySiddharth
!
Documentation
- Added section headings to Pipes API references, along with explanatory copy and links to relevant pages
- Added a guide for subletting asset checks
- Add more detailed steps to transition from serverless to hybrid
- [community-contribution] asset selection syntax corrected, thanks
@JonathanLai2004
!
Dagster Plus
- Fixed an issue where Dagster Cloud agents would wait longer than necessary when multiple code locations were timing out during a deployment.
1.7.5 (core) / 0.23.5 (libraries)
New
- The Asset > Checks tab now allows you to view plots of numeric metadata emitted by your checks.
- The Asset > Events tab now supports infinite-scrolling, making it possible to view all historical materialization and observation events.
- When constructing a
MaterializeResult
,ObserveResult
, orOutput
, you can now include tags that will be attached to the correspondingAssetMaterialization
orAssetObservation
event. These tags will be rendered on these events in the UI.
Bugfixes
- Fixed an issue where backfills would sometimes fail if a partition definition was changed in the middle of the backfill.
- Fixed an issue where if the code server became unavailable during the first tick of a backfill, the backfill would stall and be unable to submit runs once the code server became available.
- Fixed an issue where the status of an external asset would not get updated correctly.
- Fixed an issue where run status sensors would sometimes fall behind in deployments with large numbers of runs.
- The descriptions and metadata on the experimental
build_last_update_freshness_checks
andbuild_time_partition_freshness_checks
APIs have been updated to be clearer. - The headers of tables no longer become misaligned when a scrollbar is present in some scenarios.
- The sensor type, instigation type, and backfill status filters on their respective pages are now saved to the URL, so sharing the view or reloading the page preserve your filters.
- Typing a
%
into the asset graph’s query selector no longer crashes the UI. - “Materializing” states on the asset graph animate properly in both light and dark themes.
- Thanks to @lautaro79 for fixing a helm chart issue.
Breaking Changes
- Subclasses of
MetadataValue
have been changed fromNamedTuple
s to Pydantic models.NamedTuple
functionality on these classes was not part of Dagster’s stable public API, but usages relying on their tuple-ness may break. For example: callingjson.dumps
on collections that include them.
Deprecations
- [dagster-dbt] Support for
dbt-core==1.5.*
has been removed, as it has reached end of life in April 2024.
Dagster Plus
- Fixed an issue in the
dagster-cloud
CLI where the--deployment
argument was ignored when theDAGSTER_CLOUD_URL
environment variable was set. - Fixed an issue where
dagster-cloud-cli
package wouldn’t work unless thedagster-cloud
package was installed as well. - A new “budget alerts” feature has launched for users on self-serve plans. This feature will alert you when you hit your credit limit.
- The experimental asset health overview now allows you to group assets by compute kind, tag, and tag value.
- The concurrency and locations pages in settings correctly show Dagster Plus-specific options when experimental navigation is enabled.
1.7.4 (core) / 0.23.4 (libraries)
New
TimeWindowPartitionMapping
now supports thestart_offset
andend_offset
parameters even when the upstreamPartitionsDefinition
is different than the downstreamPartitionsDefinition
. The offset is expressed in units of downstream partitions, soTimeWindowPartitionMapping(start_offset=-1)
between an hourly upstream and a daily downstream would map each downstream partition to 48 upstream partitions – those for the same and preceding day.
Bugfixes
- Fixed an issue where certain exceptions in the Dagster daemon would immediately retry instead of waiting for a fixed interval before retrying.
- Fixed a bug with asset checks in complex asset graphs that include cycles in the underlying nodes.
- Fixed an issue that would cause unnecessary failures on FIPS-enabled systems due to the use of md5 hashes in non-security-related contexts (thanks @jlloyd-widen!)
- Removed
path
metadata fromUPathIOManager
inputs. This eliminates the creation ofASSET_OBSERVATION
events for every input on every step for the default I/O manager. - Added support for defining
owners
on@graph_asset
. - Fixed an issue where having multiple partitions definitions in a location with the same start date but differing end dates could lead to “
DagsterInvalidSubsetError
when trying to launch runs.
Documentation
- Fixed a few issues with broken pages as a result of the Dagster+ rename.
- Renamed a few instances of Dagster Cloud to Dagster+.
- Added a note about external asset + alert incompatibility to the Dagster+ alerting docs.
- Fixed references to outdated apis in freshness checks docs.
Dagster Plus
- When creating a Branch Deployment via GraphQL or the
dagster-cloud branch-deployment
CLI, you can now specify the base deployment. The base deployment will be used for comparing assets for Change Tracking. For example, to set the base deployment to a deployment namedstaging
:dagster-cloud branch-deployment create-or-update --base-deployment-name staging ...
. Note that once a Branch Deployment is created, the base deployment cannot be changed. - Fixed an issue where agents serving many branch deployments simultaneously would sometimes raise a
413: Request Entity Too Large
error when uploading a heartbeat to the Dagster Plus servers.
1.7.3 (core) / 0.23.3 (libraries)
New
@graph_asset
now accepts atags
argument- [ui] For users whose light/dark mode theme setting is set to match their system setting, the theme will update automatically when the system changes modes (e.g. based on time of day), with no page reload required.
- [ui] We have introduced the typefaces Geist and Geist Mono as our new default fonts throughout the Dagster app, with the goal of improving legibility, consistency, and maintainability.
- [ui] [experimental] We have begun experimenting with a new navigation structure for the Dagster UI. The change can be enabled via User Settings.
- [ui] [experimental] Made performance improvements to the Concurrency settings page.
- [dagster-azure] [community-contribution] ADLS2 IOManager supports custom timeout. Thanks @tomas-gajarsky!
- [dagster-fivetran] [community-contribution] It’s now possible to specify destination ids in
load_asset_defs_from_fivetran_instance
. Thanks @lamalex!
Bugfixes
- Fixed an issue where pressing the “Reset sensor status” button in the UI would also reset the sensor’s cursor.
- Fixed a bug that caused input loading time not to be included in the reported step duration.
- Pydantic warnings are no longer raised when importing Dagster with Pydantic 2.0+.
- Fixed an issue which would cause incorrect behavior when auto-materializing partitioned assets based on updates to a parent asset in a different code location.
- Fixed an issue which would cause every tick of the auto-materialize sensor to produce an evaluation for each asset, even if nothing had changed from the previous tick.
- [dagster-dbt] Fixed a bug that could raise
Duplicate check specs
errors with singular tests ingested as asset checks. - [embedded-elt] resolved an issue where subset of resources were not recognized when using
source.with_resources(...)
- [ui] Fixed an issue where a sensor that targeted an invalid set of asset keys could cause the asset catalog to fail to load.
- [ui] Fixed an issue in which runs in the Timeline that should have been considered overlapping were not correctly grouped together, leading to visual bugs.
- [ui] On the asset overview page, job tags no longer render poorly when an asset appears in several jobs.
- [ui] On the asset overview page, hovering over the timestamp tags in the metadata table explains where each entry originated.
- [ui] Right clicking the background of the asset graph now consistently shows a context menu, and the lineage view supports vertical as well as horizontal layout.
Documentation
- Sidebar navigation now appropriately handles command-click and middle-click to open links in a new tab.
- Added a section for asset checks to the Testing guide.
- Added a guide about Column-level lineage for assets.
- Lots of updates to examples to reflect the new opt-in approach to I/O managers.
Dagster+
- [ui] [experimental] A new Overview > Asset Health page provides visibility into failed and missing materializations, check warnings and check errors.
- [ui] You can now share feedback with the Dagster team directly from the app. Open the Help menu in the top nav, then “Share feedback”. Bugs and feature requests are submitted directly to the Dagster team.
- [ui] When editing a team, the list of team members is now virtualized, allowing for the UI to scale better for very large team sizes.
- [ui] Fixed dark mode for billing components.
1.7.2 (core) / 0.23.2 (libraries)
New
- Performance improvements when loading large asset graphs in the Dagster UI.
@asset_check
functions can now be invoked directly for unit testing.dagster-embedded-elt
dlt resourceDagsterDltResource
can now be used from@op
definitions in addition to assets.UPathIOManager.load_partitions
has been added to assist with helpingUpathIOManager
subclasses deal with serialization formats which support partitioning. Thanks@danielgafni
!- [dagster-polars] now supports other data types rather than only string for the partitioning columns. Also
PolarsDeltaIOManager
now supportsMultiPartitionsDefinition
withDeltaLake
native partitioning. Metadata value"partition_by": {"dim_1": "col_1", "dim_2": "col_2"}
should be specified to enable this feature. Thanks@danielgafni
!
Bugfixes
- [dagster-airbyte] Auto materialization policies passed to
load_assets_from_airbyte_instance
andload_assets_from_airbyte_project
will now be properly propagated to the created assets. - Fixed an issue where deleting a run that was intended to materialize a partitioned asset would sometimes leave the status of that asset as “Materializing” in the Dagster UI.
- Fixed an issue with
build_time_partition_freshness_checks
where it would incorrectly intuit that an asset was not fresh in certain cases. - [dagster-k8s] Fix an error on transient ‘none’ responses for pod waiting reasons. Thanks @piotrmarczydlo!
- [dagster-dbt] Failing to build column schema metadata will now result in a warning rather than an error.
- Fixed an issue where incorrect asset keys would cause a backfill to fail loudly.
- Fixed an issue where syncing unmaterialized assets could include source assets.
Breaking Changes
- [dagster-polars]
PolarsDeltaIOManager
no longer supports loading natively partitioned DeltaLake tables as dictionaries. They should be loaded as a singlepl.DataFrame
/pl.LazyFrame
instead.
Documentation
- Renamed
Dagster Cloud
toDagster+
all over the docs. - Added a page about Change Tracking in Dagster+ branch deployments.
- Added a section about user-defined metrics to the Dagster+ Insights docs.
- Added a section about Asset owners to the asset metadata docs.
Dagster Cloud
- Branch deployments now have Change Tracking. Assets in each branch deployment will be compared to the main deployment. New assets and changes to code version, dependencies, partitions definitions, tags, and metadata will be marked in the UI of the branch deployment.
- Pagerduty alerting is now supported with Pro plans. See the documentation for more info.
- Asset metadata is now included in the insights metrics for jobs materializing those assets.
- Per-run Insights are now available on individual assets.
- Previously, the
before_storage_id
/after_storage_id
values in theAssetRecordsFilter
class were ignored. This has been fixed. - Updated the output of
dagster-cloud deployment alert-policies list
to match the format ofsync
. - Fixed an issue where Dagster Cloud agents with many code locations would sometimes leave code servers running after the agent shut down.
1.7.1 (core) / 0.23.1 (libraries)
New
- [dagster-dbt][experimental] A new cli command
dagster-dbt project prepare-for-deployment
has been added in conjunction withDbtProject
for managing the behavior of rebuilding the manifest during development and preparing a pre-built one for production.
Bugfixes
- Fixed an issue with duplicate asset check keys when loading checks from a package.
- A bug with the new
build_last_update_freshness_checks
andbuild_time_partition_freshness_checks
has been fixed where multi_asset checks passed in would not be executable. - [dagster-dbt] Fixed some issues with building column lineage for incremental models, models with implicit column aliases, and models with columns that have multiple dependencies on the same upstream column.
Breaking Changes
- [dagster-dbt] The experimental
DbtArtifacts
class has been replaced byDbtProject
.
Documentation
- Added a dedicated concept page for all things metadata and tags
- Moved asset metadata content to a dedicated concept page: Asset metadata
- Added section headings to the Software-defined Assets API reference, which groups APIs by asset type or use
- Added a guide about user settings in the Dagster UI
- Added
AssetObservation
to the Software-defined Assets API reference - Renamed Dagster Cloud GitHub workflow files to the new, consolidated
dagster-cloud-deploy.yml
- Miscellaneous formatting and copy updates
- [community-contribution] [dagster-embedded-elt] Fixed
get_asset_key
API documentation (thanks @aksestok!) - [community-contribution] Updated Python version in contributing documentation (thanks @piotrmarczydlo!)
- [community-contribution] Typo fix in README (thanks @MiConnell!)
Dagster Cloud
- Fixed a bug where an incorrect value was being emitted for BigQuery bytes billed in Insights.
1.7.0 (core) / 0.23.0 (libraries)
Major Changes since 1.6.0 (core) / 0.22.0 (libraries)
- Asset definitions can now have tags, via the
tags
argument on@asset
,AssetSpec
, andAssetOut
. Tags are meant to be used for organizing, filtering, and searching for assets. - The Asset Details page has been revamped to include an “Overview” tab that centralizes the most important information about the asset – such as current status, description, and columns – in a single place.
- Assets can now be assigned owners.
- Asset checks are now considered generally available and will no longer raise experimental warnings when used.
- Asset checks can now be marked
blocking
, which causes downstream assets in the same run to be skipped if the check fails with ERROR-level severity. - The new
@multi_asset_check
decorator enables defining a single op that executes multiple asset checks. - The new
build_last_updated_freshness_checks
andbuild_time_partition_freshness_checks
APIs allow defining asset checks that error or warn when an asset is overdue for an update. Refer to the Freshness checks guide for more info. - The new
build_column_schema_change_checks
API allows defining asset checks that warn when an asset’s columns have changed since its latest materialization. - In the asset graph UI, the “Upstream data”, “Code version changed”, and “Upstream code version” statuses have been collapsed into a single “Unsynced” status. Clicking on “Unsynced” displays more detailed information.
- I/O managers are now optional. This enhances flexibility for scenarios where they are not necessary. For guidance, see When to use I/O managers.
- Assets with
None
orMaterializeResult
return type annotations won't use I/O managers; dependencies for these assets can be set using thedeps
parameter in the@asset
decorator.
- Assets with
- [dagster-dbt] Dagster’s dbt integration can now be configured to automatically collect metadata about column schema and column lineage.
- [dagster-dbt] dbt tests are now pulled in as Dagster asset checks by default.
- [dagster-dbt] dbt resource tags are now automatically pulled in as Dagster asset tags.
- [dagster-snowflake] [dagster-gcp] The dagster-snowflake and dagster-gcp packages now both expose a
fetch_last_updated_timestamps
API, which makes it straightforward to collect data freshness information in source asset observation functions.
Changes since 1.6.14 (core) / 0.22.14 (libraries)
New
- Metadata attached during asset or op execution can now be accessed in the I/O manager using
OutputContext.output_metadata
. - [experimental] Single-run backfills now support batched inserts of asset materialization events. This is a major performance improvement for large single-run backfills that have database writes as a bottleneck. The feature is off by default and can be enabled by setting the
DAGSTER_EVENT_BATCH_SIZE
environment variable in a code server to an integer (25 recommended, 50 max). It is only currently supported in Dagster Cloud and OSS deployments with a postgres backend. - [ui] The new Asset Details page is now enabled for new users by default. To turn this feature off, you can toggle the feature in the User Settings.
- [ui] Queued runs now display a link to view all the potential reasons why a run might remain queued.
- [ui] Starting a run status sensor with a stale cursor will now warn you in the UI that it will resume from the point that it was paused.
- [asset-checks] Asset checks now support asset names that include
.
, which can occur when checks are ingested from dbt tests. - [dagster-dbt] The env var
DBT_INDIRECT_SELECTION
will no longer be set toempty
when executing dbt tests as asset checks, unless specific asset checks are excluded.dagster-dbt
will no longer explicitly select all dbt tests with the dbt cli, which had caused argument length issues. - [dagster-dbt] Singular tests with a single dependency are now ingested as asset checks.
- [dagster-dbt] Singular tests with multiple dependencies must have the primary dependency must be specified using dbt meta.
{{
config(
meta={
'dagster': {
'ref': {
'name': <ref_name>,
'package': ... # Optional, if included in the ref.
'version': ... # Optional, if included in the ref.
},
}
}
)
}}
...
- [dagster-dbt] Column lineage metadata can now be emitted when invoking dbt. See the documentation for details.
- [experimental][dagster-embedded-elt] Add the data load tool (dlt) integration for easily building and integration dlt ingestion pipelines with Dagster.
- [dagster-dbt][community-contribution] You can now specify a custom schedule name for schedules created with
build_schedule_from_dbt_selection
. Thanks @dragos-pop! - [helm][community-contribution] You can now specify a custom job namespace for your user code deployments. Thanks @tmatthews0020!
- [dagster-polars][community-contribution] Column schema metadata is now integrated using the dagster-specific metadata key in
dagster_polars
. Thanks @danielgafni! - [dagster-datadog][community-contribution] Added
datadog.api
module to theDatadogClient
resource, enabling direct access to API methods. Thanks @shivgupta!
Bugfixes
- Fixed a bug where run status sensors configured to monitor a specific job would trigger for jobs with the same name in other code locations.
- Fixed a bug where multi-line asset check result descriptions were collapsed into a single line.
- Fixed a bug that caused a value to show up under “Target materialization” in the asset check UI even when an asset had had observations but never been materialized.
- Changed typehint of
metadata
argument onmulti_asset
andAssetSpec
toMapping[str, Any]
. - [dagster-snowflake-pandas] Fixed a bug introduced in 0.22.4 where column names were not using quote identifiers correctly. Column names will now be quoted.
- [dagster-aws] Fixed an issue where a race condition where simultaneously materializing the same asset more than once would sometimes raise an Exception when using the
s3_io_manager
. - [ui] Fixed a bug where resizable panels could inadvertently be hidden and never recovered, for instance the right panel on the global asset graph.
- [ui] Fixed a bug where opening a run with an op selection in the Launchpad could lose the op selection setting for the subsequently launched run. The op selection is now correctly preserved.
- [community-contribution] Fixed
dagster-polars
tests by excludingDecimal
types. Thanks @ion-elgreco! - [community-contribution] Fixed a bug where auto-materialize rule evaluation would error on FIPS-compliant machines. Thanks @jlloyd-widen!
- [community-contribution] Fixed an issue where an excessive DeprecationWarning was being issued for a
ScheduleDefinition
passed into theDefinitions
object. Thanks @2Ryan09!
Breaking Changes
- Creating a run with a custom non-UUID
run_id
was previously private and only used for testing. It will now raise an exception. - [community-contribution] Previously, calling
get_partition_keys_in_range
on aMultiPartitionsDefinition
would erroneously return partition keys that were within the one-dimensional range of alphabetically-sorted partition keys for the definition. Now, this method returns the cartesian product of partition keys within each dimension ’s range. Thanks, @mst! - Added
AssetCheckExecutionContext
to replaceAssetExecutionContext
as the type of thecontext
param passed in to@asset_check
functions.@asset_check
was an experimental decorator. - [experimental]
@classmethod
decorators have been removed from dagster-embedded-slt.slingDagsterSlingTranslator
- [dagster-dbt]
@classmethod
decorators have been removed fromDagsterDbtTranslator
. - [dagster-k8s] The default merge behavior when raw kubernetes config is supplied at multiple scopes (for example, at the instance level and for a particluar job) has been changed to be more consistent. Previously, configuration was merged shallowly by default, with fields replacing other fields instead of appending or merging. Now, it is merged deeply by default, with lists appended to each other and dictionaries merged, in order to be more consistent with how kubernetes configuration is combined in all other places. See the docs for more information, including how to restore the previous default merge behavior.
Deprecations
AssetSelection.keys()
has been deprecated. Instead, you can now supply asset key arguments toAssetSelection.assets()
.- Run tag keys with long lengths and certain characters are now deprecated. For consistency with asset tags, run tags keys are expected to only contain alpha-numeric characters, dashes, underscores, and periods. Run tag keys can also contain a prefix section, separated with a slash. The main section and prefix section of a run tag are limited to 63 characters.
AssetExecutionContext
has been simplified. Op-related methods and methods with existing access paths have been marked deprecated. For a full list of deprecated methods see this GitHub Discussion.- The
metadata
property onInputContext
andOutputContext
has been deprecated and renamed todefinition_metadata
. FreshnessPolicy
is now deprecated. For monitoring freshness, use freshness checks instead. If you are usingAutoMaterializePolicy.lazy()
,FreshnessPolicy
is still recommended, and will continue to be supported until an alternative is provided.
Documentation
- Lots of updates to examples to reflect the recent opt-in nature of I/O managers
- Dagster Cloud alert guides have been split up by alert type:
- Added info about asset check-based-alerts to the Dagster Cloud alerting docs
- The Asset checks documentation got a face lift - info about defining and executing asset checks now lives in its own guide
- Added a new guide for using freshness checks to the Asset checks documentation
- Cleaned up the Getting help guide - it now includes a high-level summary of all Dagster support resources, making it easier to skim!
- [community-contribution] Fixed the indentation level of a code snippet in the
dagster-polars
documentation. Thanks @danielgafni!
Dagster Cloud
- The Dagster Cloud agent will now monitor the code servers that it spins to detect whether they have stopped serving requests, and will automatically redeploy the code server if it has stopped responding for an extended period of time.
- New additions and bugfixes in Insights:
- Added per-metric cost estimation. Estimates can be added via the “Insights settings” button, and will appear in the table and chart for that metric.
- Branch deployments are now included in the deployment filter control.
- In the Deployments view, fixed deployment links in the data table.
- Added support for BigQuery cost metrics.
1.6.14 (core) / 0.22.14 (libraries)
Bugfixes
- [dagster-dbt] Fixed some issues with building column lineage metadata.
1.6.13 (core) / 0.22.13 (libraries)
Bugfixes
- Fixed a bug where an asset with a dependency on a subset of the keys of a parent multi-asset could sometimes crash asset job construction.
- Fixed a bug where a Definitions object containing assets having integrated asset checks and multiple partitions definitions could not be loaded.
1.6.12 (core) / 0.22.12 (libraries)
New
AssetCheckResult
now has a textdescription
property. Check evaluation descriptions are shown in the Checks tab on the asset details page.- Introduced
TimestampMetadataValue
. Timestamp metadata values are represented internally as seconds since the Unix epoch. They can be constructed usingMetadataValue.timestamp
. In the UI, they’re rendered in the local timezone, like other timestamps in the UI. AssetSelection.checks
can now acceptAssetCheckKeys
as well asAssetChecksDefinition
.- [community-contribution] Metadata attached to an output at runtime (via either
add_output_metadata
or by passing toOutput
) is now available onHookContext
under theop_output_metadata
property. Thanks @JYoussouf! - [experimental]
@asset
,AssetSpec
, andAssetOut
now accept atags
property. Tags are key-value pairs meant to be used for organizing asset definitions. If"__dagster_no_value"
is set as the value, only the key will be rendered in the UI.AssetSelection.tag
allows selecting assets that have a particular tag. - [experimental] Asset tags can be used in asset CLI selections, e.g.
dagster asset materialize --select tag:department=marketing
- [experimental][dagster-dbt] Tags can now be configured on dbt assets, using
DagsterDbtTranslator.get_tags
. By default, we take the dbt tags configured on your dbt models, seeds, and snapshots. - [dagster-gcp] Added get_gcs_keys sensor helper function.
Bugfixes
- Fixed a bug that prevented external assets with dependencies from displaying properly in Dagster UI.
- Fix a performance regression in loading code locations with large multi-assets.
- [community-contribution] [dagster-databricks] Fix a bug with the
DatabricksJobRunner
that led to an inability to use dagster-databricks with Databricks instance pools. Thanks @smats0n! - [community-contribution] Fixed a bug that caused a crash when external assets had hyphens in their
AssetKey
. Thanks @maxfirman! - [community-contribution] Fix a bug with
load_assets_from_package_module
that would cause a crash when any submodule had the same directory name as a dependency. Thanks @CSRessel! - [community-contribution] Fixed a mypy type error, thanks @parthshyara!
- [community-contribution][dagster-embedded-elt] Fixed an issue where Sling assets would not properly read group and description metadata from replication config, thanks @jvyoralek!
- [community-contribution] Ensured annotations from the helm chart properly propagate to k8s run pods, thanks @maxfirman!
Dagster Cloud
- Fixed an issue in Dagster Cloud Serverless runs where multiple runs simultaneously materializing the same asset would sometimes raise a “Key not found” exception.
- Fixed an issue when using agent replicas where one replica would sporadically remove a code server created by another replica due to a race condition, leading to a “code server not found” or “Deployment not found” exception.
- [experimental] The metadata key for specifying column schema that will be rendered prominently on the new Overview tab of the asset details page has been changed from
"columns"
to"dagster/column_schema"
. Materializations using the old metadata key will no longer result in the Columns section of the tab being filled out. - [ui] Fixed an Insights bug where loading a view filtered to a specific code location would not preserve that filter on pageload.
1.6.11 (core) / 0.22.11 (libraries)
Bugfixes
- Fixed an issue where
dagster dev
or the Dagster UI would display an error when loading jobs created with op or asset selections.
1.6.10 (core) / 0.22.10 (libraries)
New
- Latency improvements to the scheduler when running many simultaneous schedules.
Bugfixes
- The performance of loading the Definitions snapshot from a code server when large
@multi_asset
s are in use has been drastically improved. - The snowflake quickstart example project now renames the “by” column to avoid reserved snowflake names. Thanks @jcampbell!
- The existing group name (if any) for an asset is now retained if
the_asset.with_attributes
is called without providing a group name. Previously, the existing group name was erroneously dropped. Thanks @ion-elgreco! - [dagster-dbt] Fixed an issue where Dagster events could not be streamed from
dbt source freshness
. - [dagster university] Removed redundant use of
MetadataValue
in Essentials course. Thanks @stianthaulow! - [ui] Increased the max number of plots on the asset plots page to 100.
Breaking Changes
- The
tag_keys
argument onDagsterInstance.get_run_tags
is no longer optional. This has been done to remove an easy way of accidentally executing an extremely expensive database operation.
Dagster Cloud
- The maximum number of concurrent runs across all branch deployments is now configurable. This setting can now be set via GraphQL or the CLI.
- [ui] In Insights, fixed display of table rows with zero change in value from the previous time period.
- [ui] Added deployment-level Insights.
- [ui] Fixed an issue causing void invoices to show up as “overdue” on the billing page.
- [experimental] Branch deployments can now indicate the new and modified assets in the branch deployment as compared to the main deployment. To enable this feature, turn on the “Enable experimental branch deployment asset graph diffing” user setting.
1.6.9 (core) / 0.22.9 (libraries)
New
- [ui] When viewing logs for a run, the date for a single log row is now shown in the tooltip on the timestamp. This helps when viewing a run that takes place over more than one date.
- Added suggestions to the error message when selecting asset keys that do not exist as an upstream asset or in an
AssetSelection.
- Improved error messages when trying to materialize a subset of a multi-asset which cannot be subset.
- [dagster-snowflake]
dagster-snowflake
now requiressnowflake-connector-python>=3.4.0
- [embedded-elt]
@sling_assets
accepts an optional name parameter for the underlying op - [dagster-openai]
dagster-openai
library is now available. - [dagster-dbt] Added a new setting on
DagsterDbtTranslatorSettings
calledenable_duplicate_source_asset_keys
that allows users to set duplicate asset keys for their dbt sources. Thanks @hello-world-bfree! - Log messages in the Dagster daemon for unloadable sensors and schedules have been removed.
- [ui] Search now uses a cache that persists across pageloads which should greatly improve search performance for very large orgs.
- [ui] groups/code locations in the asset graph’s sidebar are now sorted alphabetically.
Bugfixes
- Fixed issue where the input/output schemas of configurable IOManagers could be ignored when providing explicit input / output run config.
- Fixed an issue where enum values could not properly have a default value set in a
ConfigurableResource
. - Fixed an issue where graph-backed assets would sometimes lose user-provided descriptions due to a bug in internal copying.
- [auto-materialize] Fixed an issue introduced in 1.6.7 where updates to ExternalAssets would be ignored when using AutoMaterializePolicies which depended on parent updates.
- [asset checks] Fixed a bug with asset checks in step launchers.
- [embedded-elt] Fix a bug when creating a
SlingConnectionResource
where a blank keyword argument would be emitted as an environment variable - [dagster-dbt] Fixed a bug where emitting events from
dbt source freshness
would cause an error. - [ui] Fixed a bug where using the “Terminate all runs” button with filters selected would not apply the filters to the action.
- [ui] Fixed an issue where typing a search query into the search box before the search data was fetched would yield “No results” even after the data was fetched.
Community Contributions
- [docs] fixed typo in embedded-elt.mdx (thanks @cameronmartin)!
- [dagster-databricks] log the url for the run of a databricks job (thanks @smats0n)!
- Fix missing partition property (thanks christeefy)!
- Add op_tags to @observable_source_asset decorator (thanks @maxfirman)!
- [docs] typo in MultiPartitionMapping docs (thanks @dschafer)
- Allow github actions to checkout branch from forked repo for docs changes (ci fix) (thanks hainenber)!
Experimental
- [asset checks] UI performance of asset checks related pages has been improved.
- [dagster-dbt] The class
DbtArtifacts
has been added for managing the behavior of rebuilding the manifest during development but expecting a pre-built one in production.
Documentation
- Added example of writing compute logs to AWS S3 when customizing agent configuration.
- "Hello, Dagster" is now "Dagster Quickstart" with the option to use a Github Codespace to explore Dagster.
- Improved guides and reference to better running multiple isolated agents with separate queues on ECS.
Dagster Cloud
- Microsoft Teams is now supported for alerts. Documentation
- A
send sample alert
button now exists on both the alert policies page and in the alert policies editor to make it easier to debug and configure alerts without having to wait for an event to kick them off.
1.6.8 (core) / 0.22.8 (libraries)
Bugfixes
- [dagster-embedded-elt] Fixed a bug in the
SlingConnectionResource
that raised an error when connecting to a database.
Experimental
- [asset checks]
graph_multi_assets
withcheck_specs
now support subsetting.
1.6.7 (core) / 0.22.7 (libraries)
New
- Added a new
run_retries.retry_on_op_or_asset_failures
setting that can be set to false to make run retries only occur when there is an unexpected failure that crashes the run, allowing run-level retries to co-exist more naturally with op or asset retries. See the docs for more information. dagster dev
now sets the environment variableDAGSTER_IS_DEV_CLI
allowing subprocesses to know that they were launched in a development context.- [ui] The Asset Checks page has been updated to show more information on the page itself rather than in a dialog.
Bugfixes
- [ui] Fixed an issue where the UI disallowed creating a dynamic partition if its name contained the “|” pipe character.
- AssetSpec previously dropped the metadata and code_version fields, resulting in them not being attached to the corresponding asset. This has been fixed.
Experimental
- The new
@multi_observable_source_asset
decorator enables defining a set of assets that can be observed together with the same function. - [dagster-embedded-elt] New Asset Decorator
@sling_assets
and ResourceSlingConnectionResource
have been added for the[dagster-embedded-elt.sling](http://dagster-embedded-elt.sling)
package. Deprecatedbuild_sling_asset
,SlingSourceConnection
andSlingTargetConnection
. - Added support for op-concurrency aware run dequeuing for the
QueuedRunCoordinator
.
Documentation
- Fixed reference documentation for isolated agents in ECS.
- Corrected an example in the Airbyte Cloud documentation.
- Added API links to OSS Helm deployment guide.
- Fixed in-line pragmas showing up in the documentation.
Dagster Cloud
- Alerts now support Microsoft Teams.
- [ECS] Fixed an issue where code locations could be left undeleted.
- [ECS] ECS agents now support setting multiple replicas per code server.
- [Insights] You can now toggle the visibility of a row in the chart by clicking on the dot for the row in the table.
- [Users] Added a new column “Licensed role” that shows the user's most permissive role.
1.6.6 (core) / 0.22.6 (libraries)
New
- Dagster officially supports Python 3.12.
dagster-polars
has been added as an integration. Thanks @danielgafni!- [dagster-dbt]
@dbt_assets
now supports loading projects with semantic models. - [dagster-dbt]
@dbt_assets
now supports loading projects with model versions. - [dagster-dbt]
get_asset_key_for_model
now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok! - [dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
- [UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.
Bugfixes
- Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
- Fixed an issue with the type annotations on the
@asset
decorator causing a false positive in Pyright strict mode. Thanks @tylershunt! - [ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
- [ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
- [dagster-k8s] Fixed an issue where setting the
security_context
field on thek8s_job_executor
didn't correctly set the security context on the launched step pods. Thanks @krgn!
Experimental
- Observable source assets can now yield
ObserveResult
s with nodata_version
. - You can now include
FreshnessPolicy
s on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy. - [ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.
Documentation
- Updated docs to reflect newly-added support for Python 3.12.
Dagster Cloud
- [kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.
1.6.5 (core) / 0.22.5 (libraries)
New
- Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
- [dagster-k8s] Include k8s pod debug info in run worker failure messages.
- [dagster-dbt] Events emitted by
DbtCliResource
now include metadata from the dbt adapter response. This includes fields likerows_affected
,query_id
from the Snowflake adapter, orbytes_processed
from the BigQuery adapter.
Bugfixes
- A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
- [dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the
k8s_job_executor
. - [instigator-tick-logs] Fixed an issue where invoking
context.log.exception
in a sensor or schedule did not properly capture exception information. - [asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
- [dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.
Experimental
@observable_source_asset
-decorated functions can now return anObserveResult
. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.- [auto-materialize] A new
AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron
class allows you to constructAutoMaterializePolicys
which wait for all parents to be updated after the latest tick of a given cron schedule. - [Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.
Documentation
- Fixed an error in our asset checks docs. Thanks @vaharoni!
- Fixed an error in our Dagster Pipes Kubernetes docs. Thanks @cameronmartin!
- Fixed an issue on the Hello Dagster! guide that prevented it from loading.
- Add specific capabilities of the Airflow integration to the Airflow integration page.
- Re-arranged sections in the I/O manager concept page to make info about using I/O versus resources more prominent.
1.6.4 (core) / 0.22.4 (libraries)
New
build_schedule_from_partitioned_job
now supports creating a schedule from a static-partitioned job (Thanks@craustin
!)- [dagster-pipes]
PipesK8sClient
will now autodetect the namespace when using in-cluster config. (Thanks@aignas
!) - [dagster-pipes]
PipesK8sClient
can now inject the context in to multiple containers. (Thanks@aignas
!) - [dagster-snowflake] The Snowflake Pandas I/O manager now uses the
write_pandas
method to load Pandas DataFrames in Snowflake. To support this change, the database connector was switched fromSqlDbConnection
toSnowflakeConnection
. - [ui] On the overview sensors page you can now filter sensors by type.
- [dagster-deltalake-polars] Added LazyFrame support (Thanks
@ion-elgreco
!) - [dagster-dbt] When using
@dbt_assets
and multiple dbt resources produce the sameAssetKey
, we now display an exception message that highlights the file paths of the misconfigured dbt resources in your dbt project. - [dagster-k8s] The debug info reported upon failure has been improved to include additional information from the Job. (Thanks
@jblawatt
!) - [dagster-k8s] Changed the Dagster Helm chart to apply
automountServiceAccountToken: false
to the default service account used by the Helm chart, in order to better comply with security policies. (Thanks@MattyKuzyk
!)
Bugfixes
- A unnecessary thread lock has been removed from the sensor daemon. This should improve sensor throughput for users with many sensors who have enabled threading.
- Retry from failure behavior has been improved for cases where dynamic steps were interrupted.
- Previously, when backfilling a set of assets which shared a BackfillPolicy and PartitionsDefinition, but had a non-default partition mapping between them, a run for the downstream asset could be launched at the same time as a separate run for the upstream asset, resulting in inconsistent partition ordering. Now, the downstream asset will only execute after the parents complete. (Thanks
@ruizh22
!) - Previously, asset backfills would raise an exception if the code server became unreachable mid-iteration. Now, the backfill will pause until the next evaluation.
- Fixed a bug that was causing ranged backfills over dynamically partitioned assets to fail.
- [dagster-pipes]
PipesK8sClient
has improved handling for init containers and additional containers. (Thanks@aignas
!) - Fixed the
last_sensor_start_time
property of theSensorEvaluationContext
, which would get cleared on ticks after the first tick after the sensor starts. - [dagster-mysql] Fixed the optional
dagster instance migrate --bigint-migration
, which caused some operational errors on mysql storages. - [dagster-dbt] Fixed a bug introduced in 1.6.3 that caused errors when ingesting asset checks with multiple dependencies.
Deprecations
- The following methods on
AssetExecutionContext
have been marked deprecated, with their suggested replacements in parenthesis:context.op_config
(context.op_execution_context.op_config
)context.node_handle
(context.op_execution_context.node_handle
)context.op_handle
(context.op_execution_context.op_handle
)context.op
(context.op_execution_context.op
)context.get_mapping_key
(context.op_execution_context.get_mapping_key
)context.selected_output_names
(context.op_execution_context.selected_output_names
)context.dagster_run
(context.run
)context.run_id
(context.run.run_id
)context.run_config
(context.run.run_config
)context.run_tags
(context.run.tags
)context.has_tag
(key in context.run.tags
)context.get_tag
(context.run.tags.get(key)
)context.get_op_execution_context
(context.op_execution_context
)context.asset_partition_key_for_output
(context.partition_key
)context.asset_partition_keys_for_output
(context.partition_keys
)context.asset_partitions_time_window_for_output
(context.partition_time_window
)context.asset_partition_key_range_for_output
(context.partition_key_range
)
Experimental
- [asset checks]
@asset_check
now has ablocking
parameter. When this is enabled, if the check fails with severityERROR
then any downstream assets in the same run won’t execute.
Documentation
- The Branch Deployment docs have been updated to reflect support for backfills
- Added Dagster’s maximum supported Python version (3.11) to Dagster University and relevant docs
- Added documentation for recommended partition limits (a maximum of 25K per asset).
- References to the Enterprise plan have been renamed to Pro, to reflect recent plan name changes
- Added syntax example for setting environment variables in PowerShell to our dbt with Dagster tutorial
- [Dagster University] Dagster Essentials to Dagster v1.6, and introduced the usage of
MaterializeResult
- [Dagster University] Fixed a typo in the Dagster University section on adding partitions to an asset (Thanks Brandon Peebles!)
- [Dagster University] Corrected lesson where sensors are covered (Thanks onefloid!)
Dagster Cloud
- Agent tokens can now be locked down to particular deployments. Agents will not be able to run any jobs scheduled for deployments that they are not permitted to access. By default, agent tokens have access to all deployments in an organization. Use the
Edit
button next to an agent token on theTokens
tab inOrg Settings
to configure permissions for a particular token. You must be an Organization Admin to edit agent token permissions.
1.6.3 (core) / 0.22.3 (libraries)
New
- Added support for the 3.0 release of the
pendulum
library, for Python versions 3.9 and higher. - Performance improvements when starting run worker processes or step worker processes for runs in code locations with a large number of jobs.
AllPartitionMapping
now supports mapping to downstream partitions, enabling asset backfills with these dependencies. Thanks @craustin!- [asset checks][experimental]
@asset_check
has new fieldsadditional_deps
andadditional_ins
to allow dependencies on assets other than the asset being checked. - [ui] Asset graph group nodes now show status counts.
- [dagster-snowflake] The Snowflake I/O Manager now has more specific error handling when a table doesn’t exist.
- [ui] [experimental] A new experimental UI for the auto-materialize history of a specific asset has been added. This view can be enabled under your user settings by setting “Use new asset auto-materialize history page”.
- [ui] Command clicking on an asset group will now select or deselect all assets in that group.
- [dagster-k8s] Added the ability to customize resource limits for initContainers used by Dagster system components in the Dagster Helm chart. Thanks @MattyKuzyk!
- [dagster-k8s] Added the ability to specify additional containers and initContainers in code locations in the Helm chart. Thanks @craustin!
- [dagster-k8s] Explicitly listed the set of RBAC permissions used by the agent Helm chart role instead of using a wildcard. Thanks @easontm!
- [dagster-dbt] Support for
dbt-core==1.4.*
is now removed because the version has reached end-of-life.
Bugfixes
- Previously, calling
get_partition_keys_not_in_subset
on aBaseTimeWindowPartitionsSubset
that targeted a partitions definition with no partitions (e.g. a future start date) would raise an error. Now, it returns an empty list. - Fixed issue which could cause invalid runs to be launched if a code location was updated during the course of an AMP evaluation.
- Previously, some asset backfills raised an error when targeting multi-assets with internal asset dependencies. This has been fixed.
- Previously, using the
LocalComputeLogManager
on Windows could result in errors relating to invalid paths. This has been resolved. Thanks @hainenber! - An outdated path in the contribution guide has been updated. Thanks @hainenber!
- [ui] Previously an error was sometimes raised when attempting to create a dynamic partition within a multi-partitioned asset via the UI. This has been fixed.
- [ui] The “Upstream materializations are missing” warning when launching a run has been expanded to warn about failed upstream materializations as well.
- [ui] The community welcome modal now renders properly in dark mode and some elements of Asset and Op graphs have higher contrast in both themes.
- [ui] Fixed dark mode colors for datepicker, error message, and op definition elements.
- [ui] Pressing the arrow keys to navigate op/asset graphs while the layout is loading no longer causes errors.
- [ui] Exporting asset and op graphs to SVG no longer fails when chrome extensions inject additional stylesheets into Dagster’s UI.
- [ui] Dagster now defaults to UTC when the user’s default timezone cannot be identified, rather than crashing with a date formatting error.
- [ui] Fixed an issue in the asset graph sidebar that caused groups to only list their first asset.
- [ui] Fixed an issue where sensors runs would undercount the number of dynamic partition requests added or deleted if there were multiple requests for additions/deletions.
- [docs] Fixed a typo in the “Using Dagster with Delta Lake” guide. Thanks @avriiil!
- [asset checks] Fixed an issue which could cause errors when using asset checks with step launchers.
- [dagster-webserver] A bug preventing WebSocket connections from establishing on python 3.11+ has been fixed.
- [dagster-databricks]
DatabricksJobRunner
now ensures the correctdatabricks-sdk
is installed. Thanks @zyd14! - [dagster-dbt] On run termination, an interrupt signal is now correctly forwarded to any in-progress dbt subprocesses.
- [dagster-dbt] Descriptions for dbt tests ingested as asset checks can now be populated using the
config.meta.description
. Thanks @CapitanHeMo! - [dagster-dbt] Previously, the error message displayed when no dbt profiles information was found would display an incorrect path. This has been fixed. Thanks @zoltanctoth!
- [dagster-k8s]
PipesK8sClient
can now correctly handleload_incluster_config
. Thanks @aignas!
Documentation
- Added a new category to Concepts: Automation. This page provides a high-level overview of the various ways Dagster allows you run data pipelines without manual intervention.
- Moved several concept pages under Concepts > Automation: Schedules, Sensors, Asset Sensors, and Auto-materialize Policies.
Dagster Cloud
- Fixed an issue where configuring the
agent_queue
key in adagster_cloud.yaml
file incorrectly failed to validate when using thedagster-cloud ci init
ordagster-cloud ci check
commands during CI/CD.
1.6.2 (core) / 0.22.2 (libraries)
New
- The warning for unloadable sensors and schedules in the Dagster UI has now been removed.
- When viewing an individual sensor or schedule, we now provide a button to reset the status of the sensor or schedule back to its default status as defined in code.