Fix: Dagster Not Working — Asset Loading, Resource Errors, and Daemon Issues

Q: How do I fix "Dagster Not Working — Asset Loading, Resource Errors, and Daemon Issues"?

How to fix Dagster errors — asset not found in definitions, resource not defined, dagster daemon not running, sensor or schedule not firing, DagsterInvariantViolationError, and asset materialization failing.

The Error

You try to load assets and Dagster can’t find them:

dagster.DagsterInvariantViolationError:
No assets with key 'my_dataset' found in Definitions

Or a resource is referenced but not configured:

dagster.DagsterInvalidDefinitionError:
Resource 'database' is required by asset 'raw_orders' but no resource with that key was provided.

Or schedules and sensors don’t fire:

$ dagster schedule list
# daily_etl - RUNNING
# But nothing has triggered in 24 hours

Or partitioned asset backfills fail with cryptic errors:

Partition 2025-04-09 has no materialization.
Upstream partition required but not found.

Or the UI shows assets as failed but logs reveal no error:

Asset materialization: FAILURE
Last error: <no error details available>

Dagster’s asset-centric model is powerful but opinionated — it expects you to think in terms of materialized data (assets) rather than operations (tasks). Resources inject dependencies, schedules and sensors drive execution, and the daemon process coordinates everything. Missing any piece produces errors that don’t look like bugs in your code. This guide covers each failure mode.

Why This Happens

Dagster has three conceptual layers: assets (data that gets materialized), ops (computation units, usually implicit in asset functions), and jobs (ops grouped into runnable graphs). Resources inject external dependencies (databases, APIs, S3 clients) into asset functions via type annotations.

The Dagster daemon is a separate process that runs schedules, sensors, and backfills. The webserver (dagster dev) shows the UI but doesn’t run schedules on its own. Users often start dagster dev expecting everything to work and find that their 9am schedule silently never triggers.

Diagnostic Timeline

When a Dagster pipeline “stops working,” the symptoms rarely point at the real cause. Here is how an experienced data engineer narrows it down.

Minute 0 — First guess: reload the code location. You hit the “Reload” button on the code location in the UI, hoping a stale Python process is the problem. It almost never is. The UI process is short-lived and reads from disk on every reload, so anything fixable by reload is also fixable by restart. If reload “works,” the real cause was a transient import error that got papered over.

Minute 5 — Compare asset definition vs run hash. Open the asset’s run history and look at the code version on the failed run vs the version you are currently viewing in the UI. If they differ, the failure was against an older code location that has since been redeployed. The UI shows current definitions, the run shows historical ones — they desync constantly during active development.

Minute 10 — Check IO manager wiring. A surprising number of “asset materialization failed silently” cases come down to a misconfigured IO manager. The default FilesystemIOManager writes pickles to local disk. In Kubernetes or any ephemeral compute, those pickles vanish between runs. The asset reports success but downstream loads fail with FileNotFoundError. Confirm the IO manager is S3, GCS, or another durable backend in production.

Minute 20 — Inspect run worker resources. If runs queue but never start, the run worker is starved. Check the daemon logs for CompoundResourceLimitError or pod evictions. In K8s deployments, the run worker pod’s memory limit needs to fit the asset’s full materialized output plus dependencies. A 1 GB-output asset on a 512 MB pod fails before producing any logs.

Minute 30 — Verify the daemon heartbeat. Run dagster debug run-info and confirm the daemon’s last heartbeat is recent. Schedules and sensors silently stop firing if the daemon dies. The UI happily shows them as “running” because that flag means “enabled,” not “currently executing.”

Minute 45 — Code location load errors. Open the code location’s status panel. If the location failed to load, every asset and schedule inside it disappears from the UI and never runs. Common causes: a missing env var resolved by EnvVar() at load time, a circular import, or a dependency mismatch between the daemon and webserver Python environments.

Fix 1: Defining and Registering Assets

# assets.py
from dagster import asset

@asset
def raw_orders():
    import pandas as pd
    return pd.read_csv("orders.csv")

@asset
def processed_orders(raw_orders):
    return raw_orders[raw_orders["amount"] > 0]

# definitions.py
from dagster import Definitions, load_assets_from_modules
from . import assets

defs = Definitions(
    assets=load_assets_from_modules([assets]),
)

Common error — asset not registered:

DagsterInvariantViolationError: No assets with key 'my_asset' found in Definitions

Causes:

Asset not in any file imported by definitions.py
load_assets_from_modules missing the module
@asset decorator missing — function is just a regular function

Explicit asset list (when auto-discovery gets confusing):

from dagster import Definitions
from myproject.assets import raw_orders, processed_orders, summary

defs = Definitions(
    assets=[raw_orders, processed_orders, summary],
)

Asset groups for organization:

from dagster import asset, AssetGroup

@asset(group_name="raw")
def raw_orders(): ...

@asset(group_name="processed")
def clean_orders(raw_orders): ...

@asset(group_name="analytics")
def monthly_revenue(clean_orders): ...

Asset key prefixes for hierarchical organization:

@asset(key_prefix=["database", "raw"])
def orders(): ...
# Asset key: database/raw/orders

Common Mistake: Defining an asset function and running the Dagster UI without updating definitions.py. Dagster only sees assets loaded through Definitions. If you add a new asset file, you must either list it explicitly or include it via load_assets_from_modules — or the UI won’t show it.

Fix 2: Resources and Dependency Injection

Resources inject shared state (database connections, API clients) into asset functions via type annotations.

from dagster import asset, ConfigurableResource
from pydantic import Field

class DatabaseResource(ConfigurableResource):
    connection_string: str = Field(..., description="Postgres connection")

    def execute(self, sql: str):
        import psycopg
        with psycopg.connect(self.connection_string) as conn:
            return conn.execute(sql).fetchall()

@asset
def raw_orders(database: DatabaseResource):
    return database.execute("SELECT * FROM orders")

# definitions.py
from dagster import Definitions, EnvVar
from .assets import raw_orders
from .resources import DatabaseResource

defs = Definitions(
    assets=[raw_orders],
    resources={
        "database": DatabaseResource(
            connection_string=EnvVar("DATABASE_URL"),
        ),
    },
)

EnvVar for runtime configuration — values resolved when the asset runs, not at import time:

from dagster import EnvVar

resources = {
    "api_key": EnvVar("OPENAI_API_KEY"),
    "database": DatabaseResource(
        connection_string=EnvVar("DATABASE_URL"),
    ),
}

Common error:

DagsterInvalidDefinitionError: Resource 'database' is required by asset 'raw_orders'
but no resource with that key was provided.

The asset’s type annotation (database: DatabaseResource) is the resource key. If definitions.py doesn’t register a resource under that key, the asset can’t run.

Pro Tip: Use ConfigurableResource (Pydantic-based) for all new code. The older string-based config API is deprecated. ConfigurableResource gives you type safety, IDE autocomplete, and better error messages — plus the ability to use EnvVar for secrets.

Multiple environments — define separate Definitions for dev/prod:

# definitions.py
import os
from dagster import Definitions

if os.getenv("DAGSTER_ENV") == "production":
    resources = {
        "database": DatabaseResource(connection_string=EnvVar("PROD_DATABASE_URL")),
    }
else:
    resources = {
        "database": DatabaseResource(connection_string="postgresql://localhost/dev"),
    }

defs = Definitions(assets=[...], resources=resources)

Fix 3: Running the Daemon for Schedules and Sensors

dagster dev   # Starts the webserver AND daemon

dagster dev is for development — it runs both the webserver (port 3000) and the daemon (schedules, sensors). In production, you run them separately:

# Webserver (UI)
dagster-webserver -h 0.0.0.0 -p 3000

# Daemon (background worker for schedules, sensors, backfills)
dagster-daemon run

Verify daemon is running:

dagster debug run-info
# Shows daemon heartbeat status

If the daemon isn’t running, schedules and sensors silently never fire. The UI shows them as “running” (meaning enabled), but nothing actually triggers.

Schedule definition:

from dagster import schedule, RunRequest

@schedule(
    job_name="daily_etl_job",
    cron_schedule="0 9 * * *",   # 9 AM every day
    execution_timezone="America/New_York",
)
def daily_etl():
    return RunRequest(run_key=None, run_config={})

Register it:

defs = Definitions(
    assets=[...],
    jobs=[daily_etl_job],
    schedules=[daily_etl],
)

Turn on from UI or CLI:

dagster schedule start daily_etl

Schedules are off by default — they must be explicitly started before they fire.

Sensors (event-driven triggers):

from dagster import sensor, RunRequest, SensorEvaluationContext

@sensor(job=my_job, minimum_interval_seconds=60)
def file_sensor(context: SensorEvaluationContext):
    import os
    new_files = [f for f in os.listdir("./inbox") if f.endswith(".csv")]
    for filename in new_files:
        yield RunRequest(run_key=filename, run_config={
            "ops": {"process_file": {"config": {"filename": filename}}}
        })

The daemon polls sensors every minimum_interval_seconds. Low values mean more checks but higher load.

Fix 4: Partitioned Assets

from dagster import asset, DailyPartitionsDefinition
from datetime import datetime

daily_partitions = DailyPartitionsDefinition(start_date="2024-01-01")

@asset(partitions_def=daily_partitions)
def daily_revenue(context):
    date = context.partition_key   # e.g., "2025-04-09"
    return compute_revenue_for_date(date)

Common error with partitioned assets:

Partition 2025-04-09 has no materialization.
Upstream partition required but not found.

Downstream partitioned assets expect upstream partitions to be materialized first. If you try to compute monthly_summary for April before all April daily partitions exist, it fails.

Materialize upstream first:

# From UI or CLI
dagster asset materialize --select daily_revenue --partition 2025-04-09

# Materialize a range
dagster asset materialize --select daily_revenue --partition-range 2025-04-01:2025-04-09

Multi-dimensional partitions (e.g., date × region):

from dagster import MultiPartitionsDefinition, StaticPartitionsDefinition

region_partitions = StaticPartitionsDefinition(["us", "eu", "asia"])

multi = MultiPartitionsDefinition({
    "date": daily_partitions,
    "region": region_partitions,
})

@asset(partitions_def=multi)
def regional_revenue(context):
    keys = context.partition_key.keys_by_dimension
    date = keys["date"]
    region = keys["region"]
    return compute(date, region)

Backfill — materialize a range of missed partitions:

dagster asset materialize --select my_asset --partition-range 2025-01-01:2025-03-31

In the UI, the “Backfill” button lets you select a partition range visually.

Fix 5: Asset Checks and Data Quality

Asset checks validate data after materialization — similar to dbt tests:

from dagster import asset, asset_check, AssetCheckResult

@asset
def customers():
    import pandas as pd
    return pd.read_csv("customers.csv")

@asset_check(asset=customers)
def not_null_email(customers):
    df = customers
    null_count = df["email"].isna().sum()
    return AssetCheckResult(
        passed=null_count == 0,
        description=f"{null_count} nulls in email column",
        metadata={"null_count": null_count},
    )

@asset_check(asset=customers, blocking=True)   # Blocks downstream on failure
def unique_customer_id(customers):
    dup_count = customers["id"].duplicated().sum()
    return AssetCheckResult(passed=dup_count == 0, metadata={"duplicates": dup_count})

blocking=True asset checks prevent downstream assets from materializing when the check fails — important for enforcing data quality gates.

Register in Definitions:

defs = Definitions(
    assets=[customers, ...],
    asset_checks=[not_null_email, unique_customer_id],
)

For dbt-based data quality patterns with similar test semantics, see dbt not working.

Fix 6: IO Managers — Where Assets Are Stored

IO managers control how asset outputs are persisted. The default writes to a local pickle file — fine for dev, wrong for production.

from dagster_aws.s3 import S3PickleIOManager, S3Resource

defs = Definitions(
    assets=[...],
    resources={
        "io_manager": S3PickleIOManager(
            s3_resource=S3Resource(),
            s3_bucket="my-dagster-bucket",
        ),
    },
)

Per-asset IO manager:

from dagster import asset

@asset(io_manager_key="snowflake_io_manager")
def my_asset(): ...

@asset(io_manager_key="s3_io_manager")
def other_asset(): ...

defs = Definitions(
    assets=[my_asset, other_asset],
    resources={
        "snowflake_io_manager": SnowflakeIOManager(...),
        "s3_io_manager": S3PickleIOManager(...),
    },
)

Custom IO manager:

from dagster import IOManager, io_manager, OutputContext, InputContext

class ParquetIOManager(IOManager):
    def handle_output(self, context: OutputContext, obj):
        path = f"{context.asset_key.to_path()}.parquet"
        obj.to_parquet(path)
    
    def load_input(self, context: InputContext):
        import pandas as pd
        path = f"{context.asset_key.to_path()}.parquet"
        return pd.read_parquet(path)

@io_manager
def parquet_io_manager():
    return ParquetIOManager()

Common Mistake: Leaving the default FilesystemIOManager in production. This writes pickles to the Dagster server’s local disk. If the server is in Kubernetes or restarted, assets are lost. Production deployments should use S3/GCS/Snowflake IO managers.

Fix 7: Jobs and Op-Based Workflows

For workflows that don’t fit the asset model (one-off operations, side effects), use ops and jobs:

from dagster import op, job, Out, In, Nothing

@op
def extract() -> list:
    return [1, 2, 3, 4, 5]

@op
def transform(data: list) -> list:
    return [x * 2 for x in data]

@op
def load(data: list):
    print(f"Loaded {len(data)} records")

@job
def etl_job():
    load(transform(extract()))

Asset jobs (combine assets into a job for scheduling):

from dagster import define_asset_job, AssetSelection

daily_etl_job = define_asset_job(
    name="daily_etl",
    selection=AssetSelection.groups("raw", "processed"),
)

defs = Definitions(
    assets=[...],
    jobs=[daily_etl_job],
    schedules=[daily_etl_schedule],
)

Graph composition for reusable sub-pipelines:

from dagster import graph

@graph
def preprocess(raw_data):
    cleaned = clean(raw_data)
    validated = validate(cleaned)
    return validated

@job
def full_pipeline():
    data = extract()
    preprocessed = preprocess(data)
    train_model(preprocessed)

Fix 8: Debugging and Logs

Enable structured logs inside assets:

from dagster import asset, get_dagster_logger

@asset
def my_asset(context):
    logger = context.log   # Built-in logger
    logger.info("Starting materialization")
    logger.debug("Detail info")

    try:
        result = do_work()
    except Exception as e:
        logger.error(f"Failed: {e}", exc_info=True)
        raise

    logger.info(f"Completed with {len(result)} rows")
    return result

Materialize metadata for UI display:

from dagster import MaterializeResult, MetadataValue

@asset
def my_asset():
    df = compute_data()
    return MaterializeResult(
        metadata={
            "num_rows": MetadataValue.int(len(df)),
            "preview": MetadataValue.md(df.head().to_markdown()),
            "report": MetadataValue.url("https://example.com/report"),
        },
    )

Run history in the UI shows asset materializations, run steps, logs, and timings. For expensive debugging, always check the run detail page first.

Command-line run:

# Materialize specific assets
dagster asset materialize --select "my_asset+" --select "other_asset-"

# + means downstream; - means upstream
# "my_asset+"  — my_asset and all downstream
# "+my_asset"  — my_asset and all upstream
# "+my_asset+" — my_asset with all upstream AND downstream

Still Not Working?

Dagster vs Airflow vs Prefect

Dagster — Asset-centric, strong dev experience, software-defined assets. Best for data products where lineage matters.
Airflow — Task-centric, mature ecosystem, huge community. See Airflow not working. Best when you need Airflow’s operator ecosystem.
Prefect — Task-centric with modern Python API. Best for workflows that mix data and non-data tasks.

Integration with dbt

Dagster has first-class dbt integration:

pip install dagster-dbt

from dagster_dbt import DbtCliResource, dbt_assets

@dbt_assets(manifest=Path("path/to/manifest.json"))
def my_dbt_assets(context, dbt: DbtCliResource):
    yield from dbt.cli(["build"], context=context).stream()

Use the dbt build command in the manifest path so Dagster can track each dbt model as an individual asset.

Production Deployment

Dagster runs in Kubernetes via the dagster-k8s package, on ECS via dagster-ecs, or managed via Dagster Cloud. For database configuration errors that affect deployment, see PostgreSQL connection refused.

Resource Configuration at Runtime

# Override resources at materialization time
context.resources.database   # Access the resource inside an asset

# Override in tests
from dagster import materialize

result = materialize(
    [my_asset],
    resources={
        "database": DatabaseResource(connection_string="sqlite:///test.db"),
    },
)

For testing patterns with pytest fixtures, see pytest fixture not found.

Freshness Checks and SLAs

Dagster can alert when assets become stale (not materialized within a time window):

from dagster import asset, FreshnessPolicy

@asset(
    freshness_policy=FreshnessPolicy(
        maximum_lag_minutes=60,   # Alert if not materialized in 60 minutes
        cron_schedule="0 * * * *", # Expected at the top of each hour
    ),
)
def hourly_metrics():
    ...

The UI highlights stale assets in red and can route alerts to Slack or email via sensors.

Code Locations and Separate Workspaces

For multi-team deployments, split Dagster into multiple code locations — each can have its own Python environment and dependencies:

# workspace.yaml
load_from:
  - python_module:
      module_name: team_a.definitions
      working_directory: /path/to/team_a
  - python_module:
      module_name: team_b.definitions
      working_directory: /path/to/team_b

Each code location runs in its own process, so dependency conflicts between teams don’t affect each other. The UI shows assets from all locations in one unified graph.

Asset Selection Syntax Reference

Dagster’s asset selection syntax is powerful but syntax-heavy:

Syntax	Meaning
`"my_asset"`	Just this asset
`"my_asset+"`	This asset and all downstream
`"+my_asset"`	This asset and all upstream
`"+my_asset+"`	This asset + both directions
`"group:my_group"`	All assets in group
`"tag:my_tag"`	All assets with tag
`"key_prefix:raw/"`	All assets under this prefix
`"my_asset-other_asset"`	Exclude

Combine with AssetSelection in Python for complex filters across jobs and sensors.

Sensor Cursors Stuck on Old Events

Sensors track a cursor value to remember where they left off. If a sensor crashed mid-evaluation, the cursor may be stuck at an event from days ago, causing the sensor to either re-process old events repeatedly or skip every event after the crash point. Clear the cursor from the UI (Sensors → your sensor → “Reset cursor”) or via the CLI to force a fresh start.

Asset Materialization Status Shows Stale

The asset graph in the UI shows the last successful materialization timestamp. If that timestamp lags behind your expectations, check whether the materialization is being recorded against the wrong asset key. Asset key prefixes (@asset(key_prefix=["raw"])) silently change the asset’s identity — a deploy that adds or removes a prefix orphans all prior materializations, and the new asset shows up as “never materialized.”

Definitions Object Reloading Without Updates

If you modify an asset and the UI does not reflect the change after reload, the code location may be cached. In Dagster Cloud, the agent caches Python imports — force a fresh deployment. Locally, dagster dev reads from disk on every request, but if you imported the module elsewhere (a __pycache__ issue), restart the process entirely.