Skip to content

Fix: AWS Lambda Cold Start Timeout and Slow First Invocation

FixDevs ·

Quick Answer

How to fix AWS Lambda cold start timeouts and slow first invocations — provisioned concurrency, reducing package size, connection reuse, and language-specific optimizations.

The Error

Your Lambda function works fine on subsequent calls but the first invocation after a period of inactivity fails with a timeout or is significantly slow:

Task timed out after 3.00 seconds

Or in CloudWatch Logs:

REPORT RequestId: abc-123  Duration: 28432.45 ms  Billed Duration: 28433 ms
Init Duration: 26891.23 ms

The Init Duration shows the cold start time — in this case, the initialization took 26 seconds before the handler even ran.

Or users report the first API call after quiet periods takes 5–30 seconds while subsequent calls are fast (under 100ms).

Why This Happens

Lambda functions are not always running. When a function has not been invoked recently, AWS deallocates the execution environment. The next invocation triggers a cold start:

  1. AWS provisions a new execution environment (VM/container).
  2. Downloads and extracts the deployment package.
  3. Starts the language runtime (JVM, .NET CLR, Python interpreter, Node.js).
  4. Executes any initialization code outside the handler function.
  5. Then runs your handler.

Steps 1–4 are the cold start. This can take anywhere from 100ms (small Node.js function) to 30+ seconds (large Java function with Spring Boot).

Cold starts are more likely when:

  • The function has not been invoked recently (idle timeout varies by load).
  • Traffic spikes cause Lambda to provision new instances in parallel.
  • The deployment package is large.
  • The runtime is slow to initialize (Java, .NET > Python > Node.js).
  • The handler initializes heavy resources (DB connections, SDK clients) inside the function handler.

Fix 1: Move Initialization Code Outside the Handler

Code outside the handler runs during cold start initialization and is reused across warm invocations. Database connections and SDK clients initialized inside the handler are created on every invocation:

Broken — initializes on every invocation:

// handler.js
exports.handler = async (event) => {
  // This runs on EVERY invocation — creates new connection each time
  const { Pool } = require("pg");
  const pool = new Pool({ connectionString: process.env.DATABASE_URL });

  const result = await pool.query("SELECT * FROM users WHERE id = $1", [event.userId]);
  return result.rows[0];
};

Fixed — initialize once, reuse across invocations:

// handler.js
const { Pool } = require("pg");

// Runs ONCE during cold start — reused by all warm invocations
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 1, // Lambda: keep pool small — each instance handles one request at a time
});

exports.handler = async (event) => {
  // Connection already established — just query
  const result = await pool.query("SELECT * FROM users WHERE id = $1", [event.userId]);
  return result.rows[0];
};

Python example:

import boto3
import os

# Initialize outside handler — runs once per cold start
dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table(os.environ["TABLE_NAME"])

def handler(event, context):
    # table is already initialized — no cold start overhead here
    response = table.get_item(Key={"id": event["id"]})
    return response.get("Item")

Pro Tip: Lambda execution environments are reused for subsequent invocations (warm starts). Any state initialized at the module level persists between warm invocations of the same environment. This is why connection reuse works — and also why you must be careful with mutable global state.

Fix 2: Use Provisioned Concurrency for Latency-Critical Functions

Provisioned Concurrency keeps a specified number of execution environments initialized and ready at all times — eliminating cold starts entirely for those instances:

# Set provisioned concurrency for a function
aws lambda put-provisioned-concurrency-config \
  --function-name my-api-handler \
  --qualifier production \
  --provisioned-concurrent-executions 10

With a published version (recommended):

# Publish a version
aws lambda publish-version --function-name my-api-handler

# Apply provisioned concurrency to the version
aws lambda put-provisioned-concurrency-config \
  --function-name my-api-handler \
  --qualifier 1 \
  --provisioned-concurrent-executions 5

Cost: Provisioned concurrency costs money even when the function is not invoked — you pay for the pre-initialized environments. Calculate the trade-off:

  • Without provisioned concurrency: cold starts for some requests, lower base cost.
  • With provisioned concurrency: no cold starts, higher base cost.

For latency-sensitive APIs (user-facing, payment processing), provisioned concurrency is often worth the cost.

Schedule provisioned concurrency with Application Auto Scaling:

# Scale up during business hours, scale down at night
aws application-autoscaling put-scheduled-action \
  --service-namespace lambda \
  --resource-id function:my-api-handler:production \
  --scheduled-action-name scale-up-morning \
  --schedule "cron(0 8 * * ? *)" \
  --scalable-target-action MinCapacity=10,MaxCapacity=10

Fix 3: Reduce Package Size

Lambda downloads and extracts the deployment package on every cold start. Smaller packages = faster cold starts:

Check your current package size:

# After building your deployment package
du -sh deployment.zip
ls -lh deployment.zip

For Node.js — remove dev dependencies and unused packages:

# Only include production dependencies
npm ci --only=production

# Analyze what is in your bundle
npx bundlephobia-cli  # Or use webpack-bundle-analyzer

Use Lambda Layers for shared dependencies:

# Create a layer with large shared libraries (e.g., AWS SDK, express)
aws lambda publish-layer-version \
  --layer-name node-dependencies \
  --zip-file fileb://layer.zip \
  --compatible-runtimes nodejs20.x

Lambda layers are cached separately and shared across functions — they do not add to the function’s download time after the first load.

For Python — use slim base images and avoid heavy packages:

# Check package sizes
pip show pandas numpy scipy  # These are large — consider alternatives or layers

# Use Lambda-optimized packages
pip install numpy --target ./package  # Use AWS-compiled binaries for Lambda

For Java — use GraalVM native image or Quarkus:

Spring Boot cold starts can exceed 10 seconds on Lambda. Alternatives:

  • Quarkus with native compilation: sub-100ms cold starts.
  • Micronaut or Helidon: faster initialization than Spring.
  • GraalVM native image: compile to native binary, ~10-50ms cold start.
  • AWS Lambda SnapStart (Java only): snapshots the initialized state and restores it.

Fix 4: Enable Lambda SnapStart (Java)

AWS Lambda SnapStart (available for Java 11+ Corretto runtime) takes a snapshot of the initialized execution environment and restores it on subsequent cold starts:

# SAM template
MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: java21
    SnapStart:
      ApplyOn: PublishedVersions
    AutoPublishAlias: live

SnapStart reduces Java cold starts from seconds to milliseconds. It works by:

  1. Initializing the function once.
  2. Taking a memory snapshot.
  3. Restoring from snapshot on cold start (much faster than re-initializing).

Note: SnapStart requires handling restore hooks for resources that need reconnection after snapshot restore (database connections, random seeds):

import com.amazonaws.services.lambda.runtime.events.SnapStartEvent;
import com.amazonaws.services.lambda.runtime.snapstart.SnapStartLifecycleHook;

@Component
public class SnapStartHook implements SnapStartLifecycleHook {
    @Override
    public void beforeCheckpoint(SnapStartEvent event) {
        // Close connections before snapshot
        dataSource.close();
    }

    @Override
    public void afterRestore(SnapStartEvent event) {
        // Reconnect after restore
        dataSource.initialize();
    }
}

Fix 5: Keep Functions Warm with Scheduled Pings

For functions that cannot use provisioned concurrency, a scheduled ping every few minutes keeps at least one instance warm:

CloudWatch Events (EventBridge) scheduled rule:

# Ping the function every 5 minutes
aws events put-rule \
  --name lambda-warmer \
  --schedule-expression "rate(5 minutes)"

aws events put-targets \
  --rule lambda-warmer \
  --targets "Id=1,Arn=arn:aws:lambda:us-east-1:123456789:function:my-function"

Handle warm pings in the function:

exports.handler = async (event) => {
  // Skip actual work for warm-up pings
  if (event.source === "aws.events" && event["detail-type"] === "Scheduled Event") {
    console.log("Warm ping — skipping");
    return { statusCode: 200, body: "warm" };
  }

  // Normal handler logic
  return await processRequest(event);
};

Limitation: Pings only keep ONE instance warm. If you have concurrent requests, new instances still cold-start. Provisioned concurrency is the correct solution for predictable concurrency requirements.

Fix 6: Optimize Runtime-Specific Cold Start Performance

Node.js:

// Use ES modules carefully — CJS loads faster in some cases
// Avoid dynamic requires inside the handler
// Use esbuild or webpack to bundle and tree-shake

// esbuild example
// esbuild src/handler.ts --bundle --platform=node --target=node20 --outfile=dist/handler.js

Python:

# Lazy-load heavy modules inside the handler for rarely-used code paths
def handler(event, context):
    if event.get("action") == "generate_report":
        import pandas as pd  # Only loaded when needed
        # ...

Go and Rust: These compile to native binaries with minimal cold starts (~10ms). If cold starts are critical and you have flexibility in language choice, Go or Rust Lambda functions have near-zero initialization overhead.

Fix 7: Measure and Monitor Cold Starts

Find cold starts in CloudWatch Logs Insights:

fields @timestamp, @duration, @initDuration, @billedDuration
| filter ispresent(@initDuration)
| sort @timestamp desc
| limit 100

@initDuration is only present in cold start invocations. This query shows all cold starts with their initialization time.

Set up an alarm for high cold start frequency:

aws cloudwatch put-metric-alarm \
  --alarm-name lambda-cold-starts \
  --metric-name InitDuration \
  --namespace AWS/Lambda \
  --dimensions Name=FunctionName,Value=my-function \
  --statistic SampleCount \
  --period 300 \
  --threshold 10 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 1 \
  --alarm-actions arn:aws:sns:us-east-1:123:my-topic

Still Not Working?

Check VPC configuration. Lambda functions inside a VPC have longer cold starts because they need to set up ENIs (Elastic Network Interfaces). AWS improved VPC cold starts significantly in 2019–2020, but they are still longer than non-VPC functions. Only put Lambda in a VPC if you need access to VPC resources (RDS in private subnet, ElastiCache).

Check memory allocation. Lambda allocates CPU proportional to memory. A function with 128MB gets minimal CPU — increasing memory to 512MB or 1024MB often reduces both cold start and execution time, sometimes resulting in lower overall cost (less billable duration despite more memory cost per ms).

# Use AWS Lambda Power Tuning to find the optimal memory setting
# https://github.com/alexcasalboni/aws-lambda-power-tuning

Check for initialization errors. If your cold start code throws an error, Lambda retries the initialization on every invocation — making every call a cold start. Check CloudWatch Logs for initialization errors.

For Lambda import errors specifically, see Fix: AWS Lambda Import Module Error. For Lambda timeouts unrelated to cold starts, see Fix: AWS Lambda Timeout.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles