Skip to content

Fix: AWS RDS Connection Timed Out from Lambda or EC2

FixDevs · (Updated: )

Part of:  Docker, DevOps & Infrastructure

Quick Answer

How to fix AWS RDS connection timeout errors from Lambda functions and EC2 instances, security group configuration, VPC settings, connection pooling, and RDS Proxy setup for Lambda.

The Connection That Times Out Instead of Failing

The cruel thing about an RDS timeout is that it does not fail, it hangs. A wrong password comes back in milliseconds with a clear message. A network problem just sits there until something else gives up, which on Lambda means the whole invocation dies at its timeout with no database error in the logs at all. The first time I chased one of these, I spent an hour reading application code before I accepted the truth: the packet never reached the database. Almost every fix below is about the network path, not the database, and the fastest way to prove it is to test raw TCP connectivity before touching a single line of code.

A Lambda function or EC2 instance fails to connect to RDS with:

Error: connect ETIMEDOUT 10.0.1.45:5432

Or:

SequelizeConnectionError: connect ETIMEDOUT
Error: Connection timeout expired (MySQL: ETIME)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not connect to server: Connection timed out

Or the Lambda function times out without a database error, it just hangs until the function’s own timeout is reached:

Task timed out after 30.00 seconds

Why the Connection Never Arrives

An RDS connection timeout almost always means the connection attempt is being dropped at the network layer, not a database authentication issue. The request never reaches RDS. Common causes:

  • Security group misconfiguration, the RDS security group does not allow inbound traffic from the Lambda or EC2 security group on the database port (5432 for PostgreSQL, 3306 for MySQL).
  • Lambda not in the same VPC, Lambda functions outside the VPC cannot reach RDS in a private subnet. Lambda must be configured to run inside the VPC.
  • RDS in a private subnet with no VPC endpoint, Lambda in the VPC can reach RDS in a private subnet, but internet-routed traffic cannot.
  • Lambda in public subnet without NAT, Lambda in a public subnet loses internet access when placed inside a VPC. RDS is typically in a private subnet, use private subnets for Lambda too.
  • Too many connections, Lambda scales horizontally; hundreds of concurrent invocations each opening a database connection can exhaust RDS’s connection limit, causing new connection attempts to hang.
  • RDS is stopped or in an unavailable state, check the RDS console.

In Production: Incident Lens

How the incident surfaces. RDS connection timeouts in production usually present as a sudden Lambda invocation duration spike (jobs that normally complete in 200ms now hang for the full 30-second timeout), or as exhausted connection-pool errors from long-running services. The function returns “Task timed out after 30.00 seconds” without ever reaching the database, so application logs are bare, no SQL error, no auth failure, just silence followed by the runtime killing the invocation. EC2 services running PostgreSQL clients show the same ETIMEDOUT at the network layer.

Blast radius. Depends on which layer broke. Security-group misconfig hits 100% of new connections from the affected source. Connection-pool exhaustion only hits new connections, existing pool members keep working, so the visible blast is “p99 latency spikes” on cold paths while warm paths stay fine. RDS instance failover creates a window where every connect-side call times out before the DNS CNAME flips to the standby: typically 60-120 seconds for classic Multi-AZ, and as little as ~35 seconds for the newer Multi-AZ DB cluster (three-instance) deployments. This is bounded but can cascade if your application opens new connections during the window instead of using a pool.

The monitoring signal that catches it. RDS CloudWatch DatabaseConnections against the instance’s max_connections is the leading indicator, alert at 80% utilization. CPUUtilization and FreeableMemory catch resource exhaustion that precedes connection storms. For Lambda specifically, alert on Duration p99 approaching the function timeout (not just on Errors), because a connect-timeout case looks like “Duration = timeout” not like an error. The Lambda ConcurrentExecutions metric paired with a flat DatabaseConnections count usually means new invocations are timing out at network rather than database layer.

Recovery sequence. First, confirm whether the issue is RDS-side or network-side. Open the RDS console, if status is anything but available, the instance itself is the problem and recovery is “wait for AWS.” If available, drop into the RDS Performance Insights view: high “wait/CPU” with normal connection count means the database is choking on slow queries; high connection count means clients are leaking. Second, if it is network-side, attempt connectivity from an EC2 host in the same subnet with nc -zv (see the network checks at the end of this article), if that succeeds, the issue is the Lambda or remote subnet’s security group or route table. Third, the rollback is usually “scale up the database connection limit temporarily” via aws rds modify-db-parameter-group or instance class change, then fix the underlying connection-leak in code on a follow-up deploy.

Postmortem-style preventive. The durable controls: (1) RDS Proxy in front of any Lambda-RDS path so pooling is centralized; (2) connectionTimeoutMillis set to a value shorter than the Lambda timeout (e.g., 5s) so the function fails fast with a clear error instead of timing out silently; (3) a daily synthetic Lambda that connects and runs SELECT 1 against every RDS instance from every Lambda subnet, catches security-group drift before users do; (4) CloudWatch alarm on DatabaseConnections / max_connections > 0.8 paired with autoscaling for read replicas; (5) server-side idle-connection cleanup tuned to the right knob, which is not a parameter called connection_timeout. On PostgreSQL use idle_in_transaction_session_timeout (and idle_session_timeout on PG14+) plus tcp_keepalives_idle; on MySQL use wait_timeout/interactive_timeout. These reclaim sessions left behind by clients that vanished without closing, which is exactly what a timed-out Lambda does.

Fix 1: Configure Security Groups Correctly

The most common cause. The RDS security group must explicitly allow inbound traffic from your Lambda or EC2:

Check the RDS security group:

  1. Go to AWS Console → RDS → Databases → click your DB instance.
  2. Under “Connectivity & security”, find the VPC security group.
  3. Click the security group → Inbound rules.
  4. Verify there is a rule allowing the database port from your compute resource.

Correct inbound rule on the RDS security group:

TypeProtocolPortSource
Custom TCPTCP5432sg-xxxxxxxxx (Lambda’s security group)
Custom TCPTCP3306sg-xxxxxxxxx (EC2’s security group)

Using AWS CLI to add the rule:

# Allow Lambda's security group (sg-lambda-id) to reach RDS on port 5432
aws ec2 authorize-security-group-ingress \
  --group-id sg-rds-id \
  --protocol tcp \
  --port 5432 \
  --source-group sg-lambda-id \
  --region us-east-1

One habit that has saved me a lot of grief: reference the security group as the source, never a hardcoded IP range, for anything in the same VPC. When an instance is replaced or Lambda scales, a security-group reference keeps working automatically, whereas an IP allowlist silently goes stale and you are back debugging timeouts that look like a network outage.

Do not use 0.0.0.0/0 as the source for RDS inbound rules. This opens your database to the entire internet. Always restrict to specific security groups or CIDR ranges.

Fix 2: Place Lambda in the Same VPC as RDS

Lambda functions are not in your VPC by default. To reach RDS in a private subnet, Lambda must be configured to run inside the VPC:

Configure VPC in the Lambda console:

  1. Lambda → Functions → your function → Configuration → VPC.
  2. Click Edit.
  3. Select the same VPC as your RDS instance.
  4. Select private subnets (same AZs as RDS, use at least 2 for availability).
  5. Select or create a security group for the Lambda function.
  6. Save.

Using AWS CLI:

aws lambda update-function-configuration \
  --function-name my-function \
  --vpc-config SubnetIds=subnet-private-1a,subnet-private-1b,SecurityGroupIds=sg-lambda-id

Using AWS CDK (TypeScript):

import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as ec2 from 'aws-cdk-lib/aws-ec2';

const vpc = ec2.Vpc.fromLookup(this, 'VPC', { vpcId: 'vpc-xxxxxxxx' });

const lambdaFn = new lambda.Function(this, 'MyFunction', {
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: 'index.handler',
  code: lambda.Code.fromAsset('src'),
  vpc,
  vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
  securityGroups: [lambdaSecurityGroup],
});

Using Terraform:

resource "aws_lambda_function" "my_function" {
  function_name = "my-function"
  handler       = "index.handler"
  runtime       = "nodejs20.x"

  vpc_config {
    subnet_ids         = [aws_subnet.private_1a.id, aws_subnet.private_1b.id]
    security_group_ids = [aws_security_group.lambda.id]
  }
}

A note on cold starts, because the old advice is everywhere: VPC placement used to add several seconds to every cold start, and you will still find articles quoting “1 to 10 seconds.” That has not been true since late 2019. AWS moved VPC Lambdas to shared Hyperplane ENIs, which are created once when you configure or update the function’s VPC settings, not per invocation. Today the VPC penalty at cold start is typically under 100ms. Do not avoid putting Lambda in a VPC out of cold-start fear; that fear is six years out of date.

Fix 3: Fix Lambda Internet Access After VPC Placement

When Lambda is placed in a VPC with private subnets, it loses internet access. If your Lambda also needs to call external APIs (S3, DynamoDB, external services), you need one of:

Option A, NAT Gateway (for internet access):

Lambda (private subnet) → NAT Gateway (public subnet) → Internet Gateway → Internet
# Create NAT Gateway in a public subnet
aws ec2 create-nat-gateway \
  --subnet-id subnet-public-1a \
  --allocation-id eipalloc-xxxxxxxxx

# Update private subnet route table to use NAT Gateway for 0.0.0.0/0
aws ec2 create-route \
  --route-table-id rtb-private \
  --destination-cidr-block 0.0.0.0/0 \
  --nat-gateway-id nat-xxxxxxxxx

Option B, VPC Endpoints (for AWS services, no NAT needed):

For accessing AWS services (S3, DynamoDB, Secrets Manager, SSM) without internet:

# Create VPC endpoint for S3 (Gateway endpoint — free)
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxxxxxxxx \
  --service-name com.amazonaws.us-east-1.s3 \
  --route-table-ids rtb-private

# Create VPC endpoint for Secrets Manager (Interface endpoint — costs money)
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-xxxxxxxxx \
  --vpc-endpoint-type Interface \
  --service-name com.amazonaws.us-east-1.secretsmanager \
  --subnet-ids subnet-private-1a subnet-private-1b \
  --security-group-ids sg-lambda-id

Fix 4: Fix Connection Pool Exhaustion

Lambda scales to hundreds of concurrent invocations. Each invocation opening its own database connection quickly exhausts RDS’s connection limit:

Check RDS connection limits:

-- PostgreSQL
SHOW max_connections;
SELECT count(*) FROM pg_stat_activity;

-- MySQL
SHOW VARIABLES LIKE 'max_connections';
SHOW STATUS LIKE 'Threads_connected';

Fix, use RDS Proxy (recommended for Lambda):

RDS Proxy pools and reuses database connections across Lambda invocations:

# Create RDS Proxy via CLI
aws rds create-db-proxy \
  --db-proxy-name my-rds-proxy \
  --engine-family POSTGRESQL \
  --auth '[{"AuthScheme":"SECRETS","SecretArn":"arn:aws:secretsmanager:us-east-1:123456789:secret:rds-credentials","IAMAuth":"DISABLED"}]' \
  --role-arn arn:aws:iam::123456789:role/rds-proxy-role \
  --vpc-subnet-ids subnet-private-1a subnet-private-1b \
  --vpc-security-group-ids sg-rds-proxy-id

After creating the proxy, update Lambda to connect to the proxy endpoint instead of the RDS endpoint:

// Lambda — connect to RDS Proxy endpoint
const { Pool } = require('pg');

const pool = new Pool({
  host: process.env.DB_PROXY_ENDPOINT, // e.g., my-rds-proxy.proxy-xxxx.us-east-1.rds.amazonaws.com
  port: 5432,
  database: process.env.DB_NAME,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  max: 1,      // Lambda should use 1 connection per invocation — Proxy handles pooling
  ssl: { rejectUnauthorized: true },
});

exports.handler = async (event) => {
  const client = await pool.connect();
  try {
    const result = await client.query('SELECT * FROM users LIMIT 10');
    return { statusCode: 200, body: JSON.stringify(result.rows) };
  } finally {
    client.release();
  }
};

Fix 5: Reuse Database Connections Across Lambda Invocations

Lambda reuses execution environments for warm invocations. Initialize the database connection outside the handler to reuse it:

// index.js — connection initialized once, reused across warm invocations
const { Pool } = require('pg');

// Outside the handler — initialized once per execution environment
let pool;

function getPool() {
  if (!pool) {
    pool = new Pool({
      host: process.env.DB_HOST,
      port: 5432,
      database: process.env.DB_NAME,
      user: process.env.DB_USER,
      password: process.env.DB_PASSWORD,
      max: 1,
      idleTimeoutMillis: 30000,
      connectionTimeoutMillis: 5000, // Fail fast instead of hanging
      ssl: { rejectUnauthorized: false },
    });
  }
  return pool;
}

exports.handler = async (event) => {
  const client = await getPool().connect();
  try {
    const result = await client.query('SELECT NOW()');
    return { statusCode: 200, body: JSON.stringify(result.rows) };
  } finally {
    client.release(); // Release back to pool — not close()
  }
};

The bug I have personally introduced here more than once: calling pool.end() or client.end() inside the handler. It feels tidy, but it tears down the connection after every invocation, so the next warm invocation has to reconnect from scratch and you lose the entire benefit of connection reuse. Call client.release() instead, which hands the connection back to the pool.

Fix 6: Store and Retrieve RDS Credentials Securely

Using hardcoded credentials or environment variables in plaintext is a security risk. Use AWS Secrets Manager:

const { SecretsManagerClient, GetSecretValueCommand } = require('@aws-sdk/client-secrets-manager');
const { Pool } = require('pg');

const secretsClient = new SecretsManagerClient({ region: 'us-east-1' });
let pool;

async function getPool() {
  if (pool) return pool;

  const response = await secretsClient.send(
    new GetSecretValueCommand({ SecretId: process.env.DB_SECRET_ARN })
  );

  const { username, password, host, port, dbname } = JSON.parse(response.SecretString);

  pool = new Pool({ host, port, database: dbname, user: username, password, max: 1 });
  return pool;
}

exports.handler = async (event) => {
  const p = await getPool();
  const client = await p.connect();
  try {
    const result = await client.query('SELECT * FROM orders LIMIT 5');
    return { statusCode: 200, body: JSON.stringify(result.rows) };
  } finally {
    client.release();
  }
};

Network Checks I Run When Nothing Connects

Test connectivity from within the VPC. SSH into an EC2 instance in the same VPC and subnet as Lambda, then test the RDS connection:

# Test TCP connectivity (no database client needed)
nc -zv your-rds-endpoint.rds.amazonaws.com 5432

# Or use telnet
telnet your-rds-endpoint.rds.amazonaws.com 5432

# If this times out, the issue is network/security group — not the application

Check RDS status. A stopped or failing RDS instance rejects all connections:

aws rds describe-db-instances \
  --db-instance-identifier my-db \
  --query 'DBInstances[0].DBInstanceStatus'
# Should return "available"

Check the RDS subnet group. The DB subnet group must include subnets in the same AZs as your Lambda subnets for cross-AZ routing to work.

Increase the Lambda timeout. If RDS is under heavy load, connections can take several seconds to establish. The Lambda default timeout is 3 seconds, increase it to at least 30 seconds for database workloads:

aws lambda update-function-configuration \
  --function-name my-function \
  --timeout 30

Set a connection timeout in your client. Without a connection timeout, a blocked connection causes Lambda to hang until its function timeout:

// Always set a connection timeout shorter than the Lambda timeout
const pool = new Pool({
  connectionTimeoutMillis: 5000, // Fail after 5 seconds instead of hanging
  // ...
});

For related AWS issues, see Fix: AWS Lambda Timeout, Fix: AWS EC2 SSH Connection Refused, Fix: AWS Unable to Locate Credentials, and Fix: MySQL Too Many Connections.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles