AWS Lambda SnapStart reduces Java cold starts from 8 seconds to under 100ms — and the most surprising thing about it is that it requires zero changes to your application code. It works by snapshotting the initialized execution environment after your init phase completes, then restoring from that snapshot instead of re-initializing from scratch on every cold start. This is the biggest Lambda performance improvement AWS has shipped in years, and most Java teams still haven’t adopted it.
⚡ TL;DR: Enable SnapStart on your Java 11/17/21 Lambda with one CloudFormation setting. Lambda snapshots your post-init JVM state. Cold starts restore from snapshot in <100ms instead of re-running your full init. Watch out for uniqueness issues: random seeds, timestamps, and network connections captured in the snapshot need special handling with the RuntimeHook interface.
How Lambda SnapStart actually works
Normal Lambda cold start for Java: JVM boots (500ms) → class loading (1–3s) → Spring context / framework init (2–5s) → first request handler runs. Total: 4–8 seconds. SnapStart changes this completely.
# SnapStart lifecycle:
# 1. Lambda runs your init code normally (once, at deployment)
# 2. After init completes, Lambda freezes the JVM and takes a memory snapshot (Firecracker microVM snapshot)
# 3. Snapshot is encrypted and stored in S3
# 4. Cold start = restore snapshot from S3 → thaw JVM → run handler
# 5. Thaw takes 100-200ms vs 4-8s for full init
# Enable in SAM template:
Resources:
MyFunction:
Type: AWS::Serverless::Function
Properties:
Runtime: java21
SnapStart:
ApplyOn: PublishedVersions # Required — only works on versions, not $LATEST
AutoPublishAlias: live # SAM automatically publishes a version and creates alias
The uniqueness problem — what breaks after restore
// Problems with naive SnapStart adoption:
// 1. Random seeds — SecureRandom initialized in init gets same seed after every restore
import java.security.SecureRandom;
// BAD: SecureRandom seeded once at init → all post-restore calls get same random sequence
private static final SecureRandom rng = new SecureRandom(); // Seeded in snapshot
// 2. Network connections — open sockets captured in snapshot are dead after restore
// DB connection pools, HTTP clients with keepalive, Redis connections all break
// 3. Timestamps — System.currentTimeMillis() captured in init gives stale time
// FIX: Implement CRaC (Coordinated Restore at Checkpoint) hooks via Lambda RuntimeHook:
import org.crac.Context;
import org.crac.Core;
import org.crac.Resource;
public class MyHandler implements RequestHandler<APIGatewayEvent, APIGatewayResponse>, Resource {
private static SecureRandom rng;
private static Connection dbConn;
static {
Core.getGlobalContext().register(new MyHandler());
}
@Override
public void beforeCheckpoint(Context<? extends Resource> context) {
// Called BEFORE snapshot is taken
// Close all connections, save state
if (dbConn != null) dbConn.close();
System.out.println("Before checkpoint: closing connections");
}
@Override
public void afterRestore(Context<? extends Resource> context) {
// Called AFTER restore from snapshot
// Re-initialize anything that breaks across checkpoint
rng = new SecureRandom(); // Fresh seed after restore
dbConn = createNewConnection(); // Fresh connection after restore
System.out.println("After restore: re-initialized connections");
}
}
Real benchmark: SnapStart vs cold start vs Provisioned Concurrency
# Measured on Spring Boot 3 Lambda, 512MB, us-east-1, Java 21
# Without SnapStart:
# Init duration: 5,840ms
# Cold start p50: 6,200ms
# Cold start p99: 8,900ms
# Monthly cost (100 cold starts/day): $0.00 extra (init is free)
# With SnapStart:
# Init duration: 5,840ms (same — runs once at deploy)
# Restore duration: 180ms
# Cold start p50: 210ms
# Cold start p99: 380ms
# Improvement: 96% reduction in cold start latency
# With Provisioned Concurrency (no cold starts at all):
# Cold start: 0ms (always warm)
# Extra cost: ~$14/month per 1 PC unit (1 always-warm instance)
# Verdict:
# SnapStart = 96% of Provisioned Concurrency benefit at $0 extra cost
# Use Provisioned Concurrency only when you need guaranteed sub-10ms response
SnapStart with Quarkus and Micronaut (better than Spring)
// Quarkus with SnapStart: even better results because Quarkus native compile
// moves more work to build time
// pom.xml dependency for CRaC support:
// <dependency>
// <groupId>io.quarkus</groupId>
// <artifactId>quarkus-amazon-lambda</artifactId>
// </dependency>
// Quarkus + SnapStart benchmark (512MB, Java 21):
// Init: 1,200ms (vs 5,800ms Spring — 4x less init work)
// Restore: 140ms
// p99 cold start: 180ms
// Micronaut benchmark (512MB, Java 21):
// Init: 800ms
// Restore: 110ms
// p99 cold start: 150ms
// If you're starting a new Lambda: Micronaut > Quarkus > Spring for SnapStart performance
// If you're already on Spring: SnapStart still saves 96% — don't rewrite just for this
SnapStart limitations you must know
- ✅ Supported runtimes: Java 11, Java 17, Java 21 — not Python, Node.js, or custom runtimes
- ✅ Only works on published versions and aliases — not on
$LATEST - ✅ Free — no additional cost beyond normal Lambda pricing
- ⚠️ Snapshot stored encrypted in S3 — adds ~200ms to first restore after deployment
- ⚠️ Must handle uniqueness issues with CRaC hooks (random, connections, timestamps)
- ⚠️ Not available in all regions — check AWS docs for current availability
- ❌ Does NOT work with Lambda@Edge or functions using
$LATEST - ❌ Concurrent restores each get their own snapshot copy — memory is not shared
Production CloudFormation configuration
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
ApiFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: my-java-api
Handler: com.example.Handler::handleRequest
Runtime: java21
MemorySize: 1024 # More memory = faster restore (more CPU allocated)
Timeout: 30
SnapStart:
ApplyOn: PublishedVersions
AutoPublishAlias: live
Environment:
Variables:
JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1"
# TieredStopAtLevel=1 skips JIT compilation during init
# Snapshot captures interpreted bytecode — JIT re-warms after restore
# Trade-off: faster snapshot, slightly slower first requests post-restore
# Point API Gateway at the alias, not $LATEST
ApiGateway:
Type: AWS::Serverless::Api
Properties:
StageName: prod
SnapStart is the most impactful free optimization available for Java Lambdas — combine it with the general Lambda cold start guide for a complete optimization strategy. For monitoring your SnapStart restore times, the CloudWatch Insights queries include a restore duration query that distinguishes @initDuration from restore events. Official reference: AWS Lambda SnapStart documentation.
Master AWS Lambda
→ AWS Solutions Architect Course on Udemy — The most comprehensive AWS course covering Lambda, serverless patterns, and production architecture.
→ AWS Certified Solutions Architect Study Guide — Deep Lambda chapter covering cold starts, VPC, layers, and SnapStart.
Sponsored links. We may earn a commission at no extra cost to you.
Discover more from CheatCoders
Subscribe to get the latest posts sent to your email.
