Lambda Function URLs with Streaming: Replace API Gateway for 90% of Use Cases

Lambda Function URLs with Streaming: Replace API Gateway for 90% of Use Cases

Lambda Function URLs are direct HTTPS endpoints for your Lambda functions that cost nothing extra — and most teams are still paying $3.50/million for API Gateway instead. Added in 2022 and upgraded with response streaming in 2023, Function URLs now handle authentication, CORS, and streaming responses natively. For 90% of serverless API use cases, they completely replace API Gateway.

TL;DR: Lambda Function URLs provide a dedicated HTTPS endpoint per function at no additional cost. Response streaming lets you start sending data before your handler returns — critical for LLM/AI APIs, large file streaming, and real-time data. Use API Gateway only when you need request validation, usage plans, or custom domain routing across multiple Lambdas.

Function URL vs API Gateway — cost comparison

# API Gateway REST API pricing:
# $3.50 per million requests
# $0.09/GB data transfer out
# + Cache costs if enabled

# API Gateway HTTP API pricing (cheaper):
# $1.00 per million requests
# + $0.09/GB data transfer

# Lambda Function URL pricing:
# $0.00 — completely free
# Only pay for Lambda execution time

# At 10M requests/month:
# API Gateway REST: $35/month
# API Gateway HTTP: $10/month
# Function URL:     $0/month (just Lambda costs)

# When Function URL makes sense:
# - Single-purpose functions (one function = one endpoint)
# - Internal microservices
# - Webhooks
# - Background job triggers
# - Streaming API responses (API Gateway doesn't support streaming)

Setting up a Function URL with streaming

// Node.js 20 Lambda with response streaming
// package.json: { "type": "module" }

import { Readable } from 'stream';

// awslambda.streamifyResponse is the key — wraps handler to enable streaming
export const handler = awslambda.streamifyResponse(
  async (event, responseStream, context) => {
    // Set response metadata BEFORE streaming body
    const httpResponseMetadata = {
      statusCode: 200,
      headers: {
        'Content-Type': 'text/plain',
        'X-Custom-Header': 'value',
      },
    };

    // Pipe metadata then body
    responseStream = awslambda.HttpResponseStream.from(
      responseStream,
      httpResponseMetadata
    );

    // Stream chunks as they're ready — client receives each chunk immediately
    const chunks = ['Hello ', 'world ', 'from ', 'Lambda ', 'streaming!'];
    for (const chunk of chunks) {
      responseStream.write(chunk);
      await new Promise(r => setTimeout(r, 100)); // Simulate delay between chunks
    }

    responseStream.end();
  }
);

// SAM template:
Resources:
  StreamingFn:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: nodejs20.x
      Architectures: [arm64]
      FunctionUrlConfig:
        AuthType: NONE        # Or AWS_IAM for private endpoints
        InvokeMode: RESPONSE_STREAM  # BUFFERED (default) or RESPONSE_STREAM
        Cors:
          AllowOrigins: ['https://yourdomain.com']
          AllowMethods: [GET, POST]

Real-world use case: streaming LLM responses

// Stream Claude/GPT responses directly from Lambda to browser
// No buffering — first tokens appear in <500ms

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

export const handler = awslambda.streamifyResponse(
  async (event, responseStream, _context) => {
    const body = JSON.parse(event.body || '{}');
    const userMessage = body.message;

    responseStream = awslambda.HttpResponseStream.from(responseStream, {
      statusCode: 200,
      headers: {
        'Content-Type': 'text/event-stream',
        'Cache-Control': 'no-cache',
        'Connection': 'keep-alive',
      },
    });

    // Stream from Anthropic API directly to client
    const stream = client.messages.stream({
      model: 'claude-opus-4-5',
      max_tokens: 1024,
      messages: [{ role: 'user', content: userMessage }],
    });

    for await (const chunk of stream) {
      if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
        // Send SSE event to browser
        responseStream.write('data: ' + JSON.stringify({ text: chunk.delta.text }) + '

');
      }
    }

    responseStream.write('data: [DONE]

');
    responseStream.end();
  }
);

// Client-side (browser):
const response = await fetch(FUNCTION_URL, {
  method: 'POST',
  body: JSON.stringify({ message: 'Explain async/await' }),
});
const reader = response.body.getReader();
// Read chunks as they arrive — first words appear in <500ms

Authentication — AWS_IAM for private endpoints

// Private Function URL with IAM auth — only signed requests allowed
// Use for internal service-to-service calls

// SAM:
FunctionUrlConfig:
  AuthType: AWS_IAM  // Requires SigV4 signed requests
  InvokeMode: RESPONSE_STREAM

// Calling from another Lambda or service:
import { LambdaClient, InvokeWithResponseStreamCommand } from '@aws-sdk/client-lambda';
import { fromNodeProviderChain } from '@aws-sdk/credential-providers';

// Use fetch with SigV4 signing for Function URL IAM auth:
import { SignatureV4 } from '@smithy/signature-v4';
import { Sha256 } from '@aws-crypto/sha256-js';

const signer = new SignatureV4({
  credentials: fromNodeProviderChain(),
  region: 'us-east-1',
  service: 'lambda',
  sha256: Sha256,
});

const url = new URL(FUNCTION_URL);
const signed = await signer.sign({
  method: 'POST',
  hostname: url.hostname,
  path: url.pathname,
  headers: { 'content-type': 'application/json', host: url.hostname },
  body: JSON.stringify({ message: 'hello' }),
});

const response = await fetch(FUNCTION_URL, {
  method: signed.method,
  headers: signed.headers,
  body: JSON.stringify({ message: 'hello' }),
});

When to still use API Gateway

  • ✅ Use Function URL when: one function = one endpoint, webhook receivers, streaming responses, internal services
  • ✅ Use Function URL when: you want zero infrastructure overhead and lowest cost
  • ⚠️ Use API Gateway when: routing multiple functions under one domain (/users, /orders)
  • ⚠️ Use API Gateway when: you need request validation, throttling per-route, usage plans
  • ⚠️ Use API Gateway when: you need WAF integration or custom authorizers shared across routes
  • ❌ API Gateway HTTP API + Lambda is still cheaper than REST API for non-streaming cases

Function URL streaming is the most natural fit for LLM-powered APIs — combine with Lambda cold start optimization so the first streamed token arrives fast. For monitoring streaming function performance, the CloudWatch Insights guide covers duration percentile queries that reflect streaming latency accurately. Official reference: Lambda response streaming documentation.

Master AWS Lambda

AWS Solutions Architect Course on Udemy — The most comprehensive AWS course covering Lambda, serverless patterns, and production architecture.

AWS Certified Solutions Architect Study Guide — Deep Lambda chapter covering cold starts, VPC, layers, and SnapStart.

Sponsored links. We may earn a commission at no extra cost to you.


Discover more from CheatCoders

Subscribe to get the latest posts sent to your email.

1 Comment

Leave a Reply