AWS Lambda Cost Monitoring: Understanding and Optimizing Serverless Spend

Lambda's pricing model is elegantly simple: $0.20 per million requests plus $0.0000166667 per GB-second of execution time. For most workloads, this is genuinely cheap — Lambda is cost-effective for spiky or low-volume workloads that would run on idle EC2 capacity most of the time. But production Lambda architectures with high invocation rates, long-running functions, and high memory allocations can generate significant bills that aren't obvious from the pricing model alone.

Lambda cost optimization differs from EC2 optimization because the levers are different. You can't right-size the instance type — you control memory allocation (which determines CPU allocation) and execution duration. You optimize by allocating appropriate memory, eliminating unnecessary invocations, and ensuring functions complete efficiently.

Understanding Your Lambda Costs

Start with Cost Explorer filtered to the Lambda service. Break down by usage type to see the split between request charges and duration charges. Most Lambda workloads are dominated by duration charges rather than request charges — understanding this breakdown guides optimization effort.

For functions with high duration costs, CloudWatch metrics show average duration per function. High-duration functions are either doing necessary work slowly (optimization opportunity), doing unnecessary work (architectural opportunity), or waiting on downstream services (latency issue, not Lambda-specific). Distinguish between these before optimizing.

AWS Cost and Usage Reports (CUR) provide function-level cost attribution. CUR includes Lambda function ARN in the resource field for relevant line items, allowing per-function cost breakdown with Athena queries. For large Lambda deployments, this function-level visibility is essential — aggregate Lambda costs hide which specific functions are expensive.

Memory Allocation and the CPU Relationship

Lambda memory allocation is unique in that it determines CPU allocation too. More memory means proportionally more CPU. A function with 256 MB gets half the CPU of the same function with 512 MB. This means allocating more memory can actually reduce cost if the additional CPU allows the function to finish faster — reducing duration enough to offset the higher per-second cost.

AWS Lambda Power Tuning is an open-source tool (available in the AWS Serverless Application Repository) that tests your function across memory sizes and plots the cost and performance tradeoff. Run it against your production function code with representative input and it outputs the optimal memory size for your performance and cost requirements. Functions that are currently too small often benefit from more memory; functions that are too large often can save cost by downsizing.

Identifying Expensive Invocation Patterns

Some invocation patterns generate disproportionate Lambda costs:

Polling-based triggers with idle periods: SQS-triggered Lambda functions that poll frequently when the queue is empty pay request costs without doing useful work. Configure SQS event source mapping with an appropriate batch window — Lambda polls less frequently and processes batches instead of individual messages, reducing request costs and often reducing total execution time per record processed.

Recursive invocations: Lambda functions that invoke other Lambda functions synchronously can create deep invocation chains where each layer waits for the next. This keeps all functions in the chain running for the full duration. Redesign deep synchronous chains to use asynchronous invocations or event-driven patterns where functions don't wait for downstream results.

High-frequency scheduled functions: Lambda functions triggered every minute by EventBridge Scheduler or similar mechanisms accumulate significant invocation costs even if each execution is cheap. Evaluate whether high-frequency polling can be replaced by event-driven triggers that only fire when there's actual work to do.

Provisioned Concurrency Cost Management

Provisioned concurrency eliminates cold starts but carries an ongoing hourly cost regardless of function invocation rate. Review your provisioned concurrency configuration quarterly. If provisioned concurrency utilization (available in CloudWatch) is consistently below 50%, you've over-provisioned — scale down the provisioned count. Use auto-scaling for provisioned concurrency to scale based on actual invocation patterns rather than maintaining fixed capacity.

Lambda vs. Fargate vs. EC2 Cost Comparison

Lambda is cost-effective for workloads with significant idle time and highly variable traffic. For continuously running, high-throughput workloads, EC2 or Fargate may be cheaper. A Lambda function running continuously (near 100% invocation rate) costs more per unit of compute than an equivalent EC2 instance running the same workload.

The break-even point depends on the specific workload, but roughly: workloads running more than 40-50% of the time continuously for an extended period often cost less on EC2 with Reserved Instances. Workloads with significant idle periods (nights, weekends, low-traffic hours) benefit from Lambda's pay-per-invocation model. Use the Lambda vs. EC2 cost calculator (or a spreadsheet with your actual metrics) to evaluate migration decisions quantitatively.

FAQ

Does Lambda free tier apply to production accounts?

The Lambda always-free tier — 1 million free requests and 400,000 GB-seconds per month — applies to all accounts permanently, not just during the 12-month new account period. For small Lambda workloads, you may never pay for Lambda at all. Monitor your monthly usage in Cost Explorer and compare against the free tier limits to understand when you start paying.

How does Lambda pricing work for container image functions?

Container image Lambda functions are priced identically to ZIP deployment functions — per request and per GB-second. There's no additional charge for using container images instead of ZIP deployments. Container image storage in ECR incurs standard ECR storage fees ($0.10/GB/month), which is a minor additional cost compared to S3 storage for ZIP packages.

Can I use Savings Plans for Lambda?

Yes. Compute Savings Plans cover Lambda duration costs (but not request costs). Lambda invocation request charges are not covered by Savings Plans. For Lambda-heavy workloads with predictable sustained usage, a Compute Savings Plan can reduce duration costs by up to 17% on 1-year plans. The Lambda compute savings are automatically applied to eligible duration usage within your Compute Savings Plan commitment.

Protect your AWS accounts before it's too late

Vigilare monitors your AWS accounts for suspension risks — billing anomalies, IAM issues, GuardDuty findings, and more — and alerts you before AWS takes action.

See Vigilare pricing Talk to us about securing your AWS Browse documentation →

Written by Viktor B.

Co-founder & CEO

Co-founder & CEO of Vigilare. Works on turning the AWS signals that predict account enforcement — billing anomalies, IAM drift, GuardDuty findings, SES reputation, and CloudTrail activity — into a single risk score teams can act on before AWS does.