What You Need to Know Before Implementing CloudFront

Lessons Learned Hands On

Peter Njihia
6 min readNov 23, 2019

What is CloudFront?

CloudFront is a Content Delivery Network (CDN) offering from Amazon. It was launched over 10 years ago so it’s fully mature and global.

Here are some key stats as of October 2019:

  1. Over 185 edge locations (why are most locations airport codes???)
  2. 11 Regional cache
  3. Presence in 77 cities across 37 countries in every continent

Why/Who should read this?

Anyone involved or interested in running workloads in the cloud would benefit from reading this. From product managers wanting low latency for their applications, to DevOps teams that want to learn how to best run CDN workloads, to software engineers, who will gain ground on how to best architect their software to leverage CDNs.

CloudFront Pricing

Volume largely dictates what you pay when using CloudFront. Here’s a breakdown:

  • Regional Data Transfer Out to Internet (From Edge locations to the view/end users): In the US and Europe, it starts out at $0.085/GB and can as low as $0.02/GB as volume grows. So if you are pushing a TB a month, expect to pay $85/month.
  • Regional Data Transfer Out to Origin (From edge locations to your origin): A standard $0.02/GB. No charges for data from AWS origins (AWS EC2, Load Balancers) to CloudFront edge locations.
  • Number of requests: For every 10,000 requests, expect to pay 0.010 for HTTPS and $0.0075 for HTTP. For HTTPS, think of it as $1 per 1 Million requests.
  • Invalidations (clearing cache at edge locations, for instance, during content updates): These are free until you reach a count of 1,000 at the ACCOUNT level, not individual distributions. It’s $0.005 per invalidation after that.
  • Price Classes: As with everything else in AWS, pricing varies from region to region. With Cloudfront, it’s likely your users may be in a region that has high transfer rates. You can exclude these edge locations, at the expense of increased latency, to lower your transfer costs.

For more information on pricing: CloudFront Pricing

Benefits of CloudFront

Having read this far, CloudFront benefits have started to emerge, I hope :): But here’s a wrap:

  1. Better user experience as a result of reduced response times. Content is served from a location close to the user, regardless of origin. For non-cacheable items, it’s still possible to see improved response times since traffic from edge locations to the origin (AWS Regions) is via a high speed backbone, and not hops across the internet.
  2. Reduced server load: With more resources served at edge locations, web servers at the origin serve less, shorter queues, less scaling/instances → save ca$h, and more room to run non-cacheable workloads.
  3. Well integrated with other services: Cache S3 objects in addition to web servers/ALBs, compute at edge locations with Lambda@Edge, defined WAF rules in front of CDN. You could even create streaming distributions.
  4. Highly programmable, a plus to your DevOps practice: helps in transitions, consistency, visibility and more.

Implementing CloudFront

CDN is extremely easy to get started with. But before we go about the process let’s understand the key components:

  • Distributions: This is the core resource that defines your CDN. It can either be Web-based or Real Time Messaging Protocol (RTMP) based for streaming media. For the purpose of this blog, we’ll focus on Web.
  • Origins: This defines the source of the objects you intend to cache, it can be S3, ALB, ELB, or EC2. In essence, it could be any web-hosted resource reachable from the internet. You can have multiple origins per distribution, as long as you define path based behaviors on which origin will serve what traffic.
  • Behaviors: This tells CloudFront what to do when a request is received: Do we cache it? For how long? If multiple origins, which origin should serve this request? Which HTTP methods and protocols are allowed? You must provide a default behavior with an option to add additional ones, based on path.
  • Aliases: This is a list of all domains that resolve to CDN. This is in addition to putting CNAMEs records in Route 53 mapped to the CDN domain ({id}.cloudfront.net). Be sure to have those two steps in place before testing.
  • Viewer Certificates: SSL certs to use between end-user/viewer and CloudFront.

While you can start provisioning at the console, best route is to automate via Cloudformation. It only takes one resource: AWS::CloudFormation::CloudFrontDistribution, so it’s also very straightforward. Here’s a sample template:

AWSTemplateFormatVersion: 2010-09-09
Description: Creates CloudFront distribution
Parameters:

Resources:

MyCloudFrontDistribution:
Type: 'AWS::CloudFront::Distribution'
Properties:
DistributionConfig:
Aliases: cdn.my-sample-app.com
Comment: MyCDN
IPV6Enabled: true
Enabled: true
Origins:
- OriginPath: ''
CustomOriginConfig:
OriginSSLProtocols:
- TLSv1.2
OriginProtocolPolicy: https-only
Id: WebTierALB
DomainName: web.my-sample-app.com
ViewerCertificate:
SslSupportMethod: sni-only
AcmCertificateArn: arn:aws:acm:region:account-id:certificate/cert-id
MinimumProtocolVersion: TLSv1.2_2018
DefaultCacheBehavior:
AllowedMethods:
- HEAD
- DELETE
- POST
- GET
- OPTIONS
- PUT
- PATCH
CachedMethods:
- HEAD
- GET
Compress: true
TargetOriginId: WebTierALB
ForwardedValues:
Headers:
- '*'
Cookies:
Forward: all
QueryStringCacheKeys: []
QueryString: true
SmoothStreaming: false
MinTTL: 3600
MaxTTL: 86400
DefaultTTL: 21600
ViewerProtocolPolicy: redirect-to-https
Outputs:
MyCloudFrontDistribution:
Value: !GetAtt MyCloudFrontDistribution.DomainName

Surprises and how to get around them

Here are some things worth noting:

  1. Aliases (Alternate Domain Names) are limited to 100 urls, now this is plenty for most applications, but for apps that separate entities at the domain level, this can be a challenge: think of a multi-tenant app that uses {customer}.my-app.com schemes. Always remember to append to the Aliases for every new domain you add in the automation, doing it manually is NOT recommended as consequent automation updates will wipe out the manual settings. You can use a wild card but you can’t use the same wildcard in another distribution.
  2. TLS Settings: Unlike ALBs, CDN does not upgrade TLS protocols from clients using older TLS versions. You have to test connections from all client sources/devices, especially connections from non-browser-based clients. This may be an edge case, but I’ve come across an issue where CloudFront wasn’t responding with the effective TLS protocol configured (can be TLS 1, 1.1 or 1.2). This forced the client to assume the lowest protocol resulting in requests being rejected. Test all your scenarios!
  3. CloudFront updates take quite a while to take effect, give it time before running any conclusive tests: the average range I’ve seen is 25–35 minutes. I’ve also seen it extending over an hour, but that’s a rare occurrence.
  4. Cache settings: You need to provide path-based patterns and a I highly recommend using wildcards to minimize the number of records to create. Why? It’s easier to manage and you have to pay for invalidations and this is for every invalidation request. Many paths will increase your cost, so segment wisely.
  5. The ACM certificate you need to use with CloudFront MUST BE provisioned in the us-east-region regardless of where your origin is. This gets people a lot, since it’s not very intuitive to provision ACM in one region while you are primarily used to working in a different region. So plan ahead and provision certs in us-east-1 for every region you have presence/need for CDN. Seem more here CloudFront SSL ACM certificates in Regions Outside N. Virginia

Reports & Analytics

CloudFront offers a few different ways to view CDN Metrics:

Monitoring: Gives a summary of requests processed, data transferred and error rates. CloudFront charges based on data transfers, so it’s important to review these metrics.

Cache Statistics: This is a fairly important one as it shows you hit and miss rates. You want more hits less misses.

Popular Objects: This is a ranked list of most popular requests including, their cache behavior:

Usage: Gives you stats on data transferred between your viewers and CDN and between CDN and origin.

Viewers: Gives you info about your users e.g.

  1. What devices they are using
  2. What browsers are being used?
  3. Where they are location-wise
  4. What OS are they running

Getting Started

Consider deploying CDN without caching, and work in the caching rules later. In fact, continuous reviews and tweaks are necessary for optimal performance. This approach allows your to deliver small, low risk increments, with more chances to inject value. Consider creating independent automation that can be deployed independently of the your core application.

--

--

Peter Njihia

I'm a Cloud Architect/SRA/DevSecOps Engineer helping folks build and run in the cloud efficiently..