In September, I worked on improving Pika’s image performance. I’ve had a long career now (25 years 😭) doing mostly web-programming tasks, yet somehow I’ve never set up a CDN myself. I suppose the "management years" right as my prior organization was getting bigger contributed to missing out on that experience. In any case, the work was overdue on Pika and it was time to tackle it.
Through a bit of help from online articles and online friends, I’ve gotten it mostly figured out. Here is Pika’s setup.
The tools
Since we started Good Enough with lots of AWS credits, Amazon has got us a bit locked in with their services. And since, remember, I have no past experience setting these things up, well, I tallied-ho with Amazon’s CloudFront for the CDN and S3, which we were already using, for storage. Through this process I had a lot of “grass is greener” feelings toward Cloudflare and Cloudflare R2, but I’ll save that dalliance for another day.
I started thinking about the many background jobs I was going to need to orchestrate for creating the various tuned images (resizing, removing Exif data, compression, etc). Through that research I ran into John Nunemaker’s Imgproxy is Amazing blog post. I reached out to confirm that he is still using imgproxy, and, boy howdy, is he ever. Thanks to Nunes for sharing many details about how he has configured both imgproxy and CloudFront!
The flow
When someone’s browser requests an uncached image from a Pika blog post, here’s how an image request flows through all of these systems:
┌────────────┐
│ │
│ Reader │
│ requests │
│ image │
│ │
└───┬────────┘
│ ▲
│ │
▼ │
┌────────────────┴───────────┐
│ │
│ Regional CloudFront node │
│ │
└─────┬──────────────────────┘
│ ▲
│ CloudFront │ caches
│ in regions │ and at
│ shield │
▼ │
┌────────────────────────┴───┐
│ │
│ CloudFront Origin Shield │
│ │
└─────┬──────────────────────┘
│ ▲
│ imgproxy │ strips
│ Exif, │ resizes,
│ and │ compresses
▼ │
┌────────────────────────┴───┐
│ │
│ imgproxy │
│ │
└─────┬──────────────────────┘
│ ▲
│ │
│ │
│ │
▼ │
┌────────────────────────┴───┐
│ │
│ S3 │
│ │
└────────────────────────────┘
The configuration details (as of today)
Let’s start one step above Rails with the imgproxy setup.
imgproxy
We deploy our services at Render.com. This is the full contents of the Dockerfile we use to deploy our Pika imgproxy web service instance:
FROM ghcr.io/imgproxy/imgproxy:latest
To configure imgproxy I am using environment variables to the max. Here are the environment variables I’m currently using:
IMGPROXY_TTL
=30758400
: Feeling pretty confident here and setting the TTL to 1 year. Attaching images to rich text fields in Rails should never really re-use an existing image or its URLs, making cache invalidation happen as a matter of course.IMGPROXY_FALLBACK_IMAGE_DATA
=R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
: This is a 1x1 transparent GIF fallback image in case imgproxy cannot retrieve the requested image.IMGPROXY_FALLBACK_IMAGE_TTL
=120
: I set the TTL for the fallback image to be much less than our system TTL set above. I don’t want system hiccups to lead to broken images. Well, not for more than 2 minutes, anyway!IMGPROXY_FORMAT_QUALITY
=jpeg=90,png=90,webp=79,avif=63,jxl=77
: Setting mild compression for all images. I am very cautious about over-compressing anything in Pika, and the default compression of80
was too extreme for me.webp
,avif
, andjxl
formats are not currently used in Pika, but I added them here to match the defaults that imgproxy uses forIMGPROXY_FORMAT_QUALITY
. Thegif
format is also not being used, as you'll see below.IMGPROXY_STRIP_COLOR_PROFILE
=false
: Related to the above, I want Pika to be as color-accurate as possible.IMGPROXY_MAX_SRC_RESOLUTION
=75
: Did you know there is such a thing as image bombs? Neither did I! impgroxy can protect you from them.IMGPROXY_ALLOW_SECURITY_OPTIONS
=true
: This is required to allow the use of theIMGPROXY_MAX_SRC_RESOLUTION
envar.IMGPROXY_USE_S3
=true
: This allows imgproxy to grab images directly from S3. Very clever as it saves a trip through our Rails servers! You will also need to set up the following envars:IMGPROXY_S3_REGION
,AWS_ACCESS_KEY_ID
, andAWS_SECRET_ACCESS_KEY
. The downside with this technique is that the URLs no longer end with image extensions like.jpg
, which has caused some problems with third-party services. I do wonder if it has been worth saving that trip through our Rails servers. 🤔IMGPROXY_ALLOW_ORIGIN
=https://pika.page
: I’m actually not sure if this is needed since we never hit our Rails app when loading an image.IMGPROXY_USE_LAST_MODIFIED
=true
: Given what I wrote about TTL above, I don’t think this is necessary, but it just feels right.IMGPROXY_SENTRY_DSN
: Set this to enable error reporting to Sentry.IMGPROXY_TIMEOUT
=15
: I’m not sure why I increased this from the default of10
.IMGPROXY_READ_REQUEST_TIMEOUT
=15
: Ditto.IMGPROXY_KEY
&IMGPROXY_SALT
need set as well, of course.
CloudFront
Here’s how we have CloudFront configured. I’m only mentioning the settings that we changed from the default.
Main Distribution settings:
- Alternate domain name:
cdn.u.pika.page
- Custom SSL certificate: Requested through the interface that CloudFront offers inline
Origin:
- Origin domain:
u.pika.page
- Enable origin shield: Yes, setting the Original Shield region to be the best match for our other server locations
Behaviors:
- Compress objects automatically: No
- Allowed HTTP methods: GET, HEAD, OPTIONS
- Cache HTTP methods: checked OPTIONS
- Cache key and origin requests: checked Legacy cache settings (I don‘t love that we are on this Legacy option, but I could never get the other option to work)
Logging: Added a log destination because I’m not sure how you troubleshoot without it!
DNS
Here’s how I have DNS configured for CloudFront and imgproxy:
- Our CDN requests go to
cdn.u.pika.page
(yes, I can already tell that that should have beencdn1.u.pika.page
) - The CDN requests our imgproxy origin at
u.pika.page
(yes, I should have went withu1.pika.page
) - At dnsimple I pointed
u.pika.page
to our imgproxy origin according to Render’s instructions - I also added a
CNAME
record to pointcdn.u.pika.page
to the Distribution domain name provided by CloudFront - As mentioned above, the SSL certificate for
cdn.u.pika.page
was acquired via CloudFront’s interface, which required a DNS record to be set up at dnsimple during setup for certificate validation
The Rails setup
Pika is configured to upload images to S3. This is a pretty straightforward setup that is written about in many other places.
I’m using the imgproxy gem to help build URLs for images.
(There is also an imgproxy-rails gem, but it didn’t play well with our setup.)
Here’s our imgproxy.yml
configuration file:
default: &default
key: Rails.application.credentials.dig(:imgproxy, :key)
salt: Rails.application.credentials.dig(:imgproxy, :salt)
development:
<<: *default
endpoint: <%= ENV['IMGPROXY_FREE_CDN'] %>
test:
production:
<<: *default
endpoint: <%= ENV['IMGPROXY_FREE_CDN'] %>
use_s3_urls: true
The IMGPROXY_FREE_CDN
envar is set to https://cdn.u.pika.page
, which is actually the CloudFront CDN URL.
Also note use_s3_urls: true
for the production environment.
This assures the URLs generated by the imgproxy gem are pointing imgproxy to S3 directly.
The simplest images we serve are site avatars, which can be used in the headings of a blog as well as social share images.
Rendering the imgproxy/CDN URL is pretty easy for this example.
Here’s what we have in our User
model:
has_one_attached :avatar
def avatar_url(variant=:small)
variant_options = case
when variant == :small
{ height: "100", width: "100" }
when variant == :medium
{ height: "300", width: "300" }
end
avatar.imgproxy_url(variant_options)
end
Rich text is a whole different beast in Rails.
In our case, we have already heavily overridden the _blob.html.erb
file,
and our CDN updates fit right in there.
Along the way I decided not to serve GIF files from imgproxy, so you’ll see some reference to that in the code as well.
Processing animated images can get complicated, and I decided to leave that thinking for another day.
Further, for local development I wanted to support accessing a local imgproxy instance, but not break if it isn’t available.
So you’ll see mention of an imgproxy?
method, which is supported by inclusion of this module in
ApplicationHelper
and User
:
module ImgproxyDetector
def imgproxy?
return @imgproxy if defined?(@imgproxy)
@imgproxy =
(Rails.env.production? || (Rails.env.development? && Rails.application.config_for(:imgproxy).endpoint.present?))
end
end
Here’s the simplified imgproxy/CDN-related code from our _blob.html.erb
file:
<figure class="attachment attachment--<%= blob.representable? ? "preview" : "file" %> attachment--<%= blob.filename.extension %>">
<% if blob.representable? %>
<%
if blob.content_type == 'image/gif' # don't use imgproxy URLs for GIFs in case they are animated
img_src_url = url_for(blob)
else
if imgproxy?
img_src_url = blob.imgproxy_url(height: "1400", width: "1800")
else
img_src_url = url_for(blob.variant(resize_to_limit: [1400, 1800], saver: { quality: 90 }))
end
end
%>
<%= image_tag img_src_url %>
<% end %>
<figcaption class="attachment__caption">
<% if caption = blob.try(:caption) %>
<%= caption %>
<% else %>
<span class="attachment__name"><%= blob.filename %></span>
<span class="attachment__size"><%= number_to_human_size blob.byte_size %></span>
<% end %>
</figcaption>
</figure>
imgproxy itself is much more performant than a Rails server, but you can’t get around the fact that image processing is a resource-heavy process. In order to avoid flooding our imgproxy server with an unpredictable number of requests the first time an image-heavy post is loaded, I decided that it would be best to warm the cache as soon as possible. So in the end I wasn’t able to avoid background jobs in our image processing stack. When a new post is created or has edited its images, a background job is created to query the CDN URL for each blob in the post. I’ll leave this code as an exercise for the reader.
Above you may remember that I mentioned the security concern of image bombing. While imgproxy protects us from that, I wanted to avoid folks uploading such images in the first place. So I added a validation to check image resolutions, which means I also didn’t manage to avoid doing any image processing on our Rails server. 😅 Here is a simplified version of how I do that for rich text image attachments:
# post.rb
has_rich_text :body
validate -> { acceptable_image_attachments(:body) }
def acceptable_image_attachments(attr)
return true if self.send(attr).body.blank?
self.send(attr).body.attachables.each do |attachment|
next unless attachment.is_a?(ActiveStorage::Blob)
if image_resolution_over_limit?(attachment)
errors.add(attr, image_resolution_error_message_for(attachment.filename))
end
end
end
def image_resolution_over_limit?(blob)
width, height = blob_dimensions(blob)
(width.to_f * height.to_f) / 1_000_000.0 > Rails.application.config.x.image_resolution_limit.to_f
end
def blob_dimensions(blob)
width = blob.metadata["width"]
height = blob.metadata["height"]
if width.nil? || height.nil?
blob.analyze
width = blob.metadata["width"]
height = blob.metadata["height"]
end
[width, height]
end
# application.rb
config.x.image_resolution_limit = 75 # in megapixels
Local testing is pretty easy once you get it all set up. Well, if you’re familiar with Docker. (I’m really not, but I got it set up, and doing that setup is another exercise I’ll leave to you, dear reader.) Our test code does not use imgproxy, but our development environment sure can. As mentioned above, we have a repo for Pika’s imgproxy that is a very simple Dockerfile.
- I have Docker and OrbStack installed locally to make things work
- dotenv is installed to manage my local envars
- In my
.env
file I haveIMGPROXY_FREE_CDN = "http://localhost:7777"
- I have foreman installed to handle Procfile applications
- Then I run
foreman start -f Procfile_imgproxy.dev
Here's my Procfile_imgproxy.dev
file, which is in my main Rails app:
imgproxy: docker run --rm --name pika-imgproxy -p 7777:8080 --add-host=pika.test:host-gateway -e IMGPROXY_ENABLE_INSECURE_MODE=true -e IMGPROXY_ALLOW_PRIVATE_NETWORKS=true -e IMGPROXY_ALLOW_LOOPBACK_NETWORKS=true -e IMGPROXY_ALLOW_ORIGIN=http://pika.test -e IMGPROXY_FALLBACK_IMAGE_DATA=R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7 -e IMGPROXY_FALLBACK_IMAGE_TTL=120 -e IMGPROXY_FORMAT_QUALITY=jpeg=90,png=90,webp=79,avif=63,jxl=77 -e IMGPROXY_STRIP_COLOR_PROFILE=false -e IMGPROXY_TTL=604800 -e IMGPROXY_USE_LAST_MODIFIED=true ghcr.io/imgproxy/imgproxy:latest
With this all running, you can see imgproxy in action in your local development environment!
The future
We’re hoping to ride wih this setup for quite a while. Down the road we’ll probably look into tuning GIFs, and I may look into ways to implement WebP and AVIF while still keeping colors and performance to our liking. During implementation I did not have good luck making those formats work well.
And, as an admittedly novice CDN implementor, maybe others will read this blog post and have some ideas about how I could improve this setup. Happy to hear them!