Fixing Let’s Encrypt SSL Certificate Renewal on a Server: A Step-by-Step Guide

Background

We recently provisioned two new staging environments (staging1.mydomain.com and staging2.mydomain.com) mirroring our production Rails infrastructure. Production uses a Cloudflare load balancer fronting multiple origin servers running nginx, with a cron-driven script to renew Let’s Encrypt certificates across all origins.

When we added the staging environments, the existing renew_certificate.sh cron script wasn’t set up on them yet — so the certificates expired. This post documents everything we encountered trying to fix it, every error we hit, and how we resolved each one.


The Renewal Architecture

Before diving in, it’s worth understanding how SSL renewal works in this setup:

Cloudflare (DNS + Load Balancer)
nginx (origin server)
/apps/mydomain/current ← Rails app lives here
/etc/letsencrypt/live/ ← Certs live here

The renewal script (scripts/cron/renew_certificate.sh) does the following:

  1. Fetches the load balancer pool config from the Cloudflare API
  2. Disables all origin servers in the pool except the GCP instance (takes servers out of rotation)
  3. Turns off Cloudflare’s “Always Use HTTPS” setting (allows HTTP for the ACME challenge)
  4. Runs sudo certbot renew locally
  5. Copies the new cert to all other origin servers via SCP and SSH
  6. Re-enables all origin servers in the Cloudflare pool
  7. Re-enables “Always Use HTTPS”

The problem: this script was never added to the cron on staging1/staging2, so the certs expired.


First Attempt: Running the Renewal Script Manually

SSH’d into staging2 and ran:

bash /apps/mydomain/current/scripts/cron/renew_certificate.sh

Error #1: RSpec / webmock LoadError

An error occurred while loading spec_helper. - Did you mean?
rspec ./spec/helper.rb
Failure/Error: require 'webmock/rspec'
LoadError:
cannot load such file -- webmock/rspec

What happened: The script calls bundle exec rake google_chat:send_message[...] to send failure notifications to Google Chat. On staging, test gems like webmock aren’t installed in the bundle, so the rake task blew up loading the Rails environment.

Lesson: This is a notification side-effect, not the core renewal logic. But it masked the real error.

Error #2: certbot failing because port 80 was in use

After isolating the issue, running sudo certbot renew directly gave:

Renewing an existing certificate for staging2.mydomain.ca and www.staging2.mydomain.ca
Failed to renew certificate staging2.mydomain.ca with error: Could not bind TCP port 80
because it is already in use by another process on this system (such as a web server).
Please stop the program in question and then try again.

What happened: The original certificate was issued using certbot’s standalone authenticator, which spins up its own HTTP server on port 80 to answer the ACME challenge. Since nginx was already running on port 80, the renewal failed.

Meanwhile there was a second certificate (staging2.mydomain.ca-0001) that had been created earlier with sudo certbot --nginx -d staging2.mydomain.ca. This cert was valid — but it created a mess.


Inspecting the Damage

sudo certbot certificates

Output:

Renewal configuration file /etc/letsencrypt/renewal/staging2.mydomain.ca.conf produced
an unexpected error: expected /etc/letsencrypt/live/staging2.mydomain.ca-0001/cert.pem
to be a symlink. Skipping.
The following renewal configurations were invalid:
/etc/letsencrypt/renewal/staging2.mydomain.ca.conf

The nginx config at /etc/nginx/sites-enabled/mydomain was also a mess — certbot had injected its own server block for the HTTP→HTTPS redirect, and the two 443 server blocks were pointing to different cert paths:

# Certbot-injected block (unwanted)
server {
if ($host = staging2.mydomain.ca) {
return 301 https://$host$request_uri;
} # managed by Certbot
...
}
# Redirect server pointing to -0001 certs (also unwanted)
server {
server_name staging2.mydomain.ca;
listen 443 ssl http2;
ssl_certificate /etc/letsencrypt/live/staging2.mydomain.ca-0001/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/staging2.mydomain.ca-0001/privkey.pem; # managed by Certbot
...
}
# Main www server pointing to original path
server {
server_name www.staging2.mydomain.ca;
listen 443 ssl http2;
ssl_certificate /etc/letsencrypt/live/staging2.mydomain.ca/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/staging2.mydomain.ca/privkey.pem;
...
}

The Fix

Step 1: Remove all broken certbot state

sudo rm -f /etc/letsencrypt/renewal/staging2.mydomain.ca.conf
sudo rm -f /etc/letsencrypt/renewal/staging2.mydomain.ca-0001.conf
sudo rm -rf /etc/letsencrypt/live/staging2.mydomain.ca
sudo rm -rf /etc/letsencrypt/live/staging2.mydomain.ca-0001
sudo rm -rf /etc/letsencrypt/archive/staging2.mydomain.ca
sudo rm -rf /etc/letsencrypt/archive/staging2.mydomain.ca-0001

Step 2: Stop nginx and get a fresh cert with standalone authenticator

sudo service nginx stop
sudo certbot certonly --standalone -d staging2.mydomain.ca -d www.staging2.mydomain.ca
sudo service nginx start

This gave us a clean, single certificate at /etc/letsencrypt/live/staging2.mydomain.ca/.

Step 3: Clean up the nginx config

Removed the certbot-injected if ($host = ...) server block, and updated both 443 server blocks to point to the same cert path:

ssl_certificate /etc/letsencrypt/live/staging2.mydomain.ca/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/staging2.mydomain.ca/privkey.pem;

Reloaded nginx:

sudo service nginx reload

The site was live again with a valid cert.


Making Future Renewals Work Without Stopping nginx

The next problem: the cert renewal config was still using standalone authenticator. Future automated renewals would fail again the moment nginx was running.

The fix is to switch to the webroot authenticator. Our nginx config already had an ACME challenge location block:

location ^~ /.well-known/acme-challenge/ {
root /apps/certbot;
default_type "text/plain";
allow all;
}

This means certbot can write a challenge file to /apps/certbot and nginx will serve it over HTTP — no need to stop nginx.

Attempt 1: Manually edit the renewal config

Edited /etc/letsencrypt/renewal/staging2.mydomain.ca.conf:

[renewalparams]
authenticator = webroot
server = https://acme-v02.api.letsencrypt.org/directory
key_type = ecdsa
[[webroot]]
staging2.mydomain.ca = /apps/certbot
www.staging2.mydomain.ca = /apps/certbot

Dry run:

sudo certbot renew --dry-run

Error #3: webroot mapping not found

Failed to renew certificate staging2.mydomain.ca with error: Missing command line flag or
config entry for this setting:
Input the webroot for staging2.mydomain.ca:

The config looked correct but certbot was still asking interactively. This is a known certbot quirk — manually converting a standalone config to webroot doesn’t always work reliably because of how certbot parses its internal config format.

Attempt 2: Delete and re-issue with webroot from the start (this worked)

sudo certbot delete --cert-name staging2.mydomain.ca
sudo mkdir -p /apps/certbot
sudo certbot certonly --webroot -w /apps/certbot \
-d staging2.mydomain.ca \
-d www.staging2.mydomain.ca

This time certbot generated the renewal config correctly itself. Dry run:

sudo certbot renew --dry-run
Simulating renewal of an existing certificate for staging2.mydomain.ca and www.staging2.mydomain.ca
Congratulations, all simulated renewals succeeded:
/etc/letsencrypt/live/staging2.mydomain.ca/fullchain.pem (success)

Key Lessons

  1. Never run certbot --nginx on a server where you manage the nginx config manually. It injects its own server blocks and creates confusing duplicate certs with -0001 suffixes.
  2. Standalone vs webroot authenticator: Standalone is simpler to set up initially but requires stopping nginx. Webroot is the right choice for servers where nginx runs continuously — as long as you have the ACME challenge location block configured.
  3. Manually editing certbot renewal configs is fragile. Let certbot generate the renewal config by passing the correct authenticator flags at issuance time.
  4. certbot renew --dry-run is your best friend. Always confirm future renewals will work before leaving the server. Discovering a broken renewal config 2 days before expiry is stressful.
  5. Let’s Encrypt ACME server outages are real but brief. If dry-run fails with “The service is down for maintenance”, check https://letsencrypt.status.io/ and retry in a few hours.

A Clean Auto-Renewal Script for nginx + webroot

Here’s a standalone script you can drop into any server using this stack. It handles renewal, nginx reload, and sends a notification if anything fails. It assumes the webroot authenticator is already configured in the certbot renewal config.

#!/bin/bash
# /etc/cron.d/certbot-renew or called via crontab
# Requires: certbot, nginx, curl (for Slack/Google Chat webhook)
set -euo pipefail
CERT_NAME="${CERT_NAME:-}" # e.g. staging2.mydomain.ca
NOTIFY_WEBHOOK="${NOTIFY_WEBHOOK:-}" # Slack or Google Chat webhook URL
ACME_WEBROOT="${ACME_WEBROOT:-/apps/certbot}"
LOG_FILE="/var/log/certbot-renew.log"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
log() {
echo "[$TIMESTAMP] $*" | tee -a "$LOG_FILE"
}
notify() {
local message="$1"
log "NOTIFY: $message"
if [[ -n "$NOTIFY_WEBHOOK" ]]; then
curl -s -X POST "$NOTIFY_WEBHOOK" \
-H "Content-Type: application/json" \
-d "{\"text\": \"$message\"}" \
>> "$LOG_FILE" 2>&1 || true
fi
}
# Ensure webroot directory exists
mkdir -p "$ACME_WEBROOT"
log "Starting certificate renewal..."
# Attempt renewal
RENEW_OUTPUT=$(sudo certbot renew \
--quiet \
--non-interactive \
${CERT_NAME:+--cert-name "$CERT_NAME"} \
2>&1) || {
notify "SSL RENEWAL FAILED on $(hostname): $RENEW_OUTPUT"
log "ERROR: $RENEW_OUTPUT"
exit 1
}
# Check if any cert was actually renewed (certbot exits 0 even if nothing renewed)
if echo "$RENEW_OUTPUT" | grep -q "Congratulations"; then
log "Certificate renewed. Reloading nginx..."
sudo service nginx reload || {
notify "SSL RENEWAL WARNING on $(hostname): cert renewed but nginx reload failed!"
exit 1
}
notify "SSL cert successfully renewed on $(hostname)"
log "Done."
else
log "No certificates due for renewal. Nothing to do."
fi

Usage

# Set executable
chmod +x /usr/local/bin/certbot-renew.sh
# Set environment variables and run
CERT_NAME=staging2.mydomain.ca \
NOTIFY_WEBHOOK=https://chat.googleapis.com/v1/spaces/.../messages?key=... \
/usr/local/bin/certbot-renew.sh

Crontab entry (runs twice daily — Let’s Encrypt recommendation)

0 3,15 * * * deployer CERT_NAME=staging2.mydomain.ca NOTIFY_WEBHOOK=https://... /usr/local/bin/certbot-renew.sh

Running twice daily ensures that if one attempt fails due to a transient ACME server issue, the next attempt 12 hours later will succeed — giving you plenty of time before expiry.


Summary

ProblemRoot CauseFix
certbot failed to bind port 80standalone authenticator conflicted with nginxSwitch to webroot authenticator
Duplicate -0001 cert createdRan certbot --nginx after standalone cert existedDelete all cert state, re-issue cleanly
nginx serving expired certMixed cert paths after certbot injected its own configManually fix nginx config to consistent paths
Manual webroot config edit didn’t workCertbot’s conf format is fragile when converted manuallyDelete and re-issue with --webroot flag from scratch

Happy Debugging!


Unknown's avatar

Author: Abhilash

Hi, I’m Abhilash! A seasoned web developer with 15 years of experience specializing in Ruby and Ruby on Rails. Since 2010, I’ve built scalable, robust web applications and worked with frameworks like Angular, Sinatra, Laravel, Node.js, Vue and React. Passionate about clean, maintainable code and continuous learning, I share insights, tutorials, and experiences here. Let’s explore the ever-evolving world of web development together!

Leave a comment