What breaks when an SSL cert expires (and how to catch it before your client does)

Field guide to silent SSL expiry: what fails, what alerts you do and don't get, and how to catch the 30/14/3-day windows reliably.

You'd think an SSL certificate expiring would be a noisy event. It isn't. Browsers go silent for 89 days, and on day 90 every visitor sees an interstitial that effectively says don't trust this site. By the time the agency hears about it, three things have already happened:

  1. Visitors are bouncing.
  2. The client tried to log in to email or admin, got the warning, and assumed they were hacked.
  3. Someone googled how to fix it, copy-pasted a Stack Overflow answer, and is now poking your DNS.

This is a field guide to catching that — both the renewal failure and its weirder downstream effects.

What "expired SSL" actually looks like in the wild

Three categories of failure get lumped under one label.

1. The cert hit notAfter and Let's Encrypt didn't renew

Most common. Let's Encrypt issues 90-day certs and a renewal script (certbot, traefik, caddy, the host's auto-magic) is supposed to refresh them at day 60. When that breaks, you find out at day 90.

Reasons it breaks:

  • Disk full → renewal script can't write the new cert.
  • ACME challenge port (80) closed by a new firewall rule.
  • DNS provider rate-limit kicked in for _acme-challenge TXT records.
  • The cron daemon that ran certbot got disabled during a server "cleanup".
  • An OS upgrade replaced the renewal binary with an incompatible version.

None of these throw a noisy error. The renewal log fills with red, but nobody reads renewal logs.

2. Wildcard or SAN mismatch after a domain change

Cert was issued for acme.com + www.acme.com. Marketing added app.acme.com and pointed it at the same server. The cert is valid but doesn't cover the new hostname → ERR_TLS_CERT_ALTNAME_INVALID. Looks identical to "expired" to a non-technical client.

3. Chain of trust broke

Cert is fine. The intermediate cert wasn't installed (or got purged during a server move). Some browsers cache the intermediate, so it works on your laptop and breaks on the client's phone. Untouchable bug if you're only checking from one place.

What "expired SSL" doesn't look like

A lot of the alerting tools out there only check the leaf cert's notAfter. They miss:

  • Hostname mismatch (case 2 above).
  • Self-signed cert that someone temporarily swapped in for testing and forgot to revert.
  • Untrusted root because the issuer was deprecated (rare, but happens — see Symantec 2018, Let's Encrypt's old root in 2021).
  • Wrong chain order that breaks Java clients but works in Chrome.

A real check needs to do the full TLS handshake and verify the chain and match the hostname. That's three different bugs you can hit, and they all surface as "the site is broken" to the client.

What a sensible alert schedule looks like

Three thresholds. Anything else is noise.

| Days left | Severity | What to do | |-----------|----------|------------| | ≤ 30 | warn | Email the agency owner. Renewal should have already happened — investigate why it hasn't. | | ≤ 14 | warn | Page someone. The renewal pipeline is definitely broken; manual intervention needed. | | ≤ 3 | critical | Email + Slack + on-call SMS. This is going to expire on a weekend, in a foreign timezone, while you're at a wedding. |

Below those, you're either spamming people with notifications they ignore, or you're missing the window entirely.

How Pleenx does it

We do a real TLS handshake every day for every monitored domain. Not just notAfter — we capture:

  • days_left (from valid_to)
  • authorized (full chain validation, including hostname match)
  • issuer (so you spot a CA change)
  • subjectaltname (so you spot a SAN mismatch the moment it lands)

If the handshake fails entirely — connection refused, timeout, self-signed — that's a separate critical alert. Mixing those into "SSL expired" is a category error that costs people money.

The dashboard shows you the raw days-remaining counter. The cron writes the same data into a checks table you can query. When the count crosses 30 → 14 → 3, an alert fires once per crossing — not every day. That's the difference between a useful tool and an inbox full of yellow squares everyone learns to ignore.

What to do tonight

Open a terminal and run:

echo | openssl s_client -connect acme.com:443 -servername acme.com 2>/dev/null | openssl x509 -noout -dates

Replace acme.com with your client's domain. You'll get notBefore and notAfter. If notAfter is less than 30 days out and you don't know why renewal hasn't fired, you have homework.

Or just add the domain to Pleenx and go to bed.

NS
written byNikolay SpangeletsCTO · founder

Building Pleenx from the engineering side. 10+ years shipping production web platforms; broke enough things in production to know which monitoring is actually useful.

/related reads