Speckle Certificate Errors in NGINX Ingress Controller

Hi All, I have a domain which is reflected in the values file, restarted all the services (speckle namespace all services running as expected except S3 compatibility error ). I cannot browse to the service getting error page cannot be reached. The following might be related I’m getting the following 2 error on the nginx ingress controller

W0323 09:59:26.584702 9 backend_ssl.go:47] Error obtaining X.509 certificate: no object matching key “speckle/server-tls” in local store
W0323 09:59:26.680911 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate

The cert manager appears to be running and status is ok, see below. Has anyone experience this and can point me in the right direction? Thank you for your help

Status:
Acme:
Last Registered Email: myemailaddress
Uri: https://acme-staging-v02.api.letsencrypt.org/acme/acct/90843713
Conditions:
Last Transition Time: 2023-03-01T15:47:07Z
Message: The ACME account was registered with the ACME server
Observed Generation: 3
Reason: ACMEAccountRegistered
Status: True
Type: Ready
Events:

nginx ingress controller pods logs below

W0323 09:59:25.251691 9 client_config.go:618] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0323 09:59:25.251801 9 main.go:209] “Creating API client” host=“https://10.0.0.1:443
I0323 09:59:25.275831 9 main.go:253] “Running in Kubernetes cluster” major=“1” minor=“24” git=“v1.24.9” state=“clean” commit=“57fbbcc2804848b95cad5519f5ec9d6355430db9” platform=“linux/amd64”
I0323 09:59:25.421535 9 main.go:104] “SSL fake certificate created” file=“/etc/ingress-controller/ssl/default-fake-certificate.pem”
I0323 09:59:25.463861 9 ssl.go:533] “loading tls certificate” path=“/usr/local/certificates/cert” key=“/usr/local/certificates/key”
I0323 09:59:25.478000 9 nginx.go:261] “Starting NGINX Ingress controller”
I0323 09:59:25.484624 9 event.go:285] Event(v1.ObjectReference{Kind:“ConfigMap”, Namespace:“ingress-nginx”, Name:“ingress-nginx-controller”, UID:“1ea7b853-1c97-471b-8d88-cddb9c43fea2”, APIVersion:“v1”, ResourceVersion:“2725364”, FieldPath:“”}): type: ‘Normal’ reason: ‘CREATE’ ConfigMap ingress-nginx/ingress-nginx-controller
I0323 09:59:26.584394 9 store.go:433] “Found valid IngressClass” ingress=“speckle/speckle-server” ingressclass=“nginx”
I0323 09:59:26.584589 9 event.go:285] Event(v1.ObjectReference{Kind:“Ingress”, Namespace:“speckle”, Name:“speckle-server”, UID:“adb25bac-449c-49f7-95ad-5088863957eb”, APIVersion:“networking.k8s.io/v1”, ResourceVersion:“10553424”, FieldPath:“”}): type: ‘Normal’ reason: ‘Sync’ Scheduled for sync
W0323 09:59:26.584702 9 backend_ssl.go:47] Error obtaining X.509 certificate: no object matching key “speckle/server-tls” in local store
I0323 09:59:26.679437 9 nginx.go:304] “Starting NGINX process”
I0323 09:59:26.679516 9 leaderelection.go:248] attempting to acquire leader lease ingress-nginx/ingress-nginx-leader…
I0323 09:59:26.680615 9 nginx.go:324] “Starting validation webhook” address=“:8443” certPath=“/usr/local/certificates/cert” keyPath=“/usr/local/certificates/key”
W0323 09:59:26.680911 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
I0323 09:59:26.681042 9 controller.go:188] “Configuration changes detected, backend reload required”
I0323 09:59:26.687162 9 status.go:84] “New leader elected” identity=“ingress-nginx-controller-856c787cd8-t5lst”
I0323 09:59:26.762216 9 controller.go:205] “Backend successfully reloaded”
I0323 09:59:26.762420 9 controller.go:216] “Initial sync, sleeping for 1 second”
I0323 09:59:26.762513 9 event.go:285] Event(v1.ObjectReference{Kind:“Pod”, Namespace:“ingress-nginx”, Name:“ingress-nginx-controller-744dc75f66-zdcc9”, UID:“c837b337-1782-402c-a63d-a32fccd5534b”, APIVersion:“v1”, ResourceVersion:“10556560”, FieldPath:“”}): type: ‘Normal’ reason: ‘RELOAD’ NGINX reload triggered due to a change in configuration
W0323 09:59:30.513353 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 09:59:45.021352 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 09:59:48.355721 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 09:59:51.691366 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 09:59:57.283160 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:00:00.617107 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:00:03.950868 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:00:07.283411 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:00:10.617597 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:00:18.354720 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:00:21.691188 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:00:25.022302 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
I0323 10:00:40.236471 9 status.go:84] “New leader elected” identity=“ingress-nginx-controller-744dc75f66-d62d8”
W0323 10:02:44.697652 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:02:48.031370 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate
W0323 10:02:51.364431 9 controller.go:1384] Error getting SSL certificate “speckle/server-tls”: local SSL certificate speckle/server-tls was not found. Using default certificate

It can sometimes take a moment or two for certificate manager & Let’s Encrypt (or whichever provider you are using) to process the certificate.

In the meantime, you can check the status of what is happening by inspecting some of the resources we’d expect cert-manager to create.

Firstly, cert-manager will create a temporary ingress for handling the challenge that Let’s Encrypt will eventually send. You can view this with (replacing XX with your namespace name):

kubectl get ingress --namespace XX

As well as the speckle-server ingress, you may also see an ingress such as cm-acme-http-solver-. This indicates that certificate manager is waiting on Let’s Encrypt to issue its challenge. You may also see a similarly named kubernetes service if you list those.

You can also inspect the status of the Certificate Request, and Certificate. This will indicate whether it is approved, denied, and whether it is ready. It may also describe what the current status is:

kubectl get certificaterequests --namespace XX
kubectl get certificates --namespace XX

Once the certificate is ready, cert-manager will automatically generate the secret server-tls which is being expected by the ingress, and which it is currently logging error messages about.

The provider we are using
letsencrypt-prod

The only thing that does not look correct is READY status is False here…

PS C:\Users\shiangj> kubectl get certificates --namespace speckle
NAME READY SECRET AGE
server-tls False server-tls 3h34m

The others look ok

PS C:\Users\shiangj> kubectl get certificaterequests --namespace speckle
NAME APPROVED DENIED READY ISSUER REQUESTOR AGE
server-tls-7t52c True False letsencrypt-prod system:serviceaccount:cert-manager:cert-manager 33m

PS C:\Users\shiangj> kubectl get ingress --namespace speckle
NAME CLASS HOSTS ADDRESS PORTS AGE
speckle-server nginx ldd-region.domain.com ip_address 80, 443 3h35m

We can also resolve the ip address to the domain

PS C:\Users\shiangj> resolve-dnsname ldd-region.domain.com

Name Type TTL Section IPAddress


ldd-region.domain.com A 630 Answer ip_address

The ip address above is the external IP to ingress-nginx-controller on AKS

Are you able to reach the path exposed by the cm-acme-http-solver-XXXX ingress that cert-manager has created?

It should be something like ldd-region.domain.com/.well-known/acme-challenge/XXXXXXXXXXX, where XXXXX is a random string.

You should be able to find the path by getting the ingress, and then describing it.

kubectl get ingress --namespace XX

Hi Iain, that service looks familiar but for some reason is no longer there and wondering why and trying to workout what could have removed this. Below is an old extract of my console

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cm-acme-http-solver-sqp4f NodePort 10.0.222.29 8089:30055/TCP 11s

Replaced by file import service…mm

PS C:\Users\shiangj> kubectl get all --namespace speckle
NAME READY STATUS RESTARTS AGE
pod/speckle-fileimport-service-7f58674bd5-tjb46 1/1 Running 0 6h6m
pod/speckle-frontend-656f98f5cb-m4jjl 1/1 Running 0 6h6m
pod/speckle-monitoring-799bdd7f56-4ddtl 1/1 Running 0 6h6m
pod/speckle-preview-service-54fccd98cc-sgmjs 1/1 Running 0 6h6m
pod/speckle-server-6d68ddccff-bxn2z 1/1 Running 0 6h6m
pod/speckle-webhook-service-6869cccf69-xws25 1/1 Running 0 6h6m

If it’s no longer there, then it may indicate that the Certificate has now been published and may be ready?

Can you check the Certificate and CertificateRequest status again, and also list the Secrets to see if server-tls has been created?

The pods you are showing look typical for a Speckle deployment. I think this issue is purely with the Ingress and cert-manager/Let’s Encrypt.

It crashes with the following error

I0323 16:03:58.456780 1 solver.go:87] cert-manager/acmesolver “msg”=“got successful challenge request, writing key” “base_path”=“/.well-known/acme-challenge” “host”=“domain.com” “path”=“/.well-known/acme-challenge/token” “token”=“token
Error: http: Server closed
Usage:
acmesolver [flags]

Flags:
–domain string the domain name to verify
-h, --help help for acmesolver
–key string the challenge key to respond with
–listen-port int the port number to listen on for connections (default 8089)
–token string the challenge token to verify against

http: Server closed

From your log message and this similar issue, it may be an Azure networking problem with your particular configuration Letsencrypt acme cert challenges no longer working on AKS (nginx-ingress + cert-manager + clusterissuer + letsencrypt) · Issue #2955 · Azure/AKS · GitHub

You may also be able to get more information about the current status of the challenge:

kubectl get challenges --namespace XX

and then:

kubectl describe challenge/XXXXX --namespace XX

The bottom of the output of describe on the challenge malformed: unable to update challenge error

browsing to the url in the output. i get a timeout during connect (likely firewall problem)",“status”: 400

Like you said this might be a problem at our end

2 Likes