Speckle on AKS connect ECONNREFUSED 127.0.0.1:5432

Hi All,
Please may I have your assistance with getting speckle up and running on Azure Kubernetes. The status of the pods are as follows
NAME READY STATUS RESTARTS AGE
pod/speckle-frontend-7945f8648-4tvfb 1/1 Running 0 4d21h
pod/speckle-monitoring-5c59c9c78d-cgmvx 1/1 Running 0 4d21h
pod/speckle-preview-service-85576b6797-vg7w7 1/1 Running 1105 (8m25s ago) 4d21h
pod/speckle-server-7f65d48d55-zh4nj 0/1 CrashLoopBackOff 1128 (5m4s ago) 4d
pod/speckle-webhook-service-6f69bf9fbf-qzl4l 0/1 CrashLoopBackOff 872 (2m20s ago) 4d

When I start a pod in debug mode

PS C:\Users\shiangj> kubectl debug node/aks-agentpool-31857045-vmss000002 -it --image=speckle/speckle-webhook-service:2.12.5-alpha.27056 --env=ā€œREDIS_URL=my_azure_redisu_rl.windows.net:6380,password=my_redis_password,ssl=True,abortConnect=Falseā€ --env=ā€œPOSTGRES_URL=ā€œpsql host=postgres_host_name.postgres.database.azure.com port=5432 dbname=speckle user=postgres@postgres_host_name password=my_postgres_password sslmode=requireā€ā€
Creating debugging pod node-debugger-aks-agentpool-31857045-vmss000002-m89dq with container debugger on node aks-agentpool-31857045-vmss000002.
If you donā€™t see a command prompt, try pressing enter.
{ā€œlevelā€:ā€œerrorā€,ā€œtimeā€:ā€œ2023-03-13T16:39:40.160Zā€,ā€œcomponentā€:ā€œwebhook-serviceā€,ā€œerrā€:{ā€œtypeā€:ā€œErrorā€,ā€œmessageā€:ā€œconnect ECONNREFUSED 127.0.0.1:5432ā€,ā€œstackā€:ā€œError: connect ECONNREFUSED 127.0.0.1:5432\n at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16)ā€,ā€œerrnoā€:-111,ā€œcodeā€:ā€œECONNREFUSEDā€,ā€œsyscallā€:ā€œconnectā€,ā€œaddressā€:ā€œ127.0.0.1ā€,ā€œportā€:5432},ā€œmsgā€:ā€œError executing taskā€}
Error from server (NotFound): pods ā€œhost=pg-ldd-dev-uksouth.postgres.database.azure.comā€ not found

PS C:\Users\shiangj> kubectl debug node/aks-agentpool-31857045-vmss000002 -it --image=speckle/speckle-server:2.12.5-alpha.27056 --env=ā€œREDIS_URL=speckle-lld.redis.cache.windows.net:6380,password=my_redis_password=,ssl=True,abortConnect=Falseā€ --env=ā€œPOSTGRES_URL=ā€œpsql host=pg-ldd-dev-uksouth.postgres.database.azure.com port=5432 dbname=speckle user=postgres@pg-ldd-dev-uksouth password=my_postgres_password sslmode=requireā€ā€

Creating debugging pod node-debugger-aks-agentpool-31857045-vmss000002-vfp82 with container debugger on node aks-agentpool-31857045-vmss000002.
If you donā€™t see a command prompt, try pressing enter.
{ā€œlevelā€:ā€œinfoā€,ā€œtimeā€:ā€œ2023-03-13T16:51:41.335Zā€,ā€œphaseā€:ā€œdb-startupā€,ā€œmsgā€:ā€œLoaded knex conf for productionā€}
{ā€œlevelā€:ā€œerrorā€,ā€œtimeā€:ā€œ2023-03-13T16:51:41.818Zā€,ā€œerrā€:{ā€œtypeā€:ā€œErrorā€,ā€œmessageā€:ā€œconnect ECONNREFUSED 127.0.0.1:5432ā€,ā€œstackā€:ā€œError: connect ECONNREFUSED 127.0.0.1:5432\n at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16)ā€,ā€œerrnoā€:-111,ā€œcodeā€:ā€œECONNREFUSEDā€,ā€œsyscallā€:ā€œconnectā€,ā€œaddressā€:ā€œ127.0.0.1ā€,ā€œportā€:5432},ā€œmsgā€:ā€œconnect ECONNREFUSED 127.0.0.1:5432ā€}
{ā€œlevelā€:ā€œerrorā€,ā€œtimeā€:ā€œ2023-03-13T16:51:48.133Zā€,ā€œcomponentā€:ā€œredisā€,ā€œerrā€:{ā€œtypeā€:ā€œErrorā€,ā€œmessageā€:ā€œread ECONNRESETā€,ā€œstackā€:ā€œError: read ECONNRESET\n at TCP.onStreamRead (node:internal/stream_base_commons:217:20)ā€,ā€œerrnoā€:-104,ā€œcodeā€:ā€œECONNRESETā€,ā€œsyscallā€:ā€œreadā€},ā€œmsgā€:ā€œRedis encountered an error.ā€}

/speckle-server/packages/server/dist/modules/shared/redis/redis.js:18
throw new errors_1.EnvironmentResourceError(ā€˜Redis encountered an error.ā€™, err); //FIXME backoff and retry?
^
EnvironmentResourceError: Redis encountered an error.: read ECONNRESET
at EventEmitter. (/speckle-server/packages/server/dist/modules/shared/redis/redis.js:18:23)
at EventEmitter.emit (node:events:513:28)
at EventEmitter.emit (node:domain:489:12)
at EventEmitter.silentEmit (/speckle-server/node_modules/ioredis/built/Redis.js:460:30)
at Socket. (/speckle-server/node_modules/ioredis/built/redis/event_handler.js:189:14)
at Object.onceWrapper (node:events:628:26)
at Socket.emit (node:events:525:35)
at Socket.emit (node:domain:489:12)
at emitErrorNT (node:internal/streams/destroy:151:8)
at emitErrorCloseNT (node:internal/streams/destroy:116:3) {
jse_shortmsg: ā€˜Redis encountered an error.ā€™,
jse_cause: Error: read ECONNRESET
at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {
errno: -104,
code: ā€˜ECONNRESETā€™,
syscall: ā€˜readā€™
},
jse_info: { code: ā€˜ENVIRONMENT_RESOURCE_ERRORā€™ },
cause: [Function: ve_cause]
}

Node.js v18.14.1
Error from server (NotFound): pods ā€œhost=pg-ldd-dev-uksouth.postgres.database.azure.comā€ not found

The strange thing is its trying to connect using local host (127.0.0.1) when Iā€™ve clearly specified the use of the host name. Does this mean my connection string is not the right type for Speckle? Iā€™ve used the same string to connect from a client successfully. Or is it because it cannot resolve the address and by default it will use localhost?
Iā€™m also getting an issue with connecting to Redis (Azure Redis Cache), my feeling is these are related.
Any help would be greatly appreciated
Kind Regards
John

Hi @shiangoli - the POSTGRES_URL and REDIS_URL should be in URI (RFC 3986) format, e.g. postgresql://user:pass@host:port/database?sslmode=require
More info here:

We should improve our documentation, as itā€™s confusing that Postgres has two formats.

Iain

I updated our documentation for the Helm Chart recently, which includes an example in the description for the parameters db.connectionString.secretName and redis.connectionString.secretName. Hopefully this makes it clearer.
https://specklesystems.github.io/helm/#postgres-database
https://specklesystems.github.io/helm/#redis-store

Hi Iain, thank you for sharing the correct way for configuring the connection to postgres. Unfortunately cannot get it to work
I consistently get the following error

Please check the Username and retry connection. The Username should be in username@hostname format.",ā€œstackā€:ā€œerror: Invalid Username specified. Please check the Username and retry connection. The Username should be in username@hostname format.\n at Parser.parseErrorMessage (/speckle-server/node_modules/pg-protocol/dist/parser.js:287:98)\n at Parser.handlePacket (/speckle-server/node_modules/pg-protocol/dist/parser.js:126:29)\n at Parser.parse (/speckle-server/node_modules/pg-protocol/dist/parser.js:39:38)\n at TLSSocket. (/speckle-server/node_modules/pg-protocol/dist/index.js:11:42)\n at TLSSocket.emit (node:events:513:28)\n at addChunk (node:internal/streams/readable:324:12)\n at readableAddChunk (node:internal/streams/readable:297:9)\n at Readable.push (node:internal/streams/readable:234:10)\n at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23)ā€,ā€œlengthā€:185,ā€œnameā€:ā€œerrorā€,ā€œseverityā€:ā€œFATALā€,ā€œcodeā€:ā€œ28000ā€,ā€œfileā€:ā€œauth.cā€,ā€œlineā€:ā€œ481ā€,ā€œroutineā€:ā€œClientAuthenticationā€},ā€œmsgā€:ā€œError executing taskā€}

Iā€™ve tried all manner of combinations and below are some examples but the error is the same

postgresql://postgres@pg-ldd-dev-uksouth.postgres.database.azure.com:5432/speckle?password=password&sslmode=require
postgresql://pg-ldd-dev-uksouth.postgres.database.azure.com:5432/speckle?password=password&user=postgres@pg-ldd-dev-uksouth&sslmode=require
postgresql://pg-ldd-dev-uksouth.postgres.database.azure.com:5432/speckle?password=password&user=postgres@pg-ldd-dev-uksouth.postgres.database.azure.com&sslmode=require

It seems to complain about the format of the username
The username is certainly postgres
the host is pg-ldd-dev-uksouth.postgres.database.azure.com
Iā€™ve tried the username without the full extension (postgres@pg-ldd-dev-uksouth) of the host as illustrated above as well as postgres@pg-ldd-dev-uksouth.postgres.database.azure.com and with the 1st method. Is there something in Speckle that does not support the full name?

That one fooled me too last week ā€¦ Microsoft :slight_smile:

I went in with this for the postgres url ā€¦ but with docker and not AKS:
Url exactly like what @iainsproat mentioned

postgres://username:password@ressourcename.postgres.database.azure.com/databasename?sslmode=require

My setting for the Username:

The database name:

Hope this helps you on your matter

1 Like

Another problem with Microsoft is that for some reason the username provided by Azure is not url compliant, and is in the format user@host.

But that @ is misinterpreted by postgres (and presumably all other url parsing libraries!) as the @ separating the password from the host in the postgres connection string, and this results in the error you experienced.

To workaround Azureā€™s requirements, we have to url encode the @ within the Azure username, which becomes %40, and the azure username becomes user%40host

So the connection string in full becomes:
postgresql://user%40host:password@host:port/database?sslmode=require

Iā€™ve also experienced Azure producing non-compliant passwords, which may include @ or other symbols that are misinterpreted by postgres. Again, you will need to URL encode your password with the relevant symbol: URL encoding - Wikipedia

Hope this helps!

Thank you Iain
I tried using the following connection string following your suggestion
postgresql://postgres%40pg-ldd-dev-uksouth.postgres.database.azure.com:password%40pg-ldd-dev-uksouth.postgres.database.azure.com:5432/speckle?sslmode=require

Error message is as follows, which is different :slight_smile:

{ā€œlevelā€:ā€œerrorā€,ā€œtimeā€:ā€œ2023-03-16T12:56:00.804Zā€,ā€œcomponentā€:ā€œwebhook-serviceā€,ā€œerrā€:{ā€œtypeā€:ā€œErrorā€,ā€œmessageā€:ā€œgetaddrinfo ENOTFOUND postgresā€,ā€œstackā€:ā€œError: getaddrinfo ENOTFOUND postgres\n at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)ā€,ā€œerrnoā€:-3008,ā€œcodeā€:ā€œENOTFOUNDā€,ā€œsyscallā€:ā€œgetaddrinfoā€,ā€œhostnameā€:ā€œpostgresā€},ā€œmsgā€:ā€œError executing taskā€}

Its as if it doesnā€™t recognise the address
Also my password does not contain non alphanumeric characters
Any thoughts?

Thank you

Almost, the @ between password and the host needs to be left as an @, as itā€™s actually part of the url syntax and not content!

postgresql://postgres%40pg-ldd-dev-uksouth.postgres.database.azure.com:*password*@pg-ldd-dev-uksouth.postgres.database.azure.com:5432/speckle?sslmode=require

Hi Iain, Woo hoo :slight_smile: we have a connection!!, however what I have noticed it expects objects in the database please see below and wondering I need to deploy a schema which I didnā€™t see in the instructions. Or Iā€™m looking in the wrong place?

{ā€œlevelā€:ā€œerrorā€,ā€œtimeā€:ā€œ2023-03-16T16:34:25.967Zā€,ā€œcomponentā€:ā€œwebhook-serviceā€,ā€œerrā€:{ā€œtypeā€:ā€œDatabaseErrorā€,ā€œmessageā€:ā€œ\n UPDATE webhooks_events\n SET \n "status" = 1,\n "lastUpdate" = NOW()\n FROM (\n SELECT "id" FROM webhooks_events\n WHERE "status" = 0\n ORDER BY "lastUpdate" ASC\n LIMIT 1\n ) as task\n WHERE webhooks_events."id" = task."id"\n RETURNING webhooks_events."id"\n - relation "webhooks_events" does not existā€,ā€œstackā€:ā€œerror: \n UPDATE webhooks_events\n SET \n "status" = 1,\n "lastUpdate" = NOW()\n FROM (\n SELECT "id" FROM webhooks_events\n WHERE "status" = 0\n ORDER BY "lastUpdate" ASC\n LIMIT 1\n ) as task\n WHERE webhooks_events."id" = task."id"\n RETURNING webhooks_events."id"\n - relation "webhooks_events" does not exist\n at Parser.parseErrorMessage (/speckle-server/packages/webhook-service/node_modules/pg-protocol/dist/parser.js:287:98)\n at Parser.handlePacket (/speckle-server/packages/webhook-service/node_modules/pg-protocol/dist/parser.js:126:29)\n at Parser.parse (/speckle-server/packages/webhook-service/node_modules/pg-protocol/dist/parser.js:39:38)\n at TLSSocket. (/speckle-server/packages/webhook-service/node_modules/pg-protocol/dist/index.js:11:42)\n at TLSSocket.emit (node:events:513:28)\n at addChunk (node:internal/streams/readable:324:12)\n at readableAddChunk (node:internal/streams/readable:297:9)\n at Readable.push (node:internal/streams/readable:234:10)\n at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23)ā€,ā€œlengthā€:114,ā€œnameā€:ā€œerrorā€,ā€œseverityā€:ā€œERRORā€,ā€œcodeā€:ā€œ42P01ā€,ā€œpositionā€:ā€œ13ā€,ā€œfileā€:ā€œparse_relation.cā€,ā€œlineā€:ā€œ1180ā€,ā€œroutineā€:ā€œparserOpenTableā€},ā€œmsgā€:ā€œError executing taskā€}

Great news!

This new error is strange, as Speckle should create the entire database schema automatically.
Iā€™m wondering if it was a race-condition on startup, where the webhook service started before the server had created the schema in the database.

Can you try restarting the webhook-service deployment, and does the error re-occur?

kubectl rollout restart deployment/speckle-webhook-service --namespace your_namespace

(your deployment name and namespace name will/may vary slightly from the above example).

1 Like

Hi Iain, unfortunately that made no difference and wondering the error with connecting to redis is probably preventing the deployment of the schema. So I think I need to resolve this 1st

redis_url=ā€œredis://speckle-lld:my_password%40speckle-lld.redis.cache.windows.net:6380/0ā€

PS C:\Users\shiangj> kubectl logs pod/speckle-server-5d86cb95bc-v5j89
{ā€œlevelā€:ā€œinfoā€,ā€œtimeā€:ā€œ2023-03-17T08:47:07.446Zā€,ā€œphaseā€:ā€œdb-startupā€,ā€œmsgā€:ā€œLoaded knex conf for productionā€}
{ā€œlevelā€:ā€œerrorā€,ā€œtimeā€:ā€œ2023-03-17T08:47:08.048Zā€,ā€œcomponentā€:ā€œredisā€,ā€œerrā€:{ā€œtypeā€:ā€œErrorā€,ā€œmessageā€:ā€œgetaddrinfo ENOTFOUND speckle-lldā€,ā€œstackā€:ā€œError: getaddrinfo ENOTFOUND speckle-lld\n at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)ā€,ā€œerrnoā€:-3008,ā€œcodeā€:ā€œENOTFOUNDā€,ā€œsyscallā€:ā€œgetaddrinfoā€,ā€œhostnameā€:ā€œspeckle-lldā€},ā€œmsgā€:ā€œRedis encountered an error.ā€}

/speckle-server/packages/server/dist/modules/shared/redis/redis.js:18
throw new errors_1.EnvironmentResourceError(ā€˜Redis encountered an error.ā€™, err); //FIXME backoff and retry?
^
EnvironmentResourceError: Redis encountered an error.: getaddrinfo ENOTFOUND speckle-lld
at EventEmitter. (/speckle-server/packages/server/dist/modules/shared/redis/redis.js:18:23)
at EventEmitter.emit (node:events:513:28)
at EventEmitter.emit (node:domain:489:12)
at EventEmitter.silentEmit (/speckle-server/node_modules/ioredis/built/Redis.js:460:30)
at Socket. (/speckle-server/node_modules/ioredis/built/redis/event_handler.js:189:14)
at Object.onceWrapper (node:events:628:26)
at Socket.emit (node:events:525:35)
at Socket.emit (node:domain:489:12)
at emitErrorNT (node:internal/streams/destroy:151:8)
at emitErrorCloseNT (node:internal/streams/destroy:116:3) {
jse_shortmsg: ā€˜Redis encountered an error.ā€™,
jse_cause: Error: getaddrinfo ENOTFOUND speckle-lld
at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26) {
errno: -3008,
code: ā€˜ENOTFOUNDā€™,
syscall: ā€˜getaddrinfoā€™,
hostname: ā€˜speckle-lldā€™
},
jse_info: { code: ā€˜ENVIRONMENT_RESOURCE_ERRORā€™ },
cause: [Function: ve_cause]
}

Node.js v18.14.1

I think it should be redis://speckle-lld:my_password@speckle-lld.redis.cache.windows.net:6380/0. As with postgres, the url encoding of the symbol is only required if the @ is within the username or password.

The Redis misconfiguration shouldnā€™t have caused an issue with Postgres though. If you havenā€™t added any data to Speckle, Iā€™d suggest stopping Speckle (kubectl scale deployment/speckle-server ---namespace=your_namespace --replicas=0) , wiping/deleting the entire database and then restarting it (kubectl scale deployment/speckle-server ---namespace=your_namespace --replicas=1). Speckle will then re-apply the database schema.

Itā€™s also possible to clean the single database migrations table, if you donā€™t want to delete the entire database, and that will also cause the database schema migrations to be applied again. But Iā€™d go for the entire database clean if you have the opportunity.