Speckle Server Knexerror

AlexHofbeck · 25 October 2023 04:56

Hello dear Specklers,

Our server has every morning starting issues . Joke aside … Something which did not happen 1-2 months ago. A reboot of the VM (which runs 24/7) usually does the trick.

The server is set up with the latest docker-compose script and server version.

Brief info:

Our vm auto-updates
Our database is managed … auto-updates + backups
Our server is behind another nginx reverse proxy on a separate machine
… which might have the potential of breaking the “flow” … but as previous versions of the server did not have this issue, I assume we are fine on the server side itself.

UPDATE:

Our test server has this issue too, which is another machine in a completely different configuration (no auto-updates, all DBs and reverse proxy on the same machine, no backup process). Usually we don’t care to look and my colleague works with it in an irregular manner.

This is some of the output of our server logs … some of the errors. I can provide you with a few more lines in the private chat, in case it helps you. It is not an immediate issue, as we can reboot, but it would still be interesting to find a solution.

Best,
Alex

{"level":"info","time":"2023-10-25T04:37:37.881Z","req":{"id":"017ccf8e-d67a-42e6-9cd6-5883b365b96b","method":"GET","path":"/graphql","headers":{"content-type":"application/json","host":"127.0.0.1:3000","connection":"close","x-request-id":"017ccf8e-d67a-42e6-9cd6-5883b365b96b"}},"res":{"statusCode":null,"headers":{"x-request-id":"017ccf8e-d67a-42e6-9cd6-5883b365b96b","access-control-allow-origin":"*","x-ratelimit-remaining":499}},"responseTime":2967,"msg":"request aborted"}
{"level":"error","time":"2023-10-25T04:37:42.669Z","req":{"id":"7c31d04e-5fb4-47ee-9432-866807aecb6b","method":"GET","path":"/graphql","headers":{"content-type":"application/json","host":"127.0.0.1:3000","connection":"close","x-request-id":"7c31d04e-5fb4-47ee-9432-866807aecb6b"}},"authContext":{"auth":false},"req":{"id":"7c31d04e-5fb4-47ee-9432-866807aecb6b","method":"GET","headers":{"content-type":"application/json","host":"127.0.0.1:3000","connection":"close","x-request-id":"7c31d04e-5fb4-47ee-9432-866807aecb6b"}},"authContext":{"auth":false},"component":"graphql","graphql_operation_kind":"query","graphql_query":"{serverInfo{version}}","graphql_operation_value":"GQL query serverInfo","grqphql_operation_name":"GQL serverInfo","actionName":"query serverInfo","query-path":"serverInfo","err":{"type":"GraphQLError","message":"Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?","stack":"KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?\n    at Client_PG.acquireConnection (/speckle-server/node_modules/knex/lib/client.js:312:26)\n    at async Runner.ensureConnection (/speckle-server/node_modules/knex/lib/execution/runner.js:287:28)\n    at async Runner.run (/speckle-server/node_modules/knex/lib/execution/runner.js:30:19)\n    at async getServerInfo (/speckle-server/packages/server/dist/modules/core/services/generic.js:11:28)\n    at async Object.serverInfo (/speckle-server/packages/server/dist/modules/core/graph/resolvers/server.js:10:20)","locations":[{"line":1,"column":2}],"path":["serverInfo"]},"msg":"graphql error"}
{"level":"info","time":"2023-10-25T04:37:42.669Z","req":{"id":"7c31d04e-5fb4-47ee-9432-866807aecb6b","method":"GET","path":"/graphql","headers":{"content-type":"application/json","host":"127.0.0.1:3000","connection":"close","x-request-id":"7c31d04e-5fb4-47ee-9432-866807aecb6b"}},"authContext":{"auth":false},"req":{"id":"7c31d04e-5fb4-47ee-9432-866807aecb6b","method":"GET","headers":{"content-type":"application/json","host":"127.0.0.1:3000","connection":"close","x-request-id":"7c31d04e-5fb4-47ee-9432-866807aecb6b"}},"authContext":{"auth":false},"component":"graphql","graphql_operation_kind":"query","graphql_query":"{serverInfo{version}}","graphql_operation_value":"GQL query serverInfo","grqphql_operation_name":"GQL serverInfo","actionName":"query serverInfo","msg":"graphql response"}
{"level":"info","time":"2023-10-25T04:37:45.013Z","req":{"id":"5a449920-f165-493a-8df7-83de418ef6e0","method":"GET","path":"/graphql","headers":{"connection":"upgrade","host":"speckle-server:3000","x-forwarded-proto":"https","sec-websocket-key":"6RBkWfQHU5jcDPfCMniYYg==","sec-websocket-version":"13","sec-websocket-protocol":"graphql-ws","x-request-id":"5a449920-f165-493a-8df7-83de418ef6e0","x-speckle-client-ip":"85.178.159.255"}},"res":{"statusCode":400,"headers":{"x-request-id":"5a449920-f165-493a-8df7-83de418ef6e0","access-control-allow-origin":"*","x-ratelimit-remaining":499,"content-type":"text/html; charset=utf-8","content-length":"341","etag":"W/\"155-i0kS0ZxJHXUiOj0lodWvBLh17Z8\""}},"responseTime":5,"msg":"request completed"}

dimitrie · 25 October 2023 09:43

Is probably the key here. Can you check the db’s server connection limit? Sometimes, processes get stuck and keep sockets open to the db, thus blocking extra connections. There is a timeout of sorts, but afaik is generous (maybe too generous? or maybe we don’t set one by default?).

As well, you could check if there’s any long running/idle queries.

Adding a pooler like PgBouncer could help, but this again depends on your setup.

iainsproat · 25 October 2023 09:57

The server can also be a source of some extra debugging information. It reports how many free connections it believes are available as Prometheus metrics, which are viewable at <serverurl>:3000/metrics. You can point a Prometheus-compatible metrics aggregator at this url to scrape the metrics.

The Prometheus metric to look at is speckle_server_knex_free. If you have Grafana, or similar, you can then plot this e.g. avg_over_time(speckle_server_knex_free{namespace=~"speckle-xyz", app="server"}[2m]).

Another metric to look at is speckle_server_knex_pending, which will show how many queries are pending with knex.

imanwarsame · 20 November 2023 07:31

I’ve also been having this issue, restarting the VM works but I’m thinking of abstracting the database to make it easier to manage

AlexHofbeck · 20 November 2023 07:51

I activated the PGBouncer (mentioned by @dimitrie) with me throwing a punch blindfolded . As it was in my feeling the method with less time invest. Did not help on my Azure managed Database. I will over the weeks switch to a kubernetes setting so that I get FE2 . With that I will try to take a look at the metrics mentioned by @iainsproat.

When I find something out, I will let you know @imanwarsame.

iainsproat · 21 November 2023 13:20

@AlexHofbeck - I have started a PR which should enable Frontend-2 on Docker Compose environments, though it requires a bit of testing.

You may wish to try this instead of migrating to Kubernetes to get fe2: feat(frontend-2): deploy fe2 as default on docker-compose environments by iainsproat · Pull Request #1880 · specklesystems/speckle-server · GitHub

Accompanying changes to the documentation can be found here: chore(frontend-2): documentation to deploy fe2 as default by iainsproat · Pull Request #192 · specklesystems/speckle-docs · GitHub

AlexHofbeck · 21 November 2023 13:51

Thanks very much, Ian . Will take a look at it

AlexHofbeck · 29 November 2023 11:16

@iainsproat: Let me know when you have released it and I will set it up … super enthusiastic about it.

iainsproat · 29 November 2023 11:31

We’ve merged in the changes to speckle-server, we now have to release a new version for the new documentation and updated docker compose files to work.

Keep an eye out on Releases · specklesystems/speckle-server · GitHub for when that happens. You can also subscribe to the speckle-server GitHub repository and configure notifications to alert you: Configuring notifications - GitHub Docs

Iain

iainsproat · 8 December 2023 18:05

As an update, it is all merged and the documentation is now updated: Deploying a Server - manual setup | Speckle Docs

Iain

AlexHofbeck · 10 December 2023 17:54

Hi Iain,

I have set it up for our Production and Testserver with docker-compose, worked like a charm … so happy to have FE2 running. We still need to change to Kubernetes, but this can happen independent of FE2. So thank you very much for your effort.

A. Setup
During the setup I had two minor complications, one with the canonical URL and one about the FE2 = true.

I would add here a todo for the people, that public_API_Origin needs to be modified too. At least this was necessary for me. Otherwise, the server was not able to find the backend. Can be wrong though, but I needed it for my deployment.

  speckle-frontend-2:
    image: speckle/speckle-frontend-2:2
    restart: always
    environment:
      NUXT_PUBLIC_SERVER_NAME: 'local'
      $TODO: Fill out Public_API_Origin with canonical URL  
      NUXT_PUBLIC_API_ORIGIN: 'http://127.0.0.1'
      NUXT_PUBLIC_BACKEND_API_ORIGIN: 'http://speckle-server:3000'

The true needs to be in quotation marks … thanks to you guys I’m developing a nose for docker-compose.yml .

     USE_FRONTEND_2: 'true'
      # TODO: Change this to the URL of the speckle server, as accessed from the network
      FRONTEND_ORIGIN: 'http://127.0.0.1'

B. Issue with the comments
FE2 works in general smoothly, except for the comments. When creating a comment on your official server, it works fast. On our testserver, where the database is on the same VM it works fine too.

On our Production server with Azure-managed DB:

the loading indicator does not stop until I hit refresh of the browser. With Browser refresh, it creates the comment … without refresh the rotation of the loading indicator continues seemingly endless. In the old FE we did not have issues with the comments.

All the best,
Alex

iainsproat · 11 December 2023 11:02

Thanks @AlexHofbeck - I’ve updated the documentation to include your feedback.

We’ll investigate the comments loading performance issue.

Iain

AlexHofbeck · 31 January 2024 12:18

As a conclusion to this:

After switiching to FE2 and solving the issue with my Reverse Proxy to support web socket (Viewer Comments - #3 by AlexHofbeck)

all the past issues with our deployment went away. It would still be beneficial for us to think about a better deployment configuration based on Kubernetes (in discussions internally) but for now, the Docker Compose deployment works fine.

Thanks for your support @iainsproat