Speckle's PostgreSQL database not decreasing in size when models are deleted

adrii · 14 November 2024 09:53

Hi everyone! I have some specific questions about Speckle’s PostgreSQL setup.

I have a local deployment of the Speckle stack, and I’ve been trying to query geometrical data directly from PostgreSQL. So far, I’ve been exploring the tables and found that Speckle object data apparently resides within a table named objects in the database.

However, I noticed something strange: whenever I upload a new model, the number of rows in the objects table increases (as expected), but when I delete model(s) from Speckle, the number of rows does not decrease. This gives the impression that data from deleted models remains in the objects table. If left unchecked, this would cause the database size to keep increasing. Below are some screenshots:

Before upload:

After upload:

After deleting the model from the Speckle, but the objects table size stayed the same:

So my question would be: is this an expected behaviour? If not, what can I do to fix it?

Thank you in advance!

iainsproat · 14 November 2024 10:18

Hi @adrii

Welcome to the community; this is a great question.

This is by design; objects will only be deleted when the project is deleted.

We do this to minimise the amount of data that is sent where multiple models share common components, for example this is a common situation in architectural options studies. We then do not wish the deletion of one model to remove data used by another model.

We trade-off some database storage size in order to reduce send performance. I hope this design decision makes sense.

Finally, depending on how your Postgres database is configured it may take a while for the cascading deletion of a project to complete and full compaction of the data to be reflected.

Iain

adrii · 14 November 2024 10:38

Hi Iain. First, thank you for your reply!

This is by design; objects will only be deleted when the project is deleted.

That’s actually very clear, and it does make sense.

However, practically, it’s a bit surprising… For example, suppose I have designed a simple house in Revit; using the connector, what I usually do is upload the house to a specific model within a specific project. Later on, whenever I make changes, I’ll upload it again to that specific model as a different version. So, while I understand the importance of sharing components, I assumed it would apply between versions, not models.

Returning to your example on architectural studies, let’s say I’ve designed several models and ultimately decide on just one model. I’d delete the other models from Speckle and keep only the one I want. It’s surprising that the data for the other models would remain (even if there were likely no shared components).

Is there an option in Speckle’s configuration to enable deletion of objects linked to a model when that model is deleted, provided the objects are not shared with other models?

iainsproat · 14 November 2024 11:11

It makes any future sends much, much faster. We’ve heard many times that send performance is very, very important to our users, so we’ve optimised for it. Storage space has generally been cheap and available, so we’ve traded it to achieve better performance on other criteria.

At the moment, no. Do you have specific constraints on database storage size, or perhaps a use case which is creating a prolific number of models? I’d like to understand a little more about what you would like to achieve, and why. This will help us to understand the problem space and develop options.

Iain

adrii · 14 November 2024 11:30

Do you have specific constraints on database storage size, or perhaps a use case which is creating a prolific number of models?

Yes, as I mentioned before, I’d conduct studies (both architectural and engineering), so I’d create many models with numerous versions within the same project. I can’t go into specifics right now, but imagine a construction project with several (candidate) buildings as models and various versions of them.

Storage space has generally been cheap and available

Yes, but not in my case currently, as I’m hosting the Speckle deployments locally on my own server. So, storage space is actually quite limited. Also, as I noted in my first post, I’ve been querying geometrical data directly from the database by accessing the objects table. Having additional rows (including object data for models that are no longer present) means I’d need to perform extra checks, more processing times, etc.

But anyway, I think I’ve gotten a clear answer to my original question. Sorry for branching out. Thank you for your replies!