What was the argument for postgres?

Hey folks at the speckle team! :slight_smile:

I am wondering, mainly out of curiosity and trying to challenge our own architectural decisions: Why did you choose SQL / PostgreSQL as the basis of data management on the server?

  1. The speckle objects themselves are mainly nested jsons, right?
    • using mongoDB objects could be queried a lot more flexibly and into depth
    • I saw, a while ago in the graphQL query explorer, there is the ability to query the data on a commit which evaluated the JSON field on the postgres by some query (which is not database native for SQL, right?).
  2. relationships between data are fairly simple
    • i.e. commits relate to each other and contain objects
    • git, after all, uses and object based storage implementation as well

Looking forward to hear your arguments :slight_smile:

And as always,
Keep up the great work of making this AEC a bit more nerdy :slight_smile:

3 Likes

Hey @maxmodugen!

This is an interesting topic and I think we’re all happy to start some convo around it. Mostly @dimitrie can give you the OG reasons.

But here are some pro PostgreSQL points:

  • While the Speckle base objects are primarily structured as nested models, our object model abstraction kit and (de)composition API-s add a lot of extra optimizations, that are geared towards solving the 3D data communications.
  • Detached attributes are stored as separate objects to make all of our de-duplication logic work.
  • These above make it so, that most real world project data would not fit the single document model or size constraint (16 MB iirc ), that MongoDB has and we would need our own custom logic on top to be able to query data in a meaningful way.
  • The JSON(b) support of PostgreSQL is actually a lot more powerful than most ppl give it credit for.
  • Thanks to detaching we can also do some quite parallelized object data streaming from the server to clients instead of reading and serving 1 big document.
  • Extra benefit, that our clients can keep a local cache of objects using SQLite in the same structure as the server does so we can leverage that, to reduce the amount of data that we need to constantly transfer over the wire.

I’m sure there are more pros and are many cons to this approach too. I’m keen to hear them from you :slight_smile:

4 Likes

Hi @gjedlicska ,

thanks for this neat summary!
I did not have the decomposition in mind, but it makes a lot of sense. Also the other reasons are quite reasonable.

Don’t have anything just yet that could “disprove” your point or enhance the discussion :slight_smile:

Best,
Max

Thanks @gjedlicska - I honestly don’t have much to add besides some colour/context!

2.0 was started with performance optimisations in mind as well as more freedom to structure data arbitrarily. Main 1.0 limitation was that everything needed to be a flat list of objects - hence the 2.0 decomposable tree approach.

I chose Postgres for the Speckle Server because it’s fast, stable, and generally an amazing piece of reliable tech that’s been around for ages. All cloud providers have it as a vanilla DBaaS offering, so self-hosting your own instance shouldn’t be a problem or more costly than it needs to be, even at large scales. It’s a shark, not a dinosaur :sunglasses:

I do want to emphasise one thing though - PostgreSQL is the db of choice for the Speckle Server. Transports can write to/from any other DB or medium.

PS: I’m also a big (aspiring) fan of https://boringtechnology.club/

2 Likes