Core 2.0: Decomposition API

dimitrie · 22 August 2020 12:13

Preamble

Good morning everyone in the @Speckle_Insider! The previous post covered the new base Speckle object. This one covers an equally important concern, namely how Speckle deals with the structure and composition of design data.

Blobs and Trees

In 1.0, Speckle has been focused mostly on storing efficiently flat lists of objects - geometry and metadata. These objects could have any amount of nesting in and of itself, but this would not be reflected in the storage layer (they would, ultimately, be “json blobs” grouped together under a stream).

This scenario worked well, as it’s quite similar to the way most CAD applications expose “scene” information. Nevertheless, as you can well imagine, design data is relational: built elements link to each other. How do they relate to each other? Well, it depends: there’s no canonical way of structuring data.

For example, a Site object may contain one or more Building objects. These, in turn, may contain one more Levels, and each of these Levels may contain some Walls, Floor and Beam elements. Or, those Building objects may each individually reference the same Site. Within the Building object, there’s all the Walls and Beams, and each of them, rather than being grouped under a Level, they individually reference a specific Level.

In order to solve this, I’ve taken a good look at how git, the ubiquitous version control system, works. Git operates with two object types: blobs and trees. Blobs represent files, and an important characteristic is that they are immutable - just like objects are in Speckle. Trees represent folders, and they are used to track the hierarchy of the folder structure of your repository.

Spoilers 🤫

If you think this analogy hints to more, you’re right: Speckle is becoming a fully fledged version control system for design data. Keep an eye out

As opposed to Git, that deals with data structured on your hard drive, Speckle deals with data structured in memory. In Speckle, an object is, at the same time, both a blob (data) and a tree (its subcomponents). There are two separate, yet interweaved, problems (and their corollaries) that I’ll now address:

How to decompose an object? (Corollary: how to recompose an object?)
How to serialise, transport and store that decomposed object? (Corollary: how to retrieve and deserialise an object?) - this will be tackled in a follow up post!

Decomposing

There is no right or wrong way to structure design data. Different applications operate with different models, different use-cases require different hierarchies, different disciplines and professionals operate with differently structured ontologies.

Speckle now has a mechanism by which, in the process of specifying an object - either via a strongly typed Kit, or via dynamic properties - developers (and end-users) can decide what gets decomposed and what doesn’t.

Strongly Typed Detachment

It works, in the case of strongly typed properties, by adding the Speckle specific attribute [Detach] on the properties you want to store as references, rather than within the object itself.

For example, let’s take an imaginary example:

public class Building : Base { 
  [Detach] // this attribute tells Speckle to store the value of the Site separately. 
  public Site Site { get; set; }
  public List<Level> Levels { get; set; } 
  public Owner { get; set; } 
}

public class Level : Base {
  public double height { get; set; } = 3.2; 
  public double baseElevation {get; set; } = 0; 
	
  [Detach]
  public List<Base> Elements { get; set; } // The actual walls, floors, columns, etc.
}

// Define a site globally
var mySite = new Site(); 

// Reference the same site in both buildings. 
buildingA.Site = mySite;
buildingB.Site = mySite;

When an instance of the Building class gets saved, the Site will be stored separately. If two Buildings share the same Site, it will be stored only once, and each instance of the Building class will hold a reference to the same Site.

In our example, each Level can hold, in a detachable property, all its walls, beams, ducts, pipes, furniture, and other built elements. Consequently, each of these elements will be individually accessible from the storage layer, and will maintain topological unity. Why is this important? Let’s imagine a building with a series of levels separated by floor slabs. The slab between two levels pertains to which level: the bottom one, or the top one?

Dynamic Detachment

Let’s illustrate this through an example that also demonstrates how dynamically added properties can be detached. We’ll assume that we will dynamically set topSlab and bottomSlab properties to each level in our imaginary object model:

// We're grossly simplyfing in this example. Here are our two building levels: 
var level_1, level_2;

// The philosophical slab instance. Does it belong to level 1 or level 2? 
var slab_between_1_and_2 = new Slab(); 

// Well, it belongs to both! Notice the "@" characther at the beginning of 
// the dynamic property assignment - it's the Speckle convention for "detaching"
// dynamically added properties. 
level_1["@topSlab"] = slab_between_1_and_2;
level_2["@bottomSlab"] = slab_between_1_and_2;

Because we’ve prepended the property name with an “@” symbol, Speckle will now “detach” it. Consequently, once stored, both level one and two will now hold a reference to a single slab! Beyond storage efficiency - the slab is only stored once - this approach gives us the freedom to structure data however is best and, simultaneously, query, slice and dice it however we need to.

For example, let’s assume you need to do a quantity takeoffs:

For each each individual level: specify the total surface area, including walls, ceilings, etc. Solution: simply retrieve each level individually!
For the whole building: calculate the total volume of concrete that goes into the slabs. Solution: just query the building object for all the slabs!

Similarly, imagine you’ve written a script that does environmental analysis on a given level. The level can now be retrieved in its entirety - without juggling around to bring in the level above it too, just to get the ceiling out. Once the results are calculated, they can be stored in a detachable property on that specific level (e.g., level_1[@"solarComfortMeshWithColours"] = analysisResultMesh;). Later down the line in your workflow, one can could query for all the “solarComfortMeshWithColours” objects individually for each level, or, if needed, for the whole building.

Recomposing

Recomposition of an object happens within the deserialisation process, and it’s tightly integrated with the transport layer. The process, on the surface, is deceivingly simple. When you ask Speckle to receive a specific object, the process is as follows:

Speckle retrieves said object’s “blob” (actually a JSON string representation of it),
Next up, Speckle retrieves at all the reference tree of this object,
Speckle proceeds to deserialise and re-compose the parent object, inserting in the place of references the actual referenced object.

Wether or not this sounds complicated, the exposed API is actually rather simple and the end result is that a decomposed object, when received back, will be identical with the original one - with all its parts in place

Conclusion & What’s Next

End users and developers can now, in Speckle 2.0, productively control the way they structure their design data through the decomposition mechanism. From the point of view of the future Speckle connectors, this will enable us to expose object model flexibility in a more elegant way than before to end users. As developers, Speckle gives you another powerful API on top of which you can scaffold your digital automation workflows.

More importantly, this allows you to store arbitrary data structures (scaffolded on top of a Base) with Speckle, without paying any penalties: Speckle deals equally well with flat and nested data.

So, what’s next?

So far, I have mentioned a “transport layer” and “storage layer” quite a few times. These demarcations underpin yet another important part of the new Speckle 2.0, and they control how and where design data is being stored and retrieved (“transported”). They’re tackled in the following post - so keep your eyes peeled

dirksliepenbeek · 24 August 2020 09:29

Hi @dimitrie!

Looks very interesting, I very much like the idea of the “Spoilers” and was wondering how literally I should take that? Is the idea that the user can very clearly see the changes made in the building data (like in code in git)? And how about different branches of the data, any thoughts on how that will be implemented?

HughGrovesArup · 24 August 2020 10:02

@dimitrie this seperation seems to be very similar to reference types / value types. What has driven the choice for kit developers to demark which objects are ‘detached’ (by reference)? What would be the unintended consequences of having every non-primitive property be ‘detached’?

dimitrie · 24 August 2020 11:57

Good questions from both (@HughGrovesArup & @dirksliepenbeek), I’m happy you’re asking them. I’ll take them one by one:

You will always be able to diff between two “states”/“revisons”/“commits”; how you actually display that “diff” is actually a big question - is it in the online 3d viewer? or a host application, like Rhino or Revit? Is it purely analytical, like it is now?

( the stream version tab for anyone wondering)

Branches actually now exist in the Server. They won’t be rolled out initially as a user-facing feature (everything will default to ‘master’), but they’re there.

Regarding how literal you should take the spoiler analogy: obviously with a grain of salt. Speckle 2.0 “commits”/“revisions”/“whatever they will be called” will be something like a broken merkel tree, as we can’t always guarantee previous state. What we’re aiming in 2.0 is to provide the necessary control, data structures & APIs to scaffold future workflows that can rely on a properly done versioning system (rather than just stream children, as per 1.0).

This is a very good shout. When we’ll write up docs, this will be front and center! It’s mostly a question of usecases that you design the kit for, and the correctness you want to have in data representation.
The best example that comes to mind here is the topological vs. “get-it-done” structural object models, which I’m sure you’re much more aware than me

With 2.0, for example, the point that beam ends in, and marks the start of another beam, is actually one point - if the endPoint and startPoint props is marked as detachable. You probably still won’t want to do it though as I think topology is much better handled at an application level.

A more relevant and common example is a displayObject property: a simple structural beam could have a complex sweeped mesh representation, that is much heavier than the original object. In this case, it totally makes sense to have it detach.

Having everything non-primitive detached will also make it more difficult to query and aggregate data, as the recombination of objects does have an extra cost.

On this topic, there’s another good “best practice” that comes before the detaching concerns:

// costly serialisation poyline
public class Polyline : Base { 
  public List<Point> Points { get; set; }
}

// much cheaper serialisation polyline class that still maintains 
// the convenience of accessing "points" as a separate type.
public class Polyline : Base { 
  [JsonIgnore]
  public List<Point> Points = new List<Point>(); // this one doesn't get serialised

  public List<double> Vertices // this one does! 
  {
    get => Points.SelectMany(pt => new List<double>() { pt.X, pt.Y, pt.Z }).ToList();
    set
    {
      for (int i = 0; i < value.Count; i += 3)
      {
        Points.Add(new Point(value[i], value[i + 1], value[i + 2]));
      }
    }
  }
}

chris.welch · 24 August 2020 20:39

This seems like a really intuitive way of storing topology along with your code (I rambled about it here so I won’t repeat myself). Very excited to have a go at this!

Also we were just talking in the office about source control and where speckle might fit in with that, so a proper git-like structure sounds very intriguing!

dimitrie · 25 August 2020 06:55

I’m not sure proper, as in “what we as devs think of when we mention source control” applies to 2.0 Something like that does apply though.

The gist of it is that with Speckle 2.0 you can assemble your data in whatever structure you want (let’s call it a commit) - see above - then store it (push it, or send it) via one or more transports. E.g.,

var myCommit = new Base();
myCommit["@layerOne"] = etc;
myCommit["@buildingSite"] = etc; // whatever data structure make sense!

The myCommit object above can be seen as your project’s folder structure, if we want to use git-like terms. Then you can “save” that object using one or more transports (which don’t care wether they’re local or remote):

var myCommitId = Operations.Send( myCommit, [ transports ] )

Once that’s done, you can “flag” that object’s id as a commit, associate it with a branch, etc. But ok, now I’m spilling the beans too soon. Post on transports is coming up, hopefully today, and alpha release of core and server too

chris.welch · 25 August 2020 22:02

Happy to wait! Don’t rush a good thing