Overwriting Streams

vasshaug · 12 December 2020 11:50

Hello there. A workflow question that I’m embarrassed I haven’t asked before:
Am I supposed to expect to be able to overwrite a stream without having access to the file that created it? A scenario would be like this:

Bob makes a Property Line in Rhino and sends it to a Stream.
Claire Receives the Property Line Stream and sees a mistake. She downloads/bakes Bob’s lines, edits them and want’s to Send back to the same Stream for the rest of the team to enjoy correct data.
Claire realizes this is not possible and weeps.

Perhaps this is not how Speckle Streams are intended to work? Or am I missing something?

As far as I can see, this is the case in both Revit, Rhino, Dynamo and Grasshopper.

dimitrie · 12 December 2020 14:38

Heya @vasshaug! Unfortunately in 1.0 that’s a big limitation. The good news is that, in 2.0, we’ve changed the models around, and it’s no longer the case.

You can:

“commit” to a stream from multiple places;
multiple users can “commit” to a stream;
commits are pegged to separate branches, if you want to;
etc.

Basically we’re supporting the workflow you’ve mentioned 100%!

teocomi · 12 December 2020 16:04

To expand even further, in v1 streams are “one way only” by design, so in your scenario @vasshaug Claire would just have to create a new stream and send that…
This approach works well for most cases, but a consequence is that a lot of “throwaway” streams are generated.

That’s why in v2 we made the decision to treat streams differently, a bit like repositories with branches and commits like Dim described.

vasshaug · 12 December 2020 19:02

Thank you @dimitrie and @teocomi
Sounds like v2 is something to look forward to:)

iltabe · 19 December 2020 03:38

Hi guys!
Something I wanted to ask and I didn’t dare during the community meeting two days ago (here will get a broader audience though ;):

Assuming a stream would contain 2 objects:

Object A { Geometry; Name; Attribute1 }
Object B { Geometry; Name; Attribute2 }

When sending back a new commit on the same stream where Clair is changing ObjectB.Attribute2 but the rest stays the same, is the new commit pushing back both the objects or just the modified attribute only? Is then Bob able to “visualize” what changed from his initial commit after Clair push?
Rephrasing: do you intend the commits/branches to work as a versioning system?
Do you plan to develop a web interface that let the end user to perceive that, for instance as a table? If that’s the case, would be nice to pick commit xyz and commit zyx and see what changed between the two.

Thank you!
Gianluca

teocomi · 19 December 2020 11:42

Hey @iltabe , good question!

When you create a Commit and push data to it, you are sending the new data in its entirety; we don’t do any sort of partial updates!
But the good news is that every oject being transferred is hashed, so if ObjectA doesn’t change at all, it’s only persisted once in the DB, and as well it’s only sent/recieved once when you push/pull the object multiple times (since it gets cached locally in your machine).
This, if I’m correct, is exactly how git does things and @dimitrie will be happy to get more technical if you want!
In 1.0 we already had a very basic versioning system and UI, and it’ll definitely make a come back in 2.0, at some point…

teocomi · 21 December 2020 10:16

Relevat article regarding what I mentioned, that commits are snapshots, not diffs.

iltabe · 21 December 2020 11:40

Thanks @teocomi for the answer and the posted article. A useful read!
It wasn’t given by grant you would fully follow the git way of doing it. (Are you actually using git for real?).
In my imagination I was seeing it more as a database, where you could query a property, change it and push it back to the db. Or maybe it was just a wish to have that kind of granularity

In my daily practice, I have to deal with a lot (hundreds) of objects that belong to the same class. These objects contains other objects, geometries (mainly breps, curves, planes) and of course basic data types (mainly strings). That means that usually, when performing a change on the objects, I modify all of them at the same time. So, if I edit just a string attribute of my objects, the hash will be different from the “tree” version of itself and that will trigger the possibility to push the new objects.
Eventually, for my case, that would be ending up in pushing all the objects every time I want to snapshot in time a version of them.

As practical example, we can say I have to perform a facade tessellation. My model has different object types: the facade single element, the frame behind, connections. The beauty of hashing the objects reflects on the fact of push/pulling only the single facade elements, and not the frame if those are not changed. And that’s already good.
But, thinking about speed performances here, that means I push-pull a lot of unnecessary information every time. Currently a binary serialization of my objects takes around 3-5mb each. It includes also the serialization of the properties that are other objects (i.e element.MyFrame) which it will be optimized by the dimitry’s dynamic reference (the “@” ones).

My intent is to try out Spekle 2.0 as my primary way of saving the model. I would like to retrieve it, change attributes to objects, push it back in order to track every change I do (pretty similar to when I version my code: I don’t want to do it when my class is shiny and polished, I want to commit many times per hour…).
Ultimately, when my project is towards the end, I want to be 100% sure to don’t modify any attributes by mistake. That’s why a diff UI will be super handy: once committed, I can double check the result of my push, verifying that I didn’t edit any other attribute by mistake.

Sorry for the long text!

dimitrie · 21 December 2020 14:43

Hey @iltabe, we like long texts - thanks for your insights; this is really helpful for us. I’ll try and reply - hopefully coherent!

You have that level of granularity; what happens though, given the fact that objects are immutable in speckle, is that your new object gets a new id (hash); and so does his parent, and the parent’s parent if any. The object’s applicationId though stays the same, and this is how we actually manage to “edit” existing elements in a revit file, if the revit api allows, of course.

Ultimately changing element properties is easy - the main limitation is the integration with the host software when bringing those changes in (hint: it’s all very limited!).

Diffing is again is a difficult subject to breach, but actually thinking things through based on what you said, it can be done nicely

The brute force approach would be to diff against the whole commit structure - ie, across potentially 100k+ objects. This can be done, but it’s not sustainable and I believe quite meaningless.

The brainwave which I got after you described your case is that we can actually do it on objects with applicationIds only. This is super cool, because:

you’d be able to see, side by side, how a given object has changed!
it would also give you a more concise classic diff (added/removed/common), though this could be done also based on speckle ids (hashes) too.

Not using git, but I’ve been heavily inspired. Speckle doesn’t have a demarcation between a tree and blob, like git, but each object is simultaneously a tree (of references) and a blob. This is because git has a nice tree structure to operate from from the start - your project’s folder structure; whereas Speckle needs to work with however the authoring software keeps data structured.

But don’t worry, you can query a given commit in a classic way - ie, “give me all the objetcts of this type with this property bigger than X”. It’s the query param for the objects type in case you have a server around to play with:

This is documented in the tests only at the moment: speckle-server/modules/core/tests/objects.spec.js at 692c0b02827e31023f1c4ab027e0ef6373802ece · specklesystems/speckle-server · GitHub

iltabe · 22 December 2020 14:30

Thanks @dimitrie for the insight!

You have that level of granularity; what happens though, given the fact that objects are immutable in speckle, is that your new object gets a new id (hash); and so does his parent, and the parent’s parent if any. The object’s applicationId though stays the same, and this is how we actually manage to “edit” existing elements in a revit file, if the revit api allows, of course.

Ok, I got the concept of “immutable objects”. I work with mutable objects instead. Which I create before knowing how they will look like. The object born simple and get more and more complex over the stages of the project. You could think it as the LOD 100/200…
But it doesn’t matter if my mutable objects are translated in new immutable objects every time as long as we can diff it and there is no lack of speed in reading/writing.
Unfortunately I’m not fully sure what you mean with applicationId. I suppose it’s a guid that relate the object to the object in the host app?

Ultimately changing element properties is easy - the main limitation is the integration with the host software when bringing those changes in (hint: it’s all very limited!).

At the moment my host software is my own code and rhino.

Anyways, it seams there are high chances that all this could work. I’ll come back as soon as I have time to put everything together and try it out!

dimitrie · 22 December 2020 15:05

Exactly! It’s an id that doesn’t change, so it can refer to one “logical” object - even if it’s “speckle id” changes (the immutable part). We set these where we can, e.g. Rhino and Revit; Grasshopper and Dynamo don’t have one but you can set it yourself.

The workflow you describe is totally doable. I imagine it would work like this: you create a wrapper base object with its singular, non-changing, applicationId of “AAA”.

At first, this base object contains just a simple box (generic placeholder), ie myObject["placeholder"] = Box; (and myObject["applicationId"] = "AAA"). As the design/product evolves, you keep refining it and you evolve as well its container (e.g, myObject["leftPanel"] = Surface; myObject["innerLattice"]= Mesh; etc.. As long as the applicationId stays AAA, we’ll be able to diff between the historical states of that object throughout a stream’s history.

Not sure this makes 100% sense, but there you go. Anyway, keep the thoughts coming!

iltabe · 22 December 2020 15:16

Yes it makes absolutely sense!
Also because applicationId can then be assigned by me and doesn’t have to reflect a real host id. (as you said, maybe there is none).

Thanks again! Looking forward to stress it with thousand of objects!

I wish a great special Christmas time to all @SpeckleTeam!
Gianluca

teocomi · 22 December 2020 15:42

This makes me realize we’ve never really explained how our applicationId works and the magic behind it… A blog post & docs on this are due!