Automate data management

Witt · 10 June 2025 12:32

Hi,
I am starting to dive into automate and its systems.
My function is supposed to enrich multiple elements with additional metadata.
Given its additive nature I am able to do this either via an annotation or via a new model version.

From the documentation it seems that creating a new version is the recommended.

Later on i want to use the generated data in a custom speckle viewer.

New Model version
Unfortunately I get a GraphQLException:

pecklepy.logging.exceptions.GraphQLException: GraphQLException: Failed to execute the GraphQL model request. Errors: [{'message': 'Failed to find branch with id new_matching.', ....

I am calling the “create_new_version_in_project”

 automate_context.create_new_version_in_project(version_root_object, "new_matching", "message?")

It seems a bit weird that it is complaining about not finding the version that I want to create. If I add the existing version it complains that the target model can’t be the model that triggered the automation.
I am on specklepy 3.0.0, running the automation locally as a test function.

Annotations
Intuitively I would have preferred to add info to the objects instead of creating a duplicate of the whole thing. It seems a bit wasteful if I just associate additive metadata but it depends on how the backend handles versions anyways so I will go with the recommended new model version approach if possible. (insights welcome )
Regardless: With annotations I was able to associate the data to the correct elements but I was a bit confused how the automation data is stored. If I query the data via graphql there are multiple lists. I would have expected the lists to contain data from previous runs. My goal here is to reuse computations if the input data didn’t change. In “model/automationsStatus/automationRuns” i.e. I would have expected multiple entries listing the automation runs. Instead I get two entries with the second not having any data. I would have expected previous runs either here or in the “functionRuns” list further down.
The result data is wiped when requesting a new run so it is not possible to retrieve previous run annotations in a new automation run.
At least with my approach. Maybe you can give me some hints where to look?

TLDR:

Creating a new model version from python automate fails
Is it possible to get automation annotation outputs from previous runs?

jonathon · 10 June 2025 13:08

I’ll split the answer in two, firstly, “The Right Way”: (or, “There is no spoon”)

Adding context data to the same model version

Adding data alongside the triggering model version (via object-level key-value pairs or automateContext.results) is an excellent approach for contextual metadata—like check results, visualisation hints, or review data.

Benefits:
- No geometry duplication.
- Tightly tied to the original version—perfect for “data about the data.”
- Easy to visualise in viewers (e.g. toggling enriched info for a single version).
Trade-offs:
- Can make it harder to track enriched “snapshots” of the dataset if you want historical views of how the enrichment evolved.
- Typically ephemeral—annotations and context data don’t create a new “version of truth.”
  Consuming this data alongside version data is possible; it’s queryable via GraphQL in Power BI and SDKs, but the process isn’t straightforward.

Creating a new model version (or branch) as an augmented package

Creating a new version (or, better yet, a new model) is preferable for self-contained augmented datasets—such as a calculated package, enriched geometry, or a permanent record of analysis.

Benefits:
- Clear lineage—each version is a clear snapshot in time.
- Easier for downstream workflows or sharing.
- Avoids cluttering the original version with lots of extra data.
Trade-offs:
- Adds a new version to your data history—can be seen as duplication if the enrichment is minor.
- Automation triggers: Be careful to avoid infinite loops if your enrichment triggers itself!

No single “right” approach

There’s no recommended “single” answer—it depends on:

How permanent or significant the enrichment is.
Whether it’s data about the version or truly a new version of truth.
Whether you want to keep everything tightly bound to the original dataset or treat the enrichment as a separate package.

Both approaches are valid—Speckle’s flexibility lets you choose what’s best for your workflow. Please let me know if you’d like help developing the decision criteria for your project.

jonathon · 10 June 2025 13:33

Here’s a separate reply just for the error you’re seeing (that’s a legacy backend error sneaking through the new SDK … something we should patch):

GraphQLException: Failed to find branch with id new_matching.

This happens because of a mismatch in what the create_new_version_in_project function expects.

What the `create_new_version_in_project` method expects:

def create_new_version_in_project(
    self, root_object: Base, model_id: str, version_message: str = ""
) -> Version:
    """
    Save a base model to a new version of the project.

    Args:
        root_object (Base): The Speckle base object for the new version.
        model_id (str): Id of model to create the new version on.
        version_message (str): The message for the new version.
    """

First argument: the root_object (your enriched data).
Second argument: the model ID—not a branch name.
It always creates the new version in the default branch (usually main) of that model.

The actual error

You’re passing "new_matching" as the second argument, thinking it’s a branch name. But the backend tries to find a model with that ID and fails—hence the error.

Fixing it

Pass the actual model ID:

automate_context.create_new_version_in_project(
    root_object, "c7eaxxxxxx", "My enrichment"
)

Creating a new model if it doesn’t exist

If you want to create a new model first (like if "new_matching" was meant to be a new model, not a branch), there’s a companion method:

def create_new_model_in_project(
    self, model_name: str, model_description: Optional[str] = None
) -> Model:
    input = CreateModelInput(
        name=model_name,
        description=model_description,
        project_id=self.automation_run_data.project_id,
    )

    return self.speckle_client.model.create(input)

You can wrap this in a try/except to create the model if it’s not found.

Searching for a model by name (like branches in v2)

If you want to find a model by its name (similar to how branches worked in v2), you can use a simple helper:

def find_model_by_name(
     client, 
     project_id: str, 
     target_model_name: str) -> Optional[Model]:
   
    filter = ProjectModelsFilter(search=target_model_name)
    models_page = client.model.get_models(
        project_id,
        models_limit=10,  # API will narrow down the search
        models_filter=filter
    )

    for model in models_page.items:
        if model.name == target_model_name:
            return model
    return None

I use this exact pattern in the Data Shield premade Automation, although it’s not yet updated to specklepy 3—but it’s still valid for this scenario!

Specklepy3 doubles down on type hinting and method docstrings so leverage your IDE’s power to hint this while we get the documentation caught up. We launched user docs last week and developer docs reboot is in full swing.

Let me know if you’d like me to tweak this example further or create a complete snippet for your enrichment function!

Witt · 10 June 2025 13:57

Thanks for the swift reply Jonathon!
Thats a lot of valuable infos. I understand the system a lot better now.

I will test both approaches and see what works best in our usecase.
Regarding creating a new model version I think I’ve found the culprit:

I tried calling it with the current model-id as you suggested but got
“ValueError: The target model: bc2054472a cannot match the model that triggered this automation: bc2054472a”

It seems that the automate context is not yet updated to work with the v3 graph ql - at least not completely. Line 165-175 raises an error when calling the endpoint with the same model-id. I am guessing a left over artifact from the previous usage? If I skip the lines, the model version is created and everything runs fine.

Thanks a lot!

jonathon · 10 June 2025 14:59

That specific error is not a hangover but a deliberate effort to prevent the infinite loop issue.

Rather than allowing a triggered automation on a model to create a version that then triggers itself again, this will prevent that action.

It is not impossible to circumvent, but at that point, you are making a deliberate act to do so, and we trust that you take mitigating this seriously.

The workaround is to use core api methods and not the automate “wrappers” which include this gatekeeper error.

Witt · 10 June 2025 15:25

Ah I get it now!
Putting it in the context of the CFD simulation example the automation generates a secondary model that contains the simulation results. The “create_new_version_in_project” function is used to create and update this. It’s not primarily to update the model itself since then the infinite version update trigger is an issue.

My initial mixup came from the understanding: “I am running an automation on a model to update this model”.

Now I see it more as automations can create different artifacts. They can be annotations, files or new models.

If I really need to update the same model I would need to track somehow that a version is a product resulting from an earlier automation to prevent the loop.

Thanks for the clarifications!
-I got everything that I need for now

jonathon · 10 June 2025 15:30

When you are making new models, there is another helper function that is handy

def set_context_view(
        self,
        # f"{model_id}@{version_id} or {model_id} "
        resource_ids: Optional[List[str]] = None,
        include_source_model_version: bool = True,
    ) -> None

This allows an automation resultant model to be viewed side by side in a federated view alongside the original. Very pertinent to analysis mesh results.