Setting the stage
AEC projects can be mapped out into “pipelines”
Input data → process → data → … → data → process → Deliverables
- A pipeline starts with input data (project requirements, drawings, GIS data, Ground Investigation data, etc.)
- A pipeline ends with the deliverables (BIMs, drawings, reports, presentations, etc.)
- Processes can have multiple Datasets as inputs and / or outputs.
- A pipeline CANNOT start nor end with a Process.
- Data CANNOT directly flow into Data.
- Processes CANNOT flow into other Processes.
- The whole project can be seen as one big process, but this huge process can be mapped out into smaller and smaller
data → process → data → … → data → process → data
pipelines. In software engineering this is called refactoring. - Of course, some parts of a pipeline are iterative, and others might be “messy”. AEC projects don’t have “unidirectional data flow”, and never will, but that is fine.
- Of course, we’re going to put all our datasets on Speckle , which allows us to swap out cumbersome manual processes for scripts incrementally, one process at a time.
- A script will then mature over the course of a or multiple project(s), and will be split (i.e. refactored) into multiple smaller scripts such that a script only serves a single purpose (this is called the Single Responsibility Principle in software engineering).
- Because our scripts only serve a single purpose, they are more maintainable and they become reusable on other projects.
- Once a script is mature, it can become a Speckle automation.
- Once we have a library of modular, single purpose scripts on Speckle Automate, we can start chaining them into automated pipelines.
Data engineering / science pipelines
In data engineering / science it is common to talk about data processing pipelines. One open source Python library I like for building reproducible, maintainable, and modular pipelines is kedro
[src 2.]:
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
I especially like kedro
because of kedro-viz
, which is fantastic for visualizing such data processing pipelines. See this demo:
What do you think?
I think that it could be very useful to be able to visualize AEC project pipelines, and the parts of such pipelines that are automated with Speckle Automate.
I discussed this stuff with @KatherineC earlier this week at SpeckleCon, and thought it would be worth bringing to the attention of the Speckle Community and @Automatons.
Credit where credit is due
Apart from kedro
, @RamonvanderHeijden, Evan Levelle and Martin Riese (2015) must also be credited for proposing this concept under the name of “Building Information Generation”.
Sources:
- Van Der Heijden, R., Levelle, E., & Riese, M. (2015). Parametric building information generation for design and construction. In Computational Ecologies: Design in the Anthropocene-35th Annual Conference of the Association for Computer Aided Design in Architecture (pp. 417-429).
- Alam, S., Chan, N. L., Couto, L., Dada, Y., Danov, I., Datta, D., DeBold, T., Gundaniya, J., Honoré-Rougé, Y., Kaiser, S., Kanchwala, R., Katiyar, A., Pilla, R. K., Nguyen, H., Cano Rodríguez, J. L., Schwarzmann, J., Sorokin, D., Theisen, M., Zabłocki, M., & Brugman, S. (2024). Kedro (Version 0.19.9) [Computer software]. https://github.com/kedro-org/kedro