A new project attempts to tame the plant model wilderness by creating a dedicated modeling platform that supports collaborative and distributed model design, reproducibility, and dissemination.
During the last few decades, siloed teams have developed models using different programming languages, with different degrees of modularity and inter-operability. As plant scientists rush to meet growing yield demands in the face of climate change, advanced technologies in molecular biology, biochemistry and high-performance computing provide an unprecedented opportunity to create models that guide rapid advances in plant breeding. Progress will also require research to move beyond modelling at single-scales to integrative multiscale modeling to achieve integrative, multiscale modelling that takes full advantage of our understanding of molecular mechanisms and the wealth of genome-wide data that has been generated over the last three decades. However, the ability to build integrative multiscale models is currently impeded by the difficulty in exchanging, re-using and combining models and simulation tools between teams (or even within a team) despite the existence of dedicated modelling platforms created for this purpose.
Dedicated modelling platforms have been in existence for the last 25 years, allowing users to create, execute, and interact with models and visualize their output (e.g., V-Laby, GroIMP, L-Py, AmapSim, AMAPmod, Capsis). Newer platforms also facilitate the integration and interoperability of heterogeneous models and data structures (e.g., OpenAlea and Yggdrasil).
In a new paper published by Dr. Frédéric Boudon, Researcher in Plant Modelling and Computer Science at UMR AGAP Institut at the University of Montpellier and colleagues present a new user-friendly virtual modeling environment using Jupyter notebooks. Their unique approach tackles several problems commonly encountered by plant modelers including reproducibility, reuse, modularity, collaboration, and maintenance.
According to Boudon, “the use of Jupyter notebook makes our platform unique because its ability to create modelling narratives makes it possible to give users access to the different steps of the modelling pipeline in a clear, documented and shareable way. We also use a standard representation of multidimensional arrays to represent plant properties, which improves the efficiency of modeling and the coding process because it does not require custom codes to extract, transform and visualize data. Those features are provided out-of-the-box by the Python scientific stack, minimizing the maintenance burden.”
The Jupyter-based modeling environment makes reproducible, reusable, collaborative and distributed model design possible. The notebook format supports clear specification of processes and documentation to create the simulation narrative of a modeling scenario. This format allows hypotheses of the model and actual parameter values to be clearly specified, making the information accessible to future users. This allows collaborators and users to test and modify a model. The inclusion of the conda package management system make it possible to clearly specify software dependencies. In addition, the environment enables the development of models remotely so it does not require users to have extensive computational resources. This further facilitates collaborative and distributed model design and implementation.
Increasing model modularity is possible due to the inclusion of xarray-simlab, a Python library for organizing and executing simulations. The library provides a framework to compose complex computational models from sets of reusable sub/models or modules. A collection of sub/models can be combined to form a model, and their computational ordering is entirely deduced from process dependencies. This modularity lets users run simulations for a subset of processes only or even define alternative processes to replace predefined ones.
To illustrate the use of the new modelling environment, the authors redesigned V-Mango, an existing model of mango tree development and fruit production, and reorganized its code.
“We chose V-Mango because it was a complex model that could benefit from redesign and reorganization of code. The model was composed of processes implemented as simple functions or L-system rules with no way to distinguish them from each other. In addition, the interaction between sub-models was restricted by use of different language technologies,” explains Boudon.
The functionality of xarray-simlab in the Jupyter environment allowed the authors to easily redesign V-mango and reorganize its code. This consisted of defining processes and their inputs/outputs and assigning corresponding model logic (see old vs. new workflow).
Maintenance problems are reduced with the Jupyter platform because features are provided out-of-the-box by the Python scientific stack. This reduces the need for difficult to maintain custom codes, to extract, transform and visualize data.
The authors encourage others to try out the open-source platform themselves.
READ THE ARTICLE:
Jan Vaillant, Isabelle Grechi, Frédéric Normand, Frédéric Boudon, Towards virtual modeling environments for functional structural plant models based on Jupyter notebooks: Application to the modeling of mango tree growth and development, in silico Plants, 2021;, diab040, https://doi.org/10.1093/insilicoplants/diab040
This manuscript is part of in silico Plant’s Functional Structural Plant Model special issue.
pgljupyter is available at https://github.com/fredboudon/plantgl-jupyter/ and vmango-lab at https://github.com/fredboudon/vmango-lab with instructions for the installation process. All examples in the section 3 are available as notebooks in a demo repository at https://github.com/fredboudon/plantgl-jupyter/blob/isp2022/examples and can be inspected with nbviewer and reproduced either locally or on a binder instance. The notebooks described in section 4 are available at https://github.com/fredboudon/vmango-lab-demo/tree/isp2022.