Sample canola plants with contrasting architectures (top) and their calibrated models (bottom).
Home » Creating visually realistic plant images to train neural networks

Creating visually realistic plant images to train neural networks

An interactive method for creating and calibrating developmental plant models expressed using L-systems

You can listen to this page as an audio file.
This WIWAM robot consists of a conveyor network that brings the plants to weighing-watering stations and imaging cabins, harboring a range of non-destructive camera systems.

Plants continuously move on a conveyor belt past sensors that collect vast collections of data including images that can be used to extract plant traits as the plants grow. High-throughput phenotyping like this helps plant breeders determine which features and genomic characteristics are most critical to plant improvement.

Although the systems for obtaining these images and data are complicated, interpreting the vast quantities of images is a bigger challenge.

Computer vision algorithms employing artificial neural networks and deep learning to recognize and quantify relevant aspects of crop plants show promise in meeting this challenge. However, these neural networks must be trained using large sets of annotated images, where architectural features of interest are labelled, which are time consuming and expensive to obtain.

The use of annotated synthetic images of plants provides a feasible alternative.

In a new article published in in silico Plants, Mikolaj Cieslak, Senior Research Associate at the University of Calgary, and colleagues present a new modelling process that provides a practically unlimited number of annotated images reflecting individual variation of plants to train neural networks. The paper presents the vegetative development of maize (Zea mays L.) and both the vegetative and flowering development of canola (Brassica napus L.) as examples.

The modelling process for each species was divided into two stages: (1) the construction of an L-system, capturing the essential elements of the plant species of interest qualitatively, and (2) model calibration to a set of photographs of reference plants.

“For both species, we used parametric L-systems to create a simple, descriptive model of development. The L-system model is organized around the concept of positional information, which means that the key quantitative aspects of the target plant form, such as the distribution of branches, leaves and reproductive organs, are expressed as intuitive, easy to manipulate functions of position on their supporting axes.  Developmental processes are simulated by multiplying functions of positional information by functions of time,” the authors explain.

Calibration was based on aligning the model with a reference plant using a graphical interface. The reference plants can represent a specific developmental stage or a sequence of stages. The models can be calibrated to capture genetic diversity, the influence of the environment (e.g. water limitation), and/or individual variation of plants. 

Calibration of a maize model.
Calibration of a maize model.

Once calibrated, the model can generate a practically unlimited number of annotated images of synthetic plants by randomizing the parameters using normally distributed random variables (see figure 1). The calibrated plants can be used to visualize plants at different developmental stages individually (see top of figure 2) or be assembled into models of entire plots (see bottom of figure 2).

Sample canola plants with contrasting architectures (top) and their calibrated models (bottom).
Figure 1: Sample canola plants with contrasting architectures (top) and their calibrated models (bottom).
Simulated canola plants
Figure 2: Top – Simulated stages of the development of an individual plant (days after seeding). Bottom – model of a canola plot.

Cieslak adds: “The synthetic annotated data will help in training neural networks to identify semantic plant traits in image-based phenotyping tasks. Our next step is to extract phenotypic traits from the maize and canola datasets. However, the usefulness of the calibrated models goes beyond annotations for training neural networks. The models provide a quantitative estimate of the architectural parameters of these plants over time without direct measurements (a very time-consuming process). They can also provide a basis for construction of more comprehensive models, incorporating functional aspects of a plant’s development.”


Mikolaj Cieslak, Nazifa Khan, Pascal Ferraro, Raju Soolanayakanahally, Stephen J Robinson, Isobel Parkin, Ian McQuillan, Przemyslaw Prusinkiewicz, L-system models for image-based phenomics: case studies of maize and canola, in silico Plants, 2021;, diab039,


The models were implemented using the Virtual Laboratory 4.5.1 plant modeling software ( macOS High Sierra v.10.13.6, and are available at the Algorithmic Botany website (

Rachel Shekar

Rachel (she/her) is a Founding and Managing Editor of in silico Plants. She has a Master’s Degree in Plant Biology from the University of Illinois. She has over 15 years of academic journal editorial experience, including the founding of GCB Bioenergy and the management of Global Change Biology. Rachel has overseen the social media development that has been a major part of promotion of both journals.

Read this in your language

The Week in Botany

On Monday mornings we send out a newsletter of the links that have been catching the attention of our readers on Twitter and beyond. You can sign up to receive it below.

@BotanyOne on Mastodon

Loading Mastodon feed...