In the face of growing global population, plant breeding is being used as a sustainable tool for increasing food security. A wide range of high-throughput omics technologies have been developed and used in plant breeding to accelerate crop improvement and develop new varieties with higher yield performance and greater resilience to climate changes, pests, and diseases. With the use of these new advanced technologies, large amounts of data have been generated, which can be exploited for manipulating the key characteristics of plants that are important for crop improvement. Therefore, plant breeders have relied on high-performance computing, bioinformatics tools, and Artificial Intelligence (AI), such as Machine-Learning (ML) methods, to efficiently analyse the vast amount of complex data. Machine Learning is a sub-field of AI that involves development of models and algorithms, capable of learning from the data and making predictions or decisions without explicit programming. Machine Learning process consists of collection of data, data preparations, data wrangling, data analysis, training of model, testing the model and model deployment. Machine learning is classified into supervised learning (deals with labelled data), unsupervised learning (unlabelled data) and reinforcement learning (learns by interacting with the environment). Most commonly used Machine learning algorithms in plant breeding are: Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF), k-means clustering and Ensemble Learning algorithm1.
Prediction of yield in soybean from hyperspectral reflectance using Machine Learning algorithm, Ensemble Stacking (SVM, ANN and RF) showed that the prediction accuracy of Ensemble Stacking was 91.03%, and Random Forest algorithm was used as the meta-classifier with the highest individual prediction accuracy of 83.34%. Results from this study demonstrated that Ensemble Learning can be used for identification of high-yielding soybean varieties during early growth stages2. Artificial Neural Network based optimization of plant traits to simultaneously increase both quality and quantity in tobacco showed that ANN with twelve neurons with one hidden layer was appropriate for the prediction and optimization of output. The results of this study demonstrated that the precise optimization of tested regressors could simultaneously increase the potential quantity (about 3%) and quality (1.8% nicotine, 1.5% Cl and 4.75% K contents) of cured leaf in tobacco3.
Machine Learning provides a great opportunity to make plant breeding more efficient and predictable. ML can be employed in almost every step of plant breeding, from selecting appropriate parental lines for crosses to evaluating the performance of advanced breeding lines across several environments. The use of ML algorithms in plant breeding will equip breeders with efficient and effective tools to accelerate the development of new plant varieties and improve the efficiency of the breeding process, which are important for tackling the future challenges in the era of climate change4.
References:
1. ALI, J., ANUMALLA, M., MURUGAIYAN, V. and LI, Z., 2021, Machine Learning in plant science and plant breeding. Theor Appl Genet., 13(3):1427-1442.
2. MOHSEN, T.K., GREGORIO, G.B., DALISAY, T.U., DIAZ, M.G.Q., CH, B. AND SWAMY, B.M., 2022, Application of machine learning algorithm in plant breeding: predicting yield from hyperspectral reflectance in soybean. Sci Rep., 12(1):1881-1886.
3. SALEHZADEH, H., GHOLIPOOR, M., ARUMUGAM PILLAI, M.P., ARUMUGACHAMY, S. AND VANNIARAJAN, C., 2020, Optimizing plant traits to increase yield quality and quantity in tobacco using artificial neural network. Front plant sci., 11(2):591-599.
4. YUN, S., ALI, J., ZHOU, S., REN, G., XIE, H., PENG, S., MA, L. AND YUAN, D., 2022, Machine learning bridges omics and plant breeding. Mol Plant., 15(1):9-26.
0 Comments