Smart Agriculture – Paddy Yield Prediction for Sri Lanka

Using Stacked Ensemble Learning

PEER-REVIEWED & PUBLISHED
View Full Publication on SciForum

Published in the 5th International Electronic Conference on Agronomy (IECAG 2025)

Published

2025

Domain

Agricultural Data Science

Accuracy

R² = 0.9986, NRMSE = 0.76%

Abstract

Paddy (Oryza sativa) is a widely consumed staple food worldwide, feeding over 50% of the global population. Sri Lanka has been cultivating paddy for centuries, and with the rapid increase in population, forecasting paddy yield becomes essential for food security and agricultural planning.

This research aims to predict Sri Lankan district-wise paddy yield using openly available data: CHIRPS 2.0, NASA POWER APIs, Sri Lankan Rice Research and Development Institute's PH and Salinity Maps, and Paddy Statistics from the Department of Census and Statistics – Sri Lanka for local agro zones.

Methodology

For the three major agro-climatic zones and 25 administrative districts, data from 2004 to 2024 were collected for the two harvesting seasons "Yala" and "Maha". The target variable is the total paddy production per district (in metric tons), ranging from 185 MT (Mannar, 2006 Yala) to 530,356 MT (Anuradhapura, 2019-2020 Maha).

Using the crop calendar template from the Department of Agriculture – Sri Lanka, end-to-end crop harvesting simulations were constructed. By combining these simulations with climate variables, soil properties, and historical yield records, we created 12 heterogeneous datasets.

These were used to train 12 base models, whose out-of-fold predictions fed into two meta models, and finally a stacked meta model.

Key Results

0.9986
R² Score
3,535 MT
RMSE
0.76%
NRMSE

The final model achieved exceptional accuracy in predicting district-level seasonal paddy yield, demonstrating the potential of open data + machine learning for sustainable agriculture in Sri Lanka.

Research Collaboration

Ms. Achinthi Premasiri (BSc, University of Jaffna; MPhil – University of Ruhuna)

Joint research combining expertise in agricultural science and machine learning