E-ISSN 2667-6540


This work Licensed under the Creative Commons Attribution 4.0 International License.

Multivariate Dissection Of Seed Yield Determinants In Bread Wheat: Integrating Path Analysis, K-Means Clustering And Machine-Learning Approaches [IJABES]
IJABES. 2025; 7(2): 41-57 | DOI: 10.5505/ijabes.2025.41275

Multivariate Dissection Of Seed Yield Determinants In Bread Wheat: Integrating Path Analysis, K-Means Clustering And Machine-Learning Approaches

Nazife Gözde Ayter Arpacıoğlu, Zekiye Budak Başçiftçi, Murat Olgun
Eskisehir Osmangazi University-faculty Of Agriculture - Department Of Field Crops

This study evaluated twelve commercial wheat varieties using major agronomic and quality-related yield components, including seed yield (Seed Y), heading date (Heading D), plant height (Plant H), seed number per spike (Seed N/Sp), seed weight per spike (Seed W/Sp), thousand seed weight (Thou SW), and test weight (Test W). Descriptive statistics revealed considerable phenotypic variation among genotypes, indicating strong genetic diversity in both yield potential and kernel quality characteristics. Correlation analysis showed that Seed N/Sp and Seed W/Sp were positively and strongly associated with Seed Y, supporting the widely recognized “seed number × seed weight” principle of wheat yield formation. Path analysis identified Seed W/Sp and Seed N/Sp as the most influential components, showing the highest direct effects on yield. Thousand seed weight and test weight contributed mainly through indirect effects, highlighting their secondary but supportive roles in shaping final productivity. Plant height also exhibited a meaningful direct effect, demonstrating the contribution of biomass production and source–sink relationships to seed filling. K-means clustering (k = 3) effectively separated the twelve varieties into high-yield/high-quality, low–moderate-yield, and morphologically distinct late-heading groups. High-performing varieties such as Yunus, Tosunbey, Bezostaja, and Nacibey clustered together due to superior seed number, seed weight, and kernel density traits. Random Forest regression, while limited in predictive accuracy due to dataset size, reinforced the central importance of Seed N/Sp and Seed W/Sp as key predictors of yield variability. Overall, the integrated statistical and multivariate framework clearly demonstrates that seed yield in bread wheat is predominantly governed by seed number and seed weight, while kernel quality traits provide additional indirect support. These results offer valuable insights for breeding programs aiming to identify superior parents and design ideotypes with improved yield performance

Keywords: Bread wheat, Triticum aestivum L, path analysis, K-means clustering, Random Forest, agronomic traits


Ekmeklik Buğdayda Tohum Verimi Belirleyicilerinin Çok Değişkenli Ayrıştırılması: Yol Analizi, K-Ortalama Kümeleme Ve Makine Öğrenimi Yaklaşımlarının Entegre Edilmesi

Nazife Gözde Ayter Arpacıoğlu, Zekiye Budak Başçiftçi, Murat Olgun
Eskişehir Osmangazi Üniversitesi-ziraat Fakültesi-tarla Bitkileri Bölümü

This study evaluated twelve commercial wheat varieties using major agronomic and quality-related yield components, including seed yield (Seed Y), heading date (Heading D), plant height (Plant H), seed number per spike (Seed N/Sp), seed weight per spike (Seed W/Sp), thousand seed weight (Thou SW), and test weight (Test W). Descriptive statistics revealed considerable phenotypic variation among genotypes, indicating strong genetic diversity in both yield potential and kernel quality characteristics. Correlation analysis showed that Seed N/Sp and Seed W/Sp were positively and strongly associated with Seed Y, supporting the widely recognized “seed number × seed weight” principle of wheat yield formation. Path analysis identified Seed W/Sp and Seed N/Sp as the most influential components, showing the highest direct effects on yield. Thousand seed weight and test weight contributed mainly through indirect effects, highlighting their secondary but supportive roles in shaping final productivity. Plant height also exhibited a meaningful direct effect, demonstrating the contribution of biomass production and source–sink relationships to seed filling. K-means clustering (k = 3) effectively separates the twelve varieties into high-yield/high-quality, low–moderate-yield, and morphologically distinct late-heading groups. High-performing varieties such as Yunus, Tosunbey, Bezostaja, and Nacibey clustered together due to superior seed number, seed weight, and kernel density traits. Random Forest regression, while limited in predictive accuracy due to dataset size, reinforced the central importance of Seed N/Sp and Seed W/Sp as key predictors of yield variability. Overall, the integrated statistical and multivariate framework clearly demonstrates that seed yield in bread wheat is heavily governed by seed number and seed weight, while kernel quality traits provide additional indirect support. These results offer valuable insights for breeding programs aiming to identify superior parents and design ideotypes with improved yield performance

Anahtar Kelimeler: Ekmeklik buğday, Triticum aestivum L, yol analizi, K-ortalama kümeleme, Rastgele Orman, tarımsal özellikler


Corresponding Author: Nazife Gözde Ayter Arpacıoğlu, Türkiye
Manuscript Language: English
×
APA
NLM
AMA
MLA
Chicago
Copied!
CITE
LookUs & Online Makale