From Few Images to High Accuracy: Augmentation and Embedding Methods for Date Fruit Ripeness

نوع مقاله : مقاله پژوهشی

نویسندگان

1 Dept. of Biosystem engineering, University of Mohaghegh Ardabili

2 Department of Computer Engineering, Sharif University of Technology, Tehran 14588-89694, Iran.

3 Department of Biosystems Engineering, Gorgan University of Agricultural Sciences and Natural Resources, Gorgan, Iran.

چکیده

Manual date harvesting and sorting remain labor-intensive and error-prone, particularly when distinguishing intermediate ripeness stages such as Rotab. We present an image-based classification pipeline for the Berhi cultivar that assigns fruit to three ripeness stages—Khalal, Rotab, and Tamar—using compact deep structures and training strategies suited to small datasets. Rather than relying on generative or adversarial methods, our approach emphasizes (i) careful augmentation (classical transforms, automated policies, and sample-mixing), (ii) transfer and self-supervised pre training, and (iii) embedding- and metric-learning alternatives, with ensembles and test-time augmentation used as optional accuracy/robustness boosters. On a 150-image dataset (50 images per class) evaluated with 5-fold cross-validation, a ResNet18 baseline reaches about 95% average accuracy. Automated augmentation combined with MixUp/CutMix improves accuracy to 97%, and self-supervised pre training plus advanced augmentation and ensembling attain peak performance near 98%. Improvements are most pronounced for the visually ambiguous Rotab class. We also report practical robustness measures (common corruptions, geometric stability, and calibration), which show that augmentation and pre training substantially increase stability under realistic input variability. These results indicate that, for small and visually subtle datasets, augmentation and pre training—rather than synthetic data generation—offer a pragmatic path to high accuracy and robust behavior.

چکیده تصویری

From Few Images to High Accuracy: Augmentation and Embedding Methods for Date Fruit Ripeness

تازه های تحقیق

  • Evaluation of a set of practical models and embedding/metric-learning alternatives on a three-way ripeness classification task
  • Presentation of automated augmentation policies combined with sample-mixing
  • report of deployment-relevant robustness metrics (average accuracy under common corruptions, geometric stability under small transforms, and calibration/ECE),
  • Improving stability due to augmentation + pre-training substantially
  • Providing a practical recipe for small agricultural imaging datasets
  • Improving augmentation and pre-training—rather than synthetic-image generation

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

From Few Images to High Accuracy: Augmentation and Embedding Methods for Date Fruit Ripeness

نویسندگان [English]

  • Raziyeh Pourdarbani 1
  • Omid Daliran 2
  • Sajad Sabzi 3
1 Dept. of Biosystem engineering, University of Mohaghegh Ardabili
2 Department of Computer Engineering, Sharif University of Technology, Tehran 14588-89694, Iran.
3 Department of Biosystems Engineering, Gorgan University of Agricultural Sciences and Natural Resources, Gorgan, Iran.
چکیده [English]

Manual date harvesting and sorting remain labor-intensive and error-prone, particularly when distinguishing intermediate ripeness stages such as Rotab. We present an image-based classification pipeline for the Berhi cultivar that assigns fruit to three ripeness stages—Khalal, Rotab, and Tamar—using compact deep structures and training strategies suited to small datasets. Rather than relying on generative or adversarial methods, our approach emphasizes (i) careful augmentation (classical transforms, automated policies, and sample-mixing), (ii) transfer and self-supervised pre training, and (iii) embedding- and metric-learning alternatives, with ensembles and test-time augmentation used as optional accuracy/robustness boosters. On a 150-image dataset (50 images per class) evaluated with 5-fold cross-validation, a ResNet18 baseline reaches about 95% average accuracy. Automated augmentation combined with MixUp/CutMix improves accuracy to 97%, and self-supervised pre training plus advanced augmentation and ensembling attain peak performance near 98%. Improvements are most pronounced for the visually ambiguous Rotab class. We also report practical robustness measures (common corruptions, geometric stability, and calibration), which show that augmentation and pre training substantially increase stability under realistic input variability. These results indicate that, for small and visually subtle datasets, augmentation and pre training—rather than synthetic data generation—offer a pragmatic path to high accuracy and robust behavior.

کلیدواژه‌ها [English]

  • Date fruit
  • Ripeness
  • Deep learning
  • Self-supervised learning
  • Metric learning
  • Robustness
[1] Kamilaris, A., & Prenafeta-Boldú, F. X. (2018). Deep learning in agriculture: A survey. Comput. Electron. Agric., 147, 70–90.
[2] Zhu, N., Liu, X., Liu, Z., Hu, K., Wang, Y., Tan, J., Huang, M., Zhu, Q., Ji, X., & Jiang, Y. (2018). Deep learning for smart agriculture: Concepts, tools, applications, and challenges. Comput. Electron. Agric., 154, 357–373.
https://doi.org/10.25165/ijabe.v11i4.4475
[3] Jahromi, H. A., Taheri, A., Sadoughi, F., et al. (2019). A machine learning approach for date fruit sorting. Comput. Electron. Agric., 157, 34–41.
[4] Ibrahim, S. A., Ayda, A. A., William, L. L., Ayivi, R. D., Gyawali, R., Krastanov, A., & AlJaloud, S. O. (2021). Date fruit: A review of the chemical and nutritional compounds, functional effects, and food application in nutrition bars for athletes. Int. J. Food Sci. Technol., 56, 1503–1513.
[5] Krueger, R. R. (2015). Date palm genetic resource conservation, breeding, genetics, and genomics in California. In J. M. Al-Khayri, S. M. Jain, & D. V. Johnson (Eds.), Date palm genetic resources and utilization: Vol. 2. Asia and Europe (pp. 637–661). Springer.
[6] Mohammadrezakhani, S., & Pakkish, Z. (2024). Comparison among five varieties of date fruit and their nutritional value at different ripening stages. Int. J. Hortic. Sci. Technol., 11, 461–468.
[7] Pourdarbani, R., Ghassemzadeh, H. R., Seyedarabi, H., Nahandi, F. Z., & Vahed, M. M. (2015). Study on an automatic sorting system for date fruits. J. Saudi Soc. Agric. Sci., 14, 83–90. https://doi.org/10.1016/j.jssas.2013.08.006
[8] Gabriëls, S. H., Mishra, P., Mensink, M. G., Spoelstra, P., & Woltering, E. J. (2020). Non-destructive measurement of internal browning in mangoes using visible and near-infrared spectroscopy supported by artificial neural network analysis. Postharvest Biol. Technol., 166, 111206. https://doi.org/10.1016/j.postharvbio.2020.111206
[9] Mansouri, S. M., Gautam, P. V., Jain, D., Nickhil, C., & Pramendra. (2022). Computer vision model for estimating the mass and volume of freshly harvested Thai apple ber (Ziziphus mauritiana L.) and its variation with storage days. Sci. Hortic., 305, 111436.
https://doi.org/10.1016/j.scienta.2022.111436
[10] Zhang, Z., Lu, Y., & Lu, R. (2021). Development and evaluation of an apple infield grading and sorting system. Postharvest Biol. Technol., 180, 111588.
 [11] AlMomen, M., Al-Saeed, M., & Ahmad, H. F. (2023). Date fruit classification based on surface quality using convolutional neural network models. Appl. Sci., 13, 7880.
https://doi.org/10.3390/app13137821
[12] AlSirhani, A., Siddiqi, M. H., Mostafa, A. M., Ezz, M., & Mahmoud, A. A. (2023). A novel classification model of date fruit dataset using deep transfer learning. Electronics, 12, 559. https://doi.org/10.3390/electronics12030665
[13] Fayyaz, M., Jhanjhi, N. Z., & Humayun, M. (2023). Generative adversarial network-based data augmentation for date fruit classification. IEEE Access, 11, 89102–89115. https://doi.org/10.1109/ACCESS.2023.3305891
[14] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
https://doi. 10.1109/CVPR.2016.90
[15] Tan, M., & Le, Q. V. (2019). EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning (pp. 6105–6114).
https://doi.org/10.48550/arXiv.1905.11946
[16] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16×16 words: Transformers for image recognition at scale. In Proceedings of the International Conference on Learning Representations.
https://doi.org/10.48550/arXiv.2010.11929
[17] Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 580–587).
https://doi. 10.1109/CVPR.2014.81
[18] Schroff, F., Kalenichenko, D., & Philbin, J. (2015). FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 815–823).
[19] Snell, J., Swersky, K., & Zemel, R. S. (2017). Prototypical networks for few-shot learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems (pp. 4077–4087).
[20] Cubuk, E. D., Zoph, B., Shlens, J., & Le, Q. V. (2020). RandAugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 3077–3086).
https://doi.org/10.48550/arXiv.1909.13719
[21] Cubuk, E. D., Zoph, B., Mane, D., Vasudevan, V., & Le, Q. V. (2019). AutoAugment: Learning augmentation policies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 113–123).
[22] Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2018). mixup: Beyond empirical risk minimization. In Proceedings of the International Conference on Learning Representations. https://doi.org/10.48550/arXiv.1710.09412
[23] Yun, S., Han, D., Oh, S. J., Chun, S., Choe, J., & Yoo, Y. (2019). CutMix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6000–6009).
 https://doi.org/10.48550/arXiv.1905.04899
[24] Pereyra, G., Tucker, G., Chorowski, J., Kaiser, Ł., & Hinton, G. (2017). Regularizing neural networks by penalizing confident output distributions. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017.
https://doi.org/10.48550/arXiv.1701.06548
[25] Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning (pp. 1597–1607).
https://doi.org/10.48550/arXiv.2002.05709
[26] He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9729–9738).
[27] Lee, D.H. (2013). Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the ICML 2013 Workshop on Challenges in Representation Learning, Atlanta, GA, United States, 21 June 2013.
[28] Tarvainen, A., & Valpola, H. (2017). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Proceedings of the 31st International Conference on Neural Information Processing Systems (pp. 1195–1204).
https://doi.org/10.48550/arXiv.1703.01780
[29] Hendrycks, D., & Dietterich, T. (2019). Benchmarking neural network robustness to common corruptions and perturbations. In Proceedings of the International Conference on Learning Representations.
https://doi.org/10.48550/arXiv.1903.12261
[30] Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning (pp. 1321–1330). https://doi.org/10.48550/arXiv.1706.04599
[31] Altaha, M., El-Hajj, N., & Younes, R. (2024). Multi-cultivar date fruit ripeness classification using optimized CNN architectures. Comput. Electron. Agric., 218, 108541. https://doi.org/10.1016/j.compag.2024.108541
 
دوره 13، شماره 3
اردیبهشت 1405
صفحه 271-280
  • تاریخ دریافت: 04 آبان 1404
  • تاریخ بازنگری: 22 آبان 1404
  • تاریخ پذیرش: 01 آذر 1404
  • تاریخ اولین انتشار: 01 آذر 1404
  • تاریخ انتشار: 01 اردیبهشت 1405