As part of the ongoing graduation project defenses for the AI and Data Science program (2025–2026), IUTT presented a distinguished research project titled: Arabic Document Layout Analysis Across Hierarchical Levels: Paragraphs, Lines, and Words using a Modified U-Net.
The project aims to develop an intelligent system for segmenting Arabic documents across three hierarchical levels (paragraphs, lines, and words) to enhance the efficiency of Optical Character Recognition (OCR) systems. The team built a modified U-Net model, integrating a hybrid loss function (Dice Loss & Binary Cross-Entropy) to address challenges like overlapping text and character connectivity in Arabic script.
The model achieved strong IoU results of 0.896 for lines and 0.900 for words, outperforming various previous works. The project also provided an original scientific contribution by manually annotating a new word dataset containing 7,881 images, paving the way for more robust digital document processing solutions.
Project Members: Hisham Al-Dhabhani, Al-Qassam Al-Saidi, Ali Al-Shahari, Anas Al-Aghbari, Nawar Al-Azazi.
Supervision: Dr. Amin Shayae, Mr. Mohammed Al-Qumasi.
Internal Defense Committee: Dr. Hamzah Jamel, Dr. Ayman Al-Sabri, Prof. Dr. Fadhl Ba-Alawi.
External Defense Committee: Prof. Dr. Ahmed Sultan Al-Hajami, Assoc. Prof. Dr. Malik Al-Jabri.












