Received 08.10.2024, Revised 15.01.2025, Accepted 26.02.2025
The purpose of the study was to analyse and generalise modern methods for recognising UML diagrams in images. The main focus was on automated extraction of text and graphic elements to further reproduce models in text formats. The research methodology covered the analysis of scientific publications, which included 23 papers available in open sources. The study focused on exploring existing approaches to recognising UML diagrams in images. Analysis of scientific publications has shown what modern methods of UML diagram recognition allow achieving more than 90% accuracy in recognising UML diagrams in images. The advantages, limitations, and effectiveness of classical algorithms for computer vision, machine learning, and deep neural networks were investigated. It was found that the best results in classification were provided by deep neural networks, while classical algorithms remain effective for interpreting and extracting elements of UML diagrams. It was found that the main areas in the field of UML diagram recognition are classification of UML diagram types, and interpretation and conversion of UML images to text formats. The main problems were identified: poor image quality, limited training data, and format variability. Possible areas of further research are presented, such as creating large annotated sets of UML diagrams to improve accuracy, and summarising modern approaches to support recognition of more chart types. The findings will contribute to improving the automation processes for working with UML diagrams, and provide an understanding of the current state of the information technology and software development industry, opening up new prospects for development
image recognition; computer vision; machine learning; deep learning; automation
[1] Axt, M. (2023). Transformation of sketchy UML class diagrams into formal PlantUML models. Retrieved from https:// www.diva-portal.org/smash/record.jsf?pid=diva2:1786365&dswid=-4498.
[2] Baraban, M., Baraban, S., & Garmash, V. (2021). Development of an advanced web application with a convolutional neural network for image recognition. Information Technologies and Computer Engineering, 18(1), 7-14. doi: 10.31649/19999941-2021-50-1-7-14.
[3] Bergström, G., Hujainah, F., Ho-Quang, T., Jolak, R., Rukmono, S.A., Nurwidyantoro, A., & Chaudron, M.R.V. (2022). Evaluating the layout quality of UML class diagrams using machine learning. The Journal of Systems & Software, 192, article number 111413. doi: 10.1016/j.jss.2022.111413.
[4] Chen, F., Zhang, L., Lian, X., & Niu, N. (2022). Automatically recognizing the semantic elements from UML class diagram images. Journal of Systems and Software, 193, article number 111431. doi 10.1016/j.jss.2022.111431.
[5] Conrardy, A., & Cabot, J. (2024). From image to UML: First results of image-based UML diagram generation using LLMs. doi: 10.48550/arXiv.2404.11376.
[6] De-Wyse, T., Renaux, E., & Mennesson, J. (2018). Using sketch recognition for capturing developer’s mental models. In Proceedings of the ACM/IEEE 21st international conference on model driven engineering languages and systems (pp. 23-28). Copenhagen: IEEE.
[7] Gosala, B., Chowdhuri, S.R., Singh, J., Gupta, M., & Mishra, A. (2021). Automatic classification of UML class diagrams using deep learning technique: Convolutional neural network. Applied Sciences, 11(9), article number 4267. doi: 10.3390/app11094267.
[8] Hebig, R., Ho-Quang, T., Robles, G., Fernandez, M.A., & Chaudron, M.R.V. (2016). The quest for open source projects that use UML: Mining GitHub. In Proceedings of the ACM/IEEE 19th international conference on model driven engineering languages and systems (pp. 173-183). New York: Association for Computing Machinery. doi: 10.1145/2976767.2976778.
[9] Hjaltason, J., & Samúelsson, I. (2014). Automatic classification of UML class diagrams through image feature extraction and machine learning. Sweden: University of Gothenburg and Chalmers University of Technology.
[10] Ho-Quang, T., Chaudron, M.R.V., Karasneh, B., & Osman, M. (2014). Automatic classification of UML class diagrams from images. In Proceedings of the 21st Asia-Pacific software engineering conference (pp. 422-429). Jeju: IEEE. doi: 10.1109/APSEC.2014.65.
[11] Jha, A., Dave, M., & Madan, S. (2019). Comparison of binary class and multi-class classifier using different data mining classification techniques. In Proceedings of international conference on advancements in computing & management (ICACM) 2019. (pp. 894-903). Rochester: SSRN. doi: 10.2139/ssrn.3464211.
[12] Karasneh, B., & Chaudron, M.R.V. (2013). Extracting UML models from images. In Proceedings of the 2013 5th international conference on computer science and information technology (pp. 134-137). Amman: IEEE. doi: 10.1109/ CSIT.2013.6588776.
[13] Koenig, A., Allaert, B., & Renaux, E. (2023). NEURAL-UML: Intelligent recognition system of structural elements in UML class diagram. In Proceedings of the 5th workshop on artificial intelligence and model-driven engineering. (pp. 605-613). Västerås: IEEE. doi: 10.1109/MODELS-C59198.2023.00099.
[14] Lank, E., Thorley, J.S., & Chen, S.J. (2000). An interactive system for recognizing hand drawn UML diagrams. In Proceedings of the IBM center for advanced studies conference (CASCON) (pp. 1-15). DBLP: Mississauga: doi: 10.1145/782034.782041.
[15] Moreno, V., Génova, G., Alejandres, M., & Fraga, A. (2020). Automatic classification of web images as UML static diagrams using machine learning techniques. Applied Sciences, 10(7), article number 2406. doi: 0.3390/app10072406.
[16] Munialo, S.W., Muketha, G.M., & Omieno, K.K. (2020). Automated feature extraction from UML images to measure SOA size. International Journal of Recent Technology and Engineering, 9(2), 1132-1136. doi: 10.35940/ijrte.B4131.079220.
[17] Osman, M.H., Ho-Quang, T., & Chaudron, M.R.V. (2018). An automated approach for classifying reverse-engineered and forward-engineered UML class diagrams. In Proceedings of the 44th EUROMICRO conference on software engineering and advanced applications (pp. 123-130). Prague: IEEE. doi: 10.1109/SEAA.2018.00070.
[18] Ott, J., Atchison, A., & Linstead, E. (2019). Exploring the applicability of low-shot learning in mining software repositories. Journal of Big Data, 6, article number 35. doi: 10.1186/s40537-019-0198-z.
[19] Rashid, S. (2019). Automatic classification of UML sequence diagrams from images. Sweden: University of Gothenburg and Chalmers University of Technology.
[20] Shcherban, S., Liang, P., Li, Z., & Yang, C. (2021a). Multiclass classification of four types of UML diagrams from images using deep learning. In Proceedings of the 33rd international conference on software engineering and knowledge engineering. Pittsburgh: SEKE. doi: 10.18293/SEKE2021-185.
[21] Shcherban, S., Liang, P., Li, Z., & Yang, C. (2021b). Multiclass classification of UML diagrams from images using deep learning. International Journal of Software Engineering, 31(11), 1683-1698. doi: 10.1142/S0218194021400179.
[22] Wang, L., Song, T., Song, H.-N., & Zhang, S. (2022). Research on design pattern detection method based on UML model with extended image information and deep learning. Applied Sciences, 12(17), article number 8718. doi: 10.3390/ app12178718.
[23] Hammond, T., & Davis, R. (2006). Tahuti: A geometrical sketch recognition system for UML class diagrams. In Proceedings of the 2006 working conference on Advanced visual interfaces (pp. 372-375). New York: ACM. doi: 10.1145/1185657.1185786.