Bill of Material Extraction from Engineering Drawings Using AI
Usama, Syed Muhammad (2025)
Usama, Syed Muhammad
2025
Tuotantotalouden DI-ohjelma - Master's Programme in Industrial Engineering and Management
Johtamisen ja talouden tiedekunta - Faculty of Management and Business
Hyväksymispäivämäärä
2025-12-22
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-2025122212041
https://urn.fi/URN:NBN:fi:tuni-2025122212041
Tiivistelmä
Technical Drawings are extremely important in engineering and manufacturing because they help to convey details about components and products at the component level for use throughout a product’s entire life cycle. One of the key parts of technical drawings, the Bill of Materials (BOM), is usually provided as an embedded table that contains information such as name, quantity, and hierarchy regarding each individual part. However, when BOMs are present in scanned or raster format, i.e. images (with possible non-standard layout, bad quality of images and multiple language support), extracting data automatically is very difficult.
This thesis proposes a modular and fully automatic AI-based approach to extract all necessary information from a full BOM using the Design Science Research (DSR) methodology. The proposed approach uses YOLOv11 for detecting tables, GPT-4o for understanding the detected tables semantically, and it will provide a structured output that can be used as input for a variety of downstream processes. The performance of the proposed approach was evaluated both on a per-component basis by utilizing commonly used metrics (e.g. mAP, F1-score) as well as on an end-to-end basis on over 100 technical drawings obtained from various engineering sources to demonstrate the ability to detect a BOM successfully at a rate of >95%, while also demonstrating good generalizability.
In addition to the evaluation of the proposed approach, this thesis also describes a Streamlit application that is designed to be used directly by practitioners and engineers to demonstrate the usability and relevance of the proposed solution. The proposed solution still has some limitations such as sensitivity to skewness in tables and to multilinguality, however the proposed solution significantly reduces the manual effort required to obtain and process the information contained within engineering documents and contributes to the digital transformation of engineering documentation via the utilization of deep learning techniques and vision-language architectures.
This thesis proposes a modular and fully automatic AI-based approach to extract all necessary information from a full BOM using the Design Science Research (DSR) methodology. The proposed approach uses YOLOv11 for detecting tables, GPT-4o for understanding the detected tables semantically, and it will provide a structured output that can be used as input for a variety of downstream processes. The performance of the proposed approach was evaluated both on a per-component basis by utilizing commonly used metrics (e.g. mAP, F1-score) as well as on an end-to-end basis on over 100 technical drawings obtained from various engineering sources to demonstrate the ability to detect a BOM successfully at a rate of >95%, while also demonstrating good generalizability.
In addition to the evaluation of the proposed approach, this thesis also describes a Streamlit application that is designed to be used directly by practitioners and engineers to demonstrate the usability and relevance of the proposed solution. The proposed solution still has some limitations such as sensitivity to skewness in tables and to multilinguality, however the proposed solution significantly reduces the manual effort required to obtain and process the information contained within engineering documents and contributes to the digital transformation of engineering documentation via the utilization of deep learning techniques and vision-language architectures.
