Researchers at the University of Alberta have made significant strides in developing a question-and-answer (QA) system aimed at enhancing the efficiency of building code inquiries. This project addresses the cumbersome traditional methodologies associated with manual querying practices that often prove to be labor-intensive and prone to mistakes. The crux of their innovative solution lies in utilizing a Retrieval-Augmented Generation (RAG) framework, which harmoniously integrates advanced retrieval methods with state-of-the-art large language models (LLMs).
The essence of a robust QA system stems from its ability to provide precise answers to user queries; however, achieving this requires more than mere data retrieval. RAG serves as a transformative approach that combines a retriever for extracting pertinent information from extensive documentation with a language model responsible for interpreting this content and generating accurate responses. This amalgamation is designed to streamline the process, thereby alleviating the inefficiencies that have long plagued manual systems.
As researchers delved into the capabilities of RAG, they encountered distinct challenges inherent to both retrievers and language models. The researchers recognize that there is no one-size-fits-all solution when it comes to retrieval methods. Different retrievers exhibit varying performance levels, with each having its unique advantages and potential drawbacks. Meanwhile, language models, despite their power, often suffer from issues like hallucinations—producing seemingly accurate but factually incorrect outputs. Thus, it becomes imperative to fine-tune these models to align them with the specific language and requirements of the building code niche.
In their ambitious investigation, the team, including Mr. Aqib, Dr. Qipei Mei, and Professor Chui, focused on assessing multiple retrievers’ performances. Notably, their systematic evaluation placed Elasticsearch (ES) at the forefront as the most effective retrieval method in comparison with competitors. This evaluation was pivotal as it demonstrated that retrieving between the top three to five documents yielded sufficient context relevant to the user’s query, achieving impressive results marked by consistently high BERT F1 scores—a standard measure of the quality of the model’s answers.
Further enhancements were seen through the fine-tuning of LLMs, a process crucial for capturing the linguistic subtleties inherent in building code language. The researchers experimented with a range of LLMs, with parameters varying from one billion to a staggering twenty-four billion. Among this extensive range, the Llama-3.1-8B model emerged as the standout performer, showcasing a 6.83% relative improvement in BERT F1-score compared to its pre-trained functionality. This fine-tuning process not only highlighted the importance of adaptability in language models but also underscored the necessity of aligning AI capabilities with specialized domains.
The findings have resonated throughout the academic community and beyond, shedding light on how combining effective retrieval strategies with meticulously fine-tuned language models can significantly enhance the accuracy and relevance of answers provided in the context of building code compliance. This novel approach stands as a beacon of innovation, illustrating the potential for AI-driven solutions to transform traditional practices beyond recognition.
Looking ahead, the research team emphasizes the necessity of a fully integrated end-to-end RAG framework that is appropriately validated against curated datasets. Such validation is crucial for ensuring that the system operates reliably in real-world conditions. As Aqib noted in discussions about future work, the ongoing challenge remains to refine these systems further, leveraging continuous domain-specific fine-tuning that could bring the QA capabilities in line with those of leading commercial models such as GPT-4.
Moreover, the implications of their research extend into various sectors that rely heavily on compliance with building codes and standards. The cross-disciplinary impact is evident, as this system can serve not only to assist engineers and architects but also to bridge the gap between complex code regulations and their real-world applications, promoting a more straightforward approach to compliance and legal accountability.
The research paper titled “Fine-tuning large language models and evaluating retrieval methods for improved question answering on building codes,” which has been published in Smart Construction, represents a significant contribution to the burgeoning field of AI in construction. As AI technology continues to evolve, studies like this illuminate the path forward by tackling the existing limitations and enhancing the practical application of artificial intelligence in highly technical fields.
In conclusion, the innovative work of these researchers is a noteworthy development in the realm of artificial intelligence, particularly in enhancing the precision and efficiency of QA systems specific to building codes. Their method of integrating advanced retrieval techniques with sophisticated language models heralds a new era of technology-driven solutions that promise to reshape how professionals interact with regulatory frameworks. As the field continues to advance, it will be fascinating to observe how these developments play a pivotal role in driving compliance and operational efficiency across various domains.
Subject of Research: Not applicable
Article Title: Fine-tuning large language models and evaluating retrieval methods for improved question answering on building codes
News Publication Date: 27-Aug-2025
Web References: DOI
References: Aqib M, Hamza M, Mei Q, Chui Y. Fine-tuning large language models and evaluating retrieval methods for improved question answering on building codes. Smart Constr. 2025(3):0021.
Image Credits: Mohammad Aqib, Qipei Mei, Ying Hei Chui/ University of Alberta, Edmonton, Alberta, Canada; Mohd Hamza/ Aligarh Muslim University, Aligarh, Uttar Pradesh, India
Keywords
Artificial intelligence, retrieval-augmented generation, question answering systems, large language models, building codes, compliance technology.
Tags: building code inquiry systemschallenges in information retrievalenhancing question answering efficiencyimproving accuracy in building code queriesinnovative solutions in construction regulationsintegrating retrieval methods with language modelslarge language models in QAmanual querying methodologiesoptimizing QA systems for building codesperformance evaluation of retrieval systemsprecise answer generation in QAretrieval-augmented generation framework