Chatbot Uses for Building Design

13 March 2023

Intro

There is a wave of people that suggest chatbots and AI in general will be replacing jobs, from entry level secretarial and customer service to computer programmers themselves. I do not doubt some jobs will be lost, but I believe stronger that more jobs will be streamlined and new positions will be created. Chatbots and AI are more useful as tools for people to use, they cannot do anything by themselves and certainly not without help. That could of change, nobody though ChatGPT or any of the other bots Microsoft or Meta released would be possible last year, or at the very least this widespread. But for now let us benchmark the current state of things.

Idea

I presume that chatbots will be helpful for people in the building construction industry for a few reasons. Meeting notes can be recorded, transcribed, and feed to an AI and the process for capturing the important action items and organizing them will be streamlined. A human will still have to proofread what is spit out and correct where errors are made, the bot is only so good, but it will be faster than a person starting from scratch. Though on the design side I presume it would be useful for asking quick questions about code and that’s what I want to determine with this post. My plan is to train a bot, ask it and ChatGPT the same question, and validate the responses. From that, I hope to find out if a specifically trained bot is more useful than a generalist bot or if either of them are correct or helpful.

Method

I neither have the technical know how nor the resources to train a chatbot myself. There are methods using a Google Colab notebooks or Hugginface spaces to train a model, but I have not been able to make them work. But I did find a website where I can feed it a pdf and it will train a bot to pull information from it. That website is chatbase.co.

Training the Bot

For the training set I am using Chapter 7 of NFPA 101, 2018 version, pages 123-208. Chapter 7 is mostly text, which is the only type of data the bot can learn from, there are few images, which are ignored by the bot. Being mostly text or text that explains accompanying images makes it a good dataset to use. I had to limit it to only these specific pages of the code because there is a character limit for the free trail on this website. Though this should be sufficient since I will only be asking it questions about the contents of what I specifically feed it. If it can do this it would likely be able to do so on a bigger dataset, but for now I will start small.

Testing Methodology

There are few different kinds of questions that I think are the most useful to ask a bot. Definitions, Explanation, Clarification, and Follow Up.

For the sake of this experiment I will not be ‘prompt engineering’, so I will ask it questions in the same style as if I were quickly asking a colleague.

Also I will be asking ChatGPT the same questions, ChatGPT is a generalist model that was trained on a vast training set. chatbase.co bots are presumably ChatGPT as well but it will parse the information I give it in for context when giving an answer, which in this case are sections of Chapter 7 of NFPA. I’ll refer to the new bot as Custom Bot and ChatGPT as ChatGPT.

Questions

Also I will be asking ChatGPT the same questions, ChatGPT is a generalist model that was trained on a vast training set. Chatbase.co bots are presumably ChatGPT as well but it will parse the information I give it in for context when giving an answer, which in this case are sections of Chapter 7 of NFPA. I’ll refer to the new bot as Custom Bot and ChatGPT as ChatGPT.

Definitions

Explanations

Clarifications

Results

First Impressions

ChatGPT is much more long winded and far less concise with its answers. Feels like a student trying to get to a word limit in an essay. If I had asked in the prompt for shorter answers I’m sure it would have given me shorter answers but that’s not what I did.

Custom Bot is much more concise, though this may be for economic reasons. The website’s owner may be injecting, without my knowledge, and this is just a guess, something along the lines of “keep the answer short and concise” to the end of whatever I type in. Each word that gets generated costs money so the owner would be incentivized to do this, especially for people using the free trial, like myself. Also, from using the website I think I can see what is exactly happening. The NFPA code is not being retrained as part of the training set, but is just being parsed by the bot and reinserted with my prompt as context for the generated answer. I think this is true because below the chat section is it coping lines from the NFPA pdf as “sources”, see image. This would be an inexpensive way to “train a bot” without actually having to do any training. In layman’s terms its just doing an advanced search through the submitted PDF for keywords and using the results to achieve a better response.

Sources shown at the bottom of chatbase.co website after answering a prompt

Accuracy

QuestionChatGPTCustom BotNote
1ChatGPT mentioned ADA as well as NFPA 101, Custom Bot only referenced NFPA 101.
2Both were adequate responses with ChatGPT’s being better.
3Both answers were not great, I intended for it to reference Table 7.3.1.2 Occupant Load Factor. The Custom Bot didn’t even try to answer. ChatGPT gave an answer using IBC values that was in a way correct and from Table 1004.5 but not all the values it gave me made sense.
4Both were correct but ChatGPT’s answer was better.
5Custom Bot just cites the definition and not where I would actually find the information. ChatGPT gives incorrect values from the IBC or gives other values that make no sense without more context. Either way both bots didn’t tell me where to find the information specifically or help clarify given a somewhat vague prompt. So both just spit something out without questioning the vagueness of my prompt. Like if I asked someone what their favorite fast food was and they said “the #4 meal”
6Both give technically correct answers but then fail when trying to explain with examples. ChatGPT compares a small office to a high-occupancy building where the word usage comparing lengths is inconsistent. Custom Bot states that nursing home occupants may have limited mobility and may have longer travel distances to exits. If not all exits are accessible this may be true but in most nursing homes most exits are accessible and by necessity shorter. Either way both lack context and good examples to help explain why. If either bot had neglected examples their answers would have been good enough, but they both gave poor examples.
7ChatGPT, neither references are correct, are in the right areas of the code, but incorrect. Custom Bot cites a table that doesn’t exist.
8Both gave good reasons, there is no one correct answer but both were fine. Both used accessibility, flow, security, and occupancy type as considerations.
9I intended for an answer to come from NFPA 101 Table 7.3.1.2, I don’t know where ChatGPT got 15 ft^2/person and Custom Bot said I was correct but cited the 2018 IBC, I couldn’t find that value in the IBC. Neither gave good answers.
10Asked about length, both stated that adding sprinklers does not “necessarily” or “generally” change allowable travel distance which is a half truth because context is needed. But ChatGPT continued and stated a definite correct answer of yes it can change allowable length and cited the IBC. Custom Bot never gave a definite answer and started talking about egress width and stated answers can vary.
11Both stated scissor staircases are safe if properly constructed and gave examples albeit they were more architectural examples.
Total6/105/10Both gave answers that were correct when asked for broad generalities. Neither gave good specific answers or good “real life” examples to any explanation. ChatGPT was very long-winded with its answers and this sometimes hurt clarity.

Conclusion

Don’t use chatbots for questions about code that are any more specific than definitions and make sure it is given enough context with questions about definitions. For generalities like questions 4, 8, and 11 it gave correct answers that made sense. But those questions didn’t have “correct” answers, they more or less asked for ideas and considerations for problems. For anything specific, anything asking for a reference or to check a number, don’t even bother. I tried to find where the numbers that they spit out came from but most of the time couldn’t.

Do not trust chatbots, they will convince you of lies very easily and if you know the right answer you would be better off just looking in the code yourself to find it and check for yourself. It took just as much time or even more to try and validate answers using either chatbot than it would have had it just googled a question or searched a codebook myself.

Many times people use these chatbots and come in with low expectations when asking it questions. They then take the answers, become surprised by them, and self-evaluate the bot at a higher level than their expectations and give it praise. I asked general questions with broad answer and specific questions with specific correct answers. It answered the former OK and the latter terribly. It’s answers need to be double checked for everything because there is a nonzero percentage of the time where it just spits out nonsense and I know this from experience.

Appendix

Transcript of my conversation with ChatGPT and Custom Bot