SAI Notes #08: LLM based Chatbots to query your Private Knowledge Base.
Building a Chatbot to Query Your Internal Knowledge Base Using Large Language Models
This week in the weekly SAI Notes:
- Why can't you use available commercial LLM APIs out of the box to query your internal knowledge base.
- Different ways you could try to implement the use case.
- Easy way to build a LLM based Chatbot to query your Internal Knowledge Base using Context Retrieval outside of LLM.
There is an increasing number of new use cases discovered for Large Language Models daily. One of the popular ones I am constantly hearing about is document summarization and knowledge extraction.
Today we look into what an architecture of a chatbot to query your internal/private Knowledge Base that leverages LLMs could look like and how a Vector Database facilitates its success.
The article assumes that you are already familiar with what a LLM is. What is important here is the most simple definition of it - you provide context/query in the form of a Prompt to the LLM and it returns an answer that you can use for the downstream work.
Let’s return to our use case, a Chatbot that would be able to answer questions only using knowledge that is in your internal systems - think Notion, Confluence, Product Documentation, PDF documents, etc.
Why can't we just use a pure commercial LLM API like ChatGPT powered by GPT-4 to answer the question directly.
- Commercial LLMs have been trained on a large corpus of data available on the internet. The data has more irrelevant context about your question than you might like.
- The data contained in your internal systems of interest might have not been used when training the LLM - it might be too recent and not available when the model was trained. Or it could be private and not available publicly on the internet.
- Currently available LLMs have been shown to produce hallucinations inventing data that is not true.
The next approach one could think of:
- Formulate a Question/Query in the form of a meta-prompt.
- Pass the entire Knowledge Base from internal documents together with the Question via a prompt.
- Via the same meta-prompt, instruct the LLM to only use the previously provided Knowledge Base to answer the question.
What are the problems with this approach:
- There is a Token Limit of how much information you can pass to LLM as a prompt, it will vary for different commercially available LLMs.
- Even if you would be able to pass the entire knowledge base to the prompt, Commercial LLM interface providers charge you for the amount of tokens you provide as a prompt.
- Additionally, there is yet no way to cache the knowledge base on the LLM side that would allow you to pass the relevant data corpus only once.
- Consequently, you would need to pass it each time you ask the question, which would inflate the price of API use tremendously.
One more approach could be:
- Use an OSS model.
- Fine-tune it on the corpus of your internal data.
- Use the fine-tuned version for your chatbot.
What are the problems with this approach:
- OSS models are inherently less accurate than commercial solutions.
- You will need specialized talent and a large amount of time to solve the fine-tuning challenge internally.
- Fine-tuning does not solve the hallucination challenge of LLMs.
- Hosting the LLM in-house will be costly and require dedicated talent as well.
- Input size limit is only partially solved, passing the Knowledge corpus each time when asking a question would still be too expensive.
The original article: https://www.newsletter.swirlai.com/p/sai-notes-08-llm-based-chatbots-to