Create a Custom ChatGPT Assistant with Your Documents

We are going to build custom ChatGPT using your own document library. We are going to take advantage of latest development in large language models (LLM) like OpenAI GPT-3, GTP-4.

There are 2 approaches to build question/answering system using LLM (Large language model).

1. Fine tune GPT

If you fine tune GPT, it will not limit your context to your data but, your data + data that GPT trained on. It can easily go out of context for answering your questions which are not relevant to you document. This will also require retraining the model for new data.

2. Semantic search + ChatGPT LLM (Large language model)

This approach will better fit to you question/answering system because it will give you context specific answers, easy to update your data with new information. In contrast to fine-tuning GPT requires re-training the model.

We are going to use Approach #2 here.

Followings libraries/framework will be used to build our question answering system.

Langchain: It is a powerful library for developing LLM based applications.
Pinecone: This will serve as vector database for storing your embedding vectors and performing semantic search.
Streamlit: This will be used to deploy the app.
OpenAI: LLM libraries from OpenAI.