[] 2 - QA and Chat over Documents - Langchian
QA and Chat over Documents
Introduction
QA is one of Langchain's main applications. Anything that has to do with finding an answer in a sea of words can be referred to as question and answer(QA). Although this simple function seems weak, it can actually be used to build many things. QA is related to extraction, analyzing structured data, and summarization, which all deal with processing data with LLM. Use cases: Use cases | 🦜️🔗 Langchain.
The question would be what makes QA special and different from the other use cases? I believe QA allows for fast questions to answer. This makes sense when you take a look at the workflow of QA:
load > split > embed > store > retrieve > output
QA process
load
loading "Document". "Document" is a class of Langchain, which contains the main content and metadata. You can load the document from different sources using Langchain document_loader tools, supported by many useful integrations: Integrations | 🦜️🔗 Langchain.
For example, I am using a website loader tool to load text from a website
Split
Embed the documents into vectors. To embed some text into a vector means to convert a string into a vector that is a numerical representation of that string. This step essentially puts a tag on each document, which can be used to retrieve them based on the characteristics of the document content(text).
I am not familiar with how it works on the inside, but embedding algorithms/models are able to generate features for a text automatically.
Store
Retrieve
Output
Langchain functions
Langchain offers a few ways to do the above process, with different levels of abstraction. Here are the functions that can get an answer to a question:
- load > VectorstoreIndexCreator(langchain.indexes) > answer
- load > split > store > RetrievalQA(langchain.chains) > answer
- load> split > store > retrieve > load_QA_chain(langchain.chains.question_answering)> answer