2

HANA Vector Engine and LangChain

 1 month ago
source link: https://community.sap.com/t5/technology-blogs-by-sap/hana-vector-engine-and-langchain/ba-p/13636959
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

HANA Vector Engine and LangChain

Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
VectorAndLangChain.jpeg

The popular LangChain framework makes it easy to build powerful AI applications. For many of these scenarios, it is essential to use a high-performance vector store.
With HANA Vector Engine, the enterprise-grade HANA database, which in known for its outstanding performance, enters the field of vector stores. With the LangChain integration for HANA Vector Engine, it is now easier than ever to build highly scalable AI applications.
This blog will guide you through 6 easy steps which show how to build a chat-based application based using RAG (Retrieval Augmented Generation) techniques together with HANA Vector Engine in LangChain.

Building a Demo Application

The demo application shall provide the possibility for end-users to ask questions about technical information that is spread across many pages of a website. As an example, we use the content of the CAP (Cloud Application Programming) technology of SAP (https://cap.cloud.sap).

Step 1: Loading the Content of the Website for Further Processing

LangChain provides a very convenient way to load the content of a website into “Document” objects of LangChain. The main prerequisite is that the website has a “SiteMap” defined. Most professional websites do have a sitemap, and so does the CAP documentation. Thus, loading all pages of the website is just a matter of one line, using LangChain’s “SitemapLoader”:

documents = SitemapLoader("https://cap.cloud.sap/docs/sitemap.xml").load()

Step 2: Splitting the Loaded Documents

Search results via vector embeddings tend to degrade if the embedded text is too large. How to determine the ideal sizes is beyond the scope of this blog post. In general, it is a best practice to split the texts into small parts (e.g. 2000 characters). Again, with LangChain, this is a matter of calling “split_documents” on an instance of a “TextSplitter”. In our example we use the “RecursiveCharacterTextSplitter”:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
splits = text_splitter.split_documents(documents)

Step 3: Creating Embedding Vectors and Loading them into HANA

For each of the split document parts, a vector embedding shall be created and then this embedding is inserted into HANA, along with the text. The vectors are not created by HANA, but we pass a reference to an embedding model, which is called to create the vectors. In our example we use an embedding model from OpenAI via the LangChain interface of SAP's "Generative AI Hub". This process is triggered by calling the standard vector-store interface “from_documents”, which is also available for HANA:

vectordb = HanaDB.from_documents(connection=connection,
    documents=splits,
    embedding=gen_ai_hub.proxy.langchain.OpenAIEmbeddings(),
    table_name="CAP_EMBEDDINGS"
)

The “connection” parameter is a connection to a HANA instance that was created by the standard HANA client library “hdbcli”. The “table_name” refers to the relational table which is used to store the vectors and the texts. Other vector-stores use the term “collection” for storing vector data. As HANA is a relational database, it stores vector data in a table, where one of the columns is used for storing the embedding vectors.
The call to “from_documents” first creates embedding vectors for all split document parts by calling the embedding model. Then the data is inserted to HANA. Depending on the amount of text, such a call may take several minutes, but almost all of the time is spent in the embedding model. Storing the vectors in HANA is a very fast operation.

Step 4: Defining a Prompt For a Large Language Model

The definition of a prompt is simply a text with some “variables”. LangChain will fill in values for these variables and pass that value to the LLM for processing:

prompt_template = '''
You are an expert of the SAP Cloud Programming model. You are provided multiple context items that are related to the question to answer.
Use the following pieces of context to answer the question at the end.
```
{context}
```
Question: {question}
'''
PROMPT = langchain.prompts.PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

The variable “context” will contain the best-matching text parts from the vector search based on the given question. And obviously, the variable “question” will contain the question that shall be answered.

Step 5: Create a RetrievalChain using an LLM

This step puts all the pieces together that were created in the previous steps. We use GPT4 as LLM, which is easily accessible via its SAP Generative AI Hub's LangChain integration. The main magic is done in LangChain’s implementation of “ConversationalRetrievalChain”:

qa_chain = ConversationalRetrievalChain.from_llm(
    llm=gen_ai_hub.proxy.langchain.ChatOpenAI(proxy_model_name="gpt-4"),
    vectordb.as_retriever(search_kwargs={'k': 20}),
    combine_docs_chain_kwargs={'prompt': PROMPT})

In addition to the LLM, the chain uses the “vectordb” instance that was created in step 3. By calling “as_retriever”, we instruct LangChain to retrieve the 20 (parameter “k”) most similar document splits when performing a similarity search of the entered question on the split documents. The texts found via the similarity search will then be added as “context” in the prompt.

Step 6: Ask a Question About the Content in the Vectors

Finally, we can ask questions (typically entered by end-users) about the content that we have vectorized. We pass in the question, and after some time (again, it’s the LLM that uses the time, not the super-fast HANA Vector Engine) we get back the answer as text that we can display to the user:

answer = qa_chain.invoke({"question": "How can I use CORS in CAP?"})

That’s it 😀. 6 simple steps to create a RAG based AI application with HANA Vector Engine.

Conclusion and Outlook

The LangChain integration of HANA Vector Engine combines the widely used LangChain framework with the power and speed of the enterprise-grade HANA database. The current LangChain integration of HANA Vector Engine comprises the Python version of LangChain. Further enhancements and more integration options are already on their way. Stay tuned for further announcements and blog posts.

Related Information


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK