By: Levi Masonde | Updated: 2024-09-02 | Comments | Related: More > Artificial Intelligence
Problem
You are a software developer witnessing the AI revolution, just like everyone. Most developers are at a crossroads: do you join the revolution or hope it's a phase that won't affect you and your work in the long run? With the introduction of new technologies like ChatGPT4o and all the industry-leading companies having AI incorporated into their businesses, it is safe to admit that we have reached a point of no return, and AI will only increase in popularity and influence in all businesses.
So, how do you get involved with AI as a developer? Where do you start?
Solution
Natural language AIs like ChatGPT4o are powered by Large Language Models (LLMs). You can look at the overview of this topic in my previous article. As much as theory and reading about concepts as a developer is important, learning concepts is much more effective when you get your hands dirty doing practical work with new technologies.
In this tutorial, we will build an LLM application using LangChain to show you how to start implementing AI in your applications. We will create a question-answer chatbot using the retrieval augmented generation (RAG) and web-scrapping techniques.
Prerequisites
- Visual Studio Code
- Python
- LangChain
What are LLMs?
LLMs are natural language machine learning models built using neural networks pre-trained with trillions of tokens (text data). LLMs split sentences into tokens, or groups of characters, before sending the prompt to the model using a tokenization method. You can learn more about LLMs in this article: Large Language Models (LLMs) to train artificial intelligence (AI) tools such as ChatGPT.
What is LangChain and How It Works
LLMs, by default, have been trained on a great number of topics and information based on the internet's historical data. If you want to build an AI application that uses private data or data made available after the AI's cutoff time, you must feed the AI model the relevant data. The process of bringing and inserting the appropriate information into the model prompt is known as retrieval augmented generation (RAG). We will use this technique to enhance our AI Q&A later in this tutorial.
LangChain is a Python framework built to enable developers to feed custom data to LLMs and to interact with LLMs in the following ways:
- Chains: Creates a chain of operations within a workflow. LangChain enables you to link actions like calling APIs, querying multiple LLMs, or storing data into one operation.
- Agents: Makes decisions based on the output of the LLM.
- Memory: Maintains the state and history of context across different interactions.
- Indexes: Uses tools that manage and create indexes to be queried by LLMS using vector stores.
- Retrievers: Fetches relevant information from various sources, such as databases, APIs, document collections, or retrieval augmented generation, based on queries generated by LLMs.
Text Embedding Models and Vector Stores
Text embedding is a way to represent pieces of text using arrays of numbers. This transformation is essential for Natural Language Processing because computers understand numeric representation better than raw text. Once the text is transformed, it exists on a specific coordinate in a vector space where similar texts are stored close to each other.
Vector stores are responsible for storing vectors and running vector searches for you. First, you must get data from your source and embed it into a vector store. When you want to send a query, the query is embedded and then the vector store searches for similar vectors from the vector store, sending the results back as the answer:
Source: https://python.langchain.com/v0.1/docs/modules/data_connection/vectorstores/
Large Language Model Providers
Different LLM providers in the market mainly focus on bridging the gap between established LLMs and your custom data to create AI solutions specific to your needs. Essentially, you can train your model without starting from scratch, building an entire LLM model. You can use licensed models, like OpenAI, that give you access to their APIs or open-source models, like GPT-Neo, which give you the full code to access an LLM.
Creating an LLM Application
In this section, you will create the base for a Q&A application. This is when you send a question to the LLM and get an answer. As discussed earlier, you can use the RAG technique to enhance your answers from your LLM by feeding it custom data.
First, you must install LangChain on your environment. To do this, open your terminal on Visual Studio Code and run this pip command:
pip install langchain
Next, use the following pip command for the dotenv module:
pip install python-dotenv
Then, create a .env file and add the following code:
LANGCHAIN_API_KEY=<Your_API_Key>
Keep in mind that you might have to add your API keys to your system's environment variables.
Then create a new file named LLM_App.py and add the following code to check if your environment can pick up your variables:
from dotenv import load_dotenv import os load_dotenv() # This loads environment variables from the .env file tracing = os.getenv('LANGCHAIN_TRACING_V2') api_key = os.getenv('LANGCHAIN_API_KEY') print(f'Tracing Enabled: {tracing}') print(f'API Key: {api_key}')
When running this code, you should see this output:
Then, let's add Cohere as our LLM provider. Login to Cohere, use the default API key, and add it to your .env file:
Once done, install the LangChain Cohere using this pip command:
pip install -U langchain-cohere
Let's start writing code for a Q&A application with no custom external data. Using the default RAG class on your LLM_app.py file, add the following code:
from dotenv import load_dotenv import os from langchain_cohere import CohereRagRetriever,ChatCohere from langchain.indexes import VectorstoreIndexCreator from langchain_community.docstore.document import Document from langchain_community.utilities import ApifyWrapper # Load environment variables from the .env file load_dotenv() # Retrieve environment variables tracing = os.getenv('LANGCHAIN_TRACING_V2') langchain_api_key = os.getenv('LANGCHAIN_API_KEY') cohere_api_key = os.getenv('COHERE_API_KEY') # Check if environment variables are loaded properly if not cohere_api_key: raise ValueError("COHERE_API_KEY environment variable not set.") # User query we will use for the generation user_query = "what is MSSQLTIPS and who created it" # Load the cohere chat model cohere_chat_model = ChatCohere(cohere_api_key=cohere_api_key) # Create the cohere rag retriever using the chat model rag = CohereRagRetriever(llm=cohere_chat_model, connectors=[]) # Get the cohere generation from the cohere rag retriever docs = rag.invoke(user_query) # Print the documents for doc in docs[:-1]: print(doc.metadata) print("\n\n" + doc.page_content) print("\n\n" + "-" * 30 + "\n\n") # Print the final generation answer = docs[-1].page_content print(answer)
This will provide an answer to be displayed on your terminal:
The results seem great. But, if we run the function again, these are the results:
As you can see, the results are now different and incorrect this time. Unfortunately, nobody can technically explain why this happens. It is all part of AI “hallucination.”
One way of increasing the accuracy of your LLM search results is by setting the temperature parameter to 0 when declaring your LLM on the LLM_app.py file. This is on line 55:
cohere_chat_model = ChatCohere(cohere_api_key=cohere_api_key, temperature=0)
This ensures that your results do not change too much when re-run.
Using Web Scrappers with LLMs
Another way of increasing the accuracy of your LLM search results is by declaring your custom data sources. This way, your LLM can answer questions based mainly on your provided data source. Using a tool like Apify, you can create an automated web-scrapping function that can be integrated with your LLM application. This will enable you to choose a web data source for your LLM queries.
To avoid mixing the two codes prematurely, create another Python file named ApifyQA_app.py and add the following code:
from dotenv import load_dotenv import os import cohere from apify_client import ApifyClient from langchain.indexes import VectorstoreIndexCreator from langchain_community.docstore.document import Document from langchain_community.utilities import ApifyWrapper from langchain_community.vectorstores import FAISS from langchain_cohere import CohereEmbeddings from langchain_cohere import CohereRagRetriever,ChatCohere # Load environment variables from the .env file load_dotenv() # Retrieve environment variables cohere_api_key = os.getenv('COHERE_API_KEY') # Check if environment variables are loaded properly if not cohere_api_key: raise ValueError("COHERE_API_KEY environment variable not set.") # Initialize Apify client apify = ApifyWrapper() # Define the function to map dataset items to Document objects def dataset_mapping_function(item): return Document( page_content=item.get("text", ""), metadata={"source": item.get("url", "")} ) # Start URL for the crawler (using a full URL instead of a relative path) start_url = "https://www.mssqltips.com/about/#:~:text=About%20MSSQLTips%20MSSQLTips.com%20was%20started%20in%202006%20by,a%20different%20approach%20to%20learning%20about%20SQL%20Server." # Call the Apify actor to crawl the website content loader = apify.call_actor( actor_id="apify/website-content-crawler", run_input={"startUrls": [{"url": start_url}]}, dataset_mapping_function=dataset_mapping_function, ) # Define the embedding function using Cohere embedding_function = CohereEmbeddings(cohere_api_key=cohere_api_key) # Create the Vectorstore index index_creator = VectorstoreIndexCreator(embedding=embedding_function) index = index_creator.from_loaders([loader]) # Define the Cohere LLM llm = ChatCohere(temperature=0) # Query the vector store query = "what is MSSQLTIPS and who created it" result = index.query(query, llm=llm) print(result)
Running this query, which is the same query we ran on the previous script, you get this result:
This is a result that reflects the website's content:
As you can see, the results are heavily influenced by the data source we feed our LLM.
This tutorial covers an LLM that uses a default RAG technique to get data from the web, which gives it more general knowledge but not precise knowledge and is prone to hallucinations. We also covered web-scrapping retrievers added to an LLM. This ensures that the LLM outputs have controlled and precise content.
Conclusion
LLMs played a huge role in pushing AI to the spotlight, especially today, as most companies want to eventually have custom AI systems. Starting an AI system from scratch can only be done by companies with huge pockets; most will have to settle for existing LLM models and customize them to their organization's requirements.
This tutorial teaches you the basic concepts of how LLM applications are built using pre-existing LLM models and Python's LangChain module and how to feed the application your custom web data.
Next Steps
- Learn how to use LangChain Prompt Templates with OpenAI LLMs.
- You can also learn more about how LLMs work.
- Learn how to Create AI Models with T-SQL to Buy or Sell Financial Securities.
- Learn more about Artificial Intelligence Features in Power BI for Report Development.
- Learn how they use Large Language Models (LLMs) to train artificial intelligence (AI) tools such as ChatGPT.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2024-09-02