RAG with Oracle AI Vector Search and OCI Generative AI: Python and PL/SQL Approaches

In this episode of the Oracle University Podcast, hosts Lois Houston and Nikita Abraham are joined by Brent Dayley, Senior Principal APEX & Apps Dev Instructor. Together, they explore how to implement Retrieval Augmented Generation (RAG) using Oracle AI Vector Search and OCI Generative AI. Brent walks listeners through the similarities and differences between building RAG workflows with Python and PL/SQL, offering practical insights into embedding creation, semantic search, and prompt engineering within Oracle's technology stack. Oracle AI Vector Search Deep Dive: https://mylearn.oracle.com/ou/course/oracle-ai-vector-search-deep-dive/144706/ Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu Special thanks to Arijit Ghosh, Anna Hulkower, Kris-Ann Nansen, and the OU Studio Team for helping us create this episode. Please note, this episode was recorded before Oracle AI Database 26ai replaced Oracle Database 23ai. However, all concepts and features discussed remain fully relevant to the latest release. -------------------------------------------- Episode Transcript:

00:00

Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started!

00:26

Lois: Hello and welcome to another episode of the Oracle University Podcast! I'm Lois Houston, Director of Communications and Adoption Programs with Customer Success Services, and with me is Nikita Abraham, Team Lead for Editorial Services with Oracle University.

Nikita: Hi everyone! If you joined us last week, you'll remember we explored AI Vector Search and how Retrieval Augmented Generation, or RAG, empowers large language models by surfacing relevant business content for smarter, more context-aware answers.

Lois: That's right, Niki. We also looked at how unstructured data gets transformed into embeddings, how these vectors power semantic search, and how Oracle Database 23ai is uniquely designed to support these advanced AI workflows.

Nikita: Today, we're building on that foundation with an exciting double feature. We'll start with an introduction to OCI Generative AI Service and how you can use it with Python, and then dive into Retrieval Augmented Generation with Oracle AI Vector Search and the OCI Gen AI service using PL/SQL.

01:32

Lois: And to walk us through these topics, we're delighted to welcome back Brent Dayley, Senior Principal APEX & Apps Dev Instructor. Brent, it's great to have you. So, tell us, how does the OCI Generative AI service use Oracle AI Vector Search?

Brent: So OCI Generative AI service allows us to take user questions and augment those using external data from outside of the large language model that allows us to return augmented content.

We would leverage Oracle AI Vector Search in order to retrieve contextually relevant information. And we would create prompts that have some sort of a meaning to help guide the user to input the appropriate types of questions. And this allows us to retrieve the data using a large language model.

02:27

Nikita: What are the typical steps for implementing a RAG workflow using the OCI Generative AI service in Python?

Brent: We would load the document. Transform the document to text. And then split the text into chunks.

So if you're talking about maybe a PDF that contains chapters, we might split the different chapters into individual chunks. We would then set up Oracle AI Vector Search and insert the embedding vectors. We would build the prompt to query the document. And then we would invoke the chain.

So first, you would load the text sources from a file. Open a terminal window and connect to your compute instance. And launch ipython to allow interactive work.

Ipython allows you to insert a series of steps in order to put different commands in different steps. You might load the source file called FAQs.

Next, you would load the FAQ chunks into the Vector Database. You would create a connection and connect to your database. And then create the table. And then you would vectorize the text chunks and then encode the text chunks. And then insert the chunks and vectors into the database.

Next, you would vectorize the question. Define the SQL script ordering the results by the calculated score. Define the question. Write the retrieval code. And then execute the code. Finally, you would print the results.

Then we would create the large language model prompt and call the AI generative LLM. Ensure that our prompt does not exceed the maximum context length of the model. And then define the prompt content.

We would then initialize the OCI client and then make the call.

04:47

Here's some exciting news! Oracle University has training to help your teams unlock Redwood—the next-gen design system for Fusion Cloud Applications. Learn how Redwood improves your user experience and discover how to personalize your Fusion investment using Visual Builder Studio. Whatever your role, visit mylearn.oracle.com and check out these courses today!

05:12

Nikita: Thanks, Brent. That gives us a nice overview of how Python can be leveraged with OCI Generative AI. Now, how would you compare working with Python for building RAG applications to using PL/SQL? Can you walk us through the high-level process for building a RAG solution in this environment?

Brent: First, we would want to load the document. Next, we would transform the document into plain text. After that, we would take that text and split it into meaningful chunks. Next, we would go ahead and set up Oracle AI Vector Search and insert the embedding vectors. We would then build the prompt so that we can query the document. And then we would invoke all of those previous steps as our chain.

06:04

Lois: OK, and can we take a closer look at each of these steps?

Brent: Step 1, text extraction and preparation. So, let's imagine we have some sort of document that we want to use as the augmented information. We would load that document. Next, we would transform the document to text. And we have a function in the DBMS Vector Chain Package called util to text. And this is used to extract plain text from the loaded documents.

Next, we would want to split the text into meaningful chunks. The DBMS Vector Chain Package has another function called util two chunks, that allows us to divide the extracted text into smaller, more manageable pieces, which we call chunks.

07:02

Nikita: Once we have our text chunks ready, what's the next step to make our data searchable and useful for the large language model?

Brent: Step number 2, we would want to go ahead and use embedding models in order to create our vectors. We would load multiple ONNX models into the database. And the reason we would do this is because models with a greater number of dimensions usually produce higher quality vector embeddings.

So you might want to load multiple different ONNX models into the database so that you can generate embeddings from each of the models, and then compare those vector embeddings using those different models. You would create vector embeddings using PL/SQL packages.

07:55

Lois: After embeddings are created, how does the solution find the most relevant content in response to a user's question?

Brent: Step 3, we would then go and do a similarity search so that we can return a response. We would select the text chunks that have the relevant information for the input user question based on vector search. This allows for integrating with Oracle's Gen AI Large Language Model Service to generate responses. The process ensures that the large language model generates contextually appropriate and relevant answers for those users' queries.

Now, step 4 is to build the prompt, and I want to stress the importance of large language model prompt engineering. What this will do is to carefully craft input queries or instructions so that we can get more accurate and desirable outputs from the large language model.

This allows developers to guide the LLM's behavior and tailor its responses to specific requirements. This is what we call LLM Prompt Engineering. And it allows us, as I was saying, to craft input queries or instructions so that we can create more accurate and desirable outputs.

Next, we would use an example interactive RAG application that uses the Streamlit framework in order to create a user-friendly interface. This interface will allow us to upload documents, pose the question, and receive relevant answers generated by the underlying RAG pipeline within the database.

In the final step, we will have an input prompt that asks us to ask a question about the PDF. We will then type in some sort of a question relative to the PDF content. And then we would retrieve the return data based on the input question.

10:11

Nikita: Brent, thank you for walking us through both the Python and PL/SQL approaches for building RAG solutions with Oracle Generative AI. If you'd like to dive deeper into these topics, don't forget to visit mylearn.oracle.com and look for the Oracle AI Vector Search Deep Dive course. Until next time, this is Nikita Abraham…

Lois: And Lois Houston, signing off!

10:33

That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(163)

Encore: Inside Cloud Networking

In this episode, hosts Lois Houston and Nikita Abraham unpack the basics of cloud networking and the Domain Name System (DNS). You'll learn how local and virtual networks connect our devices, and how ...

26 Mai 19min

Encore: Cloud Data Centers - Core Concepts Part 4

In this episode, hosts Lois Houston and Nikita Abraham break down the differences between Infrastructure-as-a-Service, Platform-as-a-Service, and Software-as-a-Service. The conversation explores how e...

19 Mai 14min

Encore: Cloud Data Centers - Core Concepts Part 3

Have you ever considered how a single server can support countless applications and workloads at once? In this episode, hosts Lois Houston and Nikita Abraham explore the sophisticated technologies tha...

12 Mai 15min

Encore: Cloud Data Centers - Core Concepts Part 2

Have you ever wondered where all your digital memories, work projects, or favorite photos actually live in the cloud? In this episode, Lois Houston and Nikita Abraham discuss cloud storage. They explo...

5 Mai 14min

Encore: Cloud Data Centers - Core Concepts Part 1

Curious about what really goes on inside a cloud data center? In this episode, Lois Houston and Nikita Abraham dive into how cloud data centers are transforming the way organizations manage technolo...

28 Apr 16min

Vector AI Supporting Features: What's New in Oracle Exadata and GoldenGate

Hosts Lois Houston and Nikita Abraham are joined by Brent Dayley, Senior Principal APEX and Apps Dev Instructor, to explore the latest vector AI supporting features in Oracle Exadata and GoldenGate 23...

22 Apr 13min

Retrieval Augmented Generation (RAG)

Join hosts Lois Houston and Nikita Abraham as they explore one of the most exciting innovations in enterprise AI: Retrieval Augmented Generation (RAG) powered by Oracle AI Vector Search. In this episo...

7 Apr 12min