What Is Vector Indexing? Everything You Need To Know

The world has been turned upside down by large language models like ChatGPT and Claude which can answer questions in a conversational way. Learning and inquiring through chat has changed the way we interact with computers. 

As LLM models continue to evolve and improve, organizations are looking for ways to leverage their proprietary data and knowledge into LLM-like interfaces, but still securely protect their data. One popular method to combine the power of LLM chat interfaces and proprietary data is Retrieval Augmented Generation (RAG) architectures.


At a high level, RAGs combine an existing LLM model with a huge database of enterprise data. Queries to the RAG application are processed through the database and the LLM to provide customized results that include the customer’s data.  

One drawback of a RAG database can be the impact on international teams. Vector databases are generally not distributed, meaning each request may travel halfway around the world before getting a response, greatly slowing down the time it takes to get a response from the tool. For speed and accuracy with RAG searching, a distributed database with multiple nodes around the world can alleviate latency.

YugabyteDB is a distributed PostgreSQL database integrated with `pgvector`, allowing teams to build distributed vector index databases around the world for fast, powerful RAG applications. 

Why is Vector Search Important?

Many RAG applications are built by placing the institutional knowledge into a vector database. Let’s remember what a vector is.  If you recall your high school algebra, you might remember vectors like (2,2), which represents a point x,y plane at x=2 and y=2.

Vector databases are similar to this 2D vector, except each vector is much bigger.  Where the (x,y) notation in our graph has two entries, these vectors may have hundreds or thousands of entries. 

Why so large? In a vector search, text and images are converted into vectors, a one-dimensional array of values, and it can take many values to describe the content as a vector mathematically. So, the RAG database is a database of vectors that are made up of text.  

How Does the Question we Type on the Screen Get Compared to Vectors?

First, your query is converted into a vector. Then, a search is completed by comparing the similarity of your search vector to the ones in the database, and the response is built of vectors with a high correlation to your search. For example, if there are three vectors representing the words see, sea, and eyeball, the correlation between see and eyeball will likely be highest, as these two terms are semantically related, whereas sea is not as connected to either term.

Vector search is commonly used in RAG applications and recommendation engines to find similarities between the query and the vectors stored in the RAG database.

What is Vector Indexing?

With millions (and maybe billions) of vectors stored in a vector database, finding similarities through traditional brute-force searching can be slow. To speed searches, a common solution is to create a vector index of the database. 

Vector indexes are created as a part of a vector database and can be built using common libraries in C++ or R.  Vector indexes speed vector search by using mathematical approximations rather than exact matches to find the best semantic results in the dataset. 

The most common approaches use approximate nearest neighbor searches, and there are a number of common algorithms that can be used, including Hierarchical Navigable Small World (HNSW), Product Quantization (PQ), and Inverted File Index (IVF).

What is an Inverted File Index?

IVF is an inverted file index. When a search is performed, the IVF only searches the most pertinent clusters, ensuring both accurate and speedy results from the vector index. Using an IVF index is an elegant way to balance both speed and accuracy of RAG search results.

What is the Difference Between a Vector Database and a Vector Index?

A vector index is a core component of a vector database, enabling fast similarity searches over large vectors by clustering similar vectors. The vector index clusters greatly reduce the number of comparisons that must be made.  This mathematical tradeoff speeds querying of the data stored in the vector database. 

Most vector-capable databases, including YugabyteDB, offer a vector index as part of their vector database offering. This ensures that queries to the vector database are not only accurate but also incredibly fast.

YugabyteDB adds even more speed to RAG searches by providing an enterprise-grade distributed database. Deploying the same database in multiple global locations further reduces latency in connecting and communicating with the database. 

Combining geolocalized low-latency with the power of vector databases and the speed of vector index searches makes YugabyteDB a compelling option for international teams looking to deploy enterprise RAG infrastructure. If you are interested in trying YugabyteDB for your data storage and RAG generation, you can sign up for free.