Riding the RAG Trail: Access, Permissions and Context

Ophir Dror

Bar Lanyado

August 5, 2024

min read

Riding the RAG Trail: Access, Permissions and Context

What is RAG?

‍

Retrieval-Augmented Generation (RAG) is an innovative method that boosts the capabilities of Large Language Models (LLMs) by integrating them with external data sources. This technique involves retrieving relevant data or documents related to a specific query or task, providing the LLM with additional context. As a result, organizations can use their internal documents and improve the relevance of the LLM's output significantly, and overcome one of the biggest challenges with LLMs today - hallucinations.

‍

RAG allows LLMs, which are trained on vast datasets and have billions of parameters, to access up-to-date or specialized knowledge without the need to retrain or fine-tune the model. It is particularly valuable for applications such as support chatbots and Q&A systems that require timely and accurate information. By merging the strengths of LLMs with the reliability of external knowledge bases, RAG ensures that generated responses are not only fast and coherent but also grounded in authoritative and current data.

‍

How Does RAG Work?

‍

There are three common steps to create a RAG based architecture:

Data Indexing: In this step data is loaded, split into small chunks and stored with index. Usually this process is done by embedding the data and storing it as vectors in a Vector Database. Examples for some of the most popular vector databases in the market: Pinecone, ElasticSearch, Redis, Qdrant, Chroma.‍
Query Input and retrieval: When a query (user question) is entered into the RAG system, it is embedded and compared to the other vectors in the vector database. The query vector will be compared with the pre-stored vectors in the database and the most relevant documents will be retrieved. This retrieval process employs similarity metrics, such as cosine similarity or Euclidean distance, to identify the closest matched documents.‍
Contextual Integration: The retrieved documents, which provide the necessary context, are then fed into a generative model alongside the user’s query. This additional context enables the model to produce responses that are both coherent and contextually accurate.

‍

‍

RAG Access Control Problem

‍

RAG is a fantastic solution for building LLM-based applications using our own data without the need for training or fine-tuning. However, there's a significant drawback that could hinder its use in organizations or applications — RAG does not natively support access control, and implementing it on most vector databases is not straightforward.

‍

When indexing documents in a vector database without storing additional metadata, any user query will be compared against all vectors in the database. The most relevant documents will then be retrieved and used to generate an answer.

‍

This means that if a user asks a question about a topic they shouldn't have access to, they could still receive an answer if the relevant data exists in the database, even without any injection or bypass techniques.

‍

‍

Ensuring Secure Data Access with RAG

‍

Today, there are two main strategies to secure access and permissions while building RAG:

‍

Separate Instances

‍

One way to ensure secure access is by creating separate instances for different data types or user roles. For example:

‍

Finance Instance: Accessible only to finance team members.
HR Instance: Restricted to HR personnel.
General Instance: Available to all users for accessing general information.

‍

This approach allows the application to direct queries to the appropriate instance based on the user's role, ensuring that sensitive data remains protected within its designated instance.

‍

However this approach introduced new problems:

‍

Increased Complexity: Managing multiple instances can complicate system architecture, requiring more effort to maintain and update.
Data Duplication: There's a risk of data duplication across instances, leading to increased storage costs and potential inconsistencies.
Integration Challenges: Ensuring seamless integration between instances can be challenging, especially when users need access to multiple data types simultaneously.
Resource Allocation: Running multiple instances may demand more computational resources, impacting overall system performance and efficiency.

‍

‍

Document-Level Access Control

‍

Another method involves adding metadata attributes to each document during indexing, specifying the roles or users authorized to access that document. When a user queries the RAG system, the search is limited to documents they have permission to access, based on their role or user ID.

‍

This method as well has it’s own drawbacks:

Query Performance: Filtering queries based on access control metadata can slow down query performance, especially with large datasets.
Implementation Complexity: Setting up and managing fine-grained access control requires meticulous planning and can be error-prone if not implemented correctly.
Maintenance Burden: Keeping access control metadata up-to-date requires ongoing effort, particularly as users’ roles and permissions change over time.

‍

👉 While both approaches aim to enhance data security, they both come with their own set of challenges that need to be carefully managed.

‍

‍

OK, so are we ready to go?

‍

Not really.

‍

In many organizations, it's common to have files on shared drives or storage systems with "Everyone" permissions or other broad access roles that aren't ideal. This isn't a new problem introduced by LLMs and RAGs, but these systems can amplify the issue, making it even more challenging to manage.

‍

If a file with "Everyone" permission is buried deep within a shared drive, and a user isn't aware of its existence, the risk of them accessing it is relatively low. However, with RAG systems, a user could mistakenly access unauthorized content simply by asking a question. If the answer lies in one of these broadly accessible files, the system could retrieve and present this information, bypassing traditional knowledge of file locations.

‍

To address this issue, we need to implement an additional layer of security:

‍

Introducing Context-Based Access Control (CBAC)

‍

While traditional access control mechanisms help manage document and data access based on permissions, in the unpredictable world of LLMs and the evolving architecture of RAG, a new solution is required to oversee the actual data being requested and received.‍

‍

Context-Based Access Control (CBAC) is the latest innovative feature from Lasso Security that will revolutionize the way companies enforce access control and security policies.

‍

The Need for CBAC

‍

CBAC introduces a new perspective to the world of LLMs and RAG by focusing on the context of both the request and the response and comparing it to a few parameters relevant to the user expected behavior. This approach goes beyond structured data, to understand the nuances of context and ensure secure data handling in many challenging use-cases of the GenAI world. This new feature provides great granularity for admins and help them ensure safe usage of RAG without the overhead of building, maintaining and updating multiple systems infinitely.

‍

CBAC addresses these questions by providing a context-based approach to data access. It enables organizations to:

‍

Precisely Manage Access: Ensure that only authorized users can access specific pieces of information based on the context of their request.
Prevent Unauthorized Information Exposure: Block sensitive information from being retrieved and displayed to users who shouldn’t see it, even if they have broader permissions.
Handle Nuanced Data: Manage documents that contain both relevant and out-of-scope information by evaluating the context of each request.

‍

By implementing CBAC on top of the previous alternatives (separate instances and document-level access control), organizations can elevate their security and control, and start using RAG as part of their enterprise stack without the overhead and fallbacks of each method.

‍

This groundbreaking approach redefines data protection, guaranteeing that only the right information reaches the right users at exactly the right moment, setting a new standard for access management and permissions.

‍