Grounding Your Salesforce Agent With Real-World Data (RAG, Chunking, Data Library &Â More!)
If Part 1 was about understanding what Agentforce is, Part 2 is all about understanding how your agent becomes smart, trustworthy, and actually useful in the real world.
And the secret is Grounding.
(Yes, the dramatic capital G is intentional đ)
Letâs dive in.
đ What Is Grounding? (And Why Your Agent Needs It)
Grounding = connecting your AI agent to trusted, authoritative data so it answers based on factsââânot imagination.
When you ask an agent a question like:
âWhat is the refund policy for our subscription product?â
It shouldnât hallucinate. It should look at:
- Your internal Knowledge Articles
- Your Pricing policies
- Your Product documentation
- Your CRM records
- Your Product database, etc.
That is grounding.
It tells the LLM:
đ âUse THIS data only. Stay within THIS reality.â
The Building Blocks of an Agent
Even a perfectly grounded agent needs the right internal structure. Salesforce defines three essential elements that make up an agent:
1. Topics
Define what the agent is responsible for
Example: âRefund Requestsâ, âAppointment Schedulingâ, âOrder Statusâ
2. Instructions
Tell the agent how to behave, what to avoid, and what rules to follow
Example: âAlways verify customer identity before sharing account details.â
3. Actions
Specific things the agent can perform
Examples:
- Create a Case
- Update an Order
- Fetch Customer Details
â Connect Actions to Data with Four Mechanisms
Grounding isnât just about finding the right informationâââyour agent must also know how to use that information when performing real actions.
In Agentforce, this connection happens through four powerful data-access mechanisms. Each mechanism tells the agent where the data lives and how it should be retrieved or modified.
These mechanisms act like different âdoorsâ through which the agent can reach your business data, depending on what the task requires.

1ď¸âŁ Grounded ActionsâââWhen your data is stored natively in Salesforce
Use Grounded Actions when the agent needs to work directly with Salesforce data you already trustâââsuch as:
- Accounts
- Contacts
- Leads
- Cases
- Opportunities
- Custom objects
Grounded Actions allow the agent to read and write this data safely, using the platformâs built-in permissions and security model.
Perfect for CRM-centric tasks like:
- âUpdate the case priority.â
- âCreate a follow-up task.â
- âFind all opportunities closing this month.â
Because the agent uses real Salesforce objects, its decisions stay grounded in accurate, structured information.
2ď¸âŁ Data GraphâââWhen you need connected, contextual information
Sometimes data lives across many related objects. Thatâs where the Data Graph comes in.
A Data Graph gives your agent a relationship-aware view of your Salesforce data. You define a âgraphâ of objects and their connectionsâââfor example:
- Customer â Orders â Order Line Items â Products
Your agent can then reason across the entire graph as a single interconnected dataset.
Useful for:
- Customer 360 tasks
- Order history analyses
- Eligibility checks
- Product recommendations
The Data Graph works best when decisions depend on multiple objects connected through relationships.
3ď¸âŁ Actions on CRM and External SystemsâââWhen data lives beyond Salesforce
Businesses donât live in one system, and neither should your agent.
This mechanism allows your Agentforce agent to interact with:
- External APIs
- Integration platforms
- Back-office applications
- Custom REST endpoints
Examples:
- Fetching shipment tracking from a logistics system
- Pulling credit score from a partner API
- Checking inventory in a warehouse system
This expands your agentâs capabilities far beyond CRM and ensures it has access to real-time operational data, even if it lives outside Salesforce.
4ď¸âŁRAG: The Heart of Grounding
Retrieval-Augmented Generation (RAG) means the agent:
- Receives a user query
- Retrieves relevant, real-world data
- Uses that data to generate grounded, factual output
LLMs donât know your business.
RAG lets them pull knowledge from YOUR data before generating an answer.
Structured vs. Unstructured Data in RAG
RAG can ground using both types of data:
1ď¸âŁ Structured Data
Highly organized. Searchable by fields.
Examples:
- Salesforce Objects (Lead, Case, Product, Contract)
- Database tables
- CSVs
Great for:
â precise lookups
â numerical or identifier-based queries
Example:
âWhat is the warranty period for product XYZ123?â
A simple CRM lookup might be enough.
2ď¸âŁ Unstructured Data
Humans love writing. Machines donât love parsing it.
Examples:
- PDFs
- Policy documents
- Web pages
- Meeting transcripts
- User manuals
- Knowledge articles
This is where LLMs shineâââbut only if you help them access the right parts.
3ď¸âŁ Semi-Structured
A mix.
Examples:
- JSON
- XML
- Chat logs
- Formatted docs
đĽMost organizations have tons of unstructured content lying aroundâââbut itâs rich with answers. RAG makes unstructured data searchable, relevant, and safe to use inside an AI workflow.
đ Introducing Agentforce Data Library
(Where Chunking, Indexing & Retrieval Live)
Agentforce uses the Agentforce Data Library (ADL) to ingest, transform, index, and prepare your data for retrieval.
Think of ADL as the âdata brainâ behind your agent.
đ¨ How Data Library Works (The Real Magic)
Letâs break it down into digestible steps.
đ§Š 1. ChunkingâââBreaking Large Content Into Smart Pieces
LLMs canât read a 40-page PDF and decide which part is relevant.
So ADL automatically chops your documents into smaller, meaningful âchunks.â
Example:
- A 20-page Refund Policy PDF â 200 chunks
- A product manual â 100 chunks
Each chunk becomes a small searchable unit.
đ This makes retrieval fast, accurate, and context-rich.
đ 2. IndexingâââCreating a High-Speed Search Layer
After chunking, ADL builds a vector index.
In simple terms:
- Each chunk becomes an embedding (mathematical representation of meaning)
- These embeddings are placed in an index
- When the agent gets a question, it finds the most similar chunks
This is the backbone of RAG.
đ§ 3. RetrieverâââThe Engine That Finds Relevant Chunks
The retriever is what actually searches the index.
When a user asks:
đ âWhat are the cancellation rules for Enterprise Customers?â
The retriever fetches:
- Enterprise contract policies
- SLA docs
- Pricing schedules
- Relevant knowledge articles
These chunks are sent to the LLM along with the prompt template.

âď¸ 4. Setup-Time vs Run-TimeâââWhat Happens When?
Setup-Time (When You Configure ADL):
â You add data sources (files, knowledge articles, objects)
â ADL creates a Data Stream
â Chunking happens
â Indexing happens
â Retriever is prepared
â Metadata + mappings are generated
â You reference the retriever in your agentâs design

Run-Time (When the Agent Is Live):
- User asks a question
- Retriever searches the index
- Most relevant chunks are selected
- Prompt template is filled with these chunks
- LLM generates grounded response
- Agent returns accurate, policy-compliant output

đ§Ş A Practical ExampleâââMaking a âRefund & Warranty Support Agentâ
Imagine you upload:
- 3 Warranty policy PDFs
- 50 Knowledge articles
- A troubleshooting guide
- A CSV of product models
ADL will:
đŚ Chunk PDFs â 700 chunks
đŚ Chunk support documents â 300 chunks
đŚ Create embedding index
đŚ Build retriever
đŚ Allow agent to pull relevant blocks at runtime
Then your agent can answer:
đŹ âWhatâs the refund window for Model Z?â
đŹ âDo premium users get extended warranty?â
đŹ âCan I return a product without invoice?â
With incredible accuracyâââbecause it uses YOUR content.
Leave a comment