Configure Data Sources in Gen AI Builder

Who is this for

Platform users building Knowledge Bases and AI applications using Gen AI Builder. This includes developers, AI engineers, content managers, and system integrators.

If you are responsible for bringing external or internal content into your AI workflows—whether from documents, web pages, cloud storage, or proprietary systems—this set of guides will show you how.

What you will accomplish

You will learn how to configure and manage different types of Data Sources in Gen AI Builder. These sources provide the raw content that populates Libraries and Knowledge Bases for your AI applications.

Why use Data Sources

  • The quality and relevance of your AI applications depend heavily on the content you ingest.
  • Gen AI Builder allows you to connect to a variety of content sources and manage the ingestion pipeline securely and efficiently.
  • Using Data Sources ensures your Knowledge Bases stay fresh, accurate, and aligned with your organization’s needs.

About Data Sources in Gen AI Builder

Data Sources are the first step in bringing content into the system. The content from Data Sources flows into Libraries, and then into Knowledge Bases, where it is indexed and made available for AI workloads.

Common types of Data Sources include:

  • Web pages
  • Data Lakes
  • Amazon S3 buckets
  • Google Drive folders
  • Atlassian Confluence spaces
  • Custom systems via PG.AI Structures

Key features:

  • Secure connection to data
  • Scheduled refresh to keep content up to date
  • Optional transformation with PG.AI Structures
  • Integration with downstream Libraries and Knowledge Bases

For foundational concepts, see:

For guidance on using this content in AI applications:

To see how Data Sources are used within Hybrid Manager deployments, visit:

How-to Guides

Use the following guides to configure each type of Data Source:

Each guide provides:

  • Who it is for
  • What you will accomplish
  • Why to use this source
  • Detailed configuration steps
  • Key considerations and best practices
  • Troubleshooting tips
  • Example scenarios

Next steps

Once you have configured Data Sources, proceed to:

  • Configure Libraries to organize your content.
  • Build Knowledge Bases to enable semantic search and RAG applications.
  • Connect your Knowledge Bases to AI agents and workflows using Gen AI Builder and Structures.

For more advanced integrations, see:


By thoughtfully configuring Data Sources, you will lay a strong foundation for high-quality, context-aware AI applications in your organization.


Could this page be better? Report a problem or suggest an addition!