Configure an Atlassian Confluence data source

Who is this for

Platform users who want to ingest content from Atlassian Confluence into Gen AI Builder for use in Knowledge Bases and Retrieval-Augmented Generation (RAG) workflows. Typical users include developers, technical writers, and knowledge managers who maintain institutional knowledge in Confluence.

What you will accomplish

You will configure an Atlassian Confluence data source in Gen AI Builder to ingest pages from one or more Confluence spaces or URLs. The ingested content will then be available for transformation and indexing into Libraries and Knowledge Bases.

Why use a Confluence data source

  • Confluence is a common platform for internal technical documentation, policies, engineering notes, and project knowledge.
  • Bringing Confluence content into Gen AI Builder makes this valuable institutional knowledge searchable and usable in AI workflows.
  • Targeted ingestion of selected spaces or pages provides fine-grained control.

For background on how this content powers downstream AI use cases, see:

Complexity and time to complete

  • Complexity: Moderate. You need to obtain an Atlassian API Token and understand your Confluence space structure.
  • Estimated time: 10–15 minutes if API token is already generated.

Key considerations

Data relevance and quality

  • Ingest trusted spaces or specific pages with high-quality documentation.
  • Ensure content is well structured and uses semantic page layouts to improve AI processing.

Scope of ingestion

  • Use space-specific or page-specific URLs when possible to avoid ingesting broad and irrelevant content.
  • If you want to ingest multiple spaces, configure multiple data source entries.

Credentials security

  • Use an API Token with read-only access when possible.
  • Do not use administrator-level tokens unless necessary.

How to configure an Atlassian Confluence data source

  1. In the interface where you manage data sources, select + Add New Data Source.
  2. Choose Confluence from the available data source types.
  3. Configure the following fields:
  • Name: Provide a clear and unique name. Example: Engineering Wiki Confluence.
  • Description (optional): Add descriptive context for future reference.
  1. Connect to Confluence:
  • Confluence Site URL: Enter your Confluence site URL.
  • Example: https://mycompany.atlassian.net or https://mycompany.atlassian.net/wiki/spaces/DEVKB
  • API Token: Enter a valid Atlassian API Token.
  • Atlassian Email: Enter the email address associated with the API token.
  1. (Optional) Configure advanced options:
  • Scheduled refresh: Enable this option to automatically refresh the data on a schedule.
  • Provide a cron expression. Example: 0 2 * * * (daily at 2 AM).
  • Transform your data: Enable this option to apply a PG.AI Structure to transform the data during ingestion.
  • Select an existing Structure from the list.
  1. Select Create to add the Confluence data source.

Supported content

  • Textual data from Confluence pages.
  • Content from supported macros (if rendered as text).
  • Content from page hierarchies.

For more about how content is indexed and retrieved, see Embeddings explained.

Managing and refreshing the data source

Once created, the Confluence data source can be viewed and managed through the Data Sources interface.

Actions available:

  • Edit: Modify the data source configuration.
  • Refresh: Manually trigger a data ingestion job.
  • Delete: Remove the data source.

You can also review the data job history to monitor ingestion performance and troubleshoot issues.

Troubleshooting

Authentication failure

  • Verify that the Confluence Site URL, API Token, and Email are correct.
  • Confirm that the API Token has sufficient read permissions.
  • Check the data job history for specific errors.

Space or page not found

  • Double-check the Confluence URL and verify that your account has access to the target spaces or pages.
  • If using a page-specific URL, confirm that the page is not restricted.

Example scenario

You want to index content from the "Developer Knowledge Base" space in your company's Confluence instance.

Example configuration:

  • Name: Developer KB Confluence Space
  • Confluence Site URL: https://mycompany.atlassian.net/wiki/spaces/DEVKB
  • API Token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  • Atlassian Email: developer@example.com
  • Scheduled refresh: 0 2 * * * (daily at 2 AM)
  • Transform your data: Apply a Structure to strip headers and footers and extract core content.

Could this page be better? Report a problem or suggest an addition!