Configure a Google Drive data source

Who is this for

Platform users who want to ingest files stored in Google Drive into Gen AI Builder for use in Knowledge Bases and Retrieval-Augmented Generation (RAG) workflows. Typical users include developers, data engineers, and business stakeholders who collaborate on project documentation within Google Workspace.

What you will accomplish

You will configure a Google Drive data source in Gen AI Builder to ingest files from selected Google Drive folders. The content will then be available for transformation and indexing into Libraries and Knowledge Bases.

Why use a Google Drive data source

  • Many project documents live in collaborative Google Drive folders.
  • Google Docs, Slides, and Sheets are frequently used for dynamic team content that is not available elsewhere.
  • Directly connecting to Google Drive allows you to capture this valuable knowledge without manual export steps.

For background on how this content powers downstream AI use cases, see:

Complexity and time to complete

  • Complexity: Moderate. You must complete an OAuth flow and understand your Google Drive folder structure.
  • Estimated time: 5–10 minutes.

Key considerations

Data relevance and quality

  • Ingest curated folders that contain useful, trusted documents.
  • Well-written content in Google Docs and well-formatted Google Slides or Sheets contribute to more effective AI responses.

Scope of ingestion

  • Select folders carefully during the OAuth authorization flow.
  • Avoid ingesting broad folders (such as entire My Drive or Shared Drive roots) unless intentional.

Permissions scope

  • Review the permissions requested during the OAuth process.
  • If using Google Workspace, confirm that your organization allows third-party app access as required.

How to configure a Google Drive data source

  1. In the interface where you manage data sources, select + Add New Data Source.
  2. Choose Google Drive from the available data source types.
  3. Configure the following fields:
  • Name: Provide a clear and unique name. Example: Project Alpha GDrive Docs.
  • Description (optional): Add descriptive context for future reference.
  1. Connect to Google Drive:
  • Select Connect With Google to initiate the OAuth 2.0 authorization flow.
  • Sign in to your Google Account and grant the requested permissions.
  • Follow on-screen prompts to select the files or folders to ingest.
  1. (Optional) Configure advanced options:
  • Scheduled refresh: Enable this option to automatically refresh the data on a schedule.
  • Provide a cron expression. Example: 0 2 * * * (daily at 2 AM).
  • Transform your data: Enable this option to apply a PG.AI Structure to transform the data during ingestion.
  • Select an existing Structure from the list.
  1. Select Create to add the Google Drive data source.

Supported file types

  • Google Workspace files: Docs, Sheets, Slides (converted to text).
  • PDF
  • CSV
  • Markdown
  • Most text-based file types.

For more about how content is indexed and retrieved from these files, see Embeddings explained.

Managing and refreshing the data source

Once created, the Google Drive data source can be viewed and managed through the Data Sources interface.

Actions available:

  • Edit: Modify the data source configuration.
  • Refresh: Manually trigger a data ingestion job.
  • Delete: Remove the data source.

You can also review the data job history to monitor ingestion performance and troubleshoot issues.

Troubleshooting

Authentication failed

  • Check your Google Account and try the OAuth flow again.
  • Verify that your organization allows third-party app access.
  • Check browser settings and pop-up blockers if the OAuth flow does not complete.

Files or folders not visible

  • Verify that you have access to the target files or folders in Google Drive.
  • If using a Shared Drive, ensure that the selected files are visible and accessible to your account.

Example scenario

You want to index project planning documents stored in a shared Google Drive folder.

Example configuration:

  • Name: Project Plans & Presentations GDrive
  • OAuth flow: Completed successfully.
  • Selected folder: Active Projects Q2
  • Scheduled refresh: 0 2 * * * (daily at 2 AM)
  • Transform your data: Apply a Structure to summarize Google Docs content and extract key sections.

Could this page be better? Report a problem or suggest an addition!