Known issues v1.9

These are the currently known issues and limitations identified in this Analytics Accelerator release. Where applicable, we have included workarounds to help you mitigate the impact of these issues. These issues are actively tracked and are planned for resolution in a future release.

Maintenance

  • While the function pgaa.launch_task() supports multiple maintenance operations for Delta Lake (such as z-order, vacuum, and purge), it currently only supports the compaction operation for Iceberg tables. For other Iceberg maintenance routines, use pgaa.spark_sql().
  • The function pgaa.execute_compaction() does not support the parameter settings. Use pgaa.launch_task() for more granular control over compaction behavior.

Spark

  • Integration with Apache Spark is supported for:
    • Read-only queries on Parquet files in S3-compatible object storage or a shared POSIX filesystem.
    • Read-only queries for Iceberg tables in Iceberg REST catalogs.
  • GPU acceleration with Apache Spark is only supported for read-only queries on Parquet files in S3-compatible object storage or a shared POSIX filesystem.
  • There is a known issue in the Iceberg Spark Library version 1.9.2 when using Spark to read from Iceberg tables. Under certain conditions, equality deletes may occasionally be skipped during concurrent executions. The current workaround is to disable the Spark application setting spark.sql.iceberg.executor-cache.enabled on your spark-defaults.conf file. Disabling this cache ensures data consistency by correctly processing all deletes, but it may have performance implications for high-volume read workloads.

Object storage

  • Integration with Amazon S3 Tables is currently not supported for PGD-integrated features (tiered tables, replication, and offloading). For PGD-managed lifecycles, use either a storage location or an Iceberg REST-compliant catalog.

WarehousePG

  • CREATE TABLE AS SELECT (CTAS) to object storage isn't supported in WarehousePG installations.
  • PGAA on WarehousePG supports only the standard Postgres planner, not the Orca optimizer.
  • PGAA tables in WarehousePG are read-only and don't support write operations. The data must already exist in object storage.
  • Dynamic Iceberg catalog sync via pgaa.attach_catalog() isn't supported. Use pgaa.import_catalog() to manually import catalog metadata instead.
  • pgaa.launch_task() requires the pgcrypto extension. If you see the error function gen_random_uuid() does not exist, run CREATE EXTENSION pgcrypto; on the coordinator before using maintenance functions.
  • Restoring a PGAA database with gprestore is only supported into a new, empty database. Restoring into the original database or an existing database with PGAA tables isn't supported.
  • The gprestore option --create-db isn't supported for PGAA tables. You must manually create the database and the PGAA and PGFS extensions before running gprestore. See Backing up and restoring for details.