AI Accelerator - Pipelines 4.0.0 release notes

Released: 5 May 2025

This release introduces a completely re-written processing back-end for the knowledge base pipeline. Retrievers are renamed to knowledge bases and some types and functions are also renamed for consistency. The preparer pipeline now supports OCR and can output its results to a volume. As always, bug fixes and other enhancements to existing functionality are also included.

Highlights

  • New processing back-end for the knowledge base pipeline supporting: background processing, batch processing, full iterative sync for tables and volumes.
  • Added OCR support to the preparer pipeline.
  • Added the ability to write preparer pipeline output to a volume.
  • Renaming of retrievers to knowledge bases.
  • Renaming of preparer functions for consistency.
  • Renaming of "processing mode" in pipelines config to "auto processing" for consistency.
  • Added arm64 support for the AI Accelerator on Debian 12 and RHEL 9.
  • Added support for writing preparer output to object storage volumes with configurable TLS and custom certificate validation.

Enhancements

DescriptionAddresses
Renamed retrievers to knowledge bases and renamed preparer functions.

The retriever pipeline has been renamed to the knowledge base pipeline. The functions aidb.create_retriever_for_table() and aidb.create_retriever_for_volume() have been renamed to aidb.create_knowledge_base_for_table() and aidb.create_knowledge_base_for_volume(), respectively. Also renamed is the aidb.delete_retriever function (to aidb.delete_knowledge_base()) and the aidb.RetrieverSourceDataFormat enum has been renamed to aidb.PipelineDataFormat. For consistency, the preparer pipeline functions have also been renamed. The aidb.create_preparer_for_table() function has been renamed to aidb.create_table_preparer() and aidb.create_preparer_for_volume() has been renamed to aidb.create_volume_preparer(). The aidb.retrievers view is now the aidb.knowledge_bases view. Aliases for the aidb.knowledge_bases and aidb.preparers views have also been added as aidb.kbs and aidb.preps respectively. The older functions and view names are still available but marked as deprecated. They will be removed in a future release.

Renamed "processing mode" in pipelines config to "auto processing" for consistency

The setting for "auto processing" in pipeline creation, config and overview functions has been re-named from processing_mode to auto_processing. We did this to have consistency with the name of the feature itself which is called auto-processing.

Added the ability to write preparer pipeline output to a volume.

Preparer pipelines now support AIDB volumes (e.g. an S3 bucket using PGFS) as output destination. Previously, only Postgres tables were supported. Volumes as input source were already supported together with tables, so now all 4 source/destination combinations are possible.

Added arm64 support for the AI Accelerator on Debian 12 and RHEL 9.

The AI Accelerator now supports arm64 architecture on Debian 12 and RHEL 9. This includes support for the arm64 architecture in the aidb packages.

The aidb.rerank_text() function now returns results as a rich table

The aidb.rerank_text() function used to rerank a list of texts based on a ranking query, now returns its results in a table, one row per text. The table includes columns for the raw logit_score, the text itself and the index from the input. This makes it much easier to use the result in larger SQL query e.g. by joining with other data or sorting the rows by logit_score.

Improved debugging capabilities for AI models accessed over HTTP APIs

AIDB now prints rich debugging output for any errors encountered when using external models via API. This includes NIM models and OpenAI API compatible models. The HTTP status code, error response content, and request content are logged to help users debug issues like incorrect endpoints, incorrect model configuration or incorrect credentials, among others.

Added TLS configuration support for models using HTTP APIs.

Models that connect to external HTTP endpoints can now be configured with TLS settings via the tls_config object. This supports disabling certificate verification using insecure_skip_verify for development, or specifying a custom certificate authority using the ca_path field. These settings allow secure integration with services using self-signed or private CA certificates.

Added PerformOcr operation to Preparer.

Preparer pipelines can now utilize a PerformOcr operation that leverages the PaddleOCR model with the NVIDIA NIM Image OCR API.


Could this page be better? Report a problem or suggest an addition!