GPU Recommendations for Default NIM Models v1.3.2
The November 2025 Innovation Release of EDB Postgres AI is available. For more information, see the release notes.
Overview
From Hybrid Manager, there are two primary consumers of AI models:
- PG.AI Knowledge Base (AIDB Postgres extension) for creating and maintaining AI Knowledge Bases.
- PG.AI GenAI Builder (containerized Griptape) for building agentic AI assistants.
Default NIM Models
| Model type | NIM model | NVIDIA NIM documented resource requirements |
|---|---|---|
| Text completion | llama-3.3-70b-instruct | 4 × L40S |
| Text embeddings | arctic-embed-l | 1 × L40S |
| Image embeddings | nvclip | 1 × L40S |
| OCR | paddleocr | 1 × L40S |
| Text reranking | llama-3.2-nv-rerankqa-1b-v2 | 1 × L40S |
Minimum GPU Requirement
Based on the default models above, the minimum to run them concurrently is 8 × L40S GPUs.
Cloud Mappings
- AWS EKS: recommend a node group with 2 ×
g6e.12xlargenodes. - GCP GKE: recommend a node pool with 2 ×
a2-highgpu-4gnodes.
Note: GCP does not offer L40S GPUs. The recommended A2 nodes with A100 GPUs are supported and documented for the NIM models listed above.