GPU Recommendations for Default NIM Models Innovation Release
This documentation covers the current Innovation Release of
EDB Postgres AI. You may also want the docs for the current LTS version.
Overview
From Hybrid Manager, there are two primary consumers of AI models:
- PG.AI Knowledge Base (AIDB Postgres extension) for creating and maintaining AI Knowledge Bases.
- PG.AI GenAI Builder (containerized Griptape) for building agentic AI assistants.
Default NIM Models
| Model type | NIM model | NVIDIA NIM documented resource requirements |
|---|---|---|
| Text completion | llama-3.3-70b-instruct | 4 × L40S |
| Text embeddings | arctic-embed-l | 1 × L40S |
| Image embeddings | nvclip | 1 × L40S |
| OCR | paddleocr | 1 × L40S |
| Text reranking | llama-3.2-nv-rerankqa-1b-v2 | 1 × L40S |
Minimum GPU Requirement
Based on the default models above, the minimum to run them concurrently is 8 × L40S GPUs.
Cloud Mappings
- AWS EKS: recommend a node group with 2 ×
g6e.12xlargenodes. - GCP GKE: recommend a node pool with 2 ×
a2-highgpu-4gnodes.
Note: GCP does not offer L40S GPUs. The recommended A2 nodes with A100 GPUs are supported and documented for the NIM models listed above.