AI Factory generic concepts

Suggest edits

AI Factory builds on modern AI concepts, architectures, and technologies to enable advanced data-driven and intelligent applications.

This page explains key industry concepts that form the foundation of AI Factory’s design and capabilities:

AI and ML concepts
Vector and semantic search
Retrieval-augmented generation (RAG)
AI for databases
AI for infrastructure
Hybrid and Sovereign AI architectures

To see how these concepts are implemented in AI Factory and Hybrid Manager, start with:

Before you start

You’ll get the most out of this section if you have:

A basic understanding of AI / ML workflows
Familiarity with vector databases and semantic search
Awareness of AI pipelines and LLM architectures
Helpful: experience with AI Factory Concepts and AI Factory 101 Path

Core AI concepts

Machine learning (ML)

ML enables systems to learn from data and improve performance on specific tasks without explicit programming.

Common learning types:

Supervised learning — Predict based on labeled data (ex: sales forecasting)
Unsupervised learning — Find patterns in unlabeled data (ex: customer segmentation)
Reinforcement learning — Learn through interaction and reward feedback (ex: game-playing agents)

Infrastructure needs:

Scalable storage for training data
Compute for model training (CPUs, GPUs, TPUs)
Optimized serving infrastructure for inference

Deep learning (DL) & neural networks

DL uses deep neural networks with multiple layers to extract and learn complex patterns.

Strengths:

Handles unstructured data (images, text, audio, video)
Powers advanced applications such as image recognition, speech recognition, text generation

Infrastructure needs:

High-performance compute (GPUs or TPUs)
Optimized data pipelines and model serving infrastructure

Natural language processing (NLP)

NLP applies AI to understand and generate human language.

Common use cases:

Text classification
Sentiment analysis
Machine translation
Conversational AI
Intelligent document processing
Content summarization

Infrastructure needs:

Large model architectures (Transformers)
Low-latency serving infrastructure for real-time interaction
Data pipelines to feed and fine-tune models

Large language models (LLMs)

LLMs are advanced NLP models based on deep learning architectures such as Transformers.

Key characteristics:

Trained on massive text corpora
Generate coherent, context-aware text
Support chatbots, content generation, code assistants, advanced Q&A

LLMs are the foundation for modern Gen AI applications.

Infrastructure needs:

Enormous compute and storage for training
GPU-optimized serving for real-time inference
Scalable infrastructure for multi-user workloads

Embeddings and vector representations

Embeddings are dense vector representations of data, capturing semantic meaning in numerical form.

Applications:

Semantic search
Personalization and recommendations
Anomaly detection
RAG (retrieval-augmented generation)

How embeddings are used:

Convert text, images, audio, video, or structured data into vectors
Store in vector databases or Postgres with pgvector
Perform similarity search using approximate nearest neighbor (ANN) algorithms

Vector databases and semantic search

Databases optimized to store and search embeddings.

Key features:

Fast vector similarity search
High-dimensional vector storage
Scale to billions of vectors
Support for hybrid search (vector + keyword / filter)

Common use cases:

Semantic search engines
Personalized recommendations
Knowledge assistants with grounding
Fraud detection and anomaly detection
Enhanced Gen AI pipelines (RAG)

Retrieval-augmented generation (RAG)

RAG combines vector search with LLMs to generate grounded responses.

How it works:

User query → converted to embedding
Vector search retrieves relevant documents / content
Retrieved context is injected into LLM prompt
LLM generates a response using both its model knowledge and retrieved context

Benefits:

More accurate and grounded AI responses
Real-time, dynamic knowledge integration
Support for domain-specific and compliance-aware AI applications

RAG is a core architecture pattern supported by AI Factory.

AI for databases

Intelligent database management

Using AI and ML to optimize database operations:

Automated performance tuning
Intelligent query optimization
Proactive resource management
Predictive maintenance
Self-healing capabilities

Goal: autonomous, highly optimized database infrastructure.

In-database machine learning

Perform ML model training and inference inside the database, reducing data movement.

Benefits:

Faster insights and response times
Simplified architecture
Lower latency for in-database applications
Real-time prediction capabilities
Streamlined MLOps workflows

Example: invoking ML models from SQL in Postgres.

Vector databases in Postgres

AI Factory leverages pgvector extension in Postgres:

Native vector type and operations
Integration with AI pipelines and embedding models
Full integration with Hybrid Manager and Model Serving
Enable semantic search and RAG pipelines on your Postgres data

Brings AI capabilities closer to core data.

AI for infrastructure

AI-accelerated hardware (GPUs, TPUs)

AI workloads depend heavily on specialized hardware:

GPUs — General-purpose acceleration for training and inference
TPUs — ASICs optimized for deep learning
Other accelerators — FPGAs, inference chips

Benefits:

Orders-of-magnitude speedup for model training
High throughput for production inference
Efficient resource utilization in cloud-native AI clusters

AI Factory integrates optimized GPU serving via KServe.

Leveraging cloud compute for AI workloads

Modern AI workloads rely on cloud compute to scale:

GPU / TPU instances on demand
Managed Kubernetes services for training and serving (ex: Hybrid Manager)
Optimized AI stacks (CUDA, PyTorch, TensorFlow, JAX)
Distributed model training with advanced networking
Hybrid and multi-cloud architectures

Goals:

Maximize flexibility and performance
Minimize cost
Enable portability across cloud providers and environments

AI Factory supports hybrid cloud and Sovereign AI patterns natively.

Hybrid and Sovereign AI

Modern enterprises increasingly require Sovereign AI:

Data remains in your infrastructure and control
AI pipelines leverage your trusted knowledge and content
Model serving runs in your infrastructure (ex: Hybrid Manager + KServe)
Governance, observability, and compliance are built in

AI Factory enables:

AI in Postgres
Pipelines and Knowledge Bases with your data
Model Serving in your Kubernetes cluster
Full visibility and control via Hybrid Manager dashboards

Sovereign AI is a first-class design goal for EDB AI Factory.

Next steps

To see how these concepts power EDB AI Factory, continue with:

← Prev

Data Lake Explained

↑ Up

AI Factory Explained

Knowledge Bases explained

Could this page be better? Report a problem or suggest an addition!

AI Factory generic concepts

Before you start

Core AI concepts

Machine learning (ML)

Deep learning (DL) & neural networks

Natural language processing (NLP)

Large language models (LLMs)

Embeddings and vector representations

Vector databases and semantic search

Retrieval-augmented generation (RAG)

AI for databases

Intelligent database management

In-database machine learning

Vector databases in Postgres

AI for infrastructure

AI-accelerated hardware (GPUs, TPUs)

Leveraging cloud compute for AI workloads

Hybrid and Sovereign AI

Related concepts

Next steps

← Prev

↑ Up

Next →