AI Factory generic concepts

AI Factory builds on modern AI concepts, architectures, and technologies to enable advanced data-driven and intelligent applications.

This page explains key industry concepts that form the foundation of AI Factory’s design and capabilities:

  • AI and ML concepts
  • Vector and semantic search
  • Retrieval-augmented generation (RAG)
  • AI for databases
  • AI for infrastructure
  • Hybrid and Sovereign AI architectures

To see how these concepts are implemented in AI Factory and Hybrid Manager, start with:


Before you start

You’ll get the most out of this section if you have:

  • A basic understanding of AI / ML workflows
  • Familiarity with vector databases and semantic search
  • Awareness of AI pipelines and LLM architectures
  • Helpful: experience with AI Factory Concepts and AI Factory 101 Path

Core AI concepts

Machine learning (ML)

ML enables systems to learn from data and improve performance on specific tasks without explicit programming.

Common learning types:

  • Supervised learning — Predict based on labeled data (ex: sales forecasting)
  • Unsupervised learning — Find patterns in unlabeled data (ex: customer segmentation)
  • Reinforcement learning — Learn through interaction and reward feedback (ex: game-playing agents)

Infrastructure needs:

  • Scalable storage for training data
  • Compute for model training (CPUs, GPUs, TPUs)
  • Optimized serving infrastructure for inference

Deep learning (DL) & neural networks

DL uses deep neural networks with multiple layers to extract and learn complex patterns.

Strengths:

  • Handles unstructured data (images, text, audio, video)
  • Powers advanced applications such as image recognition, speech recognition, text generation

Infrastructure needs:

  • High-performance compute (GPUs or TPUs)
  • Optimized data pipelines and model serving infrastructure

Natural language processing (NLP)

NLP applies AI to understand and generate human language.

Common use cases:

  • Text classification
  • Sentiment analysis
  • Machine translation
  • Conversational AI
  • Intelligent document processing
  • Content summarization

Infrastructure needs:

  • Large model architectures (Transformers)
  • Low-latency serving infrastructure for real-time interaction
  • Data pipelines to feed and fine-tune models

Large language models (LLMs)

LLMs are advanced NLP models based on deep learning architectures such as Transformers.

Key characteristics:

  • Trained on massive text corpora
  • Generate coherent, context-aware text
  • Support chatbots, content generation, code assistants, advanced Q&A

LLMs are the foundation for modern Gen AI applications.

Infrastructure needs:

  • Enormous compute and storage for training
  • GPU-optimized serving for real-time inference
  • Scalable infrastructure for multi-user workloads

Embeddings and vector representations

Embeddings are dense vector representations of data, capturing semantic meaning in numerical form.

Applications:

  • Semantic search
  • Personalization and recommendations
  • Anomaly detection
  • RAG (retrieval-augmented generation)

How embeddings are used:

  • Convert text, images, audio, video, or structured data into vectors
  • Store in vector databases or Postgres with pgvector
  • Perform similarity search using approximate nearest neighbor (ANN) algorithms

Databases optimized to store and search embeddings.

Key features:

  • Fast vector similarity search
  • High-dimensional vector storage
  • Scale to billions of vectors
  • Support for hybrid search (vector + keyword / filter)

Common use cases:

  • Semantic search engines
  • Personalized recommendations
  • Knowledge assistants with grounding
  • Fraud detection and anomaly detection
  • Enhanced Gen AI pipelines (RAG)

Retrieval-augmented generation (RAG)

RAG combines vector search with LLMs to generate grounded responses.

How it works:

  1. User query → converted to embedding
  2. Vector search retrieves relevant documents / content
  3. Retrieved context is injected into LLM prompt
  4. LLM generates a response using both its model knowledge and retrieved context

Benefits:

  • More accurate and grounded AI responses
  • Real-time, dynamic knowledge integration
  • Support for domain-specific and compliance-aware AI applications

RAG is a core architecture pattern supported by AI Factory.


AI for databases

Intelligent database management

Using AI and ML to optimize database operations:

  • Automated performance tuning
  • Intelligent query optimization
  • Proactive resource management
  • Predictive maintenance
  • Self-healing capabilities

Goal: autonomous, highly optimized database infrastructure.


In-database machine learning

Perform ML model training and inference inside the database, reducing data movement.

Benefits:

  • Faster insights and response times
  • Simplified architecture
  • Lower latency for in-database applications
  • Real-time prediction capabilities
  • Streamlined MLOps workflows

Example: invoking ML models from SQL in Postgres.


Vector databases in Postgres

AI Factory leverages pgvector extension in Postgres:

  • Native vector type and operations
  • Integration with AI pipelines and embedding models
  • Full integration with Hybrid Manager and Model Serving
  • Enable semantic search and RAG pipelines on your Postgres data

Brings AI capabilities closer to core data.


AI for infrastructure

AI-accelerated hardware (GPUs, TPUs)

AI workloads depend heavily on specialized hardware:

  • GPUs — General-purpose acceleration for training and inference
  • TPUs — ASICs optimized for deep learning
  • Other accelerators — FPGAs, inference chips

Benefits:

  • Orders-of-magnitude speedup for model training
  • High throughput for production inference
  • Efficient resource utilization in cloud-native AI clusters

AI Factory integrates optimized GPU serving via KServe.


Leveraging cloud compute for AI workloads

Modern AI workloads rely on cloud compute to scale:

  • GPU / TPU instances on demand
  • Managed Kubernetes services for training and serving (ex: Hybrid Manager)
  • Optimized AI stacks (CUDA, PyTorch, TensorFlow, JAX)
  • Distributed model training with advanced networking
  • Hybrid and multi-cloud architectures

Goals:

  • Maximize flexibility and performance
  • Minimize cost
  • Enable portability across cloud providers and environments

AI Factory supports hybrid cloud and Sovereign AI patterns natively.


Hybrid and Sovereign AI

Modern enterprises increasingly require Sovereign AI:

  • Data remains in your infrastructure and control
  • AI pipelines leverage your trusted knowledge and content
  • Model serving runs in your infrastructure (ex: Hybrid Manager + KServe)
  • Governance, observability, and compliance are built in

AI Factory enables:

  • AI in Postgres
  • Pipelines and Knowledge Bases with your data
  • Model Serving in your Kubernetes cluster
  • Full visibility and control via Hybrid Manager dashboards

Sovereign AI is a first-class design goal for EDB AI Factory.



Next steps

To see how these concepts power EDB AI Factory, continue with:



Could this page be better? Report a problem or suggest an addition!