AI Factory generic concepts
AI Factory builds on modern AI concepts, architectures, and technologies to enable advanced data-driven and intelligent applications.
This page explains key industry concepts that form the foundation of AI Factory’s design and capabilities:
- AI and ML concepts
- Vector and semantic search
- Retrieval-augmented generation (RAG)
- AI for databases
- AI for infrastructure
- Hybrid and Sovereign AI architectures
To see how these concepts are implemented in AI Factory and Hybrid Manager, start with:
Before you start
You’ll get the most out of this section if you have:
- A basic understanding of AI / ML workflows
- Familiarity with vector databases and semantic search
- Awareness of AI pipelines and LLM architectures
- Helpful: experience with AI Factory Concepts and AI Factory 101 Path
Core AI concepts
Machine learning (ML)
ML enables systems to learn from data and improve performance on specific tasks without explicit programming.
Common learning types:
- Supervised learning — Predict based on labeled data (ex: sales forecasting)
- Unsupervised learning — Find patterns in unlabeled data (ex: customer segmentation)
- Reinforcement learning — Learn through interaction and reward feedback (ex: game-playing agents)
Infrastructure needs:
- Scalable storage for training data
- Compute for model training (CPUs, GPUs, TPUs)
- Optimized serving infrastructure for inference
Deep learning (DL) & neural networks
DL uses deep neural networks with multiple layers to extract and learn complex patterns.
Strengths:
- Handles unstructured data (images, text, audio, video)
- Powers advanced applications such as image recognition, speech recognition, text generation
Infrastructure needs:
- High-performance compute (GPUs or TPUs)
- Optimized data pipelines and model serving infrastructure
Natural language processing (NLP)
NLP applies AI to understand and generate human language.
Common use cases:
- Text classification
- Sentiment analysis
- Machine translation
- Conversational AI
- Intelligent document processing
- Content summarization
Infrastructure needs:
- Large model architectures (Transformers)
- Low-latency serving infrastructure for real-time interaction
- Data pipelines to feed and fine-tune models
Large language models (LLMs)
LLMs are advanced NLP models based on deep learning architectures such as Transformers.
Key characteristics:
- Trained on massive text corpora
- Generate coherent, context-aware text
- Support chatbots, content generation, code assistants, advanced Q&A
LLMs are the foundation for modern Gen AI applications.
Infrastructure needs:
- Enormous compute and storage for training
- GPU-optimized serving for real-time inference
- Scalable infrastructure for multi-user workloads
Embeddings and vector representations
Embeddings are dense vector representations of data, capturing semantic meaning in numerical form.
Applications:
- Semantic search
- Personalization and recommendations
- Anomaly detection
- RAG (retrieval-augmented generation)
How embeddings are used:
- Convert text, images, audio, video, or structured data into vectors
- Store in vector databases or Postgres with pgvector
- Perform similarity search using approximate nearest neighbor (ANN) algorithms
Vector databases and semantic search
Databases optimized to store and search embeddings.
Key features:
- Fast vector similarity search
- High-dimensional vector storage
- Scale to billions of vectors
- Support for hybrid search (vector + keyword / filter)
Common use cases:
- Semantic search engines
- Personalized recommendations
- Knowledge assistants with grounding
- Fraud detection and anomaly detection
- Enhanced Gen AI pipelines (RAG)
Retrieval-augmented generation (RAG)
RAG combines vector search with LLMs to generate grounded responses.
How it works:
- User query → converted to embedding
- Vector search retrieves relevant documents / content
- Retrieved context is injected into LLM prompt
- LLM generates a response using both its model knowledge and retrieved context
Benefits:
- More accurate and grounded AI responses
- Real-time, dynamic knowledge integration
- Support for domain-specific and compliance-aware AI applications
RAG is a core architecture pattern supported by AI Factory.
AI for databases
Intelligent database management
Using AI and ML to optimize database operations:
- Automated performance tuning
- Intelligent query optimization
- Proactive resource management
- Predictive maintenance
- Self-healing capabilities
Goal: autonomous, highly optimized database infrastructure.
In-database machine learning
Perform ML model training and inference inside the database, reducing data movement.
Benefits:
- Faster insights and response times
- Simplified architecture
- Lower latency for in-database applications
- Real-time prediction capabilities
- Streamlined MLOps workflows
Example: invoking ML models from SQL in Postgres.
Vector databases in Postgres
AI Factory leverages pgvector extension in Postgres:
- Native vector type and operations
- Integration with AI pipelines and embedding models
- Full integration with Hybrid Manager and Model Serving
- Enable semantic search and RAG pipelines on your Postgres data
Brings AI capabilities closer to core data.
AI for infrastructure
AI-accelerated hardware (GPUs, TPUs)
AI workloads depend heavily on specialized hardware:
- GPUs — General-purpose acceleration for training and inference
- TPUs — ASICs optimized for deep learning
- Other accelerators — FPGAs, inference chips
Benefits:
- Orders-of-magnitude speedup for model training
- High throughput for production inference
- Efficient resource utilization in cloud-native AI clusters
AI Factory integrates optimized GPU serving via KServe.
Leveraging cloud compute for AI workloads
Modern AI workloads rely on cloud compute to scale:
- GPU / TPU instances on demand
- Managed Kubernetes services for training and serving (ex: Hybrid Manager)
- Optimized AI stacks (CUDA, PyTorch, TensorFlow, JAX)
- Distributed model training with advanced networking
- Hybrid and multi-cloud architectures
Goals:
- Maximize flexibility and performance
- Minimize cost
- Enable portability across cloud providers and environments
AI Factory supports hybrid cloud and Sovereign AI patterns natively.
Hybrid and Sovereign AI
Modern enterprises increasingly require Sovereign AI:
- Data remains in your infrastructure and control
- AI pipelines leverage your trusted knowledge and content
- Model serving runs in your infrastructure (ex: Hybrid Manager + KServe)
- Governance, observability, and compliance are built in
AI Factory enables:
- AI in Postgres
- Pipelines and Knowledge Bases with your data
- Model Serving in your Kubernetes cluster
- Full visibility and control via Hybrid Manager dashboards
Sovereign AI is a first-class design goal for EDB AI Factory.
Related concepts
- Vectorized query engines
- Data lakehouse architectures
- Separation of storage and compute
- AI Factory Concepts
Next steps
To see how these concepts power EDB AI Factory, continue with:
Could this page be better? Report a problem or suggest an addition!