🕉

Architecture Overview

High-level design

┌─────────────────────────────────────────────────┐
│                   User                           │
│         (Vercel — chat UI)                       │
└──────────────────┬──────────────────────────────┘
                   │  POST /api/chat
                   ▼
┌─────────────────────────────────────────────────┐
│          API Layer (Render — free tier)          │
│   ┌─────────────────────────────────────────┐   │
│   │         FastAPI (Python)                │   │
│   │  - Rate limiter (30/min/IP)             │   │
│   │  - CORS locked to frontend domain       │   │
│   │  - Request validation                   │   │
│   └────────────────┬────────────────────────┘   │
│   ┌────────────────┴────────────────────────┐   │
│   │      Web Search (DDG → Google)          │   │
│   │  Retrieves public info from the web     │   │
│   └────────────────┬────────────────────────┘   │
│   ┌────────────────┴────────────────────────┐   │
│   │    Context + Prompt + Question           │   │
│   │    sent to Sarvam AI for synthesis       │   │
│   └────────────────┬────────────────────────┘   │
└──────────────────┬──────────────────────────────┘
                   │  Sarvam AI API
                   ▼
┌─────────────────────────────────────────────────┐
│            Sarvam AI (sarvam-105b)               │
│        64K context, hosted by Sarvam AI          │
└─────────────────────────────────────────────────┘

Key components

1. Frontend (`veda-guru-ai-ui`)

Static site deployed on Vercel (free tier)
Vanilla HTML/CSS/JS with marked.js for markdown rendering
Responsive design with Vedic-themed styling
Suggestion chips for quick queries

2. API Service (`veda-guru-ai-api`)

FastAPI (Python) deployed on Render (free tier)
Single endpoint: POST /api/chat
Rate limited: 30 requests/minute per IP via slowapi
CORS: Restricted to the frontend domain only
Request timeout: 120s (handles Render cold starts)

3. Web Search

Primary: DuckDuckGo via duckduckgo-search library
Fallback: Google via googlesearch-python
No API keys needed for either
Search results are formatted as context for the LLM

4. LLM — Sarvam AI

Model: sarvam-105b (128K context window)
Context window: 128K tokens
Auth: api-subscription-key header
Endpoint: POST https://api.sarvam.ai/v1/chat/completions

Data flow (query)

User: "What does Rig Veda say about truth?"

POST /api/chat { message: "..." }
Check rate limit → reject if over quota
Search web (DuckDuckGo → Google fallback)
Format results + question into prompt
Send prompt to Sarvam AI /v1/chat/completions
Parse response, attach source URLs
Return { reply, sources } to frontend
Frontend renders markdown + source toggle

Security

Rate limiting: 30 requests/minute per IP
CORS: Only the frontend domain is allowed
LLM key: SARVAM_API_KEY stored as Render env var, never in code
No PII: No user accounts, no data stored
Cold start: Render free tier sleeps after 15min idle — first request may take 30-60s

Repositories

Repo	Purpose	URL
`veda-guru-ai-docs`	Documentation site	GitHub
`veda-guru-ai-api`	FastAPI chatbot backend	GitHub
`veda-guru-ai-ui`	Chat frontend	GitHub

Architecture Overview

High-level design

Key components

1. Frontend (veda-guru-ai-ui)

2. API Service (veda-guru-ai-api)

3. Web Search

4. LLM — Sarvam AI

Data flow (query)

Security

Repositories

1. Frontend (`veda-guru-ai-ui`)

2. API Service (`veda-guru-ai-api`)