Backend AI Engineer

Dhanesh Vashisth

LangGraph · FastAPI · Python · Kafka · Kubernetes

I build production-grade AI systems — from RAG pipelines and multi-agent architectures to event-driven microservices at scale. Comfortable taking a model from prototype to a containerised, observable service running in Kubernetes.

Get in touch View projects
Open to opportunities
🐍
Python
Async, type-safe, production-grade. Primary language for all backend and AI work.
FastAPI
High-performance ASGI APIs. Dependency injection, OAuth2, background tasks, WebSockets.
🔗
LangChain / LangGraph
Multi-step agents, stateful graphs, tool-calling. Moving beyond basic chains to real agentic loops.
🔗
LangSmith | Observability
monitoring, tracing, optimizing agent pipelines, debugging, observability, and evaluation of LLM workflows.
📚
RAG Pipelines
Embedding, chunking strategy, vector retrieval, reranking, and evaluation. End-to-end.
📨
Kafka
Event streaming for decoupled microservices. Producers, consumers, partitioning, offset management.
☸️
Kubernetes
Deploying and managing containerised workloads. Deployments, services, config maps, resource limits.
🐳
Docker
Multi-stage builds, compose for local dev, and production-ready images for K8s deployment.
🔧
Git & CI/CD
GitHub Actions for automated testing, building, and deployment pipelines.
AI / LLM
LangGraph
LangChain
RAG · Embeddings
OpenAI / Ollama
API Layer
FastAPI
Pydantic v2
JWT / OAuth2
WebSockets
Messaging
Kafka
Confluent
aiokafka
Dead-letter queues
Infra
Kubernetes
Docker
Helm charts
GitHub Actions
Data
PostgreSQL
Redis
Chroma / FAISS
SQLAlchemy
Observability
Structured logs
Prometheus
Grafana
Health probes
/ multi-tenant-ecommerce-RAG system

A multi-tenant-ecommerce-RAG system which is production-ready RAG (Retrieval-Augmented Generation) system serving three e-commerce platforms (Amazon, Flipkart, Myntra) from a single deployment. Implemented tenant-aware vector search using Qdrant with automatic context switching based on API keys, achieving 40% reduction in support response time. The system features real-time document ingestion via Kafka streaming, PostgreSQL for metadata management, and Redis caching, all containerized with Docker for seamless scalability.

# architecture flow User Query ──▶ FastAPI /chat ──▶ LangGraph Supervisor ├──▶ RAG Agent (vector retrieval) ├──▶ Tool Agent (web search, calc) └──▶ Summary Agent (condensation) VectorDB: Chroma Memory: Redis LLM: GPT-4o-mini / Ollama
LangGraph FastAPI RAG Python Redis Chroma
/ code-review-agent

Production-grade multi-agent code review system built on LangGraph. Submitted code is analyzed in parallel by specialized agents — bug detection, quality, and security — then synthesized into a structured report with scoring. SHA256 caching eliminates duplicate LLM calls. Fully containerised with FastAPI backend and PostgreSQL persistence.

# agent graph topology POST /review ──▶ Orchestrator: validate + cache check ──▶ Parallel Agents (LangGraph + GPT-4o-mini) ├──▶ Bug Detector ──▶ Summarizer ├──▶ Quality Checker ──▶ Score + Report └──▶ Security Checker ──▶ PostgreSQL Cache: SHA256 Retry: tenacity Docker: Compose
LangGraph FastAPI LangChain Docker Python PostgreSQL
/ containerised-fastapi-microservice

Production-ready FastAPI service template with structured logging, Prometheus metrics, health/readiness probes, and a multi-stage Dockerfile. Demonstrates how to run a Python ASGI app correctly in Kubernetes — resource limits, liveness probes, config from environment, secrets from K8s Secrets.

# k8s deployment excerpt resources: limits: { cpu: "500m", memory: "512Mi" } requests: { cpu: "100m", memory: "128Mi" } livenessProbe: httpGet: { path: /health, port: 8000 } initialDelaySeconds: 10 periodSeconds: 15 env: [ { name: DATABASE_URL, valueFrom: secretKeyRef } ]
Kubernetes Docker FastAPI Python Prometheus
2025 — Present
Backend AI Engineer
Excel Tech Corp (TX, USA) 
Built LangGraph-based multi-agent systems for document Q&A and workflow automation, reducing manual review time significantly.
Designed RAG pipelines with hybrid retrieval (BM25 + vector), improving answer relevance on domain-specific corpora.
Deployed FastAPI services to Kubernetes with proper resource management, health probes, and Prometheus metrics.
Integrated Kafka for async document ingestion — decoupled processing from the API layer, supporting bursty traffic patterns.
2022 — 2025
Python Backend Developer
RTS (Startup) 
Built and maintained REST APIs with FastAPI and PostgreSQL, handling authentication, background tasks, and data validation with Pydantic v2.
Containerised services with Docker and set up CI/CD pipelines using GitHub Actions for automated testing and deployment.
Introduced structured logging and basic observability, making production incidents 40% faster to diagnose.

Open to backend AI engineering roles, contract work, and interesting problems in production AI systems.