Backend AI Engineer

Dhanesh Vashisth

LangGraph · FastAPI · Python · Kafka · Kubernetes

I build production-grade AI systems — from RAG pipelines and multi-agent architectures to event-driven microservices at scale. Comfortable taking a model from prototype to a containerised, observable service running in Kubernetes.

Get in touch View projects

Open to opportunities

Core Stack

🐍

Python

Async, type-safe, production-grade. Primary language for all backend and AI work.

⚡

FastAPI

High-performance ASGI APIs. Dependency injection, OAuth2, background tasks, WebSockets.

🔗

LangChain / LangGraph

Multi-step agents, stateful graphs, tool-calling. Moving beyond basic chains to real agentic loops.

🔗

LangSmith | Observability

monitoring, tracing, optimizing agent pipelines, debugging, observability, and evaluation of LLM workflows.

📚

RAG Pipelines

Embedding, chunking strategy, vector retrieval, reranking, and evaluation. End-to-end.

📨

Kafka

Event streaming for decoupled microservices. Producers, consumers, partitioning, offset management.

☸️

Kubernetes

Deploying and managing containerised workloads. Deployments, services, config maps, resource limits.

🐳

Docker

Multi-stage builds, compose for local dev, and production-ready images for K8s deployment.

🔧

Git & CI/CD

GitHub Actions for automated testing, building, and deployment pipelines.

Stack Matrix

AI / LLM

LangGraph
LangChain
RAG · Embeddings
OpenAI / Ollama

API Layer

FastAPI
Pydantic v2
JWT / OAuth2
WebSockets

Messaging

Kafka
Confluent
aiokafka
Dead-letter queues

Infra

Kubernetes
Docker
Helm charts
GitHub Actions

Data

PostgreSQL
Redis
Chroma / FAISS
SQLAlchemy

Observability

Structured logs
Prometheus
Grafana
Health probes

Projects

/ multi-tenant-ecommerce-RAG system

GitHub

A multi-tenant-ecommerce-RAG system which is production-ready RAG (Retrieval-Augmented Generation) system serving three e-commerce platforms (Amazon, Flipkart, Myntra) from a single deployment. Implemented tenant-aware vector search using Qdrant with automatic context switching based on API keys, achieving 40% reduction in support response time. The system features real-time document ingestion via Kafka streaming, PostgreSQL for metadata management, and Redis caching, all containerized with Docker for seamless scalability.

# architecture flow User Query ──▶ FastAPI /chat ──▶ LangGraph Supervisor ├──▶ RAG Agent (vector retrieval) ├──▶ Tool Agent (web search, calc) └──▶ Summary Agent (condensation) VectorDB: Chroma │ Memory: Redis │ LLM: GPT-4o-mini / Ollama

/ code-review-agent

GitHub

Production-grade multi-agent code review system built on LangGraph. Submitted code is analyzed in parallel by specialized agents — bug detection, quality, and security — then synthesized into a structured report with scoring. SHA256 caching eliminates duplicate LLM calls. Fully containerised with FastAPI backend and PostgreSQL persistence.

# agent graph topology POST /review ──▶ Orchestrator: validate + cache check ──▶ Parallel Agents (LangGraph + GPT-4o-mini) ├──▶ Bug Detector ──▶ Summarizer ├──▶ Quality Checker ──▶ Score + Report └──▶ Security Checker ──▶ PostgreSQL Cache: SHA256 │ Retry: tenacity │ Docker: Compose

/ containerised-fastapi-microservice

GitHub

Production-ready FastAPI service template with structured logging, Prometheus metrics, health/readiness probes, and a multi-stage Dockerfile. Demonstrates how to run a Python ASGI app correctly in Kubernetes — resource limits, liveness probes, config from environment, secrets from K8s Secrets.

# k8s deployment excerpt resources: limits: { cpu: "500m", memory: "512Mi" } requests: { cpu: "100m", memory: "128Mi" } livenessProbe: httpGet: { path: /health, port: 8000 } initialDelaySeconds: 10 periodSeconds: 15 env: [ { name: DATABASE_URL, valueFrom: secretKeyRef } ]

Experience

2025 — Present

Backend AI Engineer

Excel Tech Corp (TX, USA)

Built LangGraph-based multi-agent systems for document Q&A and workflow automation, reducing manual review time significantly.

Designed RAG pipelines with hybrid retrieval (BM25 + vector), improving answer relevance on domain-specific corpora.

Deployed FastAPI services to Kubernetes with proper resource management, health probes, and Prometheus metrics.

Integrated Kafka for async document ingestion — decoupled processing from the API layer, supporting bursty traffic patterns.

2022 — 2025

Python Backend Developer

RTS (Startup)

Built and maintained REST APIs with FastAPI and PostgreSQL, handling authentication, background tasks, and data validation with Pydantic v2.

Containerised services with Docker and set up CI/CD pipelines using GitHub Actions for automated testing and deployment.

Introduced structured logging and basic observability, making production incidents 40% faster to diagnose.

Contact

Open to backend AI engineering roles, contract work, and interesting problems in production AI systems.

dhaneshvashisth@gmail.com github.com/dhaneshvashisth linkedin.com/in/dhaneshvashisth