I'm Srinivas Vara — Director of Data Engineering & AI Enablement at a Data company, based out of Hyderabad. I also lead AI initiatives and AI literacy across the India Solution Center — driving adoption of AI-assisted engineering at scale.
This site is where I keep what I'm learning and experimenting with. Twenty-plus years of experience across data, B2B eCommerce, healthcare platforms, and network management. Same instincts as ever; new tools.
Two parallel course series, both anchored on a working UCC Lien Risk Intelligence platform — nothing toy, every module produces a real component.
22 courses across foundation, build, AI tooling, agentic AI, project management, and governance. The methodology side — how to integrate AI coding assistants into a working SDLC.
11 strategic modules — business case, ROI models, MCP servers, governance frameworks, and a 90-day adoption roadmap. The executive view of AI-SDLC, no engineering background required. Built for PMs and technology leaders.
22 modules + 5 capstones. From the LLM mental model through tools, memory, planning, guardrails, observability, and deployment — ending in a full production agent system.
A focused deep dive on Claude Code — skills, plugins, hooks, slash commands, MCP integration, sub-agents, and headless CI mode. The tooling track for engineers who want to get one coding agent earning its keep at production scale.
Strongest right now in data engineering on GCP, AI tooling adoption, and cloud architecture. Long history with Java/Spring, SAP Commerce, and AWS / Azure platforms.
In the data engineering organisation I lead, we ingest raw public-records feeds and transform them into canonical and best-view datasets that feed the credit-risk and compliance products, including real-time data delivery APIs.
Raw source feeds land in a bronze layer with full lineage and schema-drift handling. They're cleaned, conformed, and de-duplicated into a silver layer, then aggregated into the gold "best-view" datasets that downstream APIs and scoring models depend on. Telemetry runs alongside the pipeline rather than after it.
Public-records pipeline — live flow
Airflow on Cloud Composer drives the DAGs; Pub/Sub carries the events between stages; Dataproc and BigQuery do the work. GKE hosts the slim service surfaces that need to react in real time.
Composer Pub/Sub Dataproc BigQuery GKE
Government and shipping sources change schemas without warning. We detect drift at the bronze layer, capture changes through the curve into silver, and keep contracts stable for downstream gold consumers.
CDC Drift detection Contracts
Quality checks run as part of ingestion, not after it. Health metrics, source-level SLAs, and auditable data so downstream teams trust what they're consuming — and so we catch regressions before they reach the API.
Quality SLOs Lineage Alerting
AI-SDLC is the practice of integrating AI coding assistants, spec-driven planning, and automated quality gates into every phase of software development. It's a methodology, not a tool — you don't buy it, you adopt it. The difference between “giving developers ChatGPT” and AI-SDLC is governance: spec contracts, proposal reviews, and CI gates that make AI-generated code correct, secure, and consistent.
Two ways in — the technical course series for engineers and architects, and the executive playbook for PMs and leadership. Same domain, same methodology, different audience.
22 courses across foundation, build, AI tooling, agentic AI, project management, and governance — the full methodology, working examples included. Anchored on a real UCC Lien Risk Intelligence platform; every course produces an actual component.
11 strategic modules. Business case, ROI models, MCP servers, governance frameworks (ISO 42001, NIST AI RMF), and a 90-day adoption roadmap. No engineering background required. Built for PMs, managers, and technology leaders running the rollout.
Software teams using AI coding assistants report ~26% more tasks completed per sprint (GitHub / Microsoft / Accenture field study, 2024), with controlled benchmarks showing up to 55% faster completion on focused coding tasks. Industry estimates project 40% fewer defects and 2× feature throughput — but those gains only materialise with the right process framework. AI-SDLC is that framework. Gartner projects 75% of enterprise engineers will use AI coding assistants by 2028; more recent estimates put it at 90%. The question stopped being whether to adopt — it's whether to adopt with discipline or with chaos.
AI-SDLC stands on three independent capabilities. Take any one away and the other two collapse into noise — AI agents without specs produce drift; specs without agents are paperwork; agents and specs without MCP work blind.
Claude Code, Gemini Code Assist, Cursor — pair programmers that never sleep. They generate code, write tests, review PRs, and explain decisions. The developer-productivity pillar.
OpenSpec or equivalent. Before any AI writes code, the team writes a spec. The discipline: no code without a proposal, no proposal without a spec. The quality-and-governance pillar.
Model Context Protocol gives agents secure access to the tools developers would otherwise reach for manually — Jira, GitHub, Slack, databases, cloud consoles. Without MCP, AI works blind. The context-and-integration pillar.
AI-SDLC isn't about building agents — it's about engineers using off-the-shelf coding assistants (Claude Code, Gemini Code Assist, Cursor) inside an SDLC that stays governable. Setting expectations is half the battle: the line between “the assistant does it” and “the developer does it” is what makes the whole methodology work.
This is precisely why MCP servers and OpenSpec exist — they give the assistant the context it needs and the guardrails it must follow.
Specs first, code second. OpenSpec (the @fission-ai/openspec npm package) maintains a spec.md as the source of truth for canonical fields — names, types, business meaning — enforced across every repo. The flow: AI drafts a proposal with /opsx:propose, a human reviews it, /opsx:apply lands the changes, and CI runs /opsx:verify to catch any code that drifts from the spec. Cross-service drift — the API saying debtorName, the database debtor_name, the frontend debtorNameNorm — doesn't survive into integration.
OpenSpec spec.md /opsx:verify canonical fields
Gemini Code Assist for in-IDE completions and Agent Mode multi-file refactors; Claude Code for read / write / verify loops over the codebase. Both reason over the same project context — CLAUDE.md, GEMINI.md, the spec, the platform state — using the RCTF prompt framework, so output stays consistent across engineers and across phases.
Gemini Code Assist Claude Code RCTF context files
The platform's REST surface — searchFilings, calculateRiskScore, getEntityProfile — exposed behind an MCP server so any agent on any client calls it through the same contract. Same tool schemas the IDE assistants and the headless review agents use. Cheaper than re-integrating against every new tool that shows up.
MCP servers tool schemas Spring Boot REST OAuth2
Bias audits on risk scores, explainability for adverse-action decisions, audit trails that satisfy EU AI Act requirements for credit systems, and adoption metrics that go into the same dashboards as the rest of the pipeline. Compliance is a phase of the loop, not a sign-off at the end.
bias audits adverse-action trails EU AI Act adoption metrics
Two failure modes are common enough to deserve their own names. Both are governance problems, not model problems.
Ten developers each using AI independently will generate ten slightly different interpretations of the same business rule. The API ends up with debtorName, the database with debtor_name, the frontend with debtorNameNorm — integration breaks at the seam. Spec-first development is how you unify those interpretations before they get to the seam.
Accepting AI output without reading it, because it looks plausible. The most common failure mode in AI-assisted development — produces security vulnerabilities, subtly wrong business logic, and technical debt that compiles and passes tests but doesn't actually meet the requirements. Prevention: review standards that require reviewers to understand AI-generated code (not just check that the tests pass), automated spec-compliance checks in CI, and a team culture where “I asked AI and accepted its output” is not an acceptable answer.
An AI agent isn't a chatbot. It's your code using an LLM as a decision-making brain, calling tools, and looping until the task is done. Not autonomous AI running on its own. Not “ChatGPT with extra steps.” A program where the reasoning lives in the model and the structure lives in the code — tools you define, guardrails you set, stop criteria you write down.
Building production agents is what I spend most of my evenings and weekends on. The course below is the artefact — 22 modules and 5 capstones, anchored on a UCC Lien Risk Intelligence platform.
22 modules + 5 capstones. From the LLM mental model through tools, memory, planning, guardrails, observability, and deployment — ending in a full production agent system.
A focused deep dive on a single coding agent. Skills, plugins, hooks, slash commands, MCP integration, sub-agents, headless CI mode — the tooling track for engineers who want to get Claude Code earning its keep at production scale.
Take any business question — say, “Is Acme Corporation likely to become delinquent on secured loans in the next 12 months?” You can solve it three ways with the same data and the same ML model:
Pickle a RandomForest classifier. You compute the six features by hand, hand them to predict_delinquency(), get back a probability and a label. Fast (milliseconds), reproducible — and totally inert. No data fetch, no explanation, no follow-up.
Same pickle, wrapped in a REST endpoint. POST {"company_name":"Acme"}, the server runs a hardcoded query, returns rigid JSON. Better — auto-fetches data. Still inflexible: ILIKE 'Acme' misses filings under ACME CORP and ACME CORP DBA ROADRUNNER SUPPLIES.
The same pickle is now one tool among three — search_filings, predict_delinquency, get_filing_details. The agent reasons about name variations on its own, drills into the riskiest filing, calls the ML model, and writes a narrative report citing actual filing numbers.
The infrastructure is identical in all three. The ML model doesn't move. What changes is who decides what to do once a request arrives.
The cleanest way to see what an agent adds is as three layers. Two of them are the same as the FastAPI version:
search_filings() · predict_delinquency() · the ML model
same as before
Most APIs in five years will still be FastAPI. Most ML models will still be pickle, ONNX, or TensorFlow files. The change is Layer 3 — the reasoning that decides how to use the tools and models below it. Agents don't replace ML; they put a reasoning layer on top.
Every production agent, no matter how simple or complex, is built from the same seven components. Take any one away and the agent is incomplete.
The LLM. Reads input, reasons, decides what to do next. Without it: static rules and pattern matching.
APIs, databases, files. The agent's hands. Without them: it can only respond from training data.
Conversation history, RAG. Without it: the agent forgets between turns and asks “who are you?” every message.
Decompose tasks, decide execution order. Without it: the agent only handles one-step requests.
Validate inputs, check outputs, escalate when needed. Without them: every wrong call is a production incident.
Observability — logs, traces, telemetry. Without them: you can't debug and you can't trust the output.
Where the agent runs — container, function, server. Without it: it's a script on someone's laptop.
Building an agent isn't a single checkpoint — it's a sequence with concrete artefacts at each step.
Agents are slower and more expensive than scripts. Use them where the path through the problem is open-ended — multi-step reasoning, name variations, judgement calls. Don't use them for:
Working examples of all of the above — including the seven-block stack wired into a real system — are in the course. Capstone 1 is a one-tool agent. Capstone 5 is the full production system: planning, memory, guardrails, human oversight, model routing, eval suite, deployment.