NIKHIL ANAND Nikhilanandd

About Me

name: Nikhil Anand
role: DevOps Intern @ Telangana Police Academy
education: B.Tech Data Science | SNIST | Diploma in Cloud Computing & Big Data
focus: [ DevOps, MLOps, LLMOps, On-Prem Infrastructure ]

Deploying, automating, and monitoring internal government platforms on on-premise infrastructure
Designing and implementing CI/CD pipelines for reproducible, zero-downtime deployments
Building observability stacks with Prometheus, Grafana, and ELK for production systems
Engineering containerized workloads with Docker and Kubernetes
Focused on high availability, reliability engineering, and scalable AI systems

Tech Stack

DevOps & Infrastructure

CI/CD & Version Control

Cloud Platforms

Monitoring & Observability

Programming & Frameworks

AI / ML

Databases & Data Tools

DevOps Toolkit

Domain	Technologies
Operating Systems
Containers & Orchestration
CI/CD
Web & Reverse Proxy
Monitoring
Log Management
IaC & Config Mgmt
Version Control

AI Infrastructure & LLMOps

Focus Area	Description
LLM Deployment Architecture	Designing serving pipelines for large language models with efficient batching and resource allocation
Model Serving Pipelines	FastAPI + Docker + Kubernetes for scalable, containerized model inference endpoints
Experiment Tracking	MLflow for reproducible ML experiments, model versioning, and artifact management
Vector Databases	Building retrieval-augmented generation (RAG) pipelines with vector search infrastructure
GPU-aware Infrastructure	Optimizing compute scheduling and resource allocation for training and inference workloads
Observability for ML Systems	Monitoring model drift, latency, throughput, and system health in production ML pipelines
Reproducible ML Pipelines	End-to-end pipeline automation from data ingestion to model deployment with full lineage tracking

GitHub Analytics

Contribution Graph

To enable the snake animation, add this GitHub Actions workflow to your profile repo at .github/workflows/snake.yml

Currently Engineering

DevOps Automation     ██████████████████████░░   90%  CI/CD · GitOps · IaC
On-Prem Infra         ████████████████████░░░░   80%  Government Platforms
AI/ML Pipelines       ████████████████░░░░░░░░   65%  Training · Serving · Monitoring
LLMOps                ██████████████░░░░░░░░░░   55%  Model Lifecycle · Deployment
Reliability Eng.      ████████████████████░░░░   80%  HA · Observability · SRE
Infra as Code         ████████████████████░░░░   80%  Terraform · Ansible · GitOps

Scaling DevOps automation across CI/CD, provisioning, and configuration management
Optimizing on-prem infrastructure for government platforms with zero-downtime deployment patterns
Building AI/ML deployment pipelines — from experiment tracking to production model serving
Engineering LLMOps workflows for model lifecycle management, prompt versioning, and inference scaling
Embedding Infrastructure as Code mindset across every layer of the stack
Designing reliability-first architecture with observability, alerting, and failure recovery baked in

Connect

More About My Engineering Philosophy

"Production is sacred. Every deployment should be reproducible, every system observable, every failure recoverable."

I believe infrastructure must be codified, version-controlled, and peer-reviewed — no manual changes in production, ever.
Observability is not optional. If you can't measure it, you can't improve it. Metrics, logs, and traces are first-class citizens in every system I build.
Automation exists to eliminate toil, not to replace understanding. I automate deliberately and document the why behind every pipeline.
Reliability engineering starts at design time, not incident time. I architect for failure because failures are inevitable — downtime is not.
The best infrastructure is invisible — engineers ship features, not fight deployments.

Infrastructure Philosophy

"Treat infrastructure as a product — with users, SLAs, and continuous improvement."

Automation over manual processes. If a task is done more than once, it should be scripted. If it's done more than twice, it should be a pipeline.
Observability before scaling. Instrument first, optimize second. You cannot scale what you cannot see.
Reproducibility in ML pipelines. Every experiment must be versioned, every model traceable from data to deployment.
Infrastructure as a product. Internal platforms deserve the same rigor as customer-facing services — documentation, testing, and iteration.
Reliability as a feature. Uptime is not luck. It is engineered through redundancy, graceful degradation, and relentless testing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NIKHIL ANAND Nikhilanandd

Achievements

Achievements

Highlights

Block or report Nikhilanandd