name: Nikhil Anand
role: DevOps Intern @ Telangana Police Academy
education: B.Tech Data Science | SNIST | Diploma in Cloud Computing & Big Data
focus: [ DevOps, MLOps, LLMOps, On-Prem Infrastructure ]- Deploying, automating, and monitoring internal government platforms on on-premise infrastructure
- Designing and implementing CI/CD pipelines for reproducible, zero-downtime deployments
- Building observability stacks with Prometheus, Grafana, and ELK for production systems
- Engineering containerized workloads with Docker and Kubernetes
- Focused on high availability, reliability engineering, and scalable AI systems
DevOps & Infrastructure
CI/CD & Version Control
Cloud Platforms
Monitoring & Observability
Programming & Frameworks
AI / ML
Databases & Data Tools
| Domain | Technologies |
|---|---|
| Operating Systems | |
| Containers & Orchestration | |
| CI/CD | |
| Web & Reverse Proxy | |
| Monitoring | |
| Log Management | |
| IaC & Config Mgmt | |
| Version Control |
| Focus Area | Description |
|---|---|
| LLM Deployment Architecture | Designing serving pipelines for large language models with efficient batching and resource allocation |
| Model Serving Pipelines | FastAPI + Docker + Kubernetes for scalable, containerized model inference endpoints |
| Experiment Tracking | MLflow for reproducible ML experiments, model versioning, and artifact management |
| Vector Databases | Building retrieval-augmented generation (RAG) pipelines with vector search infrastructure |
| GPU-aware Infrastructure | Optimizing compute scheduling and resource allocation for training and inference workloads |
| Observability for ML Systems | Monitoring model drift, latency, throughput, and system health in production ML pipelines |
| Reproducible ML Pipelines | End-to-end pipeline automation from data ingestion to model deployment with full lineage tracking |
To enable the snake animation, add this GitHub Actions workflow to your profile repo at
.github/workflows/snake.yml
DevOps Automation ██████████████████████░░ 90% CI/CD · GitOps · IaC
On-Prem Infra ████████████████████░░░░ 80% Government Platforms
AI/ML Pipelines ████████████████░░░░░░░░ 65% Training · Serving · Monitoring
LLMOps ██████████████░░░░░░░░░░ 55% Model Lifecycle · Deployment
Reliability Eng. ████████████████████░░░░ 80% HA · Observability · SRE
Infra as Code ████████████████████░░░░ 80% Terraform · Ansible · GitOps
- Scaling DevOps automation across CI/CD, provisioning, and configuration management
- Optimizing on-prem infrastructure for government platforms with zero-downtime deployment patterns
- Building AI/ML deployment pipelines — from experiment tracking to production model serving
- Engineering LLMOps workflows for model lifecycle management, prompt versioning, and inference scaling
- Embedding Infrastructure as Code mindset across every layer of the stack
- Designing reliability-first architecture with observability, alerting, and failure recovery baked in
More About My Engineering Philosophy
"Production is sacred. Every deployment should be reproducible, every system observable, every failure recoverable."
- I believe infrastructure must be codified, version-controlled, and peer-reviewed — no manual changes in production, ever.
- Observability is not optional. If you can't measure it, you can't improve it. Metrics, logs, and traces are first-class citizens in every system I build.
- Automation exists to eliminate toil, not to replace understanding. I automate deliberately and document the why behind every pipeline.
- Reliability engineering starts at design time, not incident time. I architect for failure because failures are inevitable — downtime is not.
- The best infrastructure is invisible — engineers ship features, not fight deployments.
Infrastructure Philosophy
"Treat infrastructure as a product — with users, SLAs, and continuous improvement."
- Automation over manual processes. If a task is done more than once, it should be scripted. If it's done more than twice, it should be a pipeline.
- Observability before scaling. Instrument first, optimize second. You cannot scale what you cannot see.
- Reproducibility in ML pipelines. Every experiment must be versioned, every model traceable from data to deployment.
- Infrastructure as a product. Internal platforms deserve the same rigor as customer-facing services — documentation, testing, and iteration.
- Reliability as a feature. Uptime is not luck. It is engineered through redundancy, graceful degradation, and relentless testing.

