Timur Djigkaev - Platform Engineer & AI Infrastructure
Open to DevOps Engineer · SRE · Platform Engineer - US Remote or Hybrid

Platform Engineer & AI Infrastructure

Five years building cloud infrastructure, Kubernetes platforms, and ML pipelines on AWS.

I build and operate production infrastructure for engineering and ML teams: Kubernetes clusters on EKS, GitOps delivery with ArgoCD and Terraform, CI/CD pipelines, and the platform layer for running ML training jobs and inference workloads.

5+ Years in Production
40% Cloud Cost Reduction

Core Expertise

Platform Engineering
Design and operate internal developer platforms on Kubernetes and AWS. GitOps-first delivery with ArgoCD and Helm. Self-service infrastructure provisioning via Terraform modules.
AI & ML Infrastructure
Kubernetes-native ML platforms: GPU node pools, Kubeflow and Argo Workflows for training pipelines, autoscaling inference endpoints, MLflow model registry, and drift monitoring.
Security-Hardened CI/CD & SRE
Multi-stage pipelines with container scanning (Trivy), OPA policy gates, and secrets management via HashiCorp Vault. SLO/SLA engineering backed by Prometheus, Grafana, and OpenTelemetry. Least-privilege IAM with Terraform modules and RBAC enforcement.

Projects

Multi-Tenant EKS Developer Platform
EKS Karpenter ArgoCD Helm OPA/Gatekeeper Terraform
Problem
40+ engineers sharing one EC2-based environment - no isolation, no cost visibility, 3-4 hour deployment cycles with frequent cross-team conflicts.
What I Built
Migrated to EKS with Karpenter for cost-optimized spot/on-demand node provisioning. Built a Helm chart library for standardized workload packaging. Deployed ArgoCD for GitOps delivery with environment promotion gates. OPA Gatekeeper policies enforce namespace isolation and security guardrails. Terraform modules enable self-service environment provisioning in under 5 minutes.
22 min
avg deploy
time (was 4hr)
32%
cluster cost
reduction
ML Training & Inference Platform on Kubernetes
Kubeflow MLflow SageMaker EKS GPU Argo Workflows Prometheus
Problem
Data science team deploying models manually via SSH. No versioning, no rollback, no monitoring - production model degradation going undetected for days.
What I Built
Kubeflow pipelines for reproducible training jobs on GPU node pools with scale-to-zero. MLflow for experiment tracking and model registry with staging/canary promotion gates. Automated CI/CD pipeline for model deployment. Grafana dashboards for inference latency, throughput, and drift detection.
85%
faster model
deployment
$0
idle training
cost
Cloud Cost Engineering Program - $18k/mo Saved
Infracost Kubecost AWS Cost Explorer Terraform Datadog
Problem
$45k/month AWS bill growing 20% month-over-month. No per-team cost visibility. $12k/month in idle and oversized resources identified during initial audit.
What I Built
Integrated Infracost into CI/CD for pre-merge cost estimates - blocking costly changes before they merge. Deployed Kubecost for per-namespace attribution piped into Grafana team dashboards. Automated Reserved Instance and Savings Plan purchasing. RDS rightsizing, spot fleet for all non-prod environments, and S3 lifecycle policies across all buckets.
40%
monthly AWS
cost reduction
$18k
saved per
month
3+
costly PRs
blocked/week

Technical Depth

Cloud Platform
EKS, ECS, Lambda, RDS, ElastiCache, VPC, IAM, SageMaker, CloudWatch, Cost Explorer, S3, CloudFront, Route 53, GuardDuty, Organizations
Kubernetes
Helm, ArgoCD, Karpenter, KEDA, Istio, Cilium, OPA/Gatekeeper, Falco, Trivy, RBAC, network policies, multi-tenant cluster design
IaC & CI/CD
Terraform, Terragrunt, Ansible, Pulumi, GitHub Actions, Jenkins, ArgoCD, Infracost
ML Platform
Kubeflow, MLflow, SageMaker Pipelines, Argo Workflows, GPU node scheduling, model registry, drift monitoring
Observability
Prometheus, Grafana, OpenTelemetry, Datadog, PagerDuty, Jaeger, distributed tracing, SLO/SLA design
Security
HashiCorp Vault, SOPS, AWS Secrets Manager, IAM hardening, Trivy, Falco, OPA Gatekeeper, SOC2 controls

Certifications

Download Resume View Certifications GitHub Profile LinkedIn Contact Me