Senior DevOps & Cloud Management Engineer

Join the ETP Growth Journey At ETP Group, we deliver the next generation of AI-powered, cloud-native SaaS platforms that are transforming retail and e-Commerce operations across Asia Pacific. As we empower brands with the agility, intelligence, and innovation they need to grow in a dynamic market, we grow too. To do that we need the right people on board. We’re always on the lookout for passionate professionals who are smart, self-motivated, and eager to make a real impact. If you love solving challenges, working with cutting-edge technology, and being part of a collaborative and fast-paced environment, ETP is the place for you. Here, you’ll find more than just a job. You’ll find the right opportunity to shape the future of unified commerce, whilst growing your career alongside a team that values innovation, ownership, and excellence. Ready to be part of our success story? Email your resume to careers@etpgroup.com — please include the position you’re applying for, a recent photograph, current and expected compensation, educational qualifications, work experience, and contact details. Company Description ETP Group is an AI-first SaaS company serving the Retail and e-Commerce industries across Asia Pacific. With 37 years of trust in the market, it supports 500+ brands in 17 countries through enterprise-grade platforms. ETP’s cloud-native solutions—ETP Unify and Ordazzle—cover POS, CRM, Inventory, Promotions, PIM, OMS, WMS, LMS, and seamless marketplace integration. For large-format retail, ETP V5 offers a hybrid omni-channel suite. Built on secure, scalable M.A.C.H architecture. ETP delivers frictionless, personalized experiences across channels. Its intuitive, asset-light platforms accelerate cloud transformation, reduce IT overhead, and help retailers enhance CX, drive growth, and lead in a fast- evolving commerce environment. Here is a glimpse of what we do - http://www.etpgroup.com/Videos.html

الخبرة المطلوبة
5+
الموقع
Mumbai
نوع الدور
Full Time
شاركها على

We are looking for a highly skilled Kubernetes Platform Engineer with 5+ years of hands-on experience in designing, implementing, and managing Kubernetes-based environments on cloud platforms — specifically Amazon EKS and Google Kubernetes Engine (GKE).

The ideal candidate will build and operate scalable, secure, and highly resilient Kubernetes platforms supporting large-scale microservices architectures. This role requires close collaboration with DevOps, Development, Security, and SRE teams to ensure high availability, performance, and operational excellence.

The candidate should be adaptable, open to handling diverse cloud and DevOps initiatives, and willing to work in rotational shifts (morning/day/evening/night).

Key Responsibilities

Kubernetes Platform Engineering

  • Design, deploy, and manage Kubernetes clusters on AWS and GCP
  • Set up and manage:
    • EKS and/or GKE clusters
    • Node groups and autoscaling policies
    • Cluster networking and ingress controllers
  • Implement namespace segregation and resource quotas
  • Manage cluster upgrades, patching, and lifecycle management

Infrastructure Provisioning & Automation

  • Provision infrastructure using Infrastructure-as-Code tools:
    • Terraform (preferred)
    • CloudFormation
  • Automate cluster provisioning and environment setup
  • Implement automated scaling using:
    • Cluster Autoscaler
    • Horizontal Pod Autoscaler
    • Vertical Pod Autoscaler (preferred)

Cloud Platform Management (AWS & GCP)

Hands-on experience managing:

AWS:

  • EC2, EKS, VPC, IAM, ELB (ALB/NLB), EBS, CloudWatch

GCP:

  • GKE, Compute Engine, VPC, IAM, Load Balancing

Responsibilities include:

  • VPCs, subnets, route tables
  • Load balancers (ALB, NLB, GCP Load Balancer)
  • Security groups and firewall rules
  • IAM roles, service accounts, and access policies

Microservices Platform Operations

  • Support large-scale microservices deployments
  • Optimize resource utilization and cluster performance
  • Troubleshoot pod, node, networking, and application issues
  • Manage rolling deployments and zero-downtime upgrades

Observability & Monitoring

  • Implement and manage monitoring tools:
    • Prometheus
    • Grafana
    • GCP Operations Suite
  • Configure dashboards, alerts, and incident monitoring
  • Perform root cause analysis and incident resolution

Security & Compliance

  • Implement Kubernetes security best practices:
    • RBAC
    • Network policies
    • Pod security standards
  • Configure secrets management:
    • AWS Secrets Manager
    • GCP Secret Manager
  • Ensure compliance with enterprise security standards

CI/CD & DevOps Integration

  • Integrate Kubernetes with CI/CD pipelines:
    • Jenkins
    • GitLab CI / GitHub Actions
  • Support container build and deployment workflows
  • Manage container registries (ECR, GCR, Artifact Registry)
  • Work with Docker and Helm

Disaster Recovery & High Availability

  • Design and implement HA Kubernetes architectures
  • Implement backup and recovery strategies
  • Participate in DR drills and recovery validation

Required Technical Skills

Kubernetes & Containerization

  • 5+ years hands-on Kubernetes experience
  • Strong experience with EKS and/or GKE
  • Experience managing production clusters
  • Strong knowledge of:
    • Pods, Deployments, StatefulSets
    • Services, Ingress
    • ConfigMaps, Secrets

Infrastructure as Code

  • Terraform (preferred)
  • CloudFormation

Monitoring & Logging

  • Prometheus
  • Grafana
  • ELK Stack or cloud-native logging

Preferred Skills

  • Experience supporting 100+ microservices environments
  • Experience with Service Mesh (Istio, Linkerd)
  • Experience with Kafka, Redis, or Solr environments
  • Multi-environment setup (Dev, QA, Prod)
  • Production incident handling (SRE practices)

Soft Skills

  • Strong troubleshooting and analytical skills
  • Ability to work in production-critical environments
  • Strong documentation capability
  • Good communication and collaboration skills
  • Flexible and adaptable mindset

Preferred Certifications

One or more of:

  • Certified Kubernetes Administrator (CKA)
  • AWS Certified Solutions Architect
  • AWS Certified DevOps Engineer
  • Google Professional Cloud DevOps Engineer

First 90 Days – Success Indicators

  • Set up and manage Kubernetes environments
  • Improve cluster reliability and scalability
  • Implement monitoring and alerting frameworks
  • Support application deployments
  • Optimize infrastructure performance