SRE Production DevOps Engineer
Digital Engineering; LPB
- Type:
- Full Time
- Location(s):
- Hyderabad TS IN 26
- Date Posted:
- Salary:
- Job Posting Start Date:
- 2026-03-25
- Job Posting End Date:
- Job ID:
- R5036316
Share this job:
Job Description Summary
The Production DevOps Engineer serves as a critical link in the "Middle-Mile" of software delivery for the GE Vernova’s Grid Software SaaS products. This role is responsible for ensuring that software moves from development to production environments through a standardized, secure, and highly observable path. You will own the Change Management Process, serving as a primary authority for production deployments to ensure that new SaaS product versions do not compromise the stability of global energy grid operations. This position requires a strong technical background in automation and a disciplined approach to release safety in a 24/7 operational environment.Works independently and is seen as a Technical Leader. The role demonstrates deep understanding of concurrent software development, its effect on build management and releasing the builds across versions and environments
Job Description
Roles and Responsibilities
Day 0: Pipeline Implementation & Standardization
Golden Path Execution: Maintain and improve standardized CI/CD pipelines using GitHub Actions and ArgoCD, ensuring all product teams follow the established "Golden Path" to avoid bespoke, non-standard deployment utilities.
Policy Enforcement: Implement and manage automated "quality gates" within the delivery pipeline to verify that every release meets security and architectural standards before reaching production.
Provisioning Support: Assist the SaaS Cloud Engineers in automating highly secure, resilient customer’s cloud infrastructure.
Day 1: Release Authority & Deployment Management
Change Control Authority: Review and provide final approval for production deployment requests, ensuring all pre-release criteria—such as performance testing and security scanning—are satisfied.
Progressive Delivery: Execute advanced rollout strategies, including Canary and Blue/Green deployments on Kubernetes, to minimize the "blast radius" of changes .
Validation: Perform automated verification and acceptance testing post-deployment to confirm service health and trigger automated rollbacks if necessary.
Day 2: Operational Support & Optimization
24/7 Follow-the-Sun Support: Participate in global on-call rotations, ensuring a seamless transition of operational responsibility between time zones through standardized handover protocols.
Incident & Root Cause Analysis: Support high-severity incident response and participate in blameless Root Cause Analysis (RCA) to identify and fix systemic deployment risks .
FinOps & Capacity: Track and report on cloud resource consumption for CI/CD infrastructure, assisting in cost-optimization efforts and right-sizing production workloads.
Manage key deliverables and mentors junior team members.
Contribute in driving initiatives such as defining standards and processes to ensure quality.
Develop and enhance the test infrastructure and continuous integration framework used across teams.
Learn new build and releases techniques and methodologies and trains the team in the same.
Partner with and provides direction to fellow team members to diagnose bugs and formulate solutions.
Technical Requirements
CI/CD & GitOps: Hands-on experience with Jenkins, Artifactory, GitHub Actions and ArgoCD for automated software delivery.
Container Orchestration: Proficiency in managing workloads on Kubernetes, specifically with EKS clusters.
Automation Tools: Strong skills in Ansible and Terraform for configuration management and infrastructure-as-code.
Cloud Platform: Solid understanding of AWS cloud services (VPC, IAM, EKS, RDS, S3, MSK, etc) in a production setting.
Observability: Experience using Prometheus, Grafana, Splunk, Datadog or Dynatrace to monitor deployment health and system performance .
Experience & Qualifications
Professional Background: 5+ years of experience in DevOps, SRE, or Release Engineering roles for cloud-native SaaS applications.
Overall Experience: 8+ Years.
Operational Discipline: Proven ability to manage production changes and troubleshooting under pressure in a high-stakes environment.
Compliance Awareness: Familiarity with regulated industries and security frameworks such as NERC CIP, SOC2, ISO 27001, IEC 62443 is highly preferred.
Communication: Strong ability to document technical procedures and communicate clearly with stakeholders during global shift handovers.
Key Performance Indicators (KPIs)
System Availability: Help maintain 99.99% availability of mission critical grid SaaS products.
Customer Onboarding Speed: Contribution towards the 4-hour SLA target.
Change Failure Rate: Maintaining a low rate of failed production deployments through improved quality gates .
Mean Time to Recover (MTTR): Ensuring fast restoration of service through automated rollbacks and executing run books diligently.
Toil Reduction: Automating repetitive manual tasks to ensure at least 50% of time is spent on engineering improvements.
Education Qualification
Bachelor's Degree in Computer Science or “STEM” Majors (Science, Technology, Engineering and Math) with advanced experience.
Business Acumen:
• Strong problem solving abilities and capable of articulating specific technical topics or assignments
• Experience in building scalable and highly available distributed systems
• Skilled in breaking down problems and estimate time for development tasks
• Evangelizes how our technology solves customer problems from a technology and business perspective
Leadership:
• Demonstrates clarity of thinking to work through limited information and vague problem definitions
• Influences through others; builds direct and "behind the scenes" support for ideas
• Proactively identifies and removes project obstacles or barriers on behalf of the team
• Shares knowledge, power, and credit, establishing trust, credibility, and goodwill
Personal Attributes:
• Able to work under minimal supervision
• Excellent communication skills and the ability to interface with senior leadership with confidence and clarity
• Skilled in providing oversight and mentoring team members. Shows ability to effectively delegate work.
• Applies values, business strategy, policies, precedent, and experience to make complex decisions in
ambiguity and with uncertain consequences.
Additional Information
Relocation Assistance Provided: Yes