Site Reliability Engineer - VP

Deutsche BankVisit website ↗·Posted 3 months ago

Location

MH, IN

Required Skills

PythonKubernetesExcel

About the Role

Job Description: -------------------- Job Title: Site Reliability Engineer - VP
Location: Pune, India
Role Description

• We are seeking a Site Reliability Engineer – Observability to build and scale our enterprise observability capability. This role focuses on instrumentation, monitoring, and telemetry platforms to provide end-to-end visibility across services.

• Own and drive enterprise-wide reliability governance, ensuring systems operate with consistent SLO standards, strong production controls, and audit-ready processes. Act as the central control tower for reliability across all platforms.

What we’ll offer you

As part of our flexible scheme, here are just some of the benefits that you’ll enjoy

• Best in class leave policy

• Gender neutral parental leaves

• 100% reimbursement under childcare assistance benefit (gender neutral)

• Sponsorship for Industry relevant certifications and education

• Employee Assistance Program for you and your family members

• Comprehensive Hospitalization Insurance for you and your dependents

• Accident and Term life Insurance

• Complementary Health screening for 35 yrs. and above

Your key responsibilities

Reliability Governance

• Define and own enterprise SLO/SLI framework aligned to service criticality

• Establish and enforce error budget governance policies

• Standardize reliability KPIs and reporting

• Production Controls

Define PRR / Production Certification (PRC) standards

• Observability coverage (metrics, logs, traces)

• Alert quality (actionable, low-noise)

• Runbooks . recovery readiness

• Govern release readiness across teams

• Incident Governance

Own incident management framework (severity, escalation, response)

• Define RCA standards, SLAs, and quality benchmarks

• Ensure traceability (alert incident RCA remediation)

• Oversee major incidents and systemic risks

• Risk . Audit Alignment

Drive adoption of SRE practices across engineering

• Provide frameworks, playbooks, and guidance

• Conduct reliability reviews with leadership

• Skills . Experience

Strong SRE / production engineering leadership experience

• Expertise in SLOs, error budgets, incident governance, observability

• Experience with distributed systems, cloud, and Kubernetes

• Strong understanding of risk, audit, and compliance (financial services preferred)

• Own and enforce reliability as a governed, measurable, and audit-ready capability across the enterprise.

Your skills and experience

• Strong understanding of metrics, logs, traces correlation

• Programming: Python, Linux

• Familiarity with monitoring tools.

How we’ll support you

• Training and development to help you excel in your career

• Coaching and support from experts in your team

• A culture of continuous learning to aid progression

• A range of flexible benefits that you can tailor to suit your needs

About us and our teams Please visit our company website for further information:

https://www.db.com/company/company.html

We at DWS are committed to creating a diverse and inclusive workplace, one that embraces dialogue and diverse views, and treats everyone fairly to drive a high-performance culture. The value we create for our clients and investors is based on our ability to bring together various perspectives from all over the world and from different backgrounds. It is our experience that teams perform better and deliver improved outcomes when they are able to incorporate a wide range of perspectives. We call this .ConnectingTheDots.

Land this role fasterProfessional

🎙️