Verisk Logo

Verisk

Site Reliability Engineer

Posted 2 Days Ago
Be an Early Applicant
Hybrid
Hyderabad, Telangana
Mid level
Hybrid
Hyderabad, Telangana
Mid level
Responsible for designing and operating resilient systems, improving reliability through automation, managing multi-region architectures, and ensuring high availability and disaster recovery processes.
The summary above was generated by AI

We’re a small engineering team building and operating production services that must stay up and available across multiple regions, even when things go wrong. We’re looking for a pragmatic Site Reliability Engineer who can design, build, and operate resilient systems without unnecessary complexity.

This role is hands-on and collaborative: you’ll work closely with application engineers to make reliability a shared responsibility, not a gate.

Responsibilities

Multi-Region Reliability & Availability (Primary Focus)

  • Design and operate multi-region architectures (active/active or active/passive)
  • Implement and improve automated failover and traffic routing
  • Identify and eliminate single points of failure
  • Ensure regional isolation and graceful degradation when dependencies fail

High Availability & Disaster Recovery

  • Define realistic availability goals and failure scenarios
  • Design and test backup and restore processes
  • Own disaster recovery plans and validate them through regular testing
  • Help the team understand RTO/RPO trade-offs

Observability & Incident Response

  • Build and maintain clear, actionable observability (metrics, logs, traces)
  • Create alerts that detect real problems without noise
  • Participate in on-call and help improve incident response
  • Lead or contribute to blameless postmortems and follow-up fixes

Automation & Operations

  • Reduce manual operational work through automation
  • Improve deployment safety (rollbacks, health checks, canaries where appropriate)
  • Manage infrastructure using infrastructure as code
  • Design systems that recover automatically whenever possible

Performance & Capacity

  • Monitor performance and saturation across regions
  • Help with capacity planning and load testing
  • Balance reliability, performance, and cost
Qualifications
  • Experience operating production systems with real availability requirements
  • Hands-on experience with cloud infrastructure and distributed systems
  • Strong understanding of: 
    • High availability patterns
    • Failure modes in distributed systems
    • Multi-region trade-offs
  • Comfortable being hands-on: debugging, automating, improving systems
  • Pragmatic mindset — you know when simple is better than perfect
  • Clear communicator who works well in a small, collaborative team

Core Technical Requirements

Cloud & Infrastructure

  • Strong expertise in Amazon Web Services (multi-region architecture)
  • Experience designing Active-Active / Active-Passive deployments
  • Disaster Recovery planning (RTO/RPO)
  • Advanced knowledge of VPC networking, IAM, Route 53, Load Balancers, and EKS
  • Infrastructure as Code (e.g., Terraform)

Containers & Orchestration

  • Advanced experience with Kubernetes (EKS preferred)
  • Strong knowledge of Docker
  • Experience managing scalable, highly available containerized workloads

API Management

  • Hands-on experience with Kong
  • API gateway configuration, authentication, rate limiting, and high availability design

Monitoring & Observability (Expert Level)

  • Advanced knowledge of Splunk
  • Strong expertise in Dynatrace
  • Experience defining SLIs/SLOs, alerting strategies, and root cause analysis
  • Incident management and production troubleshooting

Nice to Have

  • Experience with global PostgreSQL architectures (cross-region replication, failover, performance tuning)
  • Experience with Azure DevOps CI/CD pipelines
  • Working knowledge of C#
  • Strong Linux administration and troubleshooting skills

Key Competencies

  • Designing and operating highly available, resilient systems
  • Automation-first mindset
  • Deep production troubleshooting skills
  • Strong collaboration and communication abilities
About Us

Our People, Our Culture

For more than 50 years, Verisk has helped property and casualty insurers make smarter decisions about risk through AI-powered risk modeling, advanced analytics, and technology solutions spanning the entire policy lifecycle.  We are a leading strategic data, analytics, and technology partner to the global insurance industry, guided by core values of learning, caring, and results while maintaining the highest ethical standards as stewards of the industry's most comprehensive datasets. Learn more about Verisk and what we are doing within the insurance industry. 

When you join Verisk, you become part of a diverse global team with over 7,500 professionals in 30 plus countries, making work that matters. We're certified by Great Place to Work, reflecting our commitment to inclusivity, employee engagement, and wellbeing. At Verisk, your growth is a priority—from professional development and tuition benefits to a supportive, flexible workplace culture, we support your continued growth. 

Our Culture: Explore our inclusive, people-first culture that fosters innovation, collaboration, and belonging.

Awards & Recognition: See why Verisk is consistently recognized as a Great Place to Work™ around the world.

Our Businesses: Learn about the diverse industries we serve — from insurance and energy to financial services and beyond.

Life at Verisk: Discover what it’s like to work at Verisk through employee stories, team highlights, and culture moments.

Careers at Verisk: Join a global team of problem-solvers and innovators doing meaningful work that’s shaping the future of industries. Whether you're just starting out or looking to take your career to the next level, Verisk offers growth, purpose, and a people-first culture

Let’s build something meaningful together!

Verisk Analytics is an equal opportunity employer.

All members of the Verisk Analytics family of companies are equal opportunity employers. We consider all qualified applicants for employment without regard to race, religion, color, national origin, citizenship, sex, gender identity and/or expression, sexual orientation, veteran's status, age or disability. Verisk’s minimum hiring age is 18 except in countries with a higher age limit subject to applicable law.

https://www.verisk.com/company/careers/

Unsolicited resumes sent to Verisk, including unsolicited resumes sent to a Verisk business mailing address, fax machine, or email address, or directly to Verisk employees, will be considered Verisk property. Verisk will NOT pay a fee for any placement resulting from the receipt of an unsolicited resume.

Verisk Employee Privacy Notice

Top Skills

Amazon Web Services
Azure Devops
Docker
Dynatrace
Kong
Kubernetes
Linux
Postgres
Splunk
Terraform

Verisk Hyderabad, Telangana, IND Office

2nd Floor, Block C, DivyaSree Omega Kondapur, , Hyderabad, TS , India, 500081

Similar Jobs

16 Days Ago
Easy Apply
Hybrid
Hyderabad, Telangana, IND
Easy Apply
Senior level
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
The Staff Site Reliability Engineer will perform operations for cloud products, manage infrastructure, automate processes, and ensure compliance.
Top Skills: AnsibleAWSKubernetesLinuxPythonTerraform
2 Days Ago
In-Office
Hyderabad, Telangana, IND
Junior
Junior
HR Tech
The Site Reliability Engineer will ensure system reliability and performance through monitoring, automation, incident response, and collaboration with teams.
Top Skills: AnsibleAWSAzureBashChefDockerElk StackGCPGrafanaKubernetesPowershellPrometheusPuppetPython
2 Days Ago
In-Office or Remote
2 Locations
Senior level
Senior level
Hardware • Other • Software • Appliances • Industrial • Manufacturing
As a Site Reliability Engineer, you will ensure system availability, automate workflows, manage CI/CD pipelines, and support cloud technologies.
Top Skills: AnsibleAppdynamicsApplication InsightsArm TemplatesAWSAzure DevopsBashCloud FormationDatadogGithub ActionsJenkinsKubernetesLinuxNew RelicOctopusPowershellPythonSQL ServerTerraformWindows

What you need to know about the Hyderabad Tech Scene

Because of its proximity to leading research institutions and a government committed to the city's growth, Hyderabad's tech scene is booming. With plans to establish India's first "AI city," the city is on track to become one of the world's most anticipated tech hubs, with companies like TransUnion, Schrödinger and Freshworks, among others, already calling the city home.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account