The Site Reliability Engineer will support multi-tier Java applications, develop CI/CD capabilities, manage incidents, and ensure high availability in a hybrid-cloud environment.
- Experience with supporting Java (J2EE/Spring Boot) based multi-tier applications with complex upstream downstream interactions having expertise in understanding the application request flow and analysing application logs for investigating and troubleshooting issues and application break.
- Ability to work in a dynamic environment with ability to self-organize and plan and prioritise the work in an environment where multiple issues compete for attention.
- Contribute in developing and implementing automated CI/CD capability for our application.
- Contribute in our continuous improvement and continuous delivery while increasing maturity of DevOps practices.
- Get involved in the discussions and provide inputs in designing a fully automated, robust and secure infrastructure.
- Collaborate closely with other internal SRE and Dev teams/business users in investigating, testing and deployments
- Responsible for handling Release Management, raising Change Request and scheduling for the implementation of fixes and enhancements.
- Work effectively in collaboration with different teams either local or remote.
- Work towards high availability of our applications by putting in right Observability in place.
- Support our production environment with strong performance tuning, end-to-end troubleshooting, networking fundamentals skills.
- Willingness to in rotational shifts/On-Call rosters as part of 24x7 teams supporting critical applications.
Requirements
- Minimum 5-7 years’ experience as a Site Reliability engineer supporting different application and application infrastructure in a Hybrid-cloud platforms with mix of On-Prem and AWS/GCP
- Ability to support Java (J2EE/Spring Boot) or .NET applications and manage Incident and support recovery of the application and drive root cause analysis, management communication and client relationship management in partnership with Infrastructure Service Support team members.
- Ensures all production changes are made in accordance with life-cycle methodology and risk guidelines
- Application Support, Deployment of Release, patches & fixes on Platform
- Analyse application performance, perform tuning and ensure high availability & stability of platform.
- Knowledge of Batch Processing systems and tools
- Knowledge of Unix/Linux system and containerization and container orchestration platforms and platforms (viz., Docker, Cloud Foundry, OpenShift, Kubernetes) etc.
- Strong scripting skills ability automate manual tasks which could be easily converted to a script - shell, Python or PowerShell.
- Familiarity with usage of Observability tools like Grafana, Kibana, AppDynamics etc.
- Experienced in AWS/GCP Public cloud services
- Hands on experience any of the CI/CD tools viz., Jenkins, Circle-CI, GitHub Actions and ability to understand and define different deployment strategies.
- Hands-on experience with GIT. Managing deployment and branching with in GIT
Top Skills
.Net
Appdynamics
AWS
Circle-Ci
Cloud Foundry
Docker
GCP
Git
Github Actions
Grafana
J2Ee
Java
Jenkins
Kibana
Kubernetes
Linux
Openshift
Powershell
Python
Spring Boot
Unix
Similar Jobs
Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Ensure availability, scalability, and performance of systems by designing and operating large-scale services, building automation, implementing monitoring/alerting, responding to incidents, and collaborating with engineering and product teams.
Top Skills:
Java,Python,Bash,Powershell,Docker,Kubernetes,Terraform,Elk,Grafana,Kibana,Splunk,Appdynamics,Azure Application Insights,Azure Log Analytics,Azure Monitor,Azure Kubernetes Service,Servicenow,Itil
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Design, build, and maintain reliable, scalable Azure-based systems; implement monitoring, SLIs/SLOs, automation, CI/CD, IaC (Terraform), and incident response; optimize availability and performance; produce runbooks and drive reliability, security, and disaster-recovery best practices.
Top Skills:
Azure,Azure App Gateway,Waf,Azure Container Apps,Prometheus,Splunk,Grafana,Elk,Datadog,Docker,Containers,Git,Github Actions,Terraform,Linux,Unix,Python,Go,Java,Bash,Ci/Cd
Fintech • Financial Services
Lead Software Engineer responsible for driving engineering solutions, ensuring performance, automating processes, supporting production teams, and collaborating with multiple teams.
Top Skills:
AngularAppdynamicsAWSAzureGlassboxGrafanaJavaJavaScriptJSONKubernetesLinuxNode.jsOpenshiftPcfPksPrometheusPythonRubySplunkVMwareWindows
What you need to know about the Hyderabad Tech Scene
Because of its proximity to leading research institutions and a government committed to the city's growth, Hyderabad's tech scene is booming. With plans to establish India's first "AI city," the city is on track to become one of the world's most anticipated tech hubs, with companies like TransUnion, Schrödinger and Freshworks, among others, already calling the city home.


