Job Description
Summary
You will be an integral part of leading Geminis engineering teams towards modern DevOps practices, both by developing and providing modern automation and operational tooling, and working cross-functionally across Geminis engineering teams to influence and shape our development practices and culture.Responsibilities
- Running on-going performance evaluations and improvements for Gemini systems
- Creating Production-ready Scorecards to evaluate the health of systems pre-launch
- Implementing and teaching monitoring, alerting and automated resolution best practices
- Defining SLIs, SLOs with Engineering teams
- Educating and guiding Engineering teams on reliability and resiliency best practices, like statelessness, chaos testing, blue/green deployments etc.
- Building operational tooling and automations
Qualifications
- 2+ years using monitoring, alerting, and automation tooling to understand and remediate performance and health issues in systems at scale
- Good knowledge for various cloud technology providers like AWS, GCP, or Azure
- Experience in a code-first environment, developing automated solutions to solve support and operational issues
- Experience working with containerization such as Nomad, EKS (k8s), Docker, etc.
- Experience working with Configuration Management such as Ansible, Chef, Puppet
- Experience writing scripts or cli tools that help increase Developer Productivity
- Experience working with Engineering teams to implement best-practice technical solutions
- Experience working in a code-drive, automation-first public cloud infrastructure (Terraform)
It Pays to Work HereThe Compensation & Benefits Package For This Role Includes
- Competitive base salary
- Benefits
- Discretionary annual bonus
Skills
- Communications Skills
- Software Engineering