Job Description
Summary
Your Impact
- Build and orchestrate large, distributed infrastructure with a focus on automation
- Shape the resilience, efficiency, and scalability of our services
- Partner with engineers from across the company to help troubleshoot issues, deploy solutions, and accelerate their velocity
- Innovate and enhance the infrastructure platform team’s product offerings to increase self service, improve cost optimization, and reduce toil
- Provide technical leadership and mentoring to your team and others
- Champion best practices in reliability, security, and cloud infrastructure to help cultivate a culture of high operational standards
Requirements
- At least 8 years of relevant professional experience. You probably have worked on a devops, infrastructure, SRE, and/or platform team before
- Ability to develop software outside of the scope of typical infrastructure requirements and configurations
- Experience in deploying or extending an internal developer platform to reduce cognitive burden and increase efficiency of other engineering teams
- Have led large cross-team initiatives and can demonstrate a successful track record with quantifiable metrics that impact the business
- Practical experience in shell scripting and demonstrable skills in at least one higher-level language
- Excellent understanding of Linux
- Expert knowledge in all aspects of designing, deploying, and supporting large real-time systems
- Experience with monitoring, logging, alerting and end to end tracing is a plus
- Experience with distributed systems and container orchestration.
- Strong communication skills. You can give and receive constructive feedback, and you do not shy away from planning meetings and code reviews
- Familiar with most tools from our stack (see below)
Desired Qualifications
- Experience running any infrastructure in the blockchain/web3 space is a plus
- Ability to scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
- Experience with internal, ephemeral test and development environments
- Experience implementing self service tooling
- Developed strategies for using Software Build of Materials (SBOM), artifact signing and verification
- Passion for security
- Experience with setting team priorities (OKRs) and aligning business processes required to get a product/service from ideation to production (PRD, RFC, etc)
- Experience working remotely in a distributed team
- A strong desire to grow and challenge yourself. We would expect you to constantly find ways to improve and automate services to reduce toil
- Excitement for blockchain, Web 3.0, and similar decentralized technologies.
Our Stack
- We adhere to the GitOps approach to infrastructure and state management. Self service and automation through our internal developer platform is paramount.
- Some of the tools and services we use daily or almost daily are: AWS; Terraform/Terragrunt; Kubernetes, ArgoCD; GitHub Actions; Grafana
- We expect you to be comfortable with many of these tools or have a strong understanding of the fundamental concepts the tools are applied to.
Skills
- AWS
- Communications Skills
- Development
- Software Engineering
- Team Collaboration