Job Description

Summary

We are looking for a Senior Site Reliability Engineer (SRE 3) to join the Launch team at P2P.org

This team is responsible for bringing new blockchain networks into production—from initial design and deployment to ensuring they are stable, observable, and production-ready. You will work at the intersection of infrastructure, protocol engineering, and operations, helping us scale to support dozens of networks reliably.

This is a hands-on engineering role. You will design, build, and operate infrastructure for new networks, working closely with protocol teams, infrastructure, and security. You will take ownership of launches end-to-end - ensuring that what we ship is reliable, repeatable, and aligned with our platform standards.

The right person for this role is execution-focused, systems-minded, and pragmatic. You are comfortable operating in ambiguity, but you also bring structure—turning unclear requirements into working systems. You care about reliability, automation, and doing things properly, but you also know how to move fast when needed.

You will

Network Launch & Operations

  1. Lead the end-to-end launch of new blockchain networks—from testnet to mainnet
  2. Design and implement deployment architectures for validators, full nodes, RPCs, and supporting services
  3. Ensure all new networks meet production readiness standards—monitoring, alerting, backups, failover, and security
  4. Collaborate with protocol teams to understand network-specific requirements, risks, and failure modes
  5. Create repeatable launch patterns and runbooks to reduce time-to-market for new networks

Infrastructure & Reliability Engineering

  1. Build and operate infrastructure across cloud and bare-metal environments
  2. Improve automation and standardisation of deployments using Terraform, Helm, and internal tooling
  3. Contribute to the internal platform by aligning launches with existing Kubernetes, observability, and delivery standards
  4. Implement high-availability and fault-tolerant setups for validator infrastructure
  5. Continuously improve SLOs, SLIs, and alerting for newly launched networks

Observability & Incident Response

  1. Ensure all services are fully observable—metrics, logs, and traces
  2. Define and implement alerts that are actionable and low-noise
  3. Participate in on-call rotations and incident response
  4. Lead or contribute to post-incident reviews, focusing on systemic improvements
  5. Proactively identify and fix reliability risks before they impact production

Security & Best Practices

  1. Apply security best practices to all deployments—secrets management, access control, and network isolation
  2. Ensure compliance with internal standards and contribute to SOC 2-aligned practices
  3. Support secure key management practices for validator infrastructure

Collaboration & Ownership

  1. Work closely with Infrastructure, Core Networks, and Security teams
  2. Take ownership of deliverables - from design to production
  3. Contribute to documentation, runbooks, and knowledge sharing
  4. Support and mentor more junior engineers when needed

You have

  1. 5+ years of experience in SRE, DevOps, or infrastructure engineering
  2. Strong experience operating production systems at scale
  3. Hands-on experience with:
  4. Kubernetes (deployment, troubleshooting, operations)
  5. Terraform (infrastructure as code)
  6. Linux systems and networking fundamentals
  7. Experience with at least one cloud provider (GCP preferred, AWS, Azure, OCI)
  8. Experience with observability tooling (Prometheus, Grafana, Loki, or similar)
  9. Familiarity with CI/CD systems and GitOps workflows (e.g., ArgoCD)
  10. Solid scripting or programming skills (Go, Python, or similar)
  11. Experience working in distributed systems or high-availability environments
  12. Strong debugging and problem-solving skills under pressure
  13. Good communication skills and ability to work across teams (English B2 minimum)

Nice to have

  1. Experience with blockchain infrastructure (validators, RPC nodes, staking systems)
  2. Experience with bare-metal environments
  3. Experience with distributed tracing or advanced observability setups
  4. Exposure to security and compliance frameworks (SOC 2, ISO 27001)

Skills
  • AWS
  • Communications Skills
  • Development
  • Python
  • Software Engineering
  • Team Collaboration
© 2026 cryptojobs.com. All right reserved.