Job Description
Summary
The team
Join our Trading Technology team at the forefront of revolutionizing financial technology and help build the internet of money!
As a key contributor, you'll work independently, collaborating with stakeholders beyond formal management, ensuring the seamless operation, support, and security of our Core Trading production infrastructure. From monitoring environments to managing releases with Hashicorp Nomad and implementing robust metrics, alerts, and monitoring systems, you'll play a crucial role in the team's success. Your expertise in improving Developer Tooling, building Docker images, and managing CI pipelines will contribute to the automation of quality testing, while your analytical skills will be essential in identifying and mitigating potential risks of downtime. Join us in shaping the future of financial technology and be part of a team that has played a critical role in scaling Kraken's trading infrastructure globally.
This role is fully remote. We are specifically looking for candidates in EST Timezone (+/-1) to cover current needs.
The opportunity
- Work highly independently, with multiple stakeholders outside of the formal management structure
- Responsible for the operation, support, and security of production infrastructure for Core Trading Services
- Monitor and support Staging and Production environments
- Manage releases using Hashicorp Nomad
- Implement robust metrics, alerts and monitoring of Trading infrastructure
- Improve Developer Tooling, help with building Docker images, manage our Continuous Integration (CI) pipelines for automating quality testing
- Analyze potential risks of downtime and develop systems that will eliminate the issue
- Support a fully distributed team operating across numerous timezones
Skills you should HODL
- 4+ years of experience in a SRE role (DevOps, SRE, etc)
- Experience with high performance, low latency distributed systems (particularly financial)
- Experience with Hashicorp Consul, Nomad, Vault and its PKI features
- Experience with monitoring/alerting (primarily with Prometheus/Grafana) and knowledge of best practices in the area
- Experience with Bash, Python, YAML, Configuration and Secret Management
- Experience with distributed systems and technologies - gRPC & Kafka
- Experience configuring Continuous Integration (CI)
- Understanding of Unix/Linux operating systems, shell scripting
- Understand DNS, SSL/TLS, and how traffic on IP networks establishes end-to-end security and trust
- Understanding of networking concepts such as TCP/IP and UDP
- Experience in logging, monitoring, tracing e.g. Cloudwatch, Elasticsearch/Kibana (ELK)
Nice to haves
- Familiar with Fix protocol
- Experience with web sockets and Real-Time Market Data feeds
- Experience with Terraform, Kubernetes and Helm Charts
- Understanding of digital currency trading market
Skills
- Development
- Networking
- Python
- Software Engineering

