Senior DevOps Engineer (Network Specialist) at BitMEX | Hong Kong | Full-Time | cryptojobs.com | Best Platform for the Latest Web3 and Blockchain Jobs

Senior DevOps Engineer (Network Specialist)

BitMEX

Onsite

Hong Kong

Full-Time

Engineering

Summary

Role Overview

As a member of the Platform Engineering team, you will be responsible for managing and supporting the infrastructure which drives our platform. The reliability and scalability of our technology is key to our success and this position will work with our development and security teams to help design highly available and fault tolerant systems.

In particular you will be focussed on monitoring and optimizing our network performance to support the low-latency, high throughput operation of our trading exchange.

Key Responsibilities

Continuously improve the resiliency, throughput and latency profiles of our trading systems, by working hand-in-hand with our trading technology teams
Manage and support our AWS cloud infrastructure, EC2 instances and physical
servers
Development and management of IaC to ensure consistency of our infrastructure
Ensuring security hardening of our OS builds and configurations
Manage and maintain config management tooling to ensure consistency
Integration of our stack with Kubernetes
Ensure SRE best practices for design and operation of the stack
Design, implement and test disaster recovery capabilities to ensure our business
can continue to operate in the event of a technology failure
Participate in an on-call rota for escalations

Qualifications

Theoretical and practical networking knowledge, incl. but not limited to unicast and multicast routing protocols, Linux kernels TCP stack implementation, congestion avoidance/control (e.g. BBR), traffic control, network simulation, AWS VPC / TGW & Kubernetes VPC CNI, etc. DPDK experience being a plus.
Professional experience with kernel troubleshooting: strace, bpftrace, perf profiling/tracing, navigating / reading / building the relevant kernel code.
Professional experience with userland monitoring (e.g. Thanos/Prometheus/AlertManaging), logging (e.g. Splunk/Loki), alerting, troubleshooting, profiling/tracing, etc.
Strong practical AWS knowledge, with min. 5 years of SRE / DevOps experience supporting and managing Linux based systems. Computer science, or engineering, degree preferred - strong understanding of fundamental Computer Science principles is required.
Familiarity with Kubernetes / Ansible / Chef, and with one or more programming language: Python, Golang, C, NodeJS.

About Company

Job Description

Summary

Skills

About Company

Job Description

Summary

Skills

Newsletter