Job Description
Summary
The Department: Data
At Gemini, our Data Team is the engine that powers insight, innovation, and trust across the company. We bring together world-class data engineers, platform engineers, machine learning engineers, analytics engineers, and data scientists — all working in harmony to transform raw information into secure, reliable, and actionable intelligence. From building scalable pipelines and platforms, to enabling cutting-edge machine learning, to ensuring governance and cost efficiency, we deliver the foundation for smarter decisions and breakthrough products. We thrive at the intersection of crypto, technology, and finance, and we’re united by a shared mission: to unlock the full potential of Gemini’s data to drive growth, efficiency, and customer impact.
The Role: Staff Data Engineer
The Data team is responsible for designing and operating the data infrastructure that powers insight, reporting, analytics, and machine learning across the business. As a Staff Data Engineer, you will lead architectural initiatives, mentor others, and build high-scale systems that impact the entire organization. You will partner closely with product, analytics, ML, finance, operations, and engineering teams to move, transform, and model data reliably, with observability, resilience, and agility.
This role is required to be in person twice a week at either our San Francisco, CA or New York City, NY office.
Responsibilities:
- Lead the architecture, design, and implementation of data infrastructure and pipelines, spanning both batch and real-time / streaming workloads
- Build and maintain scalable, efficient, and reliable ETL/ELT pipelines using languages and frameworks such as Python, SQL, Spark, Flink, Beam, or equivalents
- Work on real-time or near-real-time data solutions (e.g. CDC, streaming, micro-batch) for use cases that require timely data
- Partner with data scientists, ML engineers, analysts, and product teams to understand data requirements, define SLAs, and deliver coherent data products that others can self-serve
- Establish data quality, validation, observability, and monitoring frameworks (data auditing, alerting, anomaly detection, data lineage)
- Investigate and resolve complex production issues: root cause analysis, performance bottlenecks, data integrity, fault tolerance
- Mentor and guide more junior and mid-level data engineers: lead code reviews, design reviews, and best-practice evangelism
- Stay up to date on new tools, technologies, and patterns in the data and cloud space, bringing proposals and proof-of-concepts when appropriate
- Document data flows, data dictionaries, architecture patterns, and operational runbooks
Minimum Qualifications:
- 8+ years of experience in data engineering (or similar) roles
- Strong experience in ETL/ELT pipeline design, implementation, and optimization
- Deep expertise in Python and SQL writing production-quality, maintainable, testable code
- Experience with large-scale data warehouses (e.g. Databricks, BigQuery, Snowflake)
- Solid grounding in software engineering fundamentals, data structures, and systems thinking
- Hands-on experience in data modeling (dimensional modeling, normalization, schema design)
- Experience building systems with real-time or streaming data (e.g. Kafka, Kinesis, Flink, Spark Streaming), and familiarity with CDC frameworks
- Experience with orchestration / workflow frameworks (e.g. Airflow)
- Familiarity with data governance, lineage, metadata, cataloging, and data quality practices
- Strong cross-functional communication skills; ability to translate between technical and non-technical stakeholders
- Proven experience in mentoring, leading design discussions, and influencing data-engineering best practices across teams
Preferred Qualifications:
- Experience with crypto, financial services, trading, markets, or exchange systems
- Experience with blockchain, crypto, Web3 data — e.g. blocks, transactions, contract calls, token transfers, UTXO/account models, on-chain indexing, chain APIs, etc.
- Experience with infrastructure as code, containerization, and CI/CD pipelines
- Hands-on experience managing and optimizing Databricks on AWS
It Pays to Work Here
The compensation & benefits package for this role includes:
- Competitive starting pay
- A discretionary annual bonus
- Long-term incentive in the form of a new hire equity grant
- Comprehensive health plans
- 401K with company matching
- Paid Parental Leave
- Flexible time off
Salary Range:
The base salary range for this role is between $168,000 - $240,000 in the State of New York, the State of California and the State of Washington. This range is not inclusive of our discretionary bonus or equity package. When determining a candidate’s compensation, we consider a number of factors including skillset, experience, job scope, and current market data.
Skills
- AWS
- Communications Skills
- Development
- Leadership
- Software Engineering
- Team Collaboration

