Job Description
Summary
You'll lead a cross-functional pod that spans the full stack, from C++ inference engines to JavaScript applications. Your responsibility is to ensure that local AI capabilities ship reliably and perform well across devices. You'll balance hands-on technical work with team coordination, guiding foundation and middleware engineers toward shared goals.
This role is ideal for someone who understands both the low-level challenges of edge AI and the product-facing needs of app developers, and wants to drive the delivery of cohesive, production-ready local AI systems.
Responsibilities
- Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, onnx
- Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments
- Integrate AI features into existing products, enriching them with the latest advancements in machine learning
- Managing a cross functional team (pod) made of middleware (JS), foundation (C++), QA and documentation engineers to produce high quality deliverables
- Regularly assessing, both qualitatively and quantitatively, our position in the market with regards to similar products or platforms
- Leveraging the expertise of technical architects to ensure robust architectural choices and code quality
- Ensuring stable releases by following precise internal release processes
Job requirements
- Excellent programming skills C++
- Strong experience with Llama.cpp and ggml inference engines, which facilitates the deployment of models to specific GPU architectures
- Good understanding of deep learning concepts and model architectures
- Experience with transformers and LLMs
- Demonstrated ability to rapidly assimilate new technologies and techniques
- Has experience managing a small, specialized, cross functional team (pod) of 3-5 people
- Has a genuine passion for building good products that improve people's lives
- A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D
Bonus points if:
- You have extensive experience with Javascript/Typescript
- You have experience with AWS, containerization platforms, orchestration, and automated testing suites (Maestro, Appium)
- You understand the difficulties, nuances and importance of p2p technology
- You have worked with MLC, TVM or similar frameworks
- You have experience with Vulkan, CUDA
- You have productionized models
Skills
- C++
- Communications Skills
- Development
- Team Collaboration
- TypeScript

