Role
Location: Zürich, Switzerland
Seniority: 5+ years (systems, compilers, or embedded)
Join the core software team that turns ComputeRAM® into something developers love to use. You will design and ship a software integration kit, build the software library components that sit between applications and hardware, develop compiler passes that lower NN/DSP ops to our primitives, and own a solid benchmarking pipeline. You will work closely with silicon and equally closely with users, moving from prototype to production with clean code, clear documentation, and measurable speed and energy gains.
- Design and implement embedded software libraries and low-level runtime for ComputeRAM-enabled platforms.
- Develop and maintain the compiler path (MLIR/LLVM passes, code generation, kernels) that maps AI and DSP primitives and related operations to our hardware.
- Develop and refine a benchmarking and profiling framework that incorporates reproducible tests, dashboards, and regression gates.
- Strengthen build, test, and CI so releases are predictable and artifacts are easy to consume.
- Collaborate with hardware, architecture, and customer-facing teams; write precise specs and documentation; turn feedback into roadmap items.
- A production-ready driver + runtime stack for at least one MCU target and one accelerator-class target.
- A working compiler path with visible wins in latency and energy on representative models, documented end-to-end.
- A stable benchmark suite with automated reports and performance guards integrated in CI.
- Developer-quality docs, examples, and reference projects that make first use smooth for partners.
- 5+ years building low-level software or compilers; strong C++ and Python; you have shipped production code.
- Hands-on experience with embedded systems and compiler design
- Solid systems understanding. memory and concurrency fundamentals
- Comfortable reading hardware datasheets and working at the HW/SW boundary.
- Evidence of performance work (profiling, tracing, optimization) on embedded or accelerator targets.
- Clear writing, good documentation habits, and a collaborative approach.
- Experience deploying deep-learning workloads on edge devices; familiarity with TensorFlow Lite for Micro, TVM, or IREE.
- HPC exposure (DirectML, OpenCL, CUDA) or DSP algorithm implementations.
- CI/CD depth and packaging for developer kits, utilizing GitHub Actions or a similar tool.
- Quantization and fixed-point experience for edge inference.
Send your CV and a short note (2–3 paragraphs on a design you owned, your toughest bug & how you solved it, an intro on what you like to do and how you see yourself as an engineer)
- 30-min intro with HR (role/context)
- First Technical deep dive
- Second Technical Deep Dive
- (Optional) Third Technical Deep Dive
- Systems/product conversation with management