Remote first (must be based within CET +/- 3 timezones) | Full-time
About Arkiv
Blockchain data is hard to use. Developers store data on-chain, then immediately need indexers, subgraphs, and custom infrastructure just to query it back. We think that's broken.
Arkiv is fixing it at the protocol layer - a blockchain with queryable storage built in. SQL queries against on-chain data, no external indexing required. It's a different architecture, and it doesn't exist anywhere else yet.
We're a small, focused team preparing for mainnet launch. If you want to run foundational Web3 infrastructure and own the experience developers have when they interact with Arkiv, this is the moment.
The Role
You are deep in blockchain infrastructure and are responsible for production loads. What really matters is that you run systems where downtime has real consequences.
You will own everything that runs Arkiv in production and everything users interact with: L3 infrastructure, monitoring, the website, dashboards, Bridge UI, explorer, and developer onboarding.
Developers, users, and partners are why we exist — their success is our success.
This is a hands-on leadership role. You will build and own the infrastructure that keeps Arkiv running - and you'll own the surfaces where developers experience it: the explorer, the Bridge UI, the docs, the getting started guides. You'll debug production issues, design monitoring systems, and be on-call. You will also shape how we operate, mentor engineers, and help define what the platform team looks like as we grow.
Our philosophy: We own production. When it breaks, we learn. Our goal is fewer pages, faster recovery, and systems that let us sleep.
You collaborate closely with the CTO on reliability and infrastructure strategy, and with the Protocol team who builds what you operate. You lead a small team today, with a mandate to grow and shape it.
You'll also be the technical face of Arkiv for prospective partners — supporting pilots, answering integration questions, and making sure their first experience with Arkiv is excellent. For deep protocol questions, you'll pull in the CTO or Protocol team — but day-to-day partner enablement is yours.
What You'll Own
The Head of Platform will be responsible for several critical domains across the production environment and user-facing applications.
Domain
Scope
L3 Infrastructure
Kubernetes clusters, deployments, networking, secrets management
Monitoring & Reliability
Prometheus, Grafana, alerting, incident response, runbooks, SLOs
Website & Bridge UI
Public website, token bridge application
Explorer & Dashboards
Blockscout deployment, internal ops dashboards, network statistics
Developer Onboarding
Docs website, getting started guides, faucet
Vendor Management
External L2 ops provider relationship, SLAs, escalations
Partner Pilots
Demo environments, pre-sales technical support, integration assistance for prospective partners
Team
Hiring, mentoring, growing engineers to senior level
Who You Are
Character and drive matter more than your CV.
You are genuinely curious - but your curiosity has a specific flavor: you need to understand how things fail. Not just well enough to deploy, but well enough to debug when others normally sleep and make sure it doesn't happen again.
You are hands-on. You've led teams, but you haven't stopped doing the work. You believe the best infrastructure leaders stay close to production. You can run a planning meeting, but you're equally comfortable in a terminal session during an incident.
You are data-driven. SLOs, MTTR, incident frequency - you measure what matters and use data to prioritize. "I think it's reliable" isn't good enough; you want to know.
You collaborate well. You're demanding and you have high standards - but you're also approachable, and hopefully humorous. People want to work with you, not just for you. You make the team better by being in it.
You see beyond your lane. When something is broken or could be better, you notice - even if it's not strictly your responsibility. Production issues don't respect org charts, and neither do you.
You have an inner drive to improve. The current state is never good enough. You're always looking for what's more reliable, more observable, more automated.
You stay calm under pressure. Incidents happen. You set the tone: clear thinking, structured response, blameless postmortems that drive improvement.
You've done this before. You have a track record running demanding production systems - ideally at scale, ideally with real consequences for downtime. You've grown engineers. You've been responsible for both the systems and the people operating them.
What We Expect
Stay hands-on: This is not a "meetings and dashboards" role - you debug, deploy, and build
Own production: Reliability is your responsibility; you set SLOs and hold the team to them
Lead by example: Set the standard for operational excellence, quality, and collaboration
Mentor and grow: Help engineers reach senior level; give direct, useful feedback
Partner with Protocol: Work closely with the protocol team who builds what you operate - shared ownership of reliability and success
Shape the team: Help define roles, hiring priorities, and team structure as we scale
Technical Requirements
We are looking for a leader with significant hands-on technical depth, as detailed in the table below.
Requirement
What We're Looking For
Infrastructure: Significant experience (5+ years typical) running production systems; Kubernetes across different environments (cloud providers and bare metal); Infrastructure-as-Code (Terraform, Ansible, or similar) - you've built and operated infrastructure at scale, regardless of where it runs
Blockchain: Prior blockchain/node experience is a prerequisite
CI/CD & GitOps: Real-world experience building and operating deployment pipelines; GitOps tools (eg ArgoCD, Flux); fast, safe releases are second nature to you
Coding: Proficiency in Go, Python, or TypeScript; you can write tools, not just configure them
Observability: Deep experience with monitoring, alerting, and incident response - you design systems that tell you when something's wrong before users notice
SRE mindset: You think in SLOs, measure MTTR, and treat reliability as a craft, not a checkbox
Software development: Solid engineering background - you can review code, guide technical decisions, and collaborate meaningfully with developers on your team
Leadership: Led and grown a team (3+ engineers); your exact title is secondary - what matters is that you've been responsible for people, not just systems
Strong Plus:
Track record in open source projects or significant internal platform/tooling work
Incident commander experience
Vendor/SLA management experience
Experience with bare metal Kubernetes deployments or cost-optimized infrastructure setups
We know this list is demanding - covering all of it is rare. If you're a strong technical leader and most of these requirements resonate, we encourage you to apply.
How We Work
Ownership: See a problem, own it - follow through or escalate.
Direct feedback: We challenge ideas openly and say what we mean.
Ship fast, learn faster: Simple solutions, quick iterations, mistakes are data.
Stay close to users: Decisions grounded in real signals from customers and the larger Web3 community, not assumptions.
Location & Compensation
Remote first (must be based within CET +/- 3 timezones)
Competitive salary, commensurate with experience
Arkiv is building infrastructure for a more open internet. If this sounds like you, we'd love to talk.