Senior Storage & Data Engineer

ETH Zürich -
Lugano, TI

Jetzt bewerben

Details zur Stelle

Teilzeit | Pensum: 80-100% | 100%
vor 4 Stunden

Qualifikationen

CI/CD
Terraform
S3
GitLab
Python

Vollständige Stellenbeschreibung

Senior Storage & Data Engineer

80%-100%, Lugano, fixed-term

The Swiss National Supercomputing Centre (CSCS) develops and operates a high-performance computing and data research infrastructure that supports world-class science in Switzerland. Its user laboratory is available to domestic and international researchers in academia, industry, and the business sector. The centre is operated by ETH Zurich and has offices at its data centre in Lugano and in Zurich.

For this position the work location is either Lugano or Zürich. The contract is for two years.

Project background

Storing petabytes is the easy part. The hard part is everything between the moment data lands on disk and the moment a researcher — or a training job — can actually trust it, find it, and use it.
Our parallel filesystems and object stores already move data fast. What they don't do on their own is tell a scientist where a dataset came from, which transformations produced it, whether it's the version that backed last quarter's published result, or how to feed it to a DataLoader without saturating the I/O subsystem. That gap — between raw bytes and usable, traceable, reproducible data — is where this role lives.
You'll work at both ends: the storage layer (throughput, integrity, tiering at multi-petabyte scale) and the data layer above it (lineage, provenance, discoverability, access patterns). If you've ever been annoyed that "the data is on the cluster" gets treated as the end of the job rather than the start of it, read on.

Job description

Bridge ingestion and use. Design the pipelines and metadata that turn ingested data into something findable and consumable — catalogs, schemas, and access layers that match how training jobs and simulations actually read, not just where bytes sit.
Make data traceable. Build lineage and provenance so any dataset, checkpoint, or result can be traced back to its inputs and transformations. Reproducibility is a first-class requirement here, not a retrofit.
Tune for the workload. Optimise parallel filesystems (Lustre, GPFS) and object storage for the concurrency, small-file, and large-checkpoint patterns of distributed GPU training and HPC simulation.
Operate at scale, safely. Design and run multi-petabyte storage with the integrity and availability scientific work depends on — erasure coding, redundancy, hot-to-archival tiering.
Automate everything. Deploy and scale storage and data services as code. Snowflake infrastructure doesn't survive at this scale.
Make it observable. Instrument storage health, capacity trends, and pipeline performance so problems surface before users feel them.
Translate. Turn real access patterns from domain scientists and ML engineers into technical requirements — and push back when a request would quietly break something downstream.

For a project in the weather and climate domain, aimed at understanding and mitigating the impact of climate change, an opening for two years is available.

The initial two-year contract could potentially be extended or even become permanent.

Profile

A technical degree (CS, engineering) or equivalent experience that demonstrates the same depth.
Solid storage grounding: filesystems (block and object), performance tuning, redundancy (RAID, erasure coding).
Python, and comfort automating infrastructure (Ansible, Terraform, or similar).
A working understanding of how ML and scientific workloads consume data — billions of small files, large checkpoints, sharding — and why naive layouts fall over.
A point of view on data lineage, provenance, or reproducibility — and ideally tooling you've used to enforce it.

What helps you stand out

Hands-on parallel filesystems (Lustre, Spectrum Scale/GPFS) or distributed storage (Ceph, VAST).
Scientific data formats — HDF5, Zarr, Parquet — and opinions on when each earns its place.
Object storage (S3) interfaced with ML frameworks (PyTorch, TensorFlow).
Orchestration (Kubernetes, Argo) and data-movement tooling.
Data versioning / cataloguing (e.g. DVC, lakeFS, a metadata catalog) and familiarity with FAIR data principles.
CI/CD and provisioning: GitLab CI, HashiCorp Vault, MAAS.

We don't expect every box ticked. Depth in storage or data engineering, plus the curiosity to grow into the other, matters more than a complete checklist.

What you get

Hardware and scale you won't find in enterprise IT — and problems with no vendor playbook.
Work that directly enables published science and frontier-scale model training.
Room to shape how data is managed, not just maintained, in an environment that takes it seriously.

Our core values as guiding principles:

Curiosity: You enjoy learning and understanding systems deeply
Openness: You collaborate effectively and value different perspectives
Courage: You are willing to tackle difficult or unfamiliar problems
Supportive: You help colleagues and users succeed
Integrity: You act responsibly, reliably, and transparently

Workplace

We offer

We are committed to building a diverse and inclusive engineering team and particularly encourage applications from groups underrepresented in tech. If you are technically adept, curious, and eager to grow, we want to hear from you.

Your job with impact: Become part of ETH Zurich, which not only supports your professional development, but also actively contributes to positive change in society
You can expect numerous benefits, such as public transport season tickets and car sharing, a wide range of sports offered by the ASVZ, childcare and attractive pension benefits
You can look forward to an exciting working environment, cultural diversity and attractive offers and benefits.
We value the diversity of our team and, to further enhance the diversity of our workforce, we particularly encourage women to apply.

We value diversity and sustainability

In line with our values, ETH Zurich encourages an inclusive culture. We promote equality of opportunity, value diversity and nurture a working and learning environment in which the rights and dignity of all our staff and students are respected. Visit our Equal Opportunities and Diversity website to find out how we ensure a fair and open environment that allows everyone to grow and flourish. Sustainability is a core value for us – we are consistently working towards a climate-neutral future.

Curious? So are we.

Please include the following documents with your application:

Motivation letter in pdf
CV in pdf
Relevant certificates and diplomas in pdf format

We look forward to receiving your online application, including a letter of motivation, CV, diplomas, and employment certificates. Please address your application to Mrs Stephanie Frequente, CSCS Human Resources, Via Trevano 131, 6900 Lugano.

Please note that we exclusively accept applications submitted through our online application portal. Applications via email or postal services will not be considered.

Further information about CSCS can be found on our website. Questions regarding the position should be directed to Pim Witlox, [email protected] (no applications).

For recruitment services, the GTC of ETH Zurich apply.

Jetzt bewerben

Senior Storage & Data Engineer

80%-100%, Lugano, fixed-term

Project background

Job description

Profile

Workplace

Workplace

We offer

We value diversity and sustainability

Curious? So are we.

Tools für Jobsuchende

Arbeitgebertools

Durchsuchen

Bleiben Sie in Kontakt