Magic is hiring a

Distributed Compute Engineer

Job Overview

Posted 6 days ago
Full Time
San Francisco, CA, USA
90000

Roles & Responsibilities

About the role: As a distributed systems engineer for compute, you will build the stack and systems that enable 1T+ parameter model training and efficient inference on Magic’s GPU clusters.

What you might work on:

Develop and maintain the software stack to support large-scale, highly available AI training and inference infrastructure
Implement and optimize systems for data processing and inference using technologies like Ray, Redis,
Message Queues (Kafka), distributed communication libraries (gRPC, ZeroMQ) and HPC technologies
Orchestrate fine-grained data movement using Rust, C++ and NCCL or UCX
Design and manage high-performance storage and caching solutions to support data-intensive applications
Build with an eye towards fault-tolerance, performance and observability
Hack on the internals of deep learning frameworks (PyTorch, Jax) in a distributed setting
Troubleshoot and resolve complex issues across GPU resources, networking, OS, drivers, and cloud environments. Automate fault detection and recovery processes

Skills Required

Machine Learning
Python

Find more jobs at Magic

There are no results matching your search.

Reset

Find similar jobs for Engineer

Process Automation Engineer

ArcelorMittal

💼 Full Time
🌍 Sestao
💰 34000

⏰ 10 hours ago

Process Automation Engineer

ArcelorMittal

💼 Full Time
🌍 Sestao
💰 34000
⏰ 10 hours ago

Engineer, Validation

SiFive

💼 Full Time
🌍 Bengaluru
💰 19000

⏰ 10 hours ago

Engineer, Validation

SiFive

💼 Full Time
🌍 Bengaluru
💰 19000
⏰ 10 hours ago

Tech Innovation - R&D Software Engineer

Infomineo

💼 Full Time
🌍 Cairo
💰 84000

⏰ 10 hours ago

Tech Innovation - R&D Software Engineer

Infomineo

💼 Full Time
🌍 Cairo
💰 84000
⏰ 10 hours ago

Algorithm Engineer

Taboola

💼 Full Time
🌍 Taipei
💰 19000

⏰ 10 hours ago

Algorithm Engineer

Taboola

💼 Full Time
🌍 Taipei
💰 19000
⏰ 10 hours ago

Senior Software Test Engineer

LVIS

💼 Full Time
🌍 Seocho-gu
💰 28000

⏰ 10 hours ago

Senior Software Test Engineer

LVIS

💼 Full Time
🌍 Seocho-gu
💰 28000
⏰ 10 hours ago