[Remote] Research Scientist Intern (ML System) - 2026 Start (PhD)
Note: The job is a remote job and is open to candidates in USA. ByteDance is a global technology company known for its innovative products like TikTok and CapCut. They are seeking a Research Scientist Intern to develop and optimize machine learning frameworks, focusing on large-scale distributed systems and GPU performance optimization.
Responsibilities
- Responsible for developing and optimizing LLM training & inference & Reinforcement Learning framework
- Working closely with model researchers to scale LLM training & Reinforcement Learning to the next level
- Responsible for GPU and CUDA Performance optimization to create an industry-leading high-performance LLM training and inference and RL engine
Skills
- Currently pursuing a PhD in computer science, automation, electronics engineering or a related technical discipline
- Proficient in algorithms and data structures, familiar with Python
- Understand the basic principles of deep learning algorithms, be familiar with the basic architecture of neural networks and understand deep learning training frameworks such as Pytorch
- Proficient in GPU high-performance computing optimization technology on CUDA, in-depth understanding of computer architecture, familiar with parallel computing optimization, memory access optimization, low-bit computing, etc
- Familiar with FSDP, Deepspeed, JAX SPMD, Megatron-LM, Verl, TensorRT-LLM, ORCA, VLLM, SGLang, etc
- Knowledge of LLM models, experience in accelerating LLM model optimization is preferred
Benefits
- Interns have day one access to health insurance
- Life insurance
- Wellbeing benefits and more
- Interns also receive 10 paid holidays per year
- Paid sick time (56 hours if hired in first half of year, 40 if hired in second half of year)
- Interns who are not working 100% remote may also be eligible for housing allowance.
Company Overview
Company H1B Sponsorship
Apply To This Job