Senior Software Engineer - Together Cloud Platform
Company: Together AI
Location: San Francisco
Posted on: April 2, 2026
|
|
|
Job Description:
About the Role Together AI is building the AI Acceleration
Cloud, an end-to-end platform for the full generative AI lifecycle,
combining the fastest LLM inference engine with state-of-the-art AI
cloud infrastructure. As a Senior Backend Engineer, you will play a
key role in building the next generation AI cloud platform – a
highly available, global, blazing-fast cloud infrastructure that
virtualizes cutting-edge ML hardware (GB200s/GB300s, BlueField
DPUs) and enables state-of-the-art ML practitioners with self-serve
AI cloud services, such as on-demand managed Kubernetes and Slurm
clusters. This platform serves both our internal StaaS products
(inference, fine-tuning) and our external cloud customers, spanning
dozens of data centers across the world. Some of what you’ll work
on: Work on a distributed GPU scheduling system for the on-demand
clusters product, Instant Clusters. Build out a global management
plane for managing our data center compute, networking, and
storage. Design and build new customer-facing cloud platform
services, delivering killer enterprise AI cloud features.
Responsibilities Identify, design, and develop foundational backend
services that power Together’s cloud platform Analyze and improve
the robustness and scalability of existing distributed systems,
APIs, databases, and infrastructure Partner with product teams to
understand functional requirements and deliver solutions that meet
business needs Write clear, well-tested, and maintainable software
and IaC for both new and existing systems Conduct design and code
reviews, create developer documentation, and develop testing
strategies for robustness and fault tolerance Participate in an
on-call rotation to address critical incidents when necessary
Requirements 5 years of demonstrated experience in building large
scale, fault tolerant, distributed systems and API microservices
Experience designing, analyzing and improving efficiency,
scalability, and stability of various system resources Excellent
communication skills – able to write clear design docs and work
effectively with both technical and non-technical team members
Demonstrated experience with building and operating
high-performance and/or globally distributed microservice
architectures across one or more cloud providers (AWS, Azure, GCP)
Strong systems knowledge across compute, networking, and storage,
including concurrency, memory management, performant I/O, and scale
Experience developing against and managing a relational database,
such as PostgreSQL Expert-level programmer in one or more of
programming language (Golang preferred) Proficiency in version
control practices and integrating IaC with CI/CD pipelines.
Experience with Kubernetes and containers preferred Experience
building and operating data infrastructure (Kinesis, Airflow,
Kafka, etc) a plus Bachelor’s or Master’s degree in Computer
Science, Computer Engineering, or a related technical field, or
equivalent practical experience About Together AI Together AI is a
research-driven artificial intelligence company. We believe open
and transparent AI systems will drive innovation and create the
best outcomes for society, and together we are on a mission to
significantly lower the cost of modern AI systems by co-designing
software, hardware, algorithms, and models. We have contributed to
leading open-source research, models, and datasets to advance the
frontier of AI, and our team has been behind technological
advancement such as FlashAttention, Hyena, FlexGen, and RedPajama.
We invite you to join a passionate group of researchers in our
journey in building the next generation AI infrastructure.
Compensation We offer competitive compensation, startup equity,
health insurance, and other benefits, as well as flexibility in
terms of remote work. The US base salary range for this full-time
position is: $160,000 - $230,000 equity benefits. Our salary ranges
are determined by location, level and role. Individual compensation
will be determined by experience, skills, and job-related
knowledge. Equal Opportunity Together AI is an Equal Opportunity
Employer and is proud to offer equal employment opportunity to
everyone regardless of race, color, ancestry, religion, sex,
national origin, sexual orientation, age, citizenship, marital
status, disability, gender identity, veteran status, and more.
Please see our privacy policy at
https://www.together.ai/privacy
Keywords: Together AI, Sunnyvale , Senior Software Engineer - Together Cloud Platform, IT / Software / Systems , San Francisco, California