Back to Job Listings

Principal Software Engineer – LLM Serving (Cloud AI)

SpringCube

Full time - Senior Engineer

IT Services & Consulting

United States, San Francisco - California

Published 2 days ago

Salary: Disclosed upon interview

Contact Employer
  • Share:
Send Feedback
Report This Job

Job Description

The SpringCube team curated the following job opportunity to help you in your job search. Explore the position below to find your next career move.

Company Overview
A leading technology company in the semiconductor and mobile communications space is seeking a Principal Software Engineer to join its Cloud AI team. The organization focuses on developing advanced software solutions for AI inference acceleration, machine learning, and large-scale commercial deployment across cutting-edge hardware platforms.

Summary
The Principal Software Engineer will design, develop, and optimize software for large language models (LLMs) and AI inference frameworks. This role requires deep technical expertise across software engineering, system performance, and AI model deployment, combined with strong collaboration skills for cross-functional projects.

Responsibilities

  • Design, develop, and maintain software solutions for inference acceleration and LLM serving frameworks such as vLLM
  • Execute, analyze, and optimize neural network performance on multicore systems
  • Develop high-performance software using C++ and Python
  • Analyze software/hardware performance, including multi-core architecture fundamentals (cores, caches, memory, bus, PCIe)
  • Apply performance modeling techniques for SoC and multi-core processor architectures
  • Collaborate with cross-functional teams from R&D to commercial deployment
  • Contribute to machine learning accelerators and related software optimization
  • Maintain awareness of advances in LLMs, multi-modal reasoning models, and neural network operators
  • Provide technical guidance and mentoring to team members

Qualifications

  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field with 8+ years of relevant experience, OR
    Master’s degree with 7+ years, OR PhD with 6+ years of experience in Hardware, Software, or Systems Engineering
  • Proven experience planning, managing, and delivering large-scale commercial software projects
  • Strong development skills in PyTorch and understanding of LLMs, multi-modal, and reasoning models
  • Experience in executing, analyzing, and optimizing neural networks
  • Strong proficiency in C++ and Python
  • Knowledge of multi-core processor architecture and SoC fundamentals
  • Experience with performance modeling of SoC architectures
  • Excellent written and verbal communication skills and a strong team player
  • Experience with machine learning accelerators and understanding of neural network mathematical operations (linear algebra, math libraries) is highly desired

Disclaimer
SpringCube curates tech job listings from various company websites to support tech professionals in globally.

1. No Endorsement: Job ads on SpringCube do not imply endorsement of their authenticity or quality.
2. No Client Relationship: This company is not a client of SpringCube unless stated.
3. To Apply: Click the “Apply” button to be redirected to the hiring company’s application page for this job.
4. No Liability: SpringCube is not liable for inaccuracies.