Back to Job Listings

Senior AI Performance Architect

SpringCube

Full time - Senior Engineer

IT Services & Consulting

United States, San Francisco - California

Published 2 days ago

Salary: Disclosed upon interview

Contact Employer
  • Share:
Send Feedback
Report This Job

Job Description

The SpringCube team curated the following job opportunity to help you in your job search. Explore the position below to find your next career move.

Company Overview
A leading global technology company is driving innovation in AI across mobile, edge, and data center platforms. The organization focuses on power-efficient AI solutions for devices, servers, and large-scale AI training systems, combining expertise in software, hardware accelerators, and system-on-chip (SoC) design to deliver cutting-edge AI performance.

Summary
The Senior AI Performance Architect will contribute to AI accelerator architecture, optimizing performance, scalability, and power efficiency for AI training and inference systems. This role requires deep expertise in hardware design, AI workloads, and system-level co-design to meet the demands of large-scale AI models and applications.

Responsibilities

  • Analyze trends in ML network design through customer engagements and research to guide software and hardware architecture decisions
  • Collaborate with customers to define hardware requirements for AI training systems
  • Evaluate current GPU and accelerator architectures and propose enhancements for efficient AI model training
  • Architect and design flexible computational blocks supporting various datatypes and precision levels (32/16/8/4/2/1)
  • Design memory technologies and subsystems optimized for capacity, bandwidth, and compute-in/near-memory operations
  • Develop scale-out and scale-up architectures, including switches, NoCs, and codesign with communication collectives
  • Optimize architectures for power efficiency
  • Perform competitive analysis and HW/SW co-design to support GenAI and LLM requirements
  • Define performance models and conduct pre-silicon performance predictions for ML training workloads
  • Analyze performance, area, and power trade-offs for future hardware and software ML algorithms, including SoC components

Requirements

  • Master’s degree in Computer Science, Engineering, Information Systems, or related field
  • 3+ years of hardware engineering experience defining architectures for GPUs or AI accelerators used in model training
  • Deep knowledge of NVIDIA/AMD GPU capabilities and architectures
  • Understanding of LLM architectures and their hardware requirements

Preferred Skills and Experience

  • Strong background in computer architecture, digital circuits, and hardware simulation
  • Knowledge of communication protocols and NoC designs used in AI systems
  • Familiarity with memory technologies for AI workloads
  • Experience modeling hardware and workloads to extract performance and power estimates
  • High-level hardware modeling experience
  • Knowledge of AI training systems such as NVIDIA DGX and NVL72
  • Experience with distributed training frameworks (DeepSpeed, FSDP) for LLMs
  • Proficiency in front-end ML frameworks (TensorFlow, PyTorch) for training models
  • Strong coding skills in C++ and Python
  • Excellent communication, analytical, problem-solving, and debugging skills
  • Ability to adapt and learn in fast-changing environments
  • Understanding of various ML model classes (CNN, RNN, etc.)

Disclaimer
SpringCube curates tech job listings from various company websites to support tech professionals in globally.

1. No Endorsement: Job ads on SpringCube do not imply endorsement of their authenticity or quality.
2. No Client Relationship: This company is not a client of SpringCube unless stated.
3. To Apply: Click the “Apply” button to be redirected to the hiring company’s application page for this job.
4. No Liability: SpringCube is not liable for inaccuracies.