Senior Distributed Systems Engineer

Save
Job updated 10 months ago

Job Description

NVIDIA Omniverse™ Cloud is a platform-as-a-service (PaaS) that provides developers and enterprises a full-stack cloud environment to design, develop, and deploy industrial Omniverse applications. The Omniverse Infrastructure organization develops hardware and software systems to power the Omniverse Cloud.

What you will be doing:

  • Architect, design, build, and optimize distributed systems.
  • Drive end-to-end Omniverse platform optimization from a hardware level to the application and service levels.
  • Develop infrastructure and microservices to support Omniverse users and developers in the deployment of a wide range of workloads.
  • Address challenges related to compute, networking, and storage resource utilization in a heterogeneous computing environment.
  • Collaborate with multiple Omniverse product teams to understand customer storage and compute requirements and build supporting infrastructure.
  • Collaborate across org boundaries with a diverse set of engineers.
  • Adapt and/or develop performance modeling and analysis tools to identify and optimize performance bottlenecks in Omniverse workloads and drive future system designs.
  • Ability to multitask effectively in a dynamic environment.

Requirements

What we need to see:

  • Masters or PhD in Computer Science or a related field (or equivalent experience).
  • 5+ years of hands-on software engineering experience building large-scale distributed, fault-tolerant systems and services.
  • Strong systems programming skills, including multi-threading, concurrency, caching, and batching.
  • Proficiency in C, C++, and Python. Experience with cloud infrastructure platforms like AWS, Azure, and Google Cloud.
  • Solid technical foundation and a deep understanding of cloud technologies, distributed systems, and microservices architecture.
  • Excellent interpersonal skills and ability to work successfully with multi-functional teams, principles, and architects across organizational boundaries and geographies.
  • Understanding of virtualization and containerization technologies like Docker, Kubernetes, VMware, KVM, etc.

Ways to stand out from the crowd:

  • Hands-on experience in performance optimization and benchmarking on large-scale distributed systems.
  • Experience in developing large-scale distributed applications and services on supercomputing and/or cloud environments.
  • Experience with NVIDIA GPUs, HPC storage, networking, and cloud computing.
  • In-depth understanding of storage systems, Linux file systems, and RDMA networking.
  • Share references to your code contributions.
View all jobs
View all jobs
Save
2
5 years of experience required
Personal Invitation Link
This is your personal referral link for job invitation. You'll receive an email notification when someone applied for the position via your job link.
Share this job
Logo of NVIDIA.
NVIDIA
Semiconductor
5001 - people

About us

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics and ignited the era of modern AI. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Team

Avatar of the user.
Sr. Recruiter
Avatar of the user.
Sr. Recruiter

Jobs


Full-time
Assistant
2
Save

Full-time
Mid-Senior level
2
Save