What you do at AMD changes everything
At AMD, we push the boundaries of what is possible. We believe in changing the world for the better by driving innovation in high-performance computing, graphics, and visualization technologies - building blocks for gaming, immersive platforms, and the data center.
Developing great technology takes more than talent: it takes amazing people who understand collaboration, respect, and who will go the "extra mile" to achieve unthinkable results. It takes people who have the passion and desire to disrupt the status quo, push boundaries, deliver innovation, and change the world. If you have this type of passion, we invite you to take a look at the opportunities available to come join our team.Senior HPC ML Applications Engineer - GPU & CPUThe Role:
We are looking for our next team member to join our growing HPC Data Center GPU (DCGPU) team, to enable and optimize HPC applications and provide performance and systems expertise to our internal partners & customers prior to 1st Si through going to production of our Epyc processors and Instinct(MI) GPU accelerators based systems and solutions .
A 'hands-on' role working independently and with other AMD engineers to tackle technical HPC functional and performance issues, collaborating with our customer-facing organizations, our internal R&D and other key engineering groups. Working across a variety of partners on the bring-up, design, debug and performance of the world's largest HPC systems, making a significant impact at a global level, including working with the 'Mega Datacenters' and HPC cloud providers. Growing the success and market penetration of the AMD GPU as it applies to HPC.The Person:
Very Strong solution-oriented mindset
Expertise in HPC application performance testing and debug on CPU and/or GPU
Strong technical ownership and ability to lead technical relationships with both customers and HPC partners
Ability to independently prioritize opportunities to deliver results on time
Proven success establishing relationships internally and across a network of customers and partners
Excellent verbal and written communication skillsKey Responsibilities:
Seek maximum HPC performance while achieveing highest quality on AMD EPYC plus Instict systems through a combination of performance optimization, HPC workload debug and characterization, compilers, math libraries and lower-level AMD-internal toolsets
Feeding back performance bottlenecks and functional issues to the relevant engineering groups during bring-up to improve quality and performance
Partner with our collaborative internal development and validation teams supporting with a deeper level of HPC application and system-level expertise
Attending and leading high-value technical HPC discussions to portray general AMD GPU proposition and its application to HPC
Technically owning and resolving customer and partner issues. Submitting JIRA tickets and driving resolution
Collaborate on future architectures, functional validation and performance testing
Attend internal working groups in resolving engineering issues; contribute to the debug and testing of unreleased GPU based solutions and their readiness for HPC workloads
Document and publish system health and performance results, as well as procedures you have generated and procedures automationPreferred Experience:
Proven HPC application experience balanced with partner or customer-facing experience
HPC Functional applications bring-up, triage, and performance profiling, monitoring tools, and software performance optimization
Expertise working with large codes from source, with appropriately linked math libraries and flag optimization, working with different compilers, MPI libraries, and math libraries
System-level hardware and its configuration on performance, such as Infiniband and shared parallel filesystems
Proven understanding of baseline testing of synthetic codes: HPL, STREAM, DGEMM, HPCG, HPCC
Linux administration; understanding setup for HPC middlewareNice to Haves:
Experience working on very large codes such as weather and associated tuning for greater scalability
Any experience understanding/inspecting/writing assembly
Understanding of memory and cache hierarchy and methods to query performance/latency at each level
Understanding HPC dataflow down to the register-levelAcademic Credentials:
- List any desired degrees, certifications, etc.
- Use the words preferred or desired, instead of required
Benefits offered are described here.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies or fee based recruitment services. AMD and its subsidiaries are equal opportunity employers. We consider candidates regardless of age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status. Please click here for more information.