SW Low Lewel (LLM) - Grenoble
France - Grenoble
Permanent
We are in search for a Low Level Software Engineer to be involved in Processing In Memory project to do SW
exploration and performance modeling for Large Language Model (LLM) inference.
What you are involved in?
Project to explore usage of emerging PIM technology to accelerate LLM inference
Advanced research topics – targeting next generation of edge devices and inference acceleration
board
A R&D team in the Grenoble research institute, focused on key technologies for future system and
promotion of the use of these to improve product competitiveness.
Work areas include: Deep LLM analysis with focus on Decode phase and KV cache, profiling
performance on innovative inference accelerators, optimization of LLM execution on PIM-based
inference accelerator, interaction with NPU and Host CPU
Application areas: Large Language model inference acceleration for edge devices and server
This will support your start in our team:
3+ years software engineer
Good experience and background in performance profiling and low level librairies/drivers
Good experience in C/C++ and python for modeling and performance profiling.
Knowledge in LLM inference would be a plus
Strong learning ability to quickly master new areas of knowledge
A strong history of thinking creatively and communicating effectively in a global team environment
exploration and performance modeling for Large Language Model (LLM) inference.
What you are involved in?
Project to explore usage of emerging PIM technology to accelerate LLM inference
Advanced research topics – targeting next generation of edge devices and inference acceleration
board
A R&D team in the Grenoble research institute, focused on key technologies for future system and
promotion of the use of these to improve product competitiveness.
Work areas include: Deep LLM analysis with focus on Decode phase and KV cache, profiling
performance on innovative inference accelerators, optimization of LLM execution on PIM-based
inference accelerator, interaction with NPU and Host CPU
Application areas: Large Language model inference acceleration for edge devices and server
This will support your start in our team:
3+ years software engineer
Good experience and background in performance profiling and low level librairies/drivers
Good experience in C/C++ and python for modeling and performance profiling.
Knowledge in LLM inference would be a plus
Strong learning ability to quickly master new areas of knowledge
A strong history of thinking creatively and communicating effectively in a global team environment
19003u44
Software & Artificial Intelligence: | A.I. / Machine Learning |