SW Low Lewel (LLM) - Grenoble

France - Grenoble
Permanent
We are in search for a Low Level Software Engineer to be involved in Processing In Memory project to do SW
exploration and performance modeling for Large Language Model (LLM) inference.

What you are involved in?
Project to explore usage of emerging PIM technology to accelerate LLM inference
Advanced research topics – targeting next generation of edge devices and inference acceleration
board
A R&D team in the Grenoble research institute, focused on key technologies for future system and
promotion of the use of these to improve product competitiveness.
Work areas include: Deep LLM analysis with focus on Decode phase and KV cache, profiling
performance on innovative inference accelerators, optimization of LLM execution on PIM-based
inference accelerator, interaction with NPU and Host CPU
Application areas: Large Language model inference acceleration for edge devices and server

This will support your start in our team:
3+ years software engineer
Good experience and background in performance profiling and low level librairies/drivers
Good experience in C/C++ and python for modeling and performance profiling.
Knowledge in LLM inference would be a plus
Strong learning ability to quickly master new areas of knowledge
A strong history of thinking creatively and communicating effectively in a global team environment
19003U44
© 2024 microTECH Global Limited
Headquarters: Park House, Park Street, Maidenhead, Berkshire SL6 1SL
Bristol, UK: Office 202, Origin Workspace, 40 Berkeley Square, Bristol BS8 1HP
Bengaluru, India: FF-2 Ozone Whites, Doddanaga Mangal, Electronic City Phase-2, Bengaluru, Karnataka 560100, India
This site uses cookies, by browsing the site you are agreeing to their use. To find out how we use them please read our Cookie Policy. Hide