SW Low Lewel (LLM) - Grenoble

Apply

Location

France - Grenoble

Workplace

On-Site

Job Type

Permanent

Job Description

We are in search for a Low Level Software Engineer to be involved in Processing In Memory project to do SW
exploration and performance modeling for Large Language Model (LLM) inference.

What you are involved in?
Project to explore usage of emerging PIM technology to accelerate LLM inference
Advanced research topics – targeting next generation of edge devices and inference acceleration
board
A R&D team in the Grenoble research institute, focused on key technologies for future system and
promotion of the use of these to improve product competitiveness.
Work areas include: Deep LLM analysis with focus on Decode phase and KV cache, profiling
performance on innovative inference accelerators, optimization of LLM execution on PIM-based
inference accelerator, interaction with NPU and Host CPU
Application areas: Large Language model inference acceleration for edge devices and server

This will support your start in our team:
3+ years software engineer
Good experience and background in performance profiling and low level librairies/drivers
Good experience in C/C++ and python for modeling and performance profiling.
Knowledge in LLM inference would be a plus
Strong learning ability to quickly master new areas of knowledge
A strong history of thinking creatively and communicating effectively in a global team environment

Job Reference

19003u44

Categorisation

Software & Artificial Intelligence:

A.I. / Machine Learning

Contact Details

T: +44 1628 206 211

E: tech@microTECH-global.com

Apply

SW Low Lewel (LLM) - Grenoble

Search Jobs

Options

Apply