LLM inference Jobs

Preferred Networks

LLM Inference Optimization Engineer

Posted 1 month, 2 weeks ago

Preferred Networks is an AI company based in Tokyo working across the stack. They design in-house chips (MN-Core series) and train LLMs (PLaMo series). They are hiring two roles: (1) MN-Core LLM Serving Engine Engineer to build software infrastructure to serv…

Roles

LLM Inference Optimization Engineer MN-Core LLM Serving Engine Engineer

Tech stack

PLaMo vLLM MN-Core CuPy LLM inference Optuna Hugging Face Chainer MN-Core L1000

Location

Tokyo, Japan, Remote in Japan

Compensation

Visa and relocation support provided.

Freight Brokerage

Founding Backend Architect (Python/PostgreSQL)

Posted 2 months, 1 week ago

We're a freight brokerage in NJ building a self-hosted, AI-first platform to replace our legacy CRM/TMS. The AI compute infrastructure and data foundation are already in place. We need the architect who turns it into a production operating system. This is a s…

Roles

Founding Backend Architect (Python/PostgreSQL)

Tech stack

email processing Postgres document ingestion FastAPI Python Linux event-driven pipelines vLLM REST APIs AI Microsoft Graph API CRM migration LLM inference Neo4j Docker

Location

NJ, Remote (US, EST hours)

Compensation

$150K + benefits; 1-week paid onsite onboarding in NJ; equity conversation open once mutual fit is established.

NVIDIA

Engineering Manager

Posted 7 months, 2 weeks ago

NVIDIA | vLLM + SGLang | Deep Learning Inference | Remote (North America preferred) Hi everyone — I’m Akbar, Senior Manager of Deep Learning Inference Software at NVIDIA. I lead our engineering efforts around vLLM and SGLang, two of the most widely used open-…

Roles

Engineering Manager DL Performance Software Engineer - LLM Inference Deep Learning Inference Inference Senior Deep Learning Software Engineer

Tech stack

compiler/runtime kernel fusion runtime optimizations vLLM scheduling optimizations SGLang Blackwell continuous integration GPUs LLM inference distributed serving Hopper

Location

Remote (North America preferred), Santa Clara, CA

iGent AI

Full-Stack

Posted 7 months, 2 weeks ago

Building coding agent systems and an agentic cloud. Small senior team (ex-DeepMind, OpenAI, Microsoft Research, Amazon, Cambridge University; multiple PhDs). Work includes distributed systems, OS/sandboxing, ML and LLM inference/post-training, long-context te…

Roles

Full-Stack Backend / Eng Lead LLM inference & post-training Agent Infrastructure DevRel & GTM

Tech stack

ML OS sandboxing OSS/community tooling long context performance optimization vLLM backend full-stack LLM inference orchestration observability post-training distributed systems filesystems cloud sandboxes RL/RLVR

Location

London, UK

Compensation

Big Tech salary + options

iGent AI

Full-Stack

Posted 9 months, 2 weeks ago

Hi! Co-founder of iGent AI here. We build agentic systems for large-scale, industrial projects, and an agentic cloud. Proof: our agent built a Redis-compatible database in Rust, with no human coding, in 70 man-hours. Why join: Small, senior team: ex-DeepMind,…

Roles

Full-Stack Devrel & GTM Staff Engineer LLM inference & post-training Agent Infrastructure Senior Engineer Backend Eng Lead

Tech stack

ML OS sandboxing Rust LLM inference distributed systems

Location

London, UK

Compensation

Big Tech salary + options.

Proton

AI Engineer

Posted 9 months, 2 weeks ago

We recently launched Lumo, a privacy-friendly ChatGPT alternative, by the makers of Proton Mail & Proton VPN. We are looking for curious and talented people to help us grow Lumo with any of these skills: ML engineering, LLM inference, GPU infra & devops, Fron…

Roles

AI Engineer

Tech stack

PHP) Backend (Rust ML engineering GPU infra React Frontend (Typescript LLM inference devops

Location

Geneva, Paris, London, Barcelona, Prague, Taipei, EU Remote

LLM inference Jobs - May 2026

Filter jobs

Save this search

LLM Inference Optimization Engineer

Roles

Tech stack

Location

Compensation

Founding Backend Architect (Python/PostgreSQL)

Roles

Tech stack

Location

Compensation

Engineering Manager

Roles

Tech stack

Location

Full-Stack

Roles

Tech stack

Location

Compensation

Full-Stack

Roles

Tech stack

Location

Compensation

AI Engineer

Roles

Tech stack

Location