LLM inference Jobs - May 2026
Search by company, role, stack, location, salary signal, source, and work setup.
Save this search
Turn the current filters into an email alert.
Log in to save filtered searches as alerts.
-
Preferred Networks
LLM Inference Optimization Engineer
Posted 1 month, 2 weeks ago
Preferred Networks is an AI company based in Tokyo working across the stack. They design in-house chips (MN-Core series) and train LLMs (PLaMo series). They are hiring two roles: (1) MN-Core LLM Serving Engine Engineer to build software infrastructure to serv…
Tech stack
Location
Tokyo, Japan, Remote in Japan
Compensation
Visa and relocation support provided.
-
Freight Brokerage
Founding Backend Architect (Python/PostgreSQL)
Posted 2 months, 1 week ago
We're a freight brokerage in NJ building a self-hosted, AI-first platform to replace our legacy CRM/TMS. The AI compute infrastructure and data foundation are already in place. We need the architect who turns it into a production operating system. This is a s…
Tech stack
Location
NJ, Remote (US, EST hours)
Compensation
$150K + benefits; 1-week paid onsite onboarding in NJ; equity conversation open once mutual fit is established.
-
NVIDIA
Engineering Manager
Posted 7 months, 2 weeks ago
NVIDIA | vLLM + SGLang | Deep Learning Inference | Remote (North America preferred) Hi everyone — I’m Akbar, Senior Manager of Deep Learning Inference Software at NVIDIA. I lead our engineering efforts around vLLM and SGLang, two of the most widely used open-…
Roles
Tech stack
Location
Remote (North America preferred), Santa Clara, CA
-
iGent AI
Full-Stack
Posted 7 months, 2 weeks ago
Building coding agent systems and an agentic cloud. Small senior team (ex-DeepMind, OpenAI, Microsoft Research, Amazon, Cambridge University; multiple PhDs). Work includes distributed systems, OS/sandboxing, ML and LLM inference/post-training, long-context te…
Tech stack
Location
London, UK
Compensation
Big Tech salary + options
-
iGent AI
Full-Stack
Posted 9 months, 2 weeks ago
Hi! Co-founder of iGent AI here. We build agentic systems for large-scale, industrial projects, and an agentic cloud. Proof: our agent built a Redis-compatible database in Rust, with no human coding, in 70 man-hours. Why join: Small, senior team: ex-DeepMind,…
Roles
Tech stack
Location
London, UK
Compensation
Big Tech salary + options.
-
Proton
AI Engineer
Posted 9 months, 2 weeks ago
We recently launched Lumo, a privacy-friendly ChatGPT alternative, by the makers of Proton Mail & Proton VPN. We are looking for curious and talented people to help us grow Lumo with any of these skills: ML engineering, LLM inference, GPU infra & devops, Fron…
Roles
Tech stack
Location
Geneva, Paris, London, Barcelona, Prague, Taipei, EU Remote