vLLM Jobs - June 2026

Bjak

Source: Remote OK

Applied AI Engineer

Posted 2 weeks ago

CompanyA1 is building a proactive AI smart assistant for everyday users to bring intelligence to conversations, errands, organising and workflows.Our product focuses on achieving high reliability for long-running workflows, persistent context, and real-world …

Roles

Applied AI Engineer

Tech stack

llama OpenAI API Python PyTorch Qwen vLLM vector database JAX

Location

Zurich

Work setup

full-time · Remote (Source: Remote OK, Work arrangement: Remote). Interviews via virtual meetings and/or onsite.

Preferred Networks

LLM Inference Optimization Engineer

Posted 1 month, 2 weeks ago

Preferred Networks is an AI company based in Tokyo working across the stack. They design in-house chips (MN-Core series) and train LLMs (PLaMo series). They are hiring two roles: (1) MN-Core LLM Serving Engine Engineer to build software infrastructure to serv…

Roles

LLM Inference Optimization Engineer MN-Core LLM Serving Engine Engineer

Tech stack

PLaMo vLLM MN-Core CuPy LLM inference Optuna Hugging Face Chainer MN-Core L1000

Location

Tokyo, Japan, Remote in Japan

Compensation

Visa and relocation support provided.

VLM Run

Product ML Staff Engineer

Posted 1 month, 2 weeks ago

Building the inference and orchestration layer for production Vision-Language Models. Focus on fast and ergonomic visual inference, reliable structured outputs, and observability for iteration. Shipped projects include Orion (visual agent for images/video/doc…

Roles

Product ML Staff Engineer

Tech stack

Python Rust vLLM SGLang Ollama

Location

Santa Clara, CA (HQ)

Freight Brokerage

Founding Backend Architect (Python/PostgreSQL)

Posted 2 months, 1 week ago

We're a freight brokerage in NJ building a self-hosted, AI-first platform to replace our legacy CRM/TMS. The AI compute infrastructure and data foundation are already in place. We need the architect who turns it into a production operating system. This is a s…

Roles

Founding Backend Architect (Python/PostgreSQL)

Tech stack

email processing Postgres document ingestion FastAPI Python Linux event-driven pipelines vLLM REST APIs AI Microsoft Graph API CRM migration LLM inference Neo4j Docker

Location

NJ, Remote (US, EST hours)

Compensation

$150K + benefits; 1-week paid onsite onboarding in NJ; equity conversation open once mutual fit is established.

Nuance Labs

Systems Engineer (Real-Time Engine)

Posted 3 months, 2 weeks ago

Nuance Labs is building a human foundation model that understands and expresses human emotion in real-time across speech, facial expression, and body language. Small, fast-moving founding team (MIT, UW, Oxford) valuing in-person collaboration at Seattle HQ. O…

Roles

Systems Engineer (Real-Time Engine) Machine Learning Infra Engineer Machine Learning Research Engineer

Tech stack

Asyncio Dagster Kubernetes Ray Python GPU serving Rust TensorRT vLLM

Location

Seattle, WA

Compensation

Competitive equity; no salary disclosed. Opportunity to shape company from the ground up and significant ownership.

NVIDIA

Engineering Manager

Posted 7 months, 2 weeks ago

NVIDIA | vLLM + SGLang | Deep Learning Inference | Remote (North America preferred) Hi everyone — I’m Akbar, Senior Manager of Deep Learning Inference Software at NVIDIA. I lead our engineering efforts around vLLM and SGLang, two of the most widely used open-…

Roles

Engineering Manager DL Performance Software Engineer - LLM Inference Deep Learning Inference Inference Senior Deep Learning Software Engineer

Tech stack

compiler/runtime kernel fusion runtime optimizations vLLM scheduling optimizations SGLang Blackwell continuous integration GPUs LLM inference distributed serving Hopper

Location

Remote (North America preferred), Santa Clara, CA

vLLM Jobs - June 2026

Filter jobs

Save this search

Applied AI Engineer

Roles

Tech stack

Location

Work setup

LLM Inference Optimization Engineer

Roles

Tech stack

Location

Compensation

Product ML Staff Engineer

Roles

Tech stack

Location

Founding Backend Architect (Python/PostgreSQL)

Roles

Tech stack

Location

Compensation

Systems Engineer (Real-Time Engine)

Roles

Tech stack

Location

Compensation

Engineering Manager

Roles

Tech stack

Location