Description

Building the inference and orchestration layer for production Vision-Language Models. Focus on fast and ergonomic visual inference, reliable structured outputs, and observability for iteration. Shipped projects include Orion (visual agent for images/video/documents), mm-ctx (Unix-style multimodal CLI with Rust core and Python devex), and vlmbench (single-file CLI for benchmarking VLM inference across vLLM, Ollama, and SGLang). Application requires emailing hiring at vlm.run with GitHub and a couple recent projects.

VLM Run

Roles

Tech stack

Location

Contact

Description

Similar jobs