Roles
Compensation
USD 120000 - 150000
$120K-$150K + equity; visa sponsorship available
- Salary period
- yearly
- Location basis
- US
- Equity
- equity
Benefits
- $120K-$150K salary range
- Equity
- Visa sponsorship available
Tech stack
Required
Nice to have
Location
REMOTE (US)
Work setup
- Employment
- full-time
- Level
- Senior
- Remote policy
- REMOTE (US)
- Remote scope
- country-limited
- Visa
- Visa sponsorship available
- Authorization
- unclear
Role details
Responsibilities
- Cross-platform desktop capture across macOS + Windows (AX/UIA trees, OCR, clipboard, keyboard/mouse boundaries, file interactions)
- Trace fusion: turning noisy low-level signals into clean human-readable session timelines
- Own the evaluation evidence pipeline (data model, ingestion, exports)
- Implement LLM normalization that summarizes traces without hallucinating or hiding evidence quality
- Privacy/consent/proctoring safety for enterprise customers
- Scale the capture and evaluation infrastructure underlying the AgentIF-OneDay benchmark
- Build and improve cross-platform capture for macOS and Windows: app and window activity, accessibility trees, OCR snapshots, clipboard events, keyboard and mouse activity boundaries, file interactions, and replay metadata.
- Turn noisy low-level signals into clean, human-readable session timelines: what the user did, where they went, what they typed, what they viewed, what files they touched, and what they submitted.
- Design the data model and ingestion pipeline that converts desktop traces, artifacts, recordings, and challenge metadata into reliable evidence for AI competency evaluation.
- Build robust fallback logic across AX/UIA, OCR, keyboard and mouse triggers, clipboard, video segments, and artifact diffs, with explicit confidence and gap tracking.
- Use LLMs carefully to summarize, normalize, and structure session traces without hallucinating or hiding evidence quality.
- Implement session-scoped capture, transparent user consent, sensitive-field handling, permission diagnostics, retention controls, and enterprise-ready auditability.
- Own Electron packaging, native helpers, permission flows, auto-updates, logging, crash handling, and local performance budgets.
- Maintain PostgreSQL and Supabase schemas, migrations, storage flows, async jobs, replay assets, and export formats for evaluation and research partners.
Requirements
- 3+ years shipping production TypeScript + Node.js
- Built or shipped Electron + native helpers
- Strong PostgreSQL (schema design, indexing, data quality)
- Comfortable with OS APIs, permissions, background processes
- Hands-on LLM API experience (structured output, evaluation, hallucination control)
- Uses Claude Code, Cursor, or similar as a daily working medium
Application
Apply by emailing jack@emergences.ai with your resume and a brief note about a system you've built (desktop software, AI infra, data pipelines, evaluation, or low-level debugging). Subject: "Your Name — AI / Systems Engineer". Please mention HN in the email.
- Email subject
- Your Name — AI / Systems Engineer
- Portfolio
- unclear
- GitHub
- not mentioned
- Cover letter
- not mentioned
- Apply flow
- Canonical URL
- https://emergences.ai/careers/ai-engineer
Company context
Build NeoHuman to observe how people use AI tools, files, and workflows during real tasks and convert those sessions into reliable evaluation evidence.
- Product
- NeoHuman (desktop-first system for observing AI tool usage and producing evaluation evidence)
- Industry
- AI / Systems
- Size
- small
- Stage
- VC-backed
- Funding
- VC-backed
- Open source
- unclear
Contact
jack
jack@emergences.ai
Description
Emergences Labs (emergences.ai) is hiring an AI / Systems Engineer for NeoHuman, a desktop-first system that observes how people actually use AI tools, files, and workflows during real tasks and turns sessions into reliable evaluation evidence. The role involves owning cross-platform desktop capture across macOS and Windows, trace fusion to produce clean session timelines, building an evaluation evidence pipeline (data model, ingestion, exports), implementing LLM normalization to summarize traces without hallucinating or hiding evidence quality, and privacy/consent/proctoring safety for enterprise customers. The company is building the capture and evaluation infrastructure underlying the AgentIF-OneDay benchmark. They are looking for an AI-native engineer with 3+ years shipping production TypeScript + Node.js, experience building or shipping Electron with native helpers, strong PostgreSQL skills (schema design, indexing, data quality), comfort with OS APIs, permissions, and background processes, and hands-on LLM API experience (structured output, evaluation, hallucination control). They regularly use Claude Code, Cursor, or similar tools. Stack includes TypeScript, Node.js, Electron, native macOS/Windows, Next.js 15, React 19, PostgreSQL + Supabase, OCR, accessibility APIs, OpenAI/Anthropic/Gemini, Vercel, and GitHub Actions. Application: email jack@emergences.ai with resume and a brief note about a system you've built. Subject format: "Your Name — AI / Systems Engineer" and please mention HN. Full JD: https://emergences.ai/careers/ai-engineer
Similar jobs
-
Loading similar jobs...