About TrueFoundry:

TrueFoundry is an enterprise-grade AI/ML platform that accelerates the development, deployment, and scaling of GenAI and ML applications with security, cost efficiency, and cross-cloud flexibility. Trusted by companies like NVIDIA, CVS, Merck, Synopsys, and many more, we're helping enterprises unlock the full value of AI faster.

We're now scaling our Enterprise Outcomes motion — a strategic arm focused on delivering domain-specific solutions that drive business transformation and shape our product roadmap. We're hiring a senior leader to build and lead the engineering arm of this motion.

Role summary

You'll be an individual contributor embedding modern LLM applications into customer workloads and platform features. You'll work closely with senior engineers and products to implement production-grade RAG pipelines, prompt/chain design, small agent runtimes (LangGraph/LangChain), vector DB integrations, and monitoring instrumentation on TrueFoundry. This role is hands-on and ideal if you enjoy shipping code, learning the latest agent tools, and turning prototypes into repeatable engineering patterns.

What you'll do

Implement, test, and maintain LLM-powered features and AI Agent / RAG pipelines (prompting, retrieval, vector DB + embeddings).
Build and extend agent workflows using LangGraph / LangChain or equivalent frameworks; help harden state persistence and retry logic.
Integrate models and runtimes via the platform's API (deploy/serve/instrument LLMs, configure token/cost guards).
Write end-to-end tests, small services, and automation to reproduce customer issues and demo solutions.
Instrument observability: logs, traces, latency/cost dashboards and basic alerting for LLM workloads.
Collaborate with product, support, and customers to convert POCs into documented, repeatable patterns.

Must-have

2–3 years software engineering experience building backend services or ML infra; comfortable with Python (and one other language).
Practical experience using LLMs (OpenAI/Anthropic/other) and building prompt + retrieval workflows.
Familiarity with at least one vector DB (e.g., Chroma, Pinecone, Weaviate) and embeddings pipelines.
Experience with REST/gRPC APIs, containers (Docker), and basic Kubernetes concepts.
Strong debugging skills and ability to write clean, testable code.

Nice-to-have

Hands-on with LangChain or LangGraph and agent architectures.
Experience with RAG evaluation, prompt engineering best practices, or prompt-testing frameworks.
Exposure to production monitoring for LLMs (token usage, cost controls, latency SLAs).
Prior experience deploying to or operating on Kubernetes.

Qualifications & signals we like

BS/MS in CS or related field (or equivalent industry experience).
Public repo or demo showing an LLM project, small agent, or RAG pipeline.
Curiosity about LLM safety, reliability, and cost-efficient deployment.

AI/ML Engineer — LLM & Agent Stack

About the Job

Recommended Jobs