AI Inference Engineer
Posted 22 days ago USD 200,000 - 350,000 / year
We are looking for an AI Inference engineer to join our growing team. Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.
Responsibilities
Responsibilities
- Develop APIs for AI inference that will be used by both internal and external customers
- Benchmark and address bottlenecks throughout our inference stack
- Improve the reliability and observability of our systems and respond to system outages
- Explore novel research and implement LLM inference optimizations
- Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
- Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
- Understanding of GPU architectures or experience with GPU kernel programming using CUDA
Recommended Jobs
Data Scientist - Digital
Posted 17 minutes ago
AI Security Engineer
Posted 53 minutes ago
AI Principal Machine Learning Engineer (10026)
Posted 53 minutes ago
Senior Engineer, Interactive Voice Response - AI/ML
Posted 1 hour ago
Lead Data Scientist
Posted 1 hour ago

