-
›
- Careers ›
- Careers in America ›
-
Kubernetes sr. technical lead
Job Description
Kubernetes sr. technical lead
Job Summary
-
Location: California
-
Project role: Kubernetes sr. technical lead
-
Skills: Kubernetes
- Secondary Skills:
- Azure DevOps
- DevOps
-
No. of positions: 4
-
Pay Range Minimum: $92000
-
Pay Range Maximum: $141000
Job description:
SR Number: | ERS/ERS/2025/2766752 |
BR Number: | 1620751BR |
Job Title: | DevOps Engineer – LLM & GPU Inference Services |
No. of Positions: | 1 |
Hiring Mode: | Contract -TP |
Location: | Remote |
Customer: | DigitalOcean, LLC |
Bill Rate: | Junior : $55-$65/hr max. all inclusive C2C Intermediate : $70-$80/hr max. all inclusive C2C Senior : $85-$95/hr max. all inclusive C2C |
ERS/ERS/2025/2766752 | ||||
Skill | Last Used | Experience in Years/month | BR Number: | Hands on Exp. Yes/No |
Cloud environments | ||||
Large Language Models (LLMs), particularly hosting them to run inference | 1620751BR | |||
Distributed services experience | ||||
GPU (Dedicated Inference Service) | Job Title: |
Job Description:
We are looking for devs with general cloud services / distributed services experience, with LLM experience as a secondary skill. GPU experience is now low on the list of preferred skills: Dedicated Inference Service
Required Skills-
- Deep experience building services in modern cloud environments on distributed systems (i.e., containerization (Kubernetes, Docker), infrastructure as code, CI/CD pipelines, APIs, authentication and authorization, data storage, deployment, logging, monitoring, alerting, etc.)
- Experience working with Large Language Models (LLMs), particularly hosting them to run inference
- Strong verbal and written communication skills. Your job will involve communicating with local and remote colleagues about technical subjects and writing detailed documentation.
- Experience with building or using benchmarking tools for evaluating LLM inference for various models, engine, and GPU combinations.
- Familiarity with various LLM performance metrics such as prefill throughput, decode throughput, TPOT, and TTFT
- Experience with one or more inference engines: e.g., vLLM, SGLang, and Modular Max
- Familiarity with one or more distributed inference serving frameworks: e.g., llm-d, NVIDIA Dynamo, and Ray Serve etc.
- Experience with AMD and NVIDIA GPUs, using software like CUDA, ROCm, AITER, NCCL, RCCL, etc.
- Knowledge of distributed inference optimization techniques - tensor/data parallelism, KV cache optimizations, smart routing etc.
- Develop and maintain an inference platform for serving large language models optimized for the various GPU platforms they will be run on.
- Work on complex AI and cloud engineering projects through the entire product development lifecycle (PDLC) - ideation, product definition, experimentation, prototyping, development, testing, release, and operations.
- Build tooling and observability to monitor system health, and build auto tuning capabilities.
- Build benchmarking frameworks to test model serving performance to guide system and infrastructure tuning efforts.
- Build native cross platform inference support across NVIDIA and AMD GPUs for a variety of model architectures.
- Contribute to open source inference engines to make them perform better on DigitalOcean cloud.
Compensation and Benefits
A candidate’s pay within the range will depend on their work location, skills, experience, education, and other factors permitted by law. This role may also be eligible for performance-based bonuses subject to company policies. In addition, this role is eligible for the following benefits subject to company policies: medical, dental, vision, pharmacy, life, accidental death & dismemberment, and disability insurance; employee assistance program; 401(k) retirement plan; 10 days of paid time off per year (some positions are eligible for need-based leave with no designated number of leave days per year); and 10 paid holidays per year.
Disclaimer
HCLTech is an equal opportunity employer, committed to providing equal employment opportunities to all applicants and employees regardless of race, religion, sex, color, age, national origin, pregnancy, sexual orientation, physical disability or genetic information, military or veteran status, or any other protected classification, in accordance with federal, state, and/or local law. Should any applicant have concerns about discrimination in the hiring process, they should provide a detailed report of those concerns to secure@hcltech.com for investigation.