Software Engineer, Training Infrastructure, Machine Learning Platform
Waymo is an autonomous driving technology company with a mission to make it safe and easy for people and things to get where they’re going. Since our start as the Google Self-Driving Car Project in 2009, Waymo has been focused on building the Waymo Driver—The World’s Most Experienced Driver™—to improve everyone's access to mobility while saving thousands of lives now lost to traffic crashes. Our Waymo Driver powers Waymo One, our fully autonomous ride-hailing service, as well as Waymo Via, our trucking and local delivery service. To date, Waymo has driven over 20 million miles autonomously on public roads across 25 U.S. cities and conducted over 20 billion miles of simulation testing.
At Waymo, we are mission-driven and believe deeply in the opportunity of autonomous driving technology to improve mobility and make people's lives better. We are united by purpose and responsibility (for our employees and riders alike). We are looking for kind, committed, employees who have integrity, dream big, work together as one team and create a sense of belonging for one another that is the foundation of our culture. We want each team member to feel welcomed and included in every step of our exciting journey.
The ML Platform team at Waymo provides a set of tools and technologies to support and automate the lifecycle of the machine learning workflow, including feature and experiment management, model development, debugging & evaluation, deployment, and monitoring. These efforts have resulted in making machine learning more accessible to teams at Waymo, including Perception, Behavior Prediction, Planner, Routing, Maps and Research, ensuring greater degrees of consistency and repeatability, and addressing the “last mile” of getting models into production and managing them once they are in place.
Join the Training Infrastructure group within ML Platform, and help us make training and evaluating ML models at Waymo easier, faster, and better! We develop and maintain a set of frameworks and tools on top of Tensorflow that address many of the pain points experienced by ML practitioners: training fast and at scale, discovering optimal hyper-parameters, automatically retraining nets on a schedule, computing reliable and noiseless metrics on validation sets, and validating newly trained nets when deployed into the full onboard software stack. We work hand in hand with machine learning experts in all parts of the company and our collaborators across Alphabet.
We are looking for an individual contributor (IC) with a strong background in ML deployment toolchain, ML accelerator (GPU/TPU) profiling and application, and remote procedure call (RPC) service management and optimization. Non-exhaustive examples of our work include:
- Build a comprehensive and user friendly toolchain for scalable and flexible ML deployment workflow
- Develop and improve our scalable and performant ML training library
- Profile ML performance on accelerators (e.g. GPU and TPU) at both model level and system level, identify performance bottlenecks and optimization opportunities
At a minimum, we’d like you to have:
- BS in Computer Science, Math, or equivalent real-world experience
- Solid Python or C++ skills
- Passionate about infrastructure work, building libraries, tools, and pipelines for machine learning practitioners
- Experience with Tensorflow, Keras, ie: distributed training and distribution strategies
- Experience contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems
It’s preferred if you have:
- MS or PhD in Computer Science, Math, or equivalent real-world experience
- Practical expertise in training models with TPU
- Experience building and architecting large-scale, production quality backend systems, especially in applied machine learning or data pipeline
- Knowledge and experience with machine learning algorithms
- Experience with multi-threaded and stream-based programming models
- Familiarity working with RPC services
- Experience with high performance computing or data mining
The expected base salary range for this full-time position across US locations is listed below. Actual starting pay will be based on job-related factors, including exact work location, experience, relevant training and education, and skill level. Your recruiter can share more about the specific salary range for the role location or, if the role can be performed remote, the specific salary range for your preferred location, during the hiring process.
Waymo employees are also eligible to participate in Waymo’s discretionary annual bonus program, equity incentive plan, and generous Company benefits program, subject to eligibility requirements.
While at Waymo, you will enjoy benefits that cover…
Health and wellness: Our people are at the heart of everything we do. At Waymo, you can enjoy top-notch medical, dental and vision insurance, mental wellness support, a Flexible Spending Account (FSA), a Health Saving Account (HSA), on-site physicians and/or nurses in some locations, and special wellness programs.
Financial wellness: Your financial peace of mind is important to us. At Waymo, we offer competitive compensation, bonus opportunities, equity, a generous 401(k) plan or regional retirement plans, 1-on-1 financial coaching, a 529 College Savings Plan and lots of other perks and employee discounts.
Flexibility and time off: Take the time you need to relax and recharge. Enjoy the flexibility to work from another location for four weeks per year. We support an on-site, hybrid work model and offer remote working opportunities, paid time off, Waymo recharge days, bereavement, sick, and parental leave.
Supporting families: When it comes to growing your family or caring for your loved ones, you have our full support. Enhanced leave options include paid parental leave (birthing parent gets 24 weeks of paid leave, and non-birthing parent gets 18 weeks of paid leave), and 20 subsidized days of backup childcare or adult/elder care. Access to fertility care or adoption support as you grow your family.
Community and personal development: At Waymo, you’ll find a range of opportunities to grow, connect, and give back. We offer education tuition reimbursement, personal and professional development, mentorship, and other ways to connect through Employee Resource Groups (ERGs), other internal groups, and even time off to volunteer.
Cool perks: Access to Google offices, cafes, wellness centers, personal training sessions, massages, haircuts, bike repairs, office transportation, commuter benefits and so much more. To support your wellbeing at home, you can enjoy at-home fitness and cooking classes, and more.
* Please note that while our benefits philosophy is the same in every place Waymonauts work, benefits may vary by office/country and are subject to eligibility requirements.