CityRochester
StateMN
RemoteYES
DepartmentMayo Clinic Platform
Why Mayo Clinic
Mayo Clinic is top-ranked in more specialties than any other care provider according to U.S. News & World Report. As we work together to put the needs of the patient first, we are also dedicated to our employees, investing in competitive compensation and
comprehensive benefit plans – to take care of you and your family, now and in the future. And with continuing education and advancement opportunities at every turn, you can build a long, successful career with Mayo Clinic.
Benefits Highlights- Medical: Multiple plan options.
- Dental: Delta Dental or reimbursement account for flexible coverage.
- Vision: Affordable plan with national network.
- Pre-Tax Savings: HSA and FSAs for eligible expenses.
- Retirement: Competitive retirement package to secure your future.
ResponsibilitiesThe Mayo Clinic Platform AI team is currenting looking for a Senior Platform engineer to join our efforts in building and advancing our model hosting capabilities for the Mayo Clinic Platform (MCP). In this role, you will architect and develop a comprehensive model hosting framework supporting diverse model types, from traditional ML and computer vision models to large language and generative AI models, serving both internal Practice teams and external customers. You will collaborate closely with data scientists, product managers, and engineering teams to drive innovation, ensure scalable system architecture, and deliver high-impact capabilities that power next-generation AI applications.
Key Responsibilities
Model Hosting Framework Development: Architect and implement scalable model hosting solutions that enable seamless deployment and serving of ML models from R&D to production environments.
Model Lifecycle Management: Build and optimize systems for model versioning, deployment, monitoring, and rollback capabilities across diverse model types and serving patterns.
Platform Scalability: Design modular, extensible hosting components and services to handle varying model sizes, traffic patterns, and computational requirements.
Performance & Reliability: Implement robust serving infrastructure with auto-scaling, load balancing, and fault tolerance to ensure high availability and low-latency inference.
Collaboration: Work cross-functionally with data scientists, MLOps engineers, and product teams to translate model serving requirements into scalable technical solutions.
Code Quality & Best Practices: Perform thorough code reviews, maintain high standards for code quality, and champion best practices in ML infrastructure engineering.
Optimization & Efficiency: Identify bottlenecks in model serving pipelines and implement solutions to improve throughput, reduce latency, and optimize resource utilization.
Monitoring & Observability: Develop comprehensive monitoring, logging, and alerting systems for model performance, drift detection, and infrastructure health.
Documentation: Create and maintain technical documentation, including deployment guides, API specifications, and troubleshooting runbooks for development and operations teams.
Technology Leadership: Stay current with model serving technologies, containerization advances, and ML platform innovations to continuously enhance the hosting framework.
Qualifications- Bachelors degree in a relevant information technology field or a minimum 7 years of direct full-stack engineering with increasing complexity.
- 3-5 years working in diverse environments utilizing Agile principles of software development
- 5+ years of professional software development experience, with at least 2 years in senior capacity, ideally in ML infrastructure, distributed systems, or platform engineering.
- Strong proficiency in Python, Java, Go, or similar languages, with demonstrated experience building production-grade ML serving systems.
- Hands-on experience with model serving technologies (e.g., TorchServe, TensorFlow Serving, Triton, KServe) and REST/gRPC API development.
- Solid understanding of Docker, Kubernetes, and container orchestration for ML workloads, including resource management and scaling strategies.
- Hands-on experience with cloud platforms (AWS, Azure, GCP) and infrastructure-as-code. Familiarity with CI/CD pipelines for ML deployments.
- Working knowledge of model optimization techniques, caching strategies, and infrastructure performance tuning.
- Excellent written and verbal communication skills, able to translate complex technical requirements across diverse stakeholders.
Exemption StatusExempt
Compensation Detail$125,444 - $181,875 / year
Benefits EligibleYes
ScheduleFull Time
Hours/Pay Period80
Schedule DetailsMonday - Friday, Normal Business Hours
100% Remote. This position may work remotely from any location within the US.
10%+ travel may be required
This vacancy is not eligible for sponsorship/ we will not sponsor or transfer visas for this position.
Weekend ScheduleNot Applicable
International AssignmentNo
Site Description
Just as our reputation has spread beyond our Minnesota roots, so have our locations. Today, our employees are located at our three major campuses in Phoenix/Scottsdale, Arizona, Jacksonville, Florida, Rochester, Minnesota, and at Mayo Clinic Health System campuses throughout Midwestern communities, and at our international locations. Each Mayo Clinic location is a special place where our employees thrive in both their work and personal lives.
Learn more about what each unique Mayo Clinic campus has to offer, and where your best fit is.Equal Opportunity
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender identity, sexual orientation, national origin, protected veteran status or disability status. Learn more about the
"EOE is the Law". Mayo Clinic participates in
E-Verify and may provide the Social Security Administration and, if necessary, the Department of Homeland Security with information from each new employee's Form I-9 to confirm work authorization.
RecruiterJulie Melton