As an ML DevOps Engineer you will collaborate with the cloud infrastructure for Machine Learning, be responsible for translating the needs and requirements (security, data governance and performance) of all interested departments into solid and reliable machine learning products in production.
- Build and maintain data pipelines, orchestrators and API’s in Python using tools like S3, Airflow, Kafka and FastAPI.
- Design and build effective, user-friendly infrastructure, tooling, and automation to accelerate Machine Learning and development within the team and wider company.
- Collaborate with teams to support the ML technical roadmap.
- Collaborate with Machine Learning Engineers, Product Managers and Platform Engineers to develop tools to support experimentation, training and production operations.
- Offer support and troubleshooting assistance for the ML pipeline, while continuously improving stability, monitoring and alerting along the way.
- Collaborate on managing the AWS stack which comprises all ML resources.
- Build and maintain systems employing an Infrastructure-as-Code approach.
- Establish standards and practises around MLOps, including governance, compliance, and data security.
- Collaborate on managing ML infrastructure costs.
Knowledge and Experience
- Degree or higher degree in a STEM area (or equivalent work experience).
- 2+ years of experience with ML infrastructure and ML DevOps
- 3+ years of overall engineering experience in distributed systems and data infrastructure
- 2+ years experience in managing and building streaming (Kafka) pipeline
- 3+ years experience coding in Python
- 2+ years of experience in REST API (FastAPI) and microservices definition.
- Experience working with ML engineers to build tooling and automation to support the entire ML engineering lifecycle, from experimentation to production operation
- Experience with Kubernetes, Docker Swarm and ML CI/CD workflows
- 2+ years experience with AWS or other public cloud platforms (GCP, Azure, etc.)
- Excellent understanding of software architectures and best practices.
- Knowledge of agile practises such as Kanban, XP or Scrum
- Experience of programming in JVM languages (Java/Kotlin)
- Knowledge of serverless and on-demand cloud services
This is a fantastic time to get onboard with a business that plan to double in size over the next 4 years. They are a close-knit and collaborative team backed by the larger group and offer great training and development.