Congratulations! Your model is ready for deployment to a production ML pipeline. This section of the course introduces testing guidelines for ML pipelines. However, this section does not demonstrate these guidelines because such a demo is not possible in a sandboxed environment.
You will learn about:
- Writing appropriate tests for launch and production.
- Detecting failure modes in your ML pipeline using tests.
- Evaluating your model quality in production.
What is an ML Pipeline?
An ML pipeline consists of several components, as the diagram shows. We’ll become familiar with these components later. For now, notice that the “Model” (the black box) is a small part of the pipeline infrastructure necessary for production ML.
Role of Testing in ML Pipelines
In software development, the ideal workflow follows test-driven development (TDD). However, in ML, starting with tests is not straightforward. Your tests depend on your data, model, and problem. For example, before training your model, you cannot write a test to validate the loss. Instead, you discover the achievable loss during model development and then test new model versions against the achievable loss.
You need tests for:
- Validating input data.
- Validating feature engineering.
- Validating quality of new model versions.
- Validating serving infrastructure.
- Testing integration between pipeline components.