MLOps · Model Deployment 15 questions 20 min

Model Deployment MCQ · test your MLOps knowledge

From containerization to canary releases – 15 questions covering serving frameworks, scaling, monitoring, and production best practices.

Easy: 5 Medium: 6 Hard: 4
Docker
Kubernetes
Model Serving
Monitoring

Model Deployment: from training to production

Deploying machine learning models involves making them available for inference in production environments. This MCQ covers essential topics: serving frameworks (TensorFlow Serving, TorchServe, ONNX Runtime), containerization with Docker, orchestration (Kubernetes), deployment strategies (shadow, canary, A/B), and monitoring for drift and performance.

Why deployment matters

A model is only valuable if it can be reliably integrated into business applications. Proper deployment ensures scalability, low latency, and continuous validation.

Deployment & MLOps glossary – key concepts

Containerization (Docker)

Packages model and dependencies into a portable container. Ensures consistency across environments.

Kubernetes

Orchestrates containers; handles scaling, load balancing, and rolling updates.

Model Serving

Frameworks like TensorFlow Serving, TorchServe, NVIDIA Triton optimize inference.

Deployment strategies

Blue‑green, canary, shadow testing, A/B tests – different ways to introduce new model versions safely.

Model Monitoring

Track prediction drift, data drift, latency, error rates. Tools: Prometheus, Grafana, Evidently AI.

ONNX / TensorRT

Intermediate representations and optimizers for cross‑platform, high‑performance inference.

Model Versioning

Managing multiple model versions simultaneously; often via registry (MLflow, DVC).

# Example: Deploy a Keras model with TensorFlow Serving
docker pull tensorflow/serving
docker run -p 8501:8501 \
  --mount type=bind,source=/path/to/model,target=/models/my_model \
  -e MODEL_NAME=my_model -t tensorflow/serving
# Inference via REST: POST http://localhost:8501/v1/models/my_model:predict
Interview tip: Be prepared to compare deployment strategies (canary vs blue‑green), explain how to monitor for concept drift, and discuss trade‑offs between REST and gRPC for inference.

Common model deployment interview questions

  • What is the difference between batch and online inference?
  • How would you A/B test two versions of a model?
  • Explain how Kubernetes manages rolling updates of a model container.
  • What metrics would you monitor for a production model?
  • How does ONNX help with model deployment?
  • Describe a canary deployment and its benefits.