Member-only story

Weekly Digest #70

Weekly Dev Blog
2 min readApr 11, 2022

--

Articles

Ship to Production, Darkly: Moving Fast, Staying Safe with ML Deployments

ML Change management (dark rollout)

Step 0: Pre-production iteration

Step 1: Production: Shadow traffic, 1% volume

Step 2: Production: Shadow traffic, 100% volume

Step 3: Experiment: Incumbent model vs. new model

This sums down to:

  • preproduction
  • shadow on live data performance test (both the model performance and the infrastructure performance)
  • go live

Detecting silent errors in the wild: Combining two novel approaches to quickly detect silent data corruptions at scale

This article went through the theoretical aspect of how to detect silent data corruption:

  1. Out-of-production testing (opportunistic testing)
  2. In-production testing (ripple testing)

Practical Argo Workflows Hardening

High level Practice

  • Latest version
  • TLS for network request
  • limit ingress/egress using network policies
  • principle of least privilege

--

--

No responses yet