lennart.cl

Blink Twice - Automatic Workload Pinning and Regression Detection for Versionless Apache Spark using Retries

Justin Breese, Vijayan Prabhakaran, Martin Grund, Stefania Leone, Amit Shukla, Michael Armbrust, Reynold Xin, Matei Zaharia, Lennart Kats, Sung Chiu, Tatiana Romanova, Philip Nord, Mitchell Webster, Chris Munson, Bo Pang, David Ma. Blink Twice - Automatic Workload Pinning and Regression Detection for Versionless Apache Spark using Retries. In Companion of the 2025 International Conference on Management of Data (SIGMOD 2025), pages 103—106, ACM, 2025. [doi

Abstract

We present Blink Twice, a system for automatic workload pinning and regression detection in versionless Apache Spark deployments. In versionless systems, the runtime is continuously upgraded without user intervention, introducing the risk of performance regressions. Blink Twice leverages automatic retry mechanisms to detect regressions by comparing execution metrics between different runtime versions. When a regression is detected, the system pins the affected workload to a known-good version while investigation and fixes are conducted. This approach enables rapid deployment of improvements while maintaining high reliability guarantees for production workloads.

More information

More information about this paper can be found at the ACM Digital Library.


[back to lennart.cl]