This post is a follow-up post about our project, High Availability in YARN. In the previous post, we have explained the motivation and our proposed solution to solve availability problem in YARN. Now, let’s continue with the implementations and experiments that we have done as proofs of concepts for our proposed solution.
As a proof-of-concept of our proposed architecture, we designed and implemented NDB storage module for YARN resource-manager. Due to limited time, recovery failure model was used in our implementation. In this post, we will refer the proof-of-concept of NDB-based-YARN as YARN-NDB.