Finally, it’s the end of my 3rd semester with EMDC and I would like to share our latest project: High Availability in YARN. This project is collaboration between EMDC and Swedish Institute of Computer Science (SICS). The project members are Arinto (me :p) and Mário. Our project partners are Umit and Strahinja (they worked on node-manager of YARN). And this project is supervised by Jim Dowling and mentored by Vasia Kalavri.
This post explains the motivation behind the project and our proposed solution. The follow-up post explains the implementations and experiments as proofs of concept of our solutions.
YARN solves scalability issues of previous MapReduce framework. It also offers flexibility in executing the computation framework on top of a cluster where YARN is deployed1. However, it still has one limitation, which is on its availability.
It has been more than two weeks since my last post! 🙁 I was busy with exam, project reports and my trip preparation. Finally, I manage to find time (on my journey from Warsaw to Prague) to update my blog :p
Well, I would like to cover our SDS project titled “Flume-based Independent News Aggregator” but my project-mate, Mario, has covered as well in his blog. So in this case, I’ll just give you the link to Mario’s post, which is here.
As a follow up of the project, our professor asked us to experiment with the system with regards to its scalability. It is individual project, and I plan to experiment with Flume scalability in term of number of event it can support. Mario will do something related to its realiability and fault tolerance. I plan to update this blog once I finished the project 🙂
*Update: Post about the mini project of Flume Scalability can be found here.