YARN, the Apache Hadoop Platform for Streaming, Realtime and Batch Processing

par Éric Charles
Thursday 27 February 2014

YARN, the Apache Hadoop Platform for Streaming, Realtime and Batch Processing

Apache Hadoop YARN is a sub-project of Hadoop at the Apache Software Foundation introduced in Hadoop 2.0 that separates the resource management and processing components. YARN was born of a need to enable a broader array of interaction patterns for data stored in HDFS beyond MapReduce, not constrained to MapReduce.

These added capabilities allow enterprises to realize near real-time processing and increased ROI on their Hadoop investments. With MapReduce becoming a user-land library, it can evolve independently of the underlying resource manager layer and in a much more agile manner.

We will explain during this talk that graph processing and iterative modelling now possible for data processing on top of Apache Hadoop YARN. We will also highlight the benefits of running a Key Value storage system such as Apache HBase and a Streaming cluster such as Storm within YARN.

A propos de Éric Charles

Eric CHARLES is the founder of DATALAYER that provides in Belgium development services based on the HADOOP ecosystem.

He worked in London on BIG DATA projects with Hadoop, Hive, Cascading, HBase, Cassandra, Kafa and Storm technologies.

Eric is also APACHE Member and Committer.

You can contact him via email (eric@datalayer.io) or on Twitter (@echarles).

Où ?

La Forge
Rue de la Cathédrale, 58
4000 Liège

Quand ?

Le jeudi 27 février ,
à partir de 19h00