Hadoop YARN clusters - Adding node at runtime -

June 15, 2010

i working on solution providing run-time ressources addition hadoop yarn cluster. purpose handle heavy peaks on our application.

i not expert , need in order approve / contest understand.

hadoop yarn

this application run in cluster-mode. provides ressource management (cpu & ram). spark application, example, ask job done. yarn handles request , provides executor computing on yarn cluster.

hdfs - data & executors

the datas not shared through executors, have stored in file system. in case : hdfs. means have run copy of spark streaming application in new server (hadoop node).

i not sure of this :

the yarn cluster , hdfs different, writing on hdfs won't write on new hadoop node local data (because not hdfs node).

as write on hdfs new data spark streaming application, creating new application should not problem.

submit job yarn
--- peak, resources needed
instance new server
install / configure hadoop & yarn, making slave
- modifying hadoop/conf/slaves, adding it's ip adress (or dns name host file)
- moddifying dfs.include , mapred.include
  
  on host machine :
- yarn -refreshnodes
- bin/hadoop dfsadmin -refreshnodes
- bin/hadoop mradmin -refreshnodes

should work ? refreshqueues sounds not useful here seems take care of process queue.

i not sure if running job increase it's capacity. idea wait new ressources available , submit new job.

thanks help

Search This Blog

TSQL

Hadoop YARN clusters - Adding node at runtime -

Comments

Post a Comment

Popular posts from this blog

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

android - How to create dynamically Fragment pager adapter -

1111. appearing after print sequence - php -