Samza的ApplicationMaster
当Samza ApplicationMaster启动时,它做以下的事情:
- 通过STREAMING_CONFIG环境变量从YARN获取配置信息(configuration)
- 在随机端口上 启动一个JMX server
- 实例化一个metrics registry和reporter来追踪计量信息
- 将AM向YARN的RM注册
- 使用每个stream的PartitionManager来获取总共的partition数量
- 从Samza的job configuration里获取总的container数量
- 将partition分给container(在Samza AM的dashboard里,称为Task Group)
- 为每个container向YARN发送一个ResourceRequest
- 每秒向YARN RM poll一次,检查allocated and released containers
AMRMClientAsync
handles communication with the
ResourceManager and provides asynchronous updates on events such as container
allocations and completions. It contains a thread that sends periodic heartbeats
to the ResourceManager. It should be used by implementing a CallbackHandler:
class MyCallbackHandler implements AMRMClientAsync.CallbackHandler {
public void onContainersAllocated(List<Container> containers) {
[run tasks on the containers]
}
public void onContainersCompleted(List<ContainerStatus> statuses) {
[update progress, check whether app is done]
}
public void onNodesUpdated(List<NodeReport> updated) {}
public void onReboot() {}
}
The client‘s lifecycle should be managed similarly to the
following: AMRMClientAsync asyncClient =
createAMRMClientAsync(appAttId, 1000, new MyCallbackhandler());
asyncClient.init(conf);
asyncClient.start();
RegisterApplicationMasterResponse response = asyncClient
.registerApplicationMaster(appMasterHostname, appMasterRpcPort,
appMasterTrackingUrl);
asyncClient.addContainerRequest(containerRequest);
[... wait for application to complete]
asyncClient.unregisterApplicationMaster(status, appMsg, trackingUrl);
asyncClient.stop();
这个类是用来做为一个Client和RM进行通信,并且注册一个用于回调的对象来处理container 的allocation和completion事件。它启动一个线程,周期性地发送hearbeat至ResourceManager
郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。