Apache Knox Ambari Cluster Monitoring
Introduction
My Apache Knox Dynamic Service Endpoint Discovery article describes some exciting new functionality available in the 0.14.0 release of Apache Knox. The gateway is now able to dynamically determine the endpoint URLs of cluster services to proxy from Ambari. The associated benefits are described in that article.
Another benefit of this new functionality, which is not mentioned in that article, is the added ability to dynamically respond to cluster configuration changes that affect generated Knox topologies by re-generating and re-deploying those topologies. Without this, deployed topologies can be easily disabled when any of the proxied Hadoop services’ configuration changes in the cluster.
Cluster Monitoring
When Knox deploys a simple topology descriptor, and generates a corresponding topology based on discovered cluster configuration details, it subsequently has the ability to monitor that cluster configuration for changes. When it discovers a change, it updates all of its topologies that are based on that modified cluster, and redeploys them. This has the potential to greatly reduce downtime for Knox due to cluster configuration changes.
For example, suppose a descriptor (docker-sandbox.json) is deployed, intended to proxy services in the HDP Docker Sandbox. Following the successful generation and deployment of the docker-sandbox topology, Knox can monitor the Sandbox cluster managed by Ambari. If an administrator were to update the dfs.namenode.http-address property value in the hdfs-site configuration, changing the port number for example, the Knox proxy for the WEBHDFS service would no longer work. However, if the Ambari cluster monitor is enabled, Knox would regenerate and redeploy the docker-sandbox topology, such that it would contain the correct port for the WEBHDFS service URL, and Knox clients would continue to work.
By default this monitor is disabled, but it can easily be enabled by setting the gateway.cluster.config.monitor.ambari.enabled property value to true in the gateway-site configuration.
<property>
<name>gateway.cluster.config.monitor.ambari.enabled</name>
<value>true</value>
<description>Enable/disable Ambari cluster configuration monitoring.</description>
</property>
Also in the gateway-site configuration, there is a property for controlling the frequency with which Knox will check the clusters for which it has deployed topologies. For demonstration purposes, you may want to set this as low as 20 or 30 seconds.
<property>
<name>gateway.cluster.config.monitor.ambari.interval</name>
<value>60</value>
<description>The interval (in seconds) for polling Ambari for cluster configuration changes.</description>
</property>
Try It
The Apache Knox Dynamic Service Endpoint Discovery article includes instructions for deploying topologies using simple descriptors, employing service URL discovery. Starting from there, you can enable the Ambari cluster monitoring, and make a cluster configuration change like the one described in this article. Then, you’ll see how Knox responds to the change, and adapts to continue providing the proxied WEBHDFS service to its clients.
- Set the gateway.cluster.config.monitor.ambari.enabled property value to true in {GATEWAY_HOME}/conf/gateway-site.xml
- Restart the gateway
- Use Ambari to modify the hdfs-site dfs.namenode.http-address configuration property value as described in the example.
- Allow the gateway to notice the configuration change (watch the {GATEWAY_HOME}/logs/gateway.log for the messages)
- Review {GATEWAY_HOME}/conf/topologies/docker-sandbox.xml, and notice the change to the WEBHDFS service URL.
Your sandbox must expose the new port you specified for the dfs.namenode.http-address property for Knox to be able to access the new endpoint; otherwise, even though the topology will be correct, requests will fail due to connection failure.
Summary
While it doesn’t take long to describe, this feature is a significant addition to the value provided by Knox. The ability to dynamically adapt to cluster service configuration changes reduces the effort required (and the potential for errors) by administrators when making such changes.
N.B., Statically-defined topologies (i.e., those deployed directly by a regular topology XML file) do NOT benefit from this monitoring support.
More details are available in the User Guide.