The principle and ZK Watcher

What is ZK Watcher

Based on a common demand for ZK applications is the need to know the state of ZK collection. For this purpose, a method is polling ZK ZK client set the timing to check if the system status has changed. However, polling is not an efficient way, particularly in the frequency state change when the low

Thus, ZK provided a poll to avoid performance problems caused by a particular time of interest notification client mode, i.e. provided Watcher manner. By setting Watcher, ZK client can request a notification to the specified znode registration, receive notification when a single znode changes. For example, the deleted Watcher sending node to be deleted when znode notice

ZK Watcher application code typically follows the following framework

zk.exists("myZnode", myWatcher, existsCallback, null);

Watcher myWatcher new Watcher() {
  public void process(WatchedEvent event) {
    // process the watch event

StatCallback existsCallback = new StatCallback() {
  public void processResult(int rc, String path, Object ctx, Stat stat) {
    // process the result of the exists call

The above code framework exists to operate, for example, shows the general use of asynchronous call ZK and register the Watcher

Classification of WatchedEvent

Watcher use an important element is to understand how and when to set Watcher trigger, not all operations can be set ZK Watcher, Watcher is not all events will be triggered

Despite the connection status of being overloaded WatchedEvent, WatchedEvent course of business will be divided into the following encounter

    NodeCreated – can be set by calling exists Watcher, is triggered when znode created from scratch

    NodeDeleted – Watcher can be set by calling getData exists or is triggered when znode be deleted

    NodeDataChanged – Watcher can be set by calling getData exists or is triggered when data changes in znode

    NodeChildrenChanged – trigger can be called to set Watcher by getChildren, created or deleted in direct child of znode

    DataWatchRemoved – trigger corresponding Watcher Watcher exists or when getData set are deleted

    ChildWatchRemoved – Watcher in the corresponding trigger setting is deleted Watcher getChildren

We can see that only exists and getData and getChildren three operations to set Watcher

Note, Watcher created getData not receive NodeCreated event, because when getData node does not exist KeeperException.NoNodeException will throw an exception, but does not set Watcher

Watcher mechanisms of implementation and life cycle

From the perspective of the application is concerned, after completing registration Watcher can just wait event is triggered, ZK is how to achieve this process without concern. But ZK understand the specific implementation mechanism will help us better understand the source of the problem and targeted troubleshooting problems in the face of error or anomaly

The most important issue is the mechanism to achieve Watcher Watcher exactly where registered, as well as exactly how Watcher triggered. It is difficult to separate the two issues to explain, so the following will be analyzed together

Originally part explain the principle of the best is a combination of the source code corresponding summary to explain, but ZK it is difficult to read the source code, posted here not only fail to help understand, I am afraid that will make the reader more confused. I will introduce the source file location corresponding to the code logic together with the size of the pseudo-code, interested students can read on their own, to enjoy good health

Watcher mechanisms to achieve from the registration talk, ZK client at the time of execution exists or getData or getChildren operation, you can set a custom Watcher reuse or create Watcher set when the client through the flag. The latter practice is rarely used, do not do too much introduction. The Watcher is packaged into the Packet into ClientCnxn EventThread, the registered client’s Watches set when the corresponding operation is completed. In the server, a request corresponding to a GetDataRequest like In Flag is set up, the server Watcher thereby determines whether to set the appropriate Watcher. Here, ZK is ServerCnxn achieved Watcher interface. ServerCnxn each service end is connected to the object for the client, its process (WatchedEvent) corresponding WatchedEvent approach is packaged and sent to the customer terminal WatcherEvent

When the state change after successful Watcher Watcher settings need to be concerned about is triggered, in essence, is a collection Watcher in ZK occur in the client callbacks processing logic. But the change of state occurs ZK collection to prevail in the state of the server, the server maintains state ZK collection, which is mainly composed of ZKDatabase and DataTree to achieve. When the server has occurred is determined when the required starting Watcher state change, the server will traverse the corresponding node changes the Watcher, this is the corresponding client connections thereof callback process (WatchedEvent) method. As described above, this is transmitted to a corresponding client WatchedEvent

Watcher registered for and triggered the process described above actually encompasses the entire life cycle Watcher, that Watcher’s life cycle starting from the corresponding operating successfully in the customer service side, to the rear end of the trigger. In other words, Watcher one-shot, and would like to listen again to the corresponding node status needs to be reset after triggering Watcher. Watcher of the end of life there is another trigger condition that the session is closed or expired. Further, in later versions 3.5.0, ZK removeWatches operation can be performed to remove the active node is no longer of interest.

Watcher error handling

As described above, Watcher is a corresponding change notification mechanism of a lightweight. Because of its simple function, in practical application to build more complex semantics, we need to do in response to the corresponding discussion of the Watcher in some fault conditions.

The first of which is the problem caused by the one-shot and information WatchedEvent carry. Because Watcher one-shot, so we might lose an event before a Watcher Watcher reset after the previous trigger to the rear. Normally this is not a problem, because ZK’s goal is to achieve consensus on the state of storage under a distributed environment, rather than to ensure that each client events are recorded and processed. Watcher that comes with re-setting the action is sufficient to ensure that we sync with the latest state at the time. Therefore, although we missed the event, but that is at best only an intermediate state guarantee provided by ZK is a final state on a period of time. But from another perspective, because the WatchedEvent contains only the incident occurred] [this information, so any new state will need to obtain from the ZK collection, which is simple to achieve ZK in a trade-off had to do

The second of which is about CONNECTIONLOSS exception. Strictly speaking this is not something to be concerned about the Watcher, because the operation failed due to CONNECTIONLOSS Watcher is unable to successfully set up. CONNECTIONLOSS abnormal means that the client and the server you are connecting disconnected due ZK server has several servers, in which case the client will attempt to connect to other servers. However, in this case, since the Watcher is not set up successfully, so after a successful reconnection, you should retry the operation just to the correct settings Watcher. In addition, Watcher had already successfully set will not be affected by such movement of the connection, because the client reconnection service will end when all Watcher resend it again, the server than the relative value of znode state and zxid of inferred Watcher required to trigger the trigger, the other normal setting Watcher


ZK Watcher mechanism of the normal process is quite smooth, but after triggering the need for active Watcher got me to state once again that it is quite troublesome, and ZK operation will be a variety of strange anomalies. About ZK deal with various exceptions in the case of network delay or partition, there will be a separate article to introduce. In addition, ZK’s source code harmful to the body, it is recommended in addition to the symptoms it is best not nothing to see

Leave a Reply