Improvement #276

Correct usage of ZCL transaction IDs

Added by Philipp Buluschek almost 4 years ago. Updated over 3 years ago.

Status:Feedback Start date:11/12/2014
Priority:Low Due date:
Assignee:Stefano Lenzi % Done:

0%

Category:zigbee.ha.driver Spent time: -
Target version:org.aaloa.zb4osgi.zigbee.ha.driver-0.7.0
Has a patch:Yes Has license agreement signed:No

Description

In ZigBeeDeviceImpl.invoke() the transaction ID is put to 0, with a comment that it is a bug in the Z-Stack, that the transaction ID of the incoming message is always 0.
In fact, this is intended behavior (see http://e2e.ti.com/support/wireless_connectivity/w/design_notes/transaction-sequence-number-z-stack-zigbee.aspx). There is no transaction ID which is valid for all incoming messages. But instead, each application must define its own transaction IDs.

In our case, we handle ZCL messages which have defined a transaction ID in the ZCL header. This is the value which must be checked to be sure that the incoming message corresponds to the sent query. (note that the transaction ID which is passed into AF_DATA_REQUEST is a different one (!). It is only used to match the AF_DATA_REQUEST to the corresponding AF_DATA_CONFIRM).

To correctly consume incoming messages, we should
- Allow the "Cluster" which is passed into invoke() to also provide its ZCL transaction ID (from the ZCL header).
- Add the ZCL transaction ID to the waiter (instead of the current transaction ID which is used only in the AF_DATA_REQUEST/AF_DATA_CONFIRM exchange)
- Allow the AF_INCOMING_MSG to return the ZCL transaction ID from the data (it's the 2nd byte in the data). In particular, it should not return the TransID value from the serial frame as it is always 0.

Currently the comparison of ZCL transaction IDs is done, for example, in AttributeImpl.doClusterWideRead(), but at this point it is already too late (the message was already consumed, we can just log an error).

One unresolved problem is how to identify if the incoming AF_INCOMING_MSG is actually a ZCL frame. Is it always so?

bugfix_276.patch Magnifier (13.9 kB) Stefano Lenzi, 11/18/2014 12:05 am

History

#1 Updated by Stefano Lenzi almost 4 years ago

  • Tracker changed from Bug to Improvement
  • Category set to zigbee.ha.driver
  • Status changed from New to In Progress
  • Assignee set to Stefano Lenzi
  • Priority changed from Normal to Low
  • Target version set to org.aaloa.zb4osgi.zigbee.ha.driver-0.7.0
I have assigned this issue to the zigbee.ha.driver because it is the one that uses the ZigBeeDevice service, but the issue can affect also the following modules:
  • zigbee.zcl.library
  • zigbee.basedriver

#2 Updated by Stefano Lenzi almost 4 years ago

Philipp Buluschek wrote:

In ZigBeeDeviceImpl.invoke() the transaction ID is put to 0, with a comment that it is a bug in the Z-Stack, that the transaction ID of the incoming message is always 0.
In fact, this is intended behavior (see http://e2e.ti.com/support/wireless_connectivity/w/design_notes/transaction-sequence-number-z-stack-zigbee.aspx). There is no transaction ID which is valid for all incoming messages. But instead, each application must define its own transaction IDs.

Okay thank you for clarifying it!

In our case, we handle ZCL messages which have defined a transaction ID in the ZCL header. This is the value which must be checked to be sure that the incoming message corresponds to the sent query. (note that the transaction ID which is passed into AF_DATA_REQUEST is a different one (!). It is only used to match the AF_DATA_REQUEST to the corresponding AF_DATA_CONFIRM).

Yes the "TransactionID" of the ZCL Frame is completely different to the parameter of the AF_DATA_REQUEST parameter

To correctly consume incoming messages, we should
- Allow the "Cluster" which is passed into invoke() to also provide its ZCL transaction ID (from the ZCL header).
- Add the ZCL transaction ID to the waiter (instead of the current transaction ID which is used only in the AF_DATA_REQUEST/AF_DATA_CONFIRM exchange)
- Allow the AF_INCOMING_MSG to return the ZCL transaction ID from the data (it's the 2nd byte in the data). In particular, it should not return the TransID value from the serial frame as it is always 0.

The problem is that at by design we want to separate the zigbee.basedriver level from the zigbee.ha.driver because the first one provide abstraction and access to ZigBee network and it should be unique, while the other provide access to the Application Profile that are present on the ZigBee network and other Application Profile driver will exists: one per ZigBee Profile Docuement (i.e. zigbee.hc.driver provides access to the ZigBee Health Care Profile)

Keeping the above in mind, I have no practical idea on how to enable the ZigBeeDevice to consume only the cluster that linked to the transaction without breaking the separation... Idea are more then welcomed!

Currently the comparison of ZCL transaction IDs is done, for example, in AttributeImpl.doClusterWideRead(), but at this point it is already too late (the message was already consumed, we can just log an error).

One unresolved problem is how to identify if the incoming AF_INCOMING_MSG is actually a ZCL frame. Is it always so?

At the moment all the ZigBee Profile uses the ZCL Frame, but by design ZigBee Specification defers the specification of the frame to Application Profile and some of the can be even Vendor specific so there is we may assume that all the data is ZCLFrame but it will a restriction compared to current approch.

#3 Updated by Stefano Lenzi almost 4 years ago

Some idea out-of-the-box

Extend ZigBeeDevice interface and breaking separation

Considering that most of the Cluster sent with ZigBeeDevice.invoke() method are ZCLFrame, we may move ZCLFrame frmo zigbee.zcl.library to zigbee.basedriver.api and add the method:
  • ZigBeeDevice.send(int clusterId, ZCLFrame packet);
  • ZigBeeDevice.invoke(int clusterId, ZCLFrame packet);

So that the consume() can actually inspect the packet so that only matching TransactionID field packet will be consumed

ZCLClusterBase never use the ZigBeeDevice.invoke

We can replace all the ZigBeeDevice.invoke() with ZigBeeDevice.send() and register register a ClusterListener for matching request / response command. This solution will avoid to break the separation between ZigBee specification layer and ZigBee Application layer

Dirty 2nd-byte magic

Even if assume no knowledge on the data sent by ZigBeeDevice.invoke() but we expect that most of the cluster sent are ZCLFrame; instead of matching the first AFMessageConsumer that matches src and clusterId, we select the one that has also the 2-nd byte of the payload matching. In case,none of the AFMessageConsumer match the 2-nd byte then the current dispatching is used.

Add a filter on invoke

We can add a filter parameter to the ZigBeeDriver.invoke(Cluster, ClusterFilter) that will be used as extra matching for the incoming message so that we avoid to consume wrong messages

Which solution do you prefer?

#4 Updated by Stefano Lenzi almost 4 years ago

Please provide some comment on the proposed solutions...

#5 Updated by Philipp Buluschek almost 4 years ago

I currently use the dirty 2nd byte version :-)

I think we should keep the ZigBee/Profile separation clean. But as we call invoke with a "Cluster", it is already specific to ZCL messages. Also the response AF_INCOMING_MESSAGE has a "clusterID" field which hints at the use of the ZCL. Thus my question whether all AF_INCOMING_MSG are in fact ZCL frames.

How is a profile which does not use the ZCL supposed to be handled? Maybe not through the AF layer?
If the AF layer is purely ZCL specific, we can use the 2nd byte solution without breaking separation (ie. all profiles using the ZCL will be handled correctly, not just HA)

Else, I'm in favor of the ClusterFilter version. It's really the cleanest. (the one where we use only send() and then register the listener is too low level for the caller. This should be abstracted.)

So for me the decisive question is: are all AF_INCOMING_MSG ZCL frames?

#6 Updated by Stefano Lenzi over 3 years ago

Philipp Buluschek wrote:

I currently use the dirty 2nd byte version :-)

I think we should keep the ZigBee/Profile separation clean. But as we call invoke with a "Cluster", it is already specific to ZCL messages. Also the response AF_INCOMING_MESSAGE has a "clusterID" field which hints at the use of the ZCL. Thus my question whether all AF_INCOMING_MSG are in fact ZCL frames.

I agree! We can expected that 99.0% of the AF_INCOMING_MSG are ZCL Frames but is not guarantee, so fare the ZCL frames are the payload of AF_* API so anything can be transmitted. For example:
vendor specific application can use their own data, or a user can send ZDP frame for ZDO cluster that are not supported by the dongle

How is a profile which does not use the ZCL supposed to be handled? Maybe not through the AF layer?

As I said the AF Layer is general purpose and it is the reason for keeping it clean from ZCL layer, the application sending data is the only aware of the meaning of the byte[] content of the it.cnr.isti.zigbee.api.Cluster

If the AF layer is purely ZCL specific, we can use the 2nd byte solution without breaking separation (ie. all profiles using the ZCL will be handled correctly, not just HA)

Else, I'm in favor of the ClusterFilter version. It's really the cleanest. (the one where we use only send() and then register the listener is too low level for the caller. This should be abstracted.)

AF is not pure ZCL specific so I will go for ClusterFilter, moreover the 2-nd byte solution must check the ZCL frame Header because if sending Manufacturer enabled ZCL frame then the transaction field will be the 4-th byte

So for me the decisive question is: are all AF_INCOMING_MSG ZCL frames?

As I said the correct answer is NO even if most of the time we will expect ZCL frames

#7 Updated by Stefano Lenzi over 3 years ago

I have implemented the ClusterFilter solution but before applying to the trunk I want to release the current codebase.

#8 Updated by Stefano Lenzi over 3 years ago

  • Has a patch changed from No to Yes

Patch against r1107

#9 Updated by Stefano Lenzi over 3 years ago

  • Status changed from In Progress to Feedback

#10 Updated by Philipp Buluschek over 3 years ago

In WaitForClusterResponse.consume(), why do we still check the transaction ID (which is always 0), and more importantly, why do we check the cluster ID. I think it woould make more sense to use the Filter for all the checks.

Basically a request sender must fully identify which messages he accepts as answer.

This makes the Filter mandatory in the call (not null), but this is a good thing in my opinion - having code with null objects increases the number of possible states (and thus the bugs) and also puts us at risk to have NPE.

Maybe we need a more general Filter interface like

public boolean accept(ZToolPacket);

which also support non ZCL messages.

Also available in: Atom PDF