Hi all, not sure if this is the correct forum to ask these questions :)
we have here a mesh network with 10 nodes of different HW types; we was running this network for nearly 2 months with the latest official SW release with a wonderfull stability... nearly no node faileure in two months.
Then we decided to try the delevop-164 build mostly to benefit from the extended statistics...
after the upgrade we have experimented a significant instability, like blocked nodes. We often have also found the OLSRD daemon crashed.
We are trying to investigate and debug these issues....
First question is if anybody else have experienced similar instability with this build..
Second question ... can anybody give any hint howto trace and debug these beheveure ? The main problem is actually how to get any sort of postmortem dump because to restore normal operation it is necessary to reboot via power the device...
Thanks for any attention
Mike
we have here a mesh network with 10 nodes of different HW types; we was running this network for nearly 2 months with the latest official SW release with a wonderfull stability... nearly no node faileure in two months.
Then we decided to try the delevop-164 build mostly to benefit from the extended statistics...
after the upgrade we have experimented a significant instability, like blocked nodes. We often have also found the OLSRD daemon crashed.
We are trying to investigate and debug these issues....
First question is if anybody else have experienced similar instability with this build..
Second question ... can anybody give any hint howto trace and debug these beheveure ? The main problem is actually how to get any sort of postmortem dump because to restore normal operation it is necessary to reboot via power the device...
Thanks for any attention
Mike
Rebooting clears all data, a node that fails needs to be looked at without rebooting to have any chance of obtaining useable information. Depending on the issue your seeing this can often be done from the local interfaces (you may need to telnet in.) As of develop-162 the support tool can be run from the command line (/usr/local/bin/supporttool)
You can try grabbing support data files at intervals to see if there is any any information in them leading up to your failure, but no guarantees that will catch it.
If you can catch it and duplicate it a ticket should be opened in the AREDN ticket system ( http://bloodhound.aredn.org/ ) to be acted upon.
The SW we run is just the one coming from the repository.
I will try to periodically run the support tool and try to see if anything meaningfull is coming out.
best regards
Mike