Quantcast
Channel: THWACK: Discussion List - All Communities
Viewing all articles
Browse latest Browse all 16365

Alerts work when testing, but not in actual trigger condition

$
0
0

So I recently started setting up alerts for typical network issues, such as node down, high interface utilization, etc.  For a trigger action, I have it set up to send me an email and also to put a short entry into a log file.  It works perfectly when I test it, but I just saw two nodes go down, and another one is currently at 96% utilization, and none of the alerts triggered.  They ARE turned on.  Does anyone know why these would work in a test, but not in actual production deployment?  I've double and triple checked, and as far as I can see, everything is set correctly.

Here's what I have:

General tab:  Alert name, description.  Enable box is checked.  Checks for alert every 1 minute.

Trigger Condition:  Trigger when ALL of the following apply:  Node Status is equal to Down (I have since changed ALL to ANY and added Node Status equal to Unreachable)  Type of property to monitor:  Node

Reset Condition:  Reset when trigger conditions are no longer true.

Alert Suppression:  Suppress Alert when ALL of the following apply:  Device (This is a Custom Property, and it IS set correctly on the device that triggered) is NOT equal to Router (I only want router down messages for now)

Time of day:  Default setting (24x7)

Trigger Actions:  Send email to (me), Log Alert to (log file)

Reset Actions:  Log alert to (log file)

Alert Sharing:  Default setting, except I changed severity from warning to critical.

 

Again, when I test the alert, it sends me an email with all the correct data I specified in the Trigger Actions, and it writes a line into the log file.  This leads me to believe that I'm not monitoring for the proper event.  Basically, our provider just had a DS3 issue, and it took down two of our routers.  It seems to me that that would count as the node being "down."  I added "unreachable" to the alert criteria, and I just changed the default property under the Alert Sharing tab for Node Name, which was set to ${SysName} to ${NodeName}.

Now I guess I just have to sit back and wait to see if the changes I made work, unless anyone else can see any glaring errors as to how I set it up.

Just FYI, the same thing is happening with the interface utilization alerts I created.  Trigger condition exists, tests work fine, alert is turned on, but it doesn't appear to trigger the alert when the citeria are met.

Edit:  I'm using Advanced Alert Manager V2013.2.1 (We're planning an update soon)


Viewing all articles
Browse latest Browse all 16365

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>