Quantcast
Channel: THWACK: All Content - Network Performance Monitor
Viewing all articles
Browse latest Browse all 21870

Alert Prioritising Dashboard (SWQL) with Alert Time and Acknowledge Status

$
0
0

I have added 2 addition columns for Alert Prioritising Dashboard which is introduced in my previous document “Alert Prioritising Dashboard (SWQL) for Problematic Nodes (Servers)” (https://thwack.solarwinds.com/docs/DOC-176521)

 

One column is AlertTime, another one is Acknowledge (Ack). The Ack column is click-able. Right click it and open a new windows to View or Acknowledge an alert.

 

alert-000.jpg

 

 

Here are the steps to add the 2 columns:

 

  • Create an alert using "Advanced Alert Manager"

 

 

Alert Name: “SAM Alert - Nodes with Issues (SQL)”

 

Alert Description:

Generate an alert for a node if any condition below is met:

Node is Down

OR

Node is UP but AvgResponseTime > 500

OR

Node is UP but CPU Usage > 95%

OR

Node is UP but Memory Usage > 95%

OR

Node is UP but at least one Volume Usage > 95% (if apply)

OR

Node is UP but at least one Hardware component Status is in Warning or worse condition (if apply)

OR

Node is UP but at least one application monitor Status is Warning or worse condition (if apply)

 

alert-001.jpg

 

Trigger Condition: (SQL)

 

 

LEFT JOIN

(

SELECT NodeID, MAX(VolumePercentUsed) AS vPercentage

FROM Volumes

WHERE VolumeType = 'Fixed Disk'

GROUP BY NodeID

) AS h ON h.NodeID = Nodes.NodeID

LEFT JOIN

(SELECT c.NodeID AS NodeID,

  MAX(CASE

   WHEN c.Status=1 THEN 0  -- UP

   WHEN c.Status=17 THEN 1  -- - Undefined

   WHEN c.Status=0 THEN 2  -- Unknown

   WHEN c.Status=3 THEN 4  -- Warning

   WHEN c.Status=14 THEN 5  -- Critical

   ELSE NULL

   END

   ) AS HWStatusID

FROM APM_HardwareCategoryStatus AS c

JOIN APM_HardwareItem AS e ON e.HardwareCategoryStatusID = c.ID

Group BY c.NodeID

) AS g ON g.NodeID = Nodes.NodeID

LEFT JOIN

(SELECT NodeID,

  MAX(CASE

    WHEN ApplicationAvailability='Up' THEN 0  -- UP

    WHEN ApplicationAvailability='Unmanaged' THEN 1  -- Unmanaged

    WHEN ApplicationAvailability='Unknown' THEN 2  -- Unknown

    WHEN ApplicationAvailability='Unreachable' THEN 3  -- Unreachable

    WHEN ApplicationAvailability='Warning' THEN 4  -- Warning

    WHEN ApplicationAvailability='Critical' THEN 5  -- Critical

    WHEN ApplicationAvailability='Down' THEN 6   -- Down

    ELSE NULL

    END

   ) AS aStatusID

FROM APM_CurrentStatusOfApplication

GROUP BY NodeID

) AS i ON i.NodeID = Nodes.NodeID

WHERE

(

    Nodes.Status = 2

    OR

    (Nodes.Status=1 AND Nodes.AvgResponseTime > 500)

    OR

    (Nodes.Status=1 AND Nodes.CPULoad > 95)

    OR

    (Nodes.Status=1 AND Nodes.PercentMemoryUsed > 95)

    OR

    (Nodes.Status=1 AND h.vPercentage > 95.0 AND h.vPercentage < 100.0)

    OR

   (Nodes.Status=1 AND g.HWStatusID > 3)

    OR

    (Nodes.Status=1 AND i.aStatusID > 3)

)

 

alert-002.jpg

 

 

Reset Condition: (SQL)


LEFT JOIN

(

SELECT NodeID, MAX(VolumePercentUsed) AS vPercentage

FROM Volumes

WHERE VolumeType = 'Fixed Disk'

GROUP BY NodeID

) AS h ON h.NodeID = Nodes.NodeID

LEFT JOIN

(SELECT c.NodeID AS NodeID,

  MAX(CASE

   WHEN c.Status=1 THEN 0  -- UP

   WHEN c.Status=17 THEN 1  -- - Undefined

   WHEN c.Status=0 THEN 2  -- Unknown

   WHEN c.Status=3 THEN 4  -- Warning

   WHEN c.Status=14 THEN 5  -- Critical

   ELSE NULL

   END

   ) AS HWStatusID

FROM APM_HardwareCategoryStatus AS c

JOIN APM_HardwareItem AS e ON e.HardwareCategoryStatusID = c.ID

Group BY c.NodeID

) AS g ON g.NodeID = Nodes.NodeID

LEFT JOIN

(SELECT NodeID,

  MAX(CASE

    WHEN ApplicationAvailability='Up' THEN 0  -- UP

    WHEN ApplicationAvailability='Unmanaged' THEN 1  -- Unmanaged

    WHEN ApplicationAvailability='Unknown' THEN 2  -- Unknown

    WHEN ApplicationAvailability='Unreachable' THEN 3  -- Unreachable

    WHEN ApplicationAvailability='Warning' THEN 4  -- Warning

    WHEN ApplicationAvailability='Critical' THEN 5  -- Critical

    WHEN ApplicationAvailability='Down' THEN 6   -- Down

    ELSE NULL

    END

   ) AS aStatusID

FROM APM_CurrentStatusOfApplication

GROUP BY NodeID

) AS i ON i.NodeID = Nodes.NodeID

WHERE

(

  Nodes.Status=1

  AND Nodes.AvgResponseTime < 500

  AND Nodes.CPULoad < 95

  AND (Nodes.PercentMemoryUsed < 95 OR Nodes.PercentMemoryUsed IS NULL)

  AND (h.vPercentage < 95.0 OR h.vPercentage IS NULL)

  AND (g.HWStatusID < 3 OR g.HWStatusID IS NULL)

  AND (i.aStatusID < 4 OR i.aStatusID is NULL)

)


alert-003.jpg


 

  • Find Alert Definition ID:

 

If you import the attached alert definition file (SAM_Alert-NodesWithIssuesSQL.AlertDefinition), the AlertDefID is “14d4ba87-7ed3-4b0d-9119-b592b8cfacd8”.

If you created your own alert, using the method below to find out AlertDefID:

·         Create a temporary summary view in Solarwinds Web Console, add a “Custom Query” Resource.

·         Edit the “Custom Query”, add the SWQL query below:

SELECT AlertDefID, Name

FROM Orion.AlertDefinitions

WHERE Name = 'SAM Alert - Nodes with Issues (SQL)'

 

·         Back to the view, and find out the AlertDefID

alert-005.jpg

  • Create Alert Dashboard:

 

Follow the steps in my previous document (https://thwack.solarwinds.com/docs/DOC-176521) to create the alert page

 

The new SWQL queries are attached.

 

If you import my Alert, the queries should work out-of-box. If you creates your own alert, please replace AlertDefID
“14d4ba87-7ed3-4b0d-9119-b592b8cfacd82 with your own one. (2 occurrences in the query)

 

Thanks


Viewing all articles
Browse latest Browse all 21870

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>