Orion Platform 2017.1 - Hot Fix 3 - Slowness after install

July 26, 2017, 6:38 am

≪ Previous: Custom "Top 10 Interfaces by Percent Utilization"

We recently installed Orion Platform 2017.1 - Hot Fix 3 to all of our Pollers. We're noticing that response times on the Orion Server are getting longer - loading up apps/desktop, starting remote sessions, launching Orion Website, etc. Response time are wonderful after rebooting. Anyone else notice similar issues after installing Hot Fix 3?

Thanks1

↧

QoE - RDP 443

October 18, 2017, 6:43 am

≫ Next: How to remain logged into www.solarwinds.com

≪ Previous: Orion Platform 2017.1 - Hot Fix 3 - Slowness after install

Hi,

I was looking for some advice on QoE, i have just discovered this and trying to monitor specific application traffic.

The RDP config file connects too gatewayhostname:s:xxxxxxx.net:443

I can see an RDP monitor which i have assigned but no hits, will this inspect packets over 443? If not could or would you recommend a solution?

Thank you in advanced.

Andy

↧

How to remain logged into www.solarwinds.com

October 13, 2017, 11:12 am

≫ Next: Service Now Integration Error

≪ Previous: QoE - RDP 443

This might just me but does anyone else have issues remaining logged into the www.solarwinds.com website? I can be logged in on one tab click something that opens another tab then all of a sudden I need to login again then click back to the tab that I was in only to be actually logged in. I have checked the remain logged in for two weeks and that still does not help. What am I doing wrong?

↧

Service Now Integration Error

October 19, 2017, 11:31 am

≫ Next: Change email Notifications Font?

≪ Previous: How to remain logged into www.solarwinds.com

Hello,

I am testing on integrating ServiceNow with Solar Winds. Currently trying to setup alerts, and I am receiving an error: Failed to create incident. Check the connection to your ServiceNow instance on the ServiceNow configuration page. The simulation does create an incident in ServiceNow, not sure why the error is occuring. Also when i setup the alert for ServiceNow, the configure alert: create service now reverts back to default with weird characters. I have tried saving, deleting, and recreating the alert but it always reverts back to these weird characters and never saves what i change. Please help with this issue.

↧

Change email Notifications Font?

October 17, 2017, 10:49 am

≫ Next: How to integrate Solarwinds Offline Help Files

≪ Previous: Service Now Integration Error

How and where can I change the email Font from ‘Times New Roman’?

Thank you in advance for your help.

↧

How to integrate Solarwinds Offline Help Files

October 19, 2017, 1:43 pm

≫ Next: Custom Poller Atlas

≪ Previous: Change email Notifications Font?

Hello,

I'm operating in a closed environment where internet access is unavailable. I can scan cleared data and using procedures, move it from the internet to our closed system but don't want to do this arduous process daily or weekly if possible.

Does anyone know how you properly implement offline help so that it is available for my userbase, and myself, without jotting down each URL and bouncing back and forth between the closed system and an internet connected computer.

I located one post with great instructions but all of the offline files have been removed for an unknown reason. (See: How to integrate Solarwinds Offline Help Files and LokiR was involved in resolving before.)

Thank you for any support or assistance,

Matthew

↧

Custom Poller Atlas

December 30, 2013, 9:11 am

≫ Next: When you installed NPM, did you add Nodes manually or did you run discovery?

≪ Previous: How to integrate Solarwinds Offline Help Files

HI,

I know it wasn't possible a year ago, but I am under the understanding that you can now add labels to Maps in Atlas to show Custom Polling Information. I have sites connected together, and wish to display the current utilization of the connections. I current have all the Custom Polling, Advanced Alerting and Reporting all setup, but having this information on the Map would be be extremely useful.

Regards,

****

↧

When you installed NPM, did you add Nodes manually or did you run discovery?

July 20, 2016, 12:33 am

≫ Next: When will the Upgrade Advisor Include SolarWinds v12.2?

≪ Previous: Custom Poller Atlas

We would like to improve user experience and for such reason I'd like to better understand if our users prefers INITIALLY to add nodes manually or run product network discovery in order to import devices to NPM

↧

When will the Upgrade Advisor Include SolarWinds v12.2?

October 19, 2017, 1:43 pm

≫ Next: The Ultimate CPU Alert

≪ Previous: When you installed NPM, did you add Nodes manually or did you run discovery?

Looking to potentially upgrade a legacy SolarWinds Orion v10.5 up to v12.2

The Upgrade Advisor currently only shows up to SolarWinds Orion v12.1

↧

The Ultimate CPU Alert

September 10, 2013, 5:49 am

≫ Next: Custom Table Resource Grouping Data with Expansion(+) Option Based On Site

≪ Previous: When will the Upgrade Advisor Include SolarWinds v12.2?

CPU alerts are a yawner. Grab the CPULoad, check it against a threshold (maybe even a per-node custom threshold, as explained here: TIPS & TRICKS: Stop The Madness: How to set alert thresholds per-device), cut the alert, move on, right?

Here's the problem: If you are working with sophisticated Operations or server staff, you probably already know that they hate CPU alerts because they are

always vague
frequently invalid
way too frequent because they are tuned too low OR
never triggered when you need them because they are tuned too high.

At the heart of the issue is the fact that high CPU, by itself, tells you nothing of use. So the CPU is high? So what? If I've got a box that is constantly running hot but it is keeping up with the work, that's called "correctly sized".

What you really want need to about CPU know are 3 things:

How many processors are in the box
How many jobs are in the Processor Queue
What's the current CPU load

If you've got more jobs in the queue than you have CPUs and you also have high CPU, then you have the makings of a meaningful, actionable issue.

Let's add a little icing on the cake: When the condition above occurs, I want to know what the top 10 processes are at that moment, so I can get an idea of the likely culprits.

Interested? Let's get to work!

For this to work, you need NPM and SAM. You will be assigning one Perfmon counter to all your servers, and doing a little bit of SQL voodoo in the alert.

The Perfmon Counter:

In SAM, set up a new template. In it, you want to add a perfmon counter monitor named “Win_Processor_Queue_Len” that points to

Counter: “Processor Queue Length”,
Instance: (blank)
Category: “System”

After appropriate testing, adjustments, etc, you will eventually roll this template out to all your Windows systems.

The Alert Trigger

Your alert trigger is going to require some hardcore SQL. So you are setting up a Custom SQL Alert, with “Nodes” as the target table.

Along with the top part of the query that is automatically provided, you will add the following:

inner join APM_AlertsAndReportsData

on (Nodes.NodeID = APM_AlertsAndReportsData.NodeId)

INNER join (select c1.NodeID, COUNT(c1.CPUIndex) as CPUCount

from (select DISTINCT CPUMultiLoad.NodeID, CPUMultiLoad.CPUIndex

from CPUMultiLoad) c1

group by c1.NodeID) c2 on Nodes.NodeID = c2.NodeID

where

APM_AlertsAndReportsData.ComponentName = 'Win_Processor_Queue_Len'

AND APM_AlertsAndReportsData.StatisticData > c2.CPUCount

AND nodes.CPULoad > 90

What this is doing is

pulling the count of CPU’s for this node from the CPUMultiLoad table
Pulling the current statistic for the Win_Processor_Queue_Len perfmon counter
Checking that the number of processes in the queue is greater than the number of CPU’s
And finally checking that the CPULoad is over 90%

If the conditions in item 3 and 4 are true, you will get an alert.

If you stop here, you have a nifty alert that will tell you when something meaningful (and bad) is going on with your server. But let’s kick it up a notch.

Trigger Action

Your alert action is going to have two key steps:

Run the “Solarwinds.APM.RealTimeProcessPoller.exe utility to get the top 10 processes
After a 60 second delay, send your message

Get the Processes

The “Solarwinds.APM.RealTimeProcessPoller.exe” comes as part of SolarWinds SAM.

NOTE: If you installed SolarWinds somewhere other than the default location (C:\program files (x86)) then you will need to provide the full path to \SolarWinds\Orion\APM\Solarwinds.APM.RealTimeProcessPoller.exe

Otherwise, your command will look like this:

SolarWinds.APM.RealTimeProcessPoller.exe -n=${NodeID} -alert=${AlertDefID} -timeout=60

The only thing you may want to adjust is the –timeout, if you find you are getting alerts coming back with no process information (ie: it’s taking longer for the servers to respond)

Send Your Message

At its most basic, your alert message needs to look like this:

CPU on Node ${NodeName} is at ${CPULoad} at ${AlertTriggerTime}.

Top 10 processes at the time of the alert are:

${Notes}

NOTE: The ${Notes} field is populated with the top 10 processes as part of the previous action.

However, if you want to dress it up, you can include more information using more SQL voodoo:

CPU on Node ${NodeName} is at ${CPULoad} at ${AlertTriggerTime}.

There are ${SQL:Select APM_AlertsAndReportsData.StatisticData from APM_AlertsAndReportsData where APM_AlertsAndReportsData.NodeId = ${NodeID} and APM_AlertsAndReportsData.ComponentName = 'Win_Processor_Queue_Len'} items in the process queue and only ${SQL:Select COUNT(c1.CPUIndex) from (select DISTINCT CPUMultiLoad.NodeID, CPUMultiLoad.CPUIndex from CPUMultiLoad where CPUMultiLoad.nodeid = ${NodeID} ) c1 } CPUs to process them.

Top 10 processes at the time of the alert are:

${Notes}

If there is no list of alerts, it's because it took longer than 2 minutes to collect off the server. We felt that delivering the alert fast was more important.

What that big ${SQL… block in the middle does is pull the current Win_Processor_Queue_Len statistic, along with the count of CPUs for this node from the CPUMultiLoad table. The result would read:

There are 10 items in the process queue and only 4 CPUs to process them.

After setting up the message, make sure you go to the “Alert Escalation” tab and set the “Delay the execution of this action” to at least 1 minute.

Summary

So there you have it. A CPU alert that not only tells you when something meaningful and actionable is happening, but it gives you (or your support staff) some initial information to get you started finding and resolving the problem.

As anecdotal proof of how valuable this is, within 24 hours of rolling out this alert at my company, we found 3 different applications which were chronically mis-behaving across the enterprise. 2 resulted in our being able to prove an issue to the vendor (who didn’t believe us) and get a bug-fix under way.

EDIT 2014-10-31:

As discovered by jbiggley in this post: Custom SQL Alerts - Do reset conditions also need to be custom?, the reset trigger is problematic for this alert (as with all custom SQL alerts). You can't just select "reset when the condition is no longer true". The solution, as elaborated by RichardLetts here: Warning about custom SQL alerts (reset trigger), the reset trigger needs to be:

inner join APM_AlertsAndReportsData

on (Nodes.NodeID = APM_AlertsAndReportsData.NodeId)

INNER join (select c1.NodeID, COUNT(c1.CPUIndex) as CPUCount

from (select DISTINCT CPUMultiLoad.NodeID, CPUMultiLoad.CPUIndex

from CPUMultiLoad) c1

group by c1.NodeID) c2 on Nodes.NodeID = c2.NodeID

where

(APM_AlertsAndReportsData.ComponentName = 'Win_Processor_Queue_Len' AND APM_AlertsAndReportsData.StatisticData <= c2.CPUCount)

OR nodes.CPULoad <= 90

The key change here is that you want to reset when EITHER the processes are less than the number of CPU's, OR the CPU load is under the threshold

EDIT 2015-02-23

Hat-Tip to garyuk who caught my greater-than / less-than confusion in the reset logic above. It's fixed now.

↧

Custom Table Resource Grouping Data with Expansion(+) Option Based On Site

October 19, 2017, 5:55 am

≫ Next: Alert based on Event count?

≪ Previous: The Ultimate CPU Alert

Hi Guys

I have created a custom table resource to display node reboots for a period of 7 days...I am just wondering due to limited space on the Dashboard is their a way to add an expansion option on the Grouping of sites , as you can see below I have the one site that has a node that reboot a few times this week...It would be great if one can have an expansion option on the site name this will help with the space constraint on the display

↧

Alert based on Event count?

October 19, 2017, 6:59 am

≫ Next: Cisco ASA Context Discovery (NPM 12.2 & NCM 7.7)

≪ Previous: Custom Table Resource Grouping Data with Expansion(+) Option Based On Site

All,

Referencing this 6 year old posting, is there a new/better way to get the same result using NPM 11.5+ and the newer web-based alerting?

Re: Alert on event counts over a period of time

↧

Cisco ASA Context Discovery (NPM 12.2 & NCM 7.7)

October 3, 2017, 9:26 pm

≫ Next: Recommendations on Training Videos for New SolarWinds tech

≪ Previous: Alert based on Event count?

With latest combination of NPM and NCM solarwinds can discover context and list them all in admin context. We tried after upgrade and admin context only list one context from the list of 60 context on that firewall.

Any one with same issue, is product working on some hot fix?

↧

Recommendations on Training Videos for New SolarWinds tech

October 13, 2017, 9:57 am

≫ Next: Cisco Firepower

≪ Previous: Cisco ASA Context Discovery (NPM 12.2 & NCM 7.7)

I am looking for input on training videos for one of the other members of my team that is going to be working with me in SolarWinds. I have been going over the SolarWinds site and Thwack and have a bunch of videos and links but just looking for input to see what others have done if anything. This is what I was thinking about for training materials.

1. Start with Orion that way they can understand the foundation that SW is built on.

2. Nodes - go over what a node is and how to add a node.

3. Alerting - what is alerting and how to configure a basic alert

4. SAM - go over the what SAM is and how it works.

At this point I am just working on this - we have DPA, VIM, NCM, VNQM, etc... but want to focus on Orion and alerting basics. Thanks!

↧

Cisco Firepower

October 20, 2017, 9:04 am

≫ Next: Volume Alert Macro Question

≪ Previous: Recommendations on Training Videos for New SolarWinds tech

Is there a way to integrate Cisco Firepower into SolarWinds Orion

↧

Volume Alert Macro Question

February 26, 2010, 8:09 am

≫ Next: Mobile device tracking

≪ Previous: Cisco Firepower

This is going to sound a bit strange. That's probably because it is, but I'm wondering if anyone has any unique ideas/suggestions on how to leverage alert macros to pass along the correct variable to something like a TreeSize Pro, or WinDirStat that would generate a report upon a volume alert trigger.

TreeSize Pro offers some pretty extensive command-line options, but I can't seem to figure out how to leverage the Orion alert macros so I don't have to create an alert for each and every monitored volume.

http://www.jam-software.com/treesize/online_manual/EN/index.html?command_line_opt.html

The idea is simple. If any monitored volume goes beyond 90% capacity, generate a report telling the system administrators where the disk space is being used. Like all products in this category, TreeSize depends on UNC path. I need some method passing this UNC path to TreeSize in the alert trigger.

I.E. "treesize.exe \\%SERVERNAME%\%DRIVELETTER%$" or something cool that would translate into "treesize.exe "\\servername\C$". I have no interest in specific directories or shares. For my purposes I'm only interested in the administrative drive letter shares.

Does anyone have any creative ideas on how to do this that doesn't involve creating lots of different alerts?

↧

Mobile device tracking

March 27, 2014, 1:47 pm

≫ Next: Which Help Desk / Service Desk are you using?

≪ Previous: Volume Alert Macro Question

Do you track mobile devices in your network?

↧

Which Help Desk / Service Desk are you using?

September 24, 2015, 1:22 am

≫ Next: Do you need Network Access Control solution ?

≪ Previous: Mobile device tracking

↧

Do you need Network Access Control solution ?

January 14, 2014, 3:15 am

≫ Next: F5 - vCMP

≪ Previous: Which Help Desk / Service Desk are you using?

I wonder how many of you already have some version of a Network Access Control (NAC) solution in place. Theoretically speaking, Mobile device management (MDM) is just a part of a bigger NAC, so I would like to ask you to vote "yes" even if you have only the MDM solution in your organization.. If voting for "We don't have one but plan to buy", could you please leave a comment on why is it important for you to have one?

↧

F5 - vCMP

October 20, 2017, 10:22 am

≫ Next: Alert MAC Address variable

≪ Previous: Do you need Network Access Control solution ?

↧

Latest Images