I am working to resolve this issue mentioned in the kb below the problem is I have about 150 enabled volume alerts. Does anyone know of a table I can query to help me find alert trigger conditions with blank values like they mention?
I am working to resolve this issue mentioned in the kb below the problem is I have about 150 enabled volume alerts. Does anyone know of a table I can query to help me find alert trigger conditions with blank values like they mention?
Have looked a bit and haven't found information on this yet. I can't imagine I am the only one wanting this kind of alert. In the screenshot below you can see the Interface Utilization has warning and Critical Values and the drop down shows Greater than and Less Than.
Anyone know of a way to have two separate Warning and Critical Thresholds based on a low value and a high value?
I can see the Baseline Statistics with standard deviations for greater than and less than for all hours here:
Is this a feature in a newer version than my current NPM?
I haven't been able to find these values in the DB or SWQL. I am guessing they may be calculated on the fly and never stored in the DB. I could figure ways to create alerts based on custom properties if I could grab those values with a query.
Anyone have some magic customization that makes this manageable and not a manual task per interface?
Being the Solarwinds SME at my work, I sometimes have to explain things to others in IT. Today, I explained that we can stop alerts of a node we still want to collect data on by muting the alert. As opposed to Unmanaging the alert, which stops both alerts and data collection. Having explained that, I then muted 4 devices which have been consistently alerting to show them how it was done. All fine and dandy until several hours later, new alert e-mails on the same nodes show up in my inbox. I immediately jumped into Manage Nodes and ran a search for the nodes. They show up as still muted! Could anyone suggest why we might still get alerts on nodes that are muted?
Additional info:
The alert is on Volume (as opposed to alerting on Interface or Node) status.
Scope of Alert: All objects in my environment
The triggers are:
Volume Percent Used is greater or equal to 90%
Volume Capacity_Class is equal to 90
The Volumes are on servers. While I do have SAM installed, I think this alert just replies on NPM information.
Here are our versions: Orion Platform 2018.2 HF3, VNQM 4.5.0, NCM 7.8, NPM 12.3, NTA 4.4.0, SAM 6.6.1, Toolset 11.0.6
Running on Windows 2016, SQL 2016
The only thing I can think of is that someone edited the alert, which kicked off new alert e-mails when the changes were saved. But shouldn't those emails be squelched by the mute setting? Or maybe someone could have unmuted, and then remuted before I logged in and saw that. I checked the system events. WHile the events has a search for managed node and unmanaged node, there is none such for muted and unmuted.
Thanks, Eric
ps. If you point out this is not the most effective way to do an alert, I would agree. The creator of these alerts has 8 alerts set up, one for each of 8 different thresholds. Were I to have set this up, I would have used this as a trigger, and just one alert for all thresholds: Volume Percent Used is greater or equal to Capacity_Class
This article provides quick information about your current environment and health check , Further it will help you address the most common reasons of performance issues on to your server without sending the diagnostics to SolarWinds support .
In this article you can also Audit your own environment quickly if its been setup as recommended by the Solarwinds MINIMUM requirements or according to the settings eliminating bottlenecks creating performance issues within the set environment .
This article also help to save time to upload the diagnostics for Support where you have air gap between the server and you can check the basic health check on the actual server itself .
There are lots of other benefits to check the environment health within the internal due to security procedure not allowing uploading the diagnostics for Solarwinds support .
Your check list
Server Hardware
Total Elements (Nodes /Interfaces/ Volumes) been polled per server
Check free disk space on the Orion Server and SQL server
Check Your Server Polling Rate
SQL Server / Orion DB Size / Settings / Options
Check SQL Server Disk Performance
Orion Antivirus directory exclusion
Webpages Customization
Collect System diagnostics as below.
Navigate to Start -> SolarWinds Orion -> Documentation and Support
Launch the gray icon for Orion Diagnostics.> Click "Start"
This program will generate a .zip file as output.
( Unzip in a folder ) Right Click > Select Extract Here .
Server Hardware
Lets check your System Hardware first if this even near to the Solarwinds MINIM recommended.
Go to the SystemInformation folder > Open the SystemInfo.txt file
You will be able to find the System hardware specification below is an example where system is only assigned with 2 Physical CPU Sockets /
below is an example where the system is only assigned two CPU PHYSICAL SOCKETS which is below Solarwinds MINIMUM recommendation .
You must have to have MINIMUM 4 PHYSICAL CPU SOCKETS here .
System Type: x64-based PC
✘Processor(s): 2 Processor(s) Installed.
[01]: Intel64 Family 6 Model 45 Stepping 7 GenuineIntel ~1600 Mhz
[02]: Intel64 Family 6 Model 45 Stepping 7 GenuineIntel ~1400 Mhz
Total Physical Memory: 49.082 MB
Available Physical Memory: 39.408 MB
Virtual Memory: Max Size: 56.250 MB
Virtual Memory: Available: 45.376 MB
Virtual Memory: In Use: 10.874 MB
Now open the SysInfo.csv file and check the further current CPU load on the System and CPU GHz level .
Below in an example where the CPU load in around 70% on the current system due to two main reasons .
Parameter | Value |
OSVersion | Windows Server 2012 R2 (Microsoft Windows NT 6.2.9200.0) |
CPUInformation | Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz |
CurrentCPUUssage | 70 % |
TotalPhysicalMemmory | 49152 MB |
FreePhysicalMemmory | 39802 MB |
FreeVirtualMemmory | 45843 MB |
FreeSpaceInPagingFiles | 7109 MB |
CurrentTimeZone | xxxx Standard Time (UTC+01:00:00) |
Low Physical Sockets assigned
Low CPU power less then 3.0 GHz
You should be able to see MINIMUM 4 Physical Processors Sockets as below .
Strongly recommend : NOT to use lower then 3.0 GHz processor you will never get the performance what you are looking for even the Host and Guest wont show the CPU is busy .
Most likely you will see CPU spikes / Orion Services consuming High CPU and Memory . Once you will move the same VM to higher then 3.0 GHz process all the above symptoms will be resolved.
With lower then 3.0 GHz processor there might be other issues such as SQL Server TCP connections TimeOut Errors and High amount of data stored under MSMQ on the system .
Make sure you have MINIMUM 3. Ghz host with Hyperthreading Active it will improve the Guest performance significantly and you will have full performance out of Solarwinds application
This is how you setup your VM in ESX
Here is an Example when you assign the numbers of CPU SOCKETS to the VM
System Model: VMware Virtual Platform
System Type: x64-based PC
Processor(s): 4 Processor(s) Installed.
[01]: Intel64 Family 6 Model 15 Stepping 1 GenuineIntel ~3493 Mhz
[02]: Intel64 Family 6 Model 15 Stepping 1 GenuineIntel ~3493 Mhz
[03]: Intel64 Family 6 Model 15 Stepping 1 GenuineIntel ~3493 Mhz
[04]: Intel64 Family 6 Model 15 Stepping 1 GenuineIntel ~3493 Mhz
BIOS Version: Phoenix Technologies LTD 6.00, 4/14/2014
Further check how much memory is assigned and available for the system and check the TaskManager which application is consuming high memory .
In above case the System Hardware is not even near to the recommended SolarWinds production deployment therefor the CPU load will remain high therefor System resources.
The following table lists minimum hardware requirements and recommendations for your SolarWinds Orion server.
Installing multiple SolarWindsOrion Platform products on the same computer may change the requirements.
Hardware requirements are listed by SolarWinds NPM license level.
These minimum requirements are for the Orion Platform. Products that run on the Orion Platform may have different requirements, such as different OS or memory requirements.
Consult your product-specific documentation for the exact requirements.
Hardware | SL100, SL250, SL500 | SL2000 | SLX |
---|---|---|---|
CPU speed | Quad core processor, 2.5 GHz or better | Quad core processor, 2.5 GHz or better | Quad core processor, 3.0 GHz or better |
For more details see below guide
NPM 12.0 system requirements - SolarWinds Worldwide, LLC. Help and Support
Check free disk space on the Orion Sever and SQL Server
Make sure you have Good free space available on the Orion Server disks C Drive and installed Directory .
Make sure you have Good free space available on the SQL Server where the actual DB is stored.
Total Elements (Nodes /Interfaces/ Volumes) been polled per server
Go to folder "DB" > Open file "AllEngines.csv"
Check how many Elements you are polling per server
EngineID | Elements | Nodes | Interfaces | Volumes |
1 | 15828 | 934 | 6823 | 1071 |
2 | 16084 | 202 | 1305 | 77 |
With only SolarWinds SLX license you can montior up to 12000 Elements and beyond this you will need an Additional Polling Engine to monitor.
More Details see the Server Sizing guide .
Use additional polling engines for 12,000 or more monitored elements
If you plan to monitor 12,000 or more elements, SolarWinds recommends that you install additional polling engines on separate servers to help distribute the work load.
I would also strongly advise you to check the blog post for any other questions if you are polling beyond 12000 Elements with single SLX Server.
Boost your server polling capacity with Stackable Poller license
Multi-module system guidelines
Check your Server Polling Rate
Go to Settings > Polling Engines .
Check if any of the Polling Rate is increased ?
Make sure none of the Polling Rate exceeded above 100%
POLLING COMPLETION | 100 |
ELEMENTS | 225 |
NETWORK NODE ELEMENTS | 18 |
VOLUME ELEMENTS | 50 |
INTERFACE ELEMENTS | 157 |
POLLING RATE | 2% of its maximum rate. |
ROUTING POLLING RATE | 0% of its maximum rate. |
HARDWARE HEALTH POLLING RATE | 0% of its maximum rate. |
VIM.VMWARE.POLLING | 2 |
F5 POLLING RATE | 0% of its maximum rate. |
WIRELESS HEAT MAP POLLING RATE | 0% of its maximum rate. |
WIRELESS POLLING RATE | 0% of its maximum rate. |
UNDP POLLING RATE | 0% of its maximum rate. |
SAM APPLICATION POLLING RATE | 170% of its maximum rate. |
If you have any polling rate increased above the 100% you will notice high CPU / Memory Utilization on the System which could effect the System and application Performance .
Orion DB Size and settings
Go to the DBInfo Folder > Open DatabaseInfo.csv file
Check the Database Recovery Mode
Check the Total Database Size
Default DB Recovery should be SIMPLE (Strongly recommended)
name | db_size | status |
SolarWindsOrion | 889274.25 MB | ✘Recovery=FULL |
For more details please see the post below and follow all the steps one by one to check your Orion Database Health and settings.
This guide will help you address the most common questions and issues related to the Orion database performance check and configuration without using the SolarWinds Database Administrator (DBA).
Quick Orion database health check guide
Check SQL Server Disk Performance
Orion Antivirus directory exclusion for NPM
Web pages recommended settings
Still have any question / need assistance ?
Please feel free to submit a new support ticket in relation to your question/error. Our support lines are available 24/7.
http://www.solarwinds.com/support/ticket
You can also contact the support by 24/7 phone support .
What has your upgrade to NPM 12.3 on Orion Platform 2018.2 looked like? We on the product manager team would like to hear about it all, the good the bad and the ugly! For a starting point here is a quick getting started blog post on upgrading to 2018.2 Orion Platform: Preparing for the Upgrade to 2018.2
Hi,
I was looking for Solaris-10 & 11 MIB monitoring:
Setup Alerting of
ILOMs for T7s via mibs for hardware failure- squelched to start.
Hi all,
I'm trying to set up an alert on Solarwinds. Very basic - alert me when an interface on a specific node goes down. Reset the alert when its back up.
Firstly I cant believe there isnt a template for this already. So I set the alert up using the rather disorganised GUI - which wasnt exactly a walk in the park, since it's not intuitive at all!
Here's the alert trigger condition:
And reset condition:
and the trigger action:
Reset action is pretty much the same.
This alert is enabled, but doesnt seem to trigger. Is something set up wrong?
Thanks for any help.
Hi,
I created an alert by correlation of more triggers.
Is it possible to represent this alert on the map with colored icon or something like that?
thanks
i've been using SecureCRT (trial) for a few days and like it a lot but I know Putty is free. i have yet to try Putty. has anyone used both?
can anyone give me your Pros & Cons about these two SSH products before i go and spend money?
thanks.
Hello Team,
In my organization we have couple of TrendMicro Hardware appliances. When I add these appliances to SolarWinds NPM 12.3;it is recognizing as "net-snmp".Do we have MIB database for TrendMicro Appliances in SolarWinds ?
or can anyone help me to create the TrendMicro Poller group into NPM database and identify the device same as other Vendors(HP,Cisco etc.)
Fairly new to NPM, so just fumbling my way through trying to tune it a bit so we get the right alerts and not so much noise.
One thing we discovered today is when we ran a shutdown command on a interface that is monitored, we did not receive an interface status change alert email.
I assume, that it first registers in NPM as interface "admin shutdown" (which we have alerts disabled at the moment) and then the interface as down.
I brought the same interface back up and physically removed the cable, and we did receive the interface status change email.
Could anyone please confirm this is the correct behavior?
Hello I am trying to figure out a way to add a descriptor into my alerts that tells new staff whether an alert is for a PROD, or TEST, or UAT/DEV node.
Below is the basic concept of what I'm looking to add:
This alert is on a {variable} node.
if ${NodeName} contains pr than PROD
if ${NodeName} contains ts than TEST
if ${NodeName} does not contain pr or ts than DEV\UAT
Thanks in advance
I recently discovered that NPM was logging into my ASA's much more frequently than I expected. Every minute.
That seems excessive, and I question if it isn't the result of enabling the special NPM Secret Sauce called Advanced ASA Monitoring?
Do you have any experience with loads of logins to ASA's when the box is checked, or other proof for-or-against this theory?
The other thread is closed so I figured I would start a new one I usually get more help here than actually contacting support.
So same issues as before but instead of the server not responding in 36 hours or so it took maybe a week but it is the SAME issues.
1. Server stopped sending alerts out sometime around 11AM on the 4th.
2. Logged onto server and opened Orion service manager and both the module engine and the administration service were going back and forth between running and stopping.
3. Orion could not connect to SQL
4. I have some alerts that at are going out but not sure if they are legit or not.
5. After the reboot I notice that a good chunk of my nodes interfaces are 'unknown' this looks like it fixes itself but again something else going on.
I have applied the 'hotfix' that you all pushed out to try to fix this.
I have done the change from streaming to buffered
I have done the registry change for the ports
The only thing I have not done is revert the snap shots back to June 14th prior to the update so Solarwinds is stable again.
At this point I am going to schedule a task in VM Ware to reboot the server every night. That is pretty much the only way I will know Solarwinds will actually work.
After installing, and then uninstalling the July Microsoft patches around .NET Framework, we have been dealing with some serious instability in our environment. If you aren't familiar with the patches, they're documented here:
Advisory on July 2018 .NET Framework Updates · Issue #74 · dotnet/announcements · GitHub
Microsoft released these, we installed, they pulled them and then released another one to fix the issues that were found, but then said they did not think that it fixed everything on the 2008 R2 servers (we have two in our environment - one being the core Orion server, along with 9 2012 R2 servers). We have since uninstalled all of the patches from our environment, but still experience the issues.
The issues we are seeing is that the businesslayerhost process is crashing very often on our pollers, and we have a ton of apps (mostly the ones that monitor on our agent-based machines) going into an unknown state continuously throughout the day - about 1,000 out of the 8,000 total. The event log errors we are seeing are at the bottom of this email. My question is are you guys aware of these patches causing instability with SolarWinds? What about on the agent side? I know the agent relies on .NET framework, as it installs it during the installation process if it isn't already there. With the way that we are seeing the issues on our pollers, it almost makes me think that we are having issues communicating with the agents, thus causing the unknown app numbers to bounce around all day as the pollers are having trouble getting the data in time. I believe all of our agent-managed machines still have these patches, even though they are all 2012 R2 and up.
For reference, here is the version(s) we are at:
Errors:
Application: SolarWinds.BusinessLayerHost.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.InvalidOperationException
at SolarWinds.BusinessLayerHost.BusinessLayerHostService+<>c__DisplayClass25_0.<CheckPlugins>b__0(System.Object)
at System.Threading.QueueUserWorkItemCallback.WaitCallback_Context(System.Object)
at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
Faulting application name: SolarWinds.BusinessLayerHost.exe, version: 2017.1.5300.1698, time stamp: 0x58ac4615
Faulting module name: KERNELBASE.dll, version: 6.3.9600.18938, time stamp: 0x5a7dd8a7
Exception code: 0xe0434352
Fault offset: 0x00015ef8
Faulting process id: 0x1d0c
Faulting application start time: 0x01d42e91b9959d28
Faulting application path: C:\Program Files (x86)\SolarWinds\Orion\SolarWinds.BusinessLayerHost.exe
Faulting module path: C:\WINDOWS\SYSTEM32\KERNELBASE.dll
Report Id: 00ec59ef-9a86-11e8-80fd-e4115bafdd78
Faulting package full name:
Faulting package-relative application ID:
Hello THWACKers!
The User Experience (UX) is doing some investigation into Palo Alto firewalls. We're interested in learning a bit about your current Palo Alto firewalls, and what tool(s) you're using to monitor/manage them.
For filling out this quick 10-minute survey, you'll get 500 points. As an added bonus, sending over examples of your Palo Alto configurations will get you up to 2,500 more points to use in the THWACK® store!
Anyone else seeing an issue with adding VMWare 6.5 nodes on NPM v11 and using the VMWare polling option? I got this error when I tried to add a new server.
Error while connecting to VMware device - Unsupported namespace "urn:vim2" in content of SOAP body while parsing SOAP body
Edit: I updated to Solarwinds NPM v12 hoping that it would correct the issue. It didn't. It fails the credentials test when trying to edit the node.
I need to add approximately 500 ICMP nodes to NPM 12.1. I have an excel file with node names and associated IP addresses. Is there a way to add all and include the node name? When I do discover with just the IP addresses, the tool names all nodes with their IP addresses since they do not have SNMP ability.
Aaron
Many SolarWinds Orion users need to integrate the monitoring and alerting capabilities with task or ticket tracking systems. The recent addition of the integration to ServiceNow allowed many users to have an easy way to connect the two worlds without needing custom scripts. In a similar way, the research team at SolarWinds has written an integration with Atlassian JIRA.
Note: This integration currently relies on the "Alert Integration" feature in Orion. Ensure that is turned on for the alerts that you want to create JIRA issues for.
To gauge interest and find early bugs, we are releasing an unofficial alpha level integration for our customers who use JIRA. This integration synchronizes Orion alerts with JIRA issues. This post will describe the configuration and usage of this integration for users to try out and let us know additional features, usability, and usefulness of the software.
The integration is meant to link alerts in Orion with issues in JIRA. For example,
Step 1: Installation
Download the bits attached to this post.
Set up the service - install the service on your main Orion server.
Step 2: JIRA Integration Configuration
Next open the file settings.json from the installation directory with a text editor like Notepad.
Update the following fields with the correct information.
Orion Settings
Setting | Description |
---|---|
OrionHost | Hostname of your Orion server |
OrionUsername | Username of the Orion user you wish to access the Orion Information Service with. This must be an admin account in order to configure the alert notifications. |
OrionPassword | Password for the user above |
WebhookListenUrl | The url that JIRA will call back when an event occurs in JIRA. The port in this field will need to be accessible through the firewall so that the JIRA events will be able to communicate back to the Orion JIRA Integration service |
JIRA Settings
Setting | Description |
---|---|
ServerHost | Hostname for your JIRA server |
ServerPort | Port your JIRA server is listening on |
UseHttps | Whether the JIRA server is configured to for HTTPS |
ProjectKey | Project key in JIRA that you want the issues opened in. |
IssueTypeKey | JIRA issue type you would like the Orion JIRA Integration to create issues as. This field is one of the predetermined list of options available in JIRA like "Story", "Task", etc. |
Username | JIRA username you want to use to connect to the JIRA server |
Password | Password for the above username |
AcknowledgedTransitionAction | Name of the transition to set the issue to when an alert is acknowledged in Orion. Set to an empty string if not used. |
ResetTransitionAction | Name of the transition to set the issue to when the alert is reset. Set to an empty string if not used |
EventsToListenFor | This field is mainly for development use so do not adjust this field |
FieldAssignment | Specify the mapping of Orion user properties to Jira custom fields. The Orion properties must be added under the "Alert Integration" section of the Alert Summery page.
A sample definition would be
"FieldAssignment" : { "JiraField1" : "IP Address", "JiraField2" : "Caption" } |
Sample settings file
{ "OrionHost": "orion.foo.local", "OrionUsername": "Test", "OrionPassword": "test", "WebhookListenUrlRoot": "http://localhost:8080", "Jira" : { "ServerHost" : "jira-01.foo.local", "ServerPort" : "8080", "UseHttps": false, "ProjectKey" : "ITX", "IssueTypeKey": "Task", "Username": "JiraUser", "Password": "JiraPass", "AcknowledgedTransitionAction" : "", "ResetTransitionAction" : "Done", "EventsToListenFor": [], "FieldAssignment": {}, } }
Restart the Orion JIRA Integration windows service after updating the configuration file.
Step 3: Orion Alert Configuration
Currently, the integration relies on the "Alert Integration" feature in Orion. Edit each alert you want to create JIRA tickets for and make sure the alert is enabled and the "Alert Integration" checkbox is checked. To do so
Step 4: Test the integration
Trigger a test alert in Orion and confirm that the desired task is created in JIRA.
See the issue in JIRA
Update the alert notes. Confirm that alert notes you add in Orion get entered as a comment in the JIRA task.
See the notes as comments in the JIRA task.
Test complete! Congratulations, you have now just used the SolarWinds Orion JIRA Integration.
Step 5: Enjoy and give feedback
Thank you for using this alpha stage integration and please let us know by responding to this post any additional requests you have for this sort of alerting integration. Also, since this is not an official release, you can not call SolarWinds support and get support for this feature. Support will be provided through this post from the research team at SolarWinds. If things are not functioning well, please stop the Orion JIRA Integration windows service and set the service to be disabled so that it does not restart on reboot.
Thanks
SolarWinds Architecture Research and Innovation Team
Message was edited by: Zeid Derhally Updated attachment to provide more logging.
Message was edited by: Zeid Derhally Updated to include information about the Field Assignment functionality
I just finished a build out of NPM 12.3 using fresh Server 2016 VMs. I was testing out some new weather maps for our new main dashboard and I'm getting an error on the Network Atlas when I link to the National Weather Service's radar. I am getting the error "no image could be found at the specified location." It works fine in IE on the server.
Is anyone else getting an error on their environment with this link?