Quantcast
Channel: THWACK: All Content - Network Performance Monitor
Viewing all 21870 articles
Browse latest View live

Better volume threshold graphs

$
0
0

When looking at a volume details view, you may have noticed that the threshold colors (green/yellow/red) of the percent used of a volume only use the global thresholds and not the custom volume thresholds you have specified.

 

For example, you will see something like:

Default volume info

which shows all green even though in my example I've changed the thresholds on /tmp to warning>30% and critical >36% and /boot to warning>15% and critical>19% (see below for the threshold setting screen for /tmp):

Capacity Planning Thresholds

What I've done is to write some custom SQL that shows the volume information in a more pleasant way. Now I see:

New volume info

 

where the coloring actually uses the per-volume (filesystem) thresholds instead of the system wide generic volume threshold defaults. It also graphically/textually shows the thresholds. I accomplished this by specifying the following SQL in a custom table. The SQL is shown below. Using CSS might be a better way to accomplish the 'graph' instead of HTML tables, but I copied code I'd previously written and it accomplishes what I desire.

 

SELECT

    Caption AS InstanceCaption,

    VolumeSize,

    (

        CAST(WarningThreshold AS varchar) +

        '% / ' +

        CAST(CriticalThreshold AS varchar) +

        '%'

    ) AS Thresholds,

    (

        CASE

            WHEN ROUND(CAST(VolumePercentUsed as float), 0) >= CAST(CriticalThreshold as float) THEN '<span style="background-color:#FF9999;color=black">'

            WHEN ROUND(CAST(VolumePercentUsed as float), 0) >= CAST(WarningThreshold as float) THEN '<span style="background-color:#ffff66;color:black">'

                ELSE '<span>'

        END +

        '<b>' +

        CAST(ROUND(CAST(VolumePercentUsed as float), 0) as varchar) +

        ' %</b></span>'

    ) AS VolumePercentUsed,

    (

        '<table style="display: inline-block; width:100px; height:10px; border-collapse:collapse; vertical-align:top;" cellspacing="0" cellpadding="0">' +

            '<tbody>' +

            '<tr>' +

                '<td style="background-color:' +

                    CASE

                        WHEN ROUND(CAST(VolumePercentUsed as float), 0) >= CAST(CriticalThreshold as float) THEN '#FF0000'

                        WHEN ROUND(CAST(VolumePercentUsed as float), 0) >= CAST(WarningThreshold as float) THEN '#FFFF00'

                            ELSE '#00FF00'

                    END +

                    ';width:' +

                        CAST(ROUND(CAST(VolumePercentUsed as float), 0) as varchar) +

                    'px;height:8px;border:1px solid black;padding:0px;">' +

                '</td>' +

                '<td style="width:' +

                    CAST(ROUND(100 - CAST(VolumePercentUsed as float), 0) as varchar) +

                    'px;border:1px solid black;padding:0px;">' +

                '</td>' +

            '</tr>' +

            '<tr>' +

                '<td colspan="2" style="height:2px;border:1px solid black;padding:0px;" valign="bottom">' +

                    '<table style="width:100px; height:2px;" cellspacing="0" cellpadding="0">' +

                        '<tbody>' +

                            '<tr>' +

                                '<td style="background-color:#00FF00; width:' +

                                    CAST(WarningThreshold as varchar) +

                                'px;border:0;padding:0;">' +

                                '</td>' +

                                '<td style="background-color:#FFFF00; width:' +

                                    CAST(CriticalThreshold as varchar) +

                                'x;border:0;padding:0;">' +

                                '</td>' +

                                '<td style="background-color:#FF0000; width:' +

                                    CAST((100 - CAST(CriticalThreshold as float)) AS varchar) +

                                'px;border:0;padding:0;">' +

                                '</td>' +

                            '</tr>' +

                        '</tbody>' +

                    '</table>' +

                '</td>' +

            '</tr>' +

            '</tbody>' +

        '</table>'

    ) AS USED,

    ('/Orion/NetPerfMon/VolumeDetails.aspx?NetObject=V:' + CAST(InstanceId as varchar)) AS DetailsUrl

FROM

    (

    SELECT

        n.NodeID,

        v.[VolumeID] AS InstanceId,

        v.[VolumeSize] AS VolumeSize,

        v.Caption,

        ISNULL(fm.[ThresholdType], 0) AS ThresholdType,

        v.[VolumePercentUsed] AS VolumePercentUsed,

        Floor((CASE WHEN (fcs.[WarningThreshold] IS NOT NULL)

                        THEN fcs.[WarningThreshold]

                    WHEN (fm.[ThresholdType] IS NULL OR fm.[ThresholdType] = 0)

                        THEN global.[GlobalWarningThreshold]

                    ELSE NULL END)) AS WarningThreshold,

        Floor((CASE WHEN (fcs.[CriticalThreshold] IS NOT NULL)

                        THEN fcs.[CriticalThreshold]

                    WHEN (fm.[ThresholdType] IS NULL OR fm.[ThresholdType] = 0)

                        THEN global.[GlobalCriticalThreshold]

                    ELSE NULL END)) AS CriticalThreshold

      FROM

        [dbo].Nodes n

      CROSS JOIN

        (SELECT

            Id,

            Name,

            ThresholdType,

            GETUTCDATE() AS CurrentTime,

            NULL AS ThresholdName

          FROM [dbo].[ForecastMetrics]

          WHERE

            (EntityType = N'Orion.Volumes' AND [Name] = N'Forecast.Metric.PercentDiskUsed')

        ) fm

      CROSS APPLY

        (SELECT

            ISNULL

                ((SELECT    TOP 1 sts.CurrentValue

                    FROM        [dbo].[ForecastMetrics] AS fmm WITH (NOLOCK)

                    INNER JOIN [dbo].[Settings] AS sts WITH (NOLOCK) ON sts.[SettingID] LIKE fmm.[CriticalThresholdSettingID]

                    WHERE    fmm.EntityType = N'Orion.Volumes'), 0) GlobalCriticalThreshold,

            ISNULL

                ((SELECT    TOP 1 sts.CurrentValue

                    FROM        [dbo].[ForecastMetrics] AS fmm WITH (NOLOCK)

                    INNER JOIN [dbo].[Settings] AS sts WITH (NOLOCK) ON sts.[SettingID] LIKE fmm.[WarningThresholdSettingID]

                    WHERE    fmm.EntityType = N'Orion.Volumes'), 0) GlobalWarningThreshold

        ) global

        JOIN [dbo].[Volumes] AS v ON v.[nodeID] = n.nodeID

        LEFT JOIN ForecastCapacitySettings AS fcs ON v.[VolumeID] = fcs.[InstanceId] AND fcs.[MetricId] = fm.[Id]

        where n.nodeid = ${NodeID}

        ) AS ForecastThresholds

 

I'll leave it up to you to specify all of the table layout settings.


Creating custom Solarwinds reports using Orion Report Writer

$
0
0

This seems like an easy task.  I am trying to use the Orion Report Writer to create a custom query where I already have this SWQL query.  I go in and create a new report, there is the SQL tab.  There is text saying "Enter SQL query here"; however I cannot type any text.  Am I missing something simple here?

 

 

Tell Us Your Unknown Devices v2.0

$
0
0

Those that have been part of the Thwack Community a while may be familiar with the long running Tell us your "Unknown" devices! thread which had been active since 2007. That thread had become too unwieldy, and most of the user submissions had been implemented many years ago. I recently reviewed each and every posting in that thread, verified what had been implemented in-product, and which ones had not so they could be included in a forthcoming release. With that done, it was time to lock that thread for good and start anew. This time, providing a bit more guidance along the way to ensure everyone is successful in providing the necessary information required to properly identify these devices.

 

What is an 'Unknown' Device anyway?

 

Orion does its best to automatically identify and classify nodes as they're added to Orion. There are however, new device types and models released all the time. It's entirely possible you might be managing a device right now that Orion is unable to properly identify. You can find these easily by going to [Settings - Manage Nodes], changing the 'Group by:' option to 'Machine Type' and clicking on the 'Unknown' category. It's also helpful to add the 'Polling Method' column to the layout, as this thread pertains exclusively to SNMP managed nodes.

 

Any SNMP managed nodes listed under the 'Unknown' Machine Type category are prime candidates for submission to this thread. All that's required is that you provide the devices SNMP System Object Identifier (SysObjectID), as well as the Make & Model of the device associated with that SysObjectID. This post is an excellent example of the perfect submission.

 

What Exactly is a SysObjectID?

 

I have yet to find a clearer definition for what the SysObjectID (System Object Identifier) is then the following excerpt which can typically be found written in virtually every vendor's MIB file verbatim.

 

Object Name: sysObjectID
Object ID: 1.3.6.1.2.1.1.2.0
Object Syntax: OBJECT IDENTIFIER
Object Access: read-only
Object Status: mandatory
Object Description: The vendor's authoritative identification of the  network management subsystem contained in the  entity. This value is allocated within the SMI  enterprises subtree (1.3.6.1.4.1) and provides an  easy and unambiguous means for determining `what  kind of box' is being managed. For example, if  vendor `Flintstones, Inc.' was assigned the  subtree 1.3.6.1.4.1.4242, it could assign the  identifier 1.3.6.1.4.1.4242.1.1 to its `Fred  Router'.

 

Essentially, it's a string of numbers in dotted notation that is (hopefully) unique to at least the manufacturer, and in most cases, to the specific make and model of the device being monitored. It's how we identify for example, that the device vendor is 'Cisco' and the model is a 'Nexus C7018'. All System Object ID's begin with '1.3.6.1.4.1' followed by a number which uniquely identifies the manufacturer. The numbers which then follow typically identify the specific model of the device.

 

Where Can I Locate the SysObjectID?

 

If the device is already managed as a Node in Orion then you can locate the SysObjectID in the 'Node Details' resource as shown below, when viewing the node in the Orion web interface.

 

Node DetailsNET-SNMP

Alternatively, you can use NET-SNMP to query the following SNMP OID to return the unique SysObjectID.

 

1.3.6.1.2.1.1.2.0

 

Below is an example of the 'snmpget' command line arguments which will return you the SysObjectID for the device.

 

 snmpget -v2c -On -c public 10.199.5.103 1.3.6.1.2.1.1.2.0

 

The example above is executed against a device with the IP address of '10.199.5.103' using SNMPv2c, with the community string 'public'. Below is a screenshot of the resulting output from that command. The string of numbers and periods highlighted in yellow below is this device's unique SysObjectID.

 

My Device Incorrectly Appears Listed as 'NET-SNMP'

 

Linux hosts, virtual appliances, and even some network equipment built on Linux, FreeBSD, etc. are often identified as 'NET-SNMP'. This is because the SNMP Daemon running on those hosts is, you guessed it, NET-SNMP. Unfortunately, these vendors for some reason, have chosen not to implement their own unique SysObjectID, and instead kept the default SysObjectID '1.3.6.1.4.1.8072.3.2.10' which is designated for NET-SNMP. If you have a device such as this, fret not. There are a few options available to you if you'd like these devices to be properly identified by their appropriate vendor's make & model within Orion.

 

Install The Orion Linux Agent

 

The easiest solution would be to install the Orion Linux Agent on the device which is reporting itself to be 'NET-SNMP'. The Linux Agent does not rely upon SNMP to identify the machine type or vendor. Instead, the Agent will report the Vendor as 'Linux' and the 'Machine Type' as the Linux distribution running on the device as depicted in the screenshots below.

 

Red HatCitrix XenServer

 

 

Modify NET-SNMP Configuration

 

Another approach is to customize NET-SNMP and Orion to properly reflect the Vendor and Machine Type. Simply following the steps outlined by adatole's post entitled No More Net-SNMP Nodes. This method uses a script osname.sh which is executed when a particular OID is is queried. Next, you would create a custom Device Poller to query that newly created OID and populate the Machine Type value in Orion for that device.

 

If you find it more fun to follow along, you can watch adatole walk you through the entire process in the following video.

 

 

 

Can't I Just Upload My Vendor's MIB File Here And You figure it Out?

 

While it would be nice if that's how it worked, unfortunately many (or most) vendors don't include this information within their MIB files. MIB files include a listing of all possible OIDs which could be polled across a wide variety of different devices (typically an entire product family), but it doesn't include the values which are returned by the devices (Enums notwithstanding). For that reason we need users, such as yourself, to post the SysObjectID's in this thread, along with the device vendor and model information so it can be included in our database.

 

If you'd still like your device's MIB file included in the Orion MIB database, for use with Network Performance Monitor's Universal Device Poller, or the Orion Platform's SNMP Trap Receiver, simply follow the steps outlined in KB article at the link below.  The latest version of the MIB database, containing your submissions, can always be downloaded from within the Customer Portal.

 

Request additional MIBs to the SNMP MIB browser database - SolarWinds Worldwide, LLC. Help and Support

SQL to SWQL - Report writer to web reports

$
0
0

I'm trying to migrate some of my reports into web report to have one report rather than a multitude.

 

I have a large custom report that looks for the average availability of last month per custom group. I can't get it to work in the SWQL and was wondering if anyone could help? If I can get this small section to work then I can edit the rest of the query.

 

 

SELECT TOP 10000 AVG(ResponseTime.Availability) AS AVERAGE_of_Availability,

       COUNT(DISTINCT Nodes.NodeID) AS COUNT_of_NodeID,

       'Campus-Core-Agg' AS AREA

FROM   Nodes INNER JOIN ResponseTime ON (Nodes.NodeID = ResponseTime.NodeID)

WHERE  ( DateTime between  (DATEADD(m, DATEDIFF(m, 0, getdate()) -1 , 0)) AND (DATEADD(m, DATEDIFF(m, 0, getdate()) , 0))

)

AND    (

            (Nodes.Vendor = 'Juniper Networks, Inc.')

       AND  (Nodes.Core_Aggregation = 1)

 

 

Many thanks!!!

Atlas and connect now with Firewall interfaces

$
0
0

Hi all

Do I need to configure anything on fw interfaces to allow connect now to dynamically show connection details when creating maps withing atlas.

ATM when I try connect now on ASAs and checkpoint interfaces nothing dynamically connects ,

All other Cisco devices work fine with connect now feature.

Thanks Paul

NPM Design question

$
0
0

Hi Guys

I have a "small" design question to ask

Today we have 2 NPM instances one in our corp network and one in our secure enclave, witch is a secluded portion of our network, so it would be possible to be able to cut it of and operate it autonomously in case of an attack.

But our normal procedures do that monitoring both systems simply isn't an option on a day to day basis.

 

So I need to design a system where we can monitor both the the Corp network and the Enclave through a single "pain of glass"

One suggestion is the Enterprise console I understand that this will be able to receive data from several systems, but as I understand this will only give us monitoring capabilities, If we need to dig deeper we have to do it on the relevant system.

Anther suggestion is to place 2 Corp APE's in the Enclave and monitor it all through the Corp system but still retaining the Enclave system, but then I fear that if we ever had to run the Enclave autonomously equipment will not be up to date on the Enclave system.

 

I am not sure how big a problem the last one will be, because the Enclave setup is pretty static.

 

Is there anybody out there that has solved this or have some design recommendations, suggestion or anything?

What are the pro's and con's and is there another solution that I haven't thought about?

 

We have 2 NPM setups, one with unlimited nodes (Corp), and the one in the Enclave has a max of 2000 Nodes i think

If we could do this with one setup That would be sweet

Hope you can help me a little

 

Regards Jens

Report that shows unplugged interfaces

$
0
0

I'm looking for a report that will show me which interfaces are configured as "display interface as unplugged rather than down".

 

Anyone have any ideas? Thanks.

NPM / SAM Integration - Design Question

$
0
0

Hi All,

In the process of setting up a new environment. Will be running NPM, NCM, NTA, IPAM, SAM and VMAN. The intention is to segregate NPM/NCM/NTA and potentially IPAM on its own server, while running SAM/VMAN on its own server. This is being done to limit the dependencies and interaction between the network team and systems operations team for system availability, while still providing integration under a single platform.

Ultimately I want to be sure that this approach is sound, and that so long each product points to the same shared Orion database, and enabling integration from within the respective products, that this scenario will work as intended - which is an integrated platform for all products within the Orion framework.

Last question, is there anything prohibiting IPAM from running on the NPM/NCM server, if the server is spec'ed accordingly? I have only skimmed the technical documents so I am sure this documented but easier to ask.

Thanks









SolarWinds Alerting Flow Chart

$
0
0

I'm need some guidance, my director is wanting some kind of "visual aid" to show our higher management how we monitor and respond to alerts on the network. So needing to make flow chart and or a diagram describing how me monitor our critical servers and other nodes. I don't want to just send them a link of our SolarWinds dashboard/summary since it can be a lot to take in as non technical person. When I was in the Army I used to just draw pretty pictures in crayon for any grunt to understand, basically want to do that now but more professionally with Visio lol. Anyone have any examples of a alerting flow charts and or can give some advice? Thanks much! 

Ping problems after upgrading

$
0
0

We're seeing some very odd issues with NPM here. First, a bit about our environment:

Over 7600 nodes
We use 5 additional pollers as well as main poller
Server 2016 main poller:
VMware virtual server with 24 CPU, 64GB ram.
NPM 12.3

UDT 3.3.1

NCM 7.8

VNQM 4.5

IPAM 4.7

NTA 4.2.3

 

Database server: Server 2016 with SQL 2016

 

Since ugprading all the applications on 6/15, we've found that on 6/22 the server hung. After resetting it, many nodes began showing as down. We found that they were up if pinged from another machine. Then we realized that ping itself is timing out or not working on the main poller. If we stop the solarwinds services, ping starts working again. Trying to isolate an individual service causing the issue didn't find the culprit. It seems like it might be the job engine service, but stopping it by itself doesn't get ping working again, and we need it running regardless.

VMWare ESXi Standard Lockdown and Solarwinds NPM

$
0
0

I am having an issue and would like some help.

 

Simply put, I am just using some of the SolarWinds virtualization settings that are built into the network performance monitor.

We tried turning on standard lockdown mode on a cluster. The performance monitoring is completely broken when lockdown is on and was wondering if there was something i could do to fix that or is this to work as intended.

Before you ask, Yes I am positive the root password is correct because it all works when I take it out of lockdown. We do have exception users setup as well and none of them work either.

 

I knew lockdown would shut off quite a bit but also though the root account and others specified in exception lists would bypass that restriction.

Has anyone tried this or can give some guidance on getting this to work?

Make ALL Links, In A SWQL Custom Query Resource, Open In New Tabs By Default

$
0
0

This is a super simple, single file, single line, edit.

 

ESTIMATED TIME TO INSTALL/PERFORM MODIFICATION:<1 Minute

 

DIFFICULTY LEVEL:1-Youngling

  1. Youngling(Easiest/Most Basic; no coding experience required, no config wizard required, no system restart required, no system downtime.)
  2. Padawan (Easy/Basic; no coding experience required, possible config wizard required, possible system/services restart required, limited/no downtime.)
  3. Jedi Knight (Moderately Difficult/Advanced; some coding experience required/recommended, config wizard required, possible system/services restart required, limited/short duration downtime.)
  4. Jedi Master (Most Difficult/Advanced; advanced coding experience required, config wizard required, system/services restarts required, 30+ minutes downtime/maintenance window recommended, and other things that I do not even know I would need to know, required...)

 

 

For all of those "tabbers", "shift-clickers", and "middle-button mashers" out there, that know the only way to truly use a tabbed browser, is to open so many tabs that you can even see the tabs anymore... this one is for you... And, of course, by you, I mean us...

 

This simple little modification may have been mentioned elsewhere before, however, I was not fortunate enough to have found it before I figured out how to do it. So, if it has been mentioned before, well, here it is again...

 

This modification will change the default behavior when clicking on a link within a SWQL Custom Query resource. (Any link, formed within the query, using the "_LinkFor_" alias.)

By default, clicking on a link will load the link in the same page/tab as the source was in.

The new behavior, after making this change, when clicking on a link, will load the link destination in a new tab/page, without the need to shift-click, or middle-mouse button click.

 

THINGS TO KNOW:

  • I have countless SWQL Custom Query resources scattered throughout my SolarWinds environment.
  • I cannot stand directly clicking links, having them open in the same page/tab.
  • If I can still see the icons on the tabs of Chrome, then I must be sleeping.
  • I am a very inexperienced, and untrained, amateur (with the exception being all things Star Wars related, which does you absolutely no good here...)
    • Always backup your system/files BEFORE making any changes, and/or test with a demo/dev system before making changes to your production environment.
    • Please don't break your system, then blame it on me.
      • If you break your system, then blame it on me, please know, "I don't give a care...", "I told you so...", and/or "Nanna nanna boo boo, stick your head in doo doo..." will most likely be my response...

 

**WARNING! THE INFORMATION YOU ARE ABOUT TO READ COMES FROM THE MIND OF AN UNTRAINED AMATEUR, AND IS MOST LIKELY FAR, FAR FROM THE BEST PRACTICE**

 

Filename:

CustomQuery.js

 

File Location:

\inetpub\SolarWinds\Orion\NetPerfMon\Resources\Misc\

 

Open the file, and look for the line that has "if (cellInfo.linkColumn) {" (it should be on/around line 160)

The change you will be making will need to be done on the next line, line 161.

 

Change the RED part, of the line below,

            element = $('<a/>').attr('href', rowArray[cellInfo.linkColumn]);

 

To match the GREEN part, of the line below,

            element = $('<a Target="_blank" />').attr('href', rowArray[cellInfo.linkColumn]);

 

Save your file, and you are done!

 

HERE IS HOW THE DEFAULT CODE LOOKS, BEFORE ANY CHANGES:

        var element;        if (cellInfo.linkColumn) {            element = $('<a/>').attr('href', rowArray[cellInfo.linkColumn]);        } else {            element = $('<span/>');        }

 

 

AND HERE IS HOW THE CODE SHOULD LOOK AFTER YOUR CHANGE:

        var element;        if (cellInfo.linkColumn) {            element = $('<a Target="_blank" />').attr('href', rowArray[cellInfo.linkColumn]);        } else {            element = $('<span/>');        }

 

 

Now, you should be able to use the "_LinkFor_" column alias in your SWQL query, on a Custom Query resource, and when you click the link, on the query results, it should, by default, automatically open in a new tab/window.

 

If you have any questions, or comments, please leave them below, and I will do my best to follow up with you.

 

Thank you,

 

-Will

 

--If you are interested in customizing, and/or modifying your SolarWinds environment, CourtesyIT has put together a terrific "Page of Pages" (PoP), "List of Links" (LoL), okay, you get the idea... Please visit his page, How to do various customizations with your Solarwinds, and discover a better way to enhance your SolarWinds environment. Make sure to bookmark, like, and rate his page, as it will help you, as well as others after you.

Multiple Product Upgrade And Migration - NPM 12.1 - 12.3, SAM 6.4 - 6.7, NCM 7.6 - 7.8, And NTA 4.2 - 4.4

$
0
0

Please Help!!!

According to Upgrade and Migration Guide, I have to migrate the Engine and SQL servers to Windows 2016 and minimum SQL server 2016.

 

Current Config
SAM 6.4
NCM 7.6
NPM 12.1
NTA 4.2

Existing Servers
Engine
Windows 2008 R2

Database Server
Windows 2012
SQL Database 2014

New Servers
Engine
Windows 2016

Database Server
Windows 2016
Windows SQL Server 2016

I was wondering if someone can help with the migration steps that I need to take in order to upgrade all of the applications to the latest software.

SAM 6.7
NCM 7.8
NPM 12.3
NTA 4.4

 

Keep in mind that we have to migrate to new servers for the pollers and the SQL database.

Packet loss graph

$
0
0

Hello,

How can I create a graph which show packet loss and average response timed , based on time '

Thanks

Mickael





Looking for assistance with SWQL query

$
0
0

Hello,

I am trying to troubleshoot a issue of high CPU usage occurring intermittently on 1200 nodes.

I found a thread where I can create a PerfMon link.

I can get a single component to work, along with overall node CPU, but I am having an issue structuring a query to be able to display 2 different components with a single link url.

 

I have tried a self join, but get an error about displaying a link from two different sources.

I have not had any luck with a SELECT sub query either.

 

I am hoping a SWQL guru can point me in the right direction.

 

 

SELECT OAA.Node.Caption

,'PerfStack_Template_A' AS [PerfStack-A] 

,'ui/perfstack/?presetTime=last10Minutes

&withRelationships=true

&charts=

0_Orion.Nodes_' + TOSTRING(OAA.NodeID) + '-Orion.CPULoad.MaxLoad;

0_Orion.APM.GenericApplication_' + TOSTRING(OAA.ApplicationID) + '-Orion.APM.ProcessEvidenceChart.' + TOSTRING(OAA.Components.ComponentID) + '.MaxPercentCPU;

' AS [_LinkFor_PerfStack-A] 

FROM Orion.APM.Application OAA

where OAA.NodeID = '%${SEARCH_STRING}%'

and (OAA.Components.Displayname = 'java.exe' or OAA.Components.Displayname = 'Poseidon Agent')


Alerting on Volume Thresholds

$
0
0

Back in April of 2015, NPM 11.5 was released and with it came a brand new Web-based Alert Engine in the Orion Platform. At the time, and ever since, one of the most valuable capabilities of this new engine was the ability to dynamically alert on multiple objects based on their own individually assigned thresholds. Setting individual thresholds for things like CPU Utilization, Percent Memory Utilization, Packet Loss, Response Time, Interface Errors, and Interface Utilization was a game changer to a lot of alerting schemas that allowed us to reduce our custom property footprint, as well as the complexity of the alert definitions. However, a glaring "omission" was that the thresholds made available for Volumes were not presented to the alerting engine (or so we thought). This was a bit mind-boggling, and talking to other MVPs, seasoned SW Admins, and SW employees over the years, I had never heard differently, so the assumption was cemented as a "missing item that requires a work-around". (On a side note, I am 42% sure that jbiggley was behind a very well orchestrated and elaborate trolling to keep me in the dark on this capability, but I digress...) But today, I'd like to present the solution that was hiding in the background this entire time, to save future admins the discomfort of maintaining "Disk_Crit" custom properties.

 

NOTE: As tait.cyrus mentions in the comments below:

A problem with using "Volume Capacity Forecasting" is newly added nodes will not have any forecasting data for several days, so you will not be able to get any volume alerts from newly added nodes (until forecasting volume data becomes available which can be somewhere in the 1-3 days time period) since volume thresholds won't appear until the forecasting data appears.

 

The problem arises from the fact that "Volume Capacity Forecasting" is a database 'view' made up of data from various other tables including a database JOIN of a forecasting table so until there is forecasting data available for a volume on a node, "Volume Capacity Forecasting" will not show anything so no threshold data will be available and thus no ability to generate volume alerts on the node. It would have been preferable that an appropriate database JOIN would have been used that would have shown threshold data even when forecasting data was not yet available. I did submit an incident on this and was told this is a known 'feature'. I have requested that this be changed so volume thresholds show up immediately allowing volume alerts to be immediately generated in newly added nodes.

 

tl;dr - Be aware that this solution has limitations on new volumes!!!

 

Background: Node and Interface metric thresholds are added to the alerting engine in a very intuitive way:

 

 

However, volume thresholds are obviously not:

 

 

The key was to take a step back and look at the alerting object options, there you shall find your salvation in the form of a "Volume Capacity Forecasting" object (as opposed to the intuitive "Volume" object type):

 

 

Which then presents those valuable thresholds!

 

 

From there, you need to setup a "Double Value Comparison" in the trigger:

 

 

And then create a comparison between the current and threshold values, respectively:

 

 

Which will then trigger on Volumes where their current percent utilization exceed the threshold you have defined on that specific volume:

 

 

 

For reference: thresholds are edited per object by editing the object's properties, and looking at the bottom of the page: (Pro Tip: you can edit multiple objects at once from the "Manage Nodes/Entities" page)

 

 

 

Verified via SQL search on the "VolumesForecastCapacity" view in the database:

 

 

SELECT TOP 100 * FROM VolumesForecastCapacity

 

 

 

There you have it. Happy monitoring everyone!

OpsGenie Super Thread

$
0
0

Would love to hear all opinions of the OpsGenie product and how it can revolutionize our alerting and on call scheduling work flow.

 

How has it changed the way in which you approach SolarWinds alerting?

 

Thank you!

Dependency aware Alert examples

$
0
0

Hi guys,

 

Could anyone please share screenshots of their trigger condition for an alert that is suited to mute Down alerts on Child objects that are down?

I have a dependency where a router is the parent and a group is the child. I only wish to receive alerts when the router is down and suppress the hundreds of alerts I would get when the group members are down. Any help greatly appreciated.

Help me better understand our Alerts

$
0
0

Our Solarwinds team has set up Disk alerts  at Warning, High, and Critical Thresholds.

 

From what I am told, they should apply to all disk volumes. 

 

If I look at the Active Alerts,  and filter down to volumes and to Critical alerts, I get a listing of a number of active critical disk alerts.

 

If I run a simple report, asking for  all volumes running above the critical alert threshold of 95%, it returns me a listing  of all volumes currently running above our critical threshold of 95%,  but it is double the size of the Critical Active Alert data.

 

This tells me that all of my volumes must not be alerting. 

 

If I look at the ones that are not showing on the active alert list, I see that in the box  called:

 

Why does it not show  the Warning and Critical Alerts?

 

All Alerts this Object can trigger         (1)

            ALL ALERTS

 

ALERT NAMEDESCRIPTIONSEVERITY
_DiskUtilization_HighTriggered when disk utilization is between 90% and 95% on any device regardless of platform

Serious

Tell Us Your Unknown Devices v2.0

$
0
0

Those that have been part of the Thwack Community a while may be familiar with the long running Tell us your "Unknown" devices! thread which had been active since 2007. That thread had become too unwieldy, and most of the user submissions had been implemented many years ago. I recently reviewed each and every posting in that thread, verified what had been implemented in-product, and which ones had not so they could be included in a forthcoming release. With that done, it was time to lock that thread for good and start anew. This time, providing a bit more guidance along the way to ensure everyone is successful in providing the necessary information required to properly identify these devices.

 

What is an 'Unknown' Device anyway?

 

Orion does its best to automatically identify and classify nodes as they're added to Orion. There are however, new device types and models released all the time. It's entirely possible you might be managing a device right now that Orion is unable to properly identify. You can find these easily by going to [Settings - Manage Nodes], changing the 'Group by:' option to 'Machine Type' and clicking on the 'Unknown' category. It's also helpful to add the 'Polling Method' column to the layout, as this thread pertains exclusively to SNMP managed nodes.

 

Any SNMP managed nodes listed under the 'Unknown' Machine Type category are prime candidates for submission to this thread. All that's required is that you provide the devices SNMP System Object Identifier (SysObjectID), as well as the Make & Model of the device associated with that SysObjectID. This post is an excellent example of the perfect submission.

 

What Exactly is a SysObjectID?

 

I have yet to find a clearer definition for what the SysObjectID (System Object Identifier) is then the following excerpt which can typically be found written in virtually every vendor's MIB file verbatim.

 

Object Name: sysObjectID
Object ID: 1.3.6.1.2.1.1.2.0
Object Syntax: OBJECT IDENTIFIER
Object Access: read-only
Object Status: mandatory
Object Description: The vendor's authoritative identification of the  network management subsystem contained in the  entity. This value is allocated within the SMI  enterprises subtree (1.3.6.1.4.1) and provides an  easy and unambiguous means for determining `what  kind of box' is being managed. For example, if  vendor `Flintstones, Inc.' was assigned the  subtree 1.3.6.1.4.1.4242, it could assign the  identifier 1.3.6.1.4.1.4242.1.1 to its `Fred  Router'.

 

Essentially, it's a string of numbers in dotted notation that is (hopefully) unique to at least the manufacturer, and in most cases, to the specific make and model of the device being monitored. It's how we identify for example, that the device vendor is 'Cisco' and the model is a 'Nexus C7018'. All System Object ID's begin with '1.3.6.1.4.1' followed by a number which uniquely identifies the manufacturer. The numbers which then follow typically identify the specific model of the device.

 

Where Can I Locate the SysObjectID?

 

If the device is already managed as a Node in Orion then you can locate the SysObjectID in the 'Node Details' resource as shown below, when viewing the node in the Orion web interface.

 

Node DetailsNET-SNMP

Alternatively, you can use NET-SNMP to query the following SNMP OID to return the unique SysObjectID.

 

1.3.6.1.2.1.1.2.0

 

Below is an example of the 'snmpget' command line arguments which will return you the SysObjectID for the device.

 

 snmpget -v2c -On -c public 10.199.5.103 1.3.6.1.2.1.1.2.0

 

The example above is executed against a device with the IP address of '10.199.5.103' using SNMPv2c, with the community string 'public'. Below is a screenshot of the resulting output from that command. The string of numbers and periods highlighted in yellow below is this device's unique SysObjectID.

 

My Device Incorrectly Appears Listed as 'NET-SNMP'

 

Linux hosts, virtual appliances, and even some network equipment built on Linux, FreeBSD, etc. are often identified as 'NET-SNMP'. This is because the SNMP Daemon running on those hosts is, you guessed it, NET-SNMP. Unfortunately, these vendors for some reason, have chosen not to implement their own unique SysObjectID, and instead kept the default SysObjectID '1.3.6.1.4.1.8072.3.2.10' which is designated for NET-SNMP. If you have a device such as this, fret not. There are a few options available to you if you'd like these devices to be properly identified by their appropriate vendor's make & model within Orion.

 

Install The Orion Linux Agent

 

The easiest solution would be to install the Orion Linux Agent on the device which is reporting itself to be 'NET-SNMP'. The Linux Agent does not rely upon SNMP to identify the machine type or vendor. Instead, the Agent will report the Vendor as 'Linux' and the 'Machine Type' as the Linux distribution running on the device as depicted in the screenshots below.

 

Red HatCitrix XenServer

 

 

Modify NET-SNMP Configuration

 

Another approach is to customize NET-SNMP and Orion to properly reflect the Vendor and Machine Type. Simply following the steps outlined by adatole's post entitled No More Net-SNMP Nodes. This method uses a script osname.sh which is executed when a particular OID is is queried. Next, you would create a custom Device Poller to query that newly created OID and populate the Machine Type value in Orion for that device.

 

If you find it more fun to follow along, you can watch adatole walk you through the entire process in the following video.

 

 

 

Can't I Just Upload My Vendor's MIB File Here And You figure it Out?

 

While it would be nice if that's how it worked, unfortunately many (or most) vendors don't include this information within their MIB files. MIB files include a listing of all possible OIDs which could be polled across a wide variety of different devices (typically an entire product family), but it doesn't include the values which are returned by the devices (Enums notwithstanding). For that reason we need users, such as yourself, to post the SysObjectID's in this thread, along with the device vendor and model information so it can be included in our database.

 

If you'd still like your device's MIB file included in the Orion MIB database, for use with Network Performance Monitor's Universal Device Poller, or the Orion Platform's SNMP Trap Receiver, simply follow the steps outlined in KB article at the link below.  The latest version of the MIB database, containing your submissions, can always be downloaded from within the Customer Portal.

 

Request additional MIBs to the SNMP MIB browser database - SolarWinds Worldwide, LLC. Help and Support

Viewing all 21870 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>