Is there a way to create a tabular view of Netpath measurements

November 24, 2016, 6:40 am

≫ Next: NPM 11.5.2 to 12.0.1 - Now syslog / trap services won't start

≪ Previous: SNMP fails but device shows up.

Hello,

Netpath has been a good feature from solarwinds but I think the developers should add more reporting features to this. Especially a table view to get a quick glance of all measurements for a particular service from all probes.

Currently, I cannot quickly compare for e.g Mexico Vs India Vs Switzerland for response time to each services. We plan to install approximately 80 probes , one in each country. I have to click each link to see the current average measurement. This would become extremely difficult once we have 80 probes to have a quick glance on the values

Is there a way anyone know to get a tabular view of the measurement.

Message was edited by: tester12 M

↧

NPM 11.5.2 to 12.0.1 - Now syslog / trap services won't start

December 16, 2016, 11:54 am

≫ Next: High Availability in NPM

≪ Previous: Is there a way to create a tabular view of Netpath measurements

All prerequisites met successfully, the upgrade went well until the very end when this message appeared:

Checking Orion Service Manager, all services are running except Syslog - which is stuck in "stopping" and trap, which is stopped and won't start.

New firewall ports --- are they TCP or UDP, the NPM 12.0 Release Notes *does not specify* - could this be a problem with the site and services simply not starting?

Thanks,

Bill

↧

High Availability in NPM

December 12, 2016, 1:28 am

≫ Next: clear picture of NPM hardware

≪ Previous: NPM 11.5.2 to 12.0.1 - Now syslog / trap services won't start

HI Experts,

I have installed NPM orion on a server now i need to set it High Availability.

I'm in some doubt for this.

Whether i need to install another NPM on different server or there is differnet way, Please help solve this.

↧

clear picture of NPM hardware

December 5, 2016, 12:00 am

≫ Next: Differend maps in a loop

≪ Previous: High Availability in NPM

My organisation want to deploy NPM for an enterprise level customer which has a network in about 8 cities i can not get the true hardware requirements can any body help me in this and

give me a rough sketch of the hardware and the NPM hardware topology??

↧

Differend maps in a loop

December 16, 2016, 7:31 am

≫ Next: NPM 12.1/NCM 7.5.1 ?all good?

≪ Previous: clear picture of NPM hardware

Is it possible to get differend maps in a loop, so that we could view map A for a couple of minutes then map B etc...

This could be intresting to have.

↧

NPM 12.1/NCM 7.5.1 ?all good?

September 29, 2016, 1:32 am

≫ Next: F5's - Unknown Machine Type

≪ Previous: Differend maps in a loop

Please update if you have problems with 12.1 or NCM 7.5.1 ?

Sure like to get some good vibes ...

Before upgrate my production set..

/SJA

↧

F5's - Unknown Machine Type

March 21, 2016, 9:32 am

≫ Next: When you installed NPM, did you add Nodes manually or did you run discovery?

≪ Previous: NPM 12.1/NCM 7.5.1 ?all good?

My F5's are showing up with Machine Type "unknown", which is causing it to default to a completely unrelated view (it defaults to wireless controller view). Does anyone else have any trouble with that happening, or is there something I just need to troubleshoot? Just checking before I go putting in a feature request.

↧

When you installed NPM, did you add Nodes manually or did you run discovery?

July 20, 2016, 12:33 am

≫ Next: Alerting for routing neighbor

≪ Previous: F5's - Unknown Machine Type

We would like to improve user experience and for such reason I'd like to better understand if our users prefers INITIALLY to add nodes manually or run product network discovery in order to import devices to NPM

↧

Alerting for routing neighbor

December 20, 2016, 8:26 am

≫ Next: switch to server traffic flow via NTA

≪ Previous: When you installed NPM, did you add Nodes manually or did you run discovery?

Hello All,

I am working on the alerting for routing neighbor, and i have the following query on my alert:

Node Name: ${N=SwisEntity;M=Router.Nodes.DisplayName}

Neighbor Name: ${SQL: SELECT Caption FROM NodesData Where IP_Address='${N=SwisEntity;M=NeighborIP}'}

NeighborIP: <a href="https://orion.xxxx.com/Orion/View.aspx?NetObject=NBR:${NeighborID}">${NeighborIP}</a>

Protocol: ${SQL:SELECT DisplayName FROM NPM_RoutingProtocol WHERE ProtocolID=${ProtocolID}}

Status: ${N=SwisEntity;M=ProtocolStatusDescription} (${SQL:SELECT StatusName FROM StatusInfo WHERE StatusID='${OrionStatus}'})

IsDeleted: ${IsDeleted}

the output i get is correct for a few IPs, but for a few i get the below:

I checked and saw that the IPs that do not resolve to a neighbor name are not the polling IPs of the node and hence i am not getting the neighbor name.

Is there a way i can get the neighbor show up even if the IP is not the polling IP of the node?

Do we have a query for this?

TIA,

Malcolm.

↧

switch to server traffic flow via NTA

December 20, 2016, 12:36 am

≫ Next: Alert when Cisco stack member removed from stack

≪ Previous: Alerting for routing neighbor

i have alcatel switches in my environment and are configured to sent sflows to NTA listener. i want to know if its possible to see traffic(protocol, app, port etc, ) flow from switch to server or vice versa. i am not sure if "flow navigator" on NTA can accomplish this but last time i did not manage.

↧

Alert when Cisco stack member removed from stack

September 16, 2016, 3:47 am

≫ Next: Network Discovery Alert Variables

≪ Previous: switch to server traffic flow via NTA

Hello all,

We had a stack member fail in a stack yesterday and this was reported in the events section and there was also an event when the stack gained a new member once the fault was resolved:

Events were:

Switch stack on xxxx lost 1 member

Switch stack on xxxx gained 1 member

What we would like is for an alert to be created based upon this kind of event, so we can configure the alert to send an email to let us know of these events if we are not in front of solarwinds.

I have looked through the alerts and can't see anything relating to this, can anyone help us set up an alert based upon these events?

Thanks,

↧

Network Discovery Alert Variables

November 1, 2016, 10:47 am

≫ Next: Monitor a linux service and respond to output of a command

≪ Previous: Alert when Cisco stack member removed from stack

Getting emails from scheduled discovery jobs is very helpful, not knowing which discovery they come from isn't. I feel its possible but I need some SQL help to sort it out.

Info:

There are 2 tables we need to tie together:

DiscoveryProfiles

DiscoveryJobs

The discovery notification comes from the discovery jobs table. If you look at the table you can see a row for every job. That row contains variable "ProfileID".

I need to tie the "ProfileID" variable from the "DiscoveryJobs" table to the "ProfileID" in the "DiscoveryProfiles" table so I can pull out the "Name" of that discovery.

I would like to have the "Name" of the profile inserted into the email so when i get 6+ a night i know which job it came from. This would also be helpful in the discovery failed emails so you would know which discovery failed to run.

↧

Monitor a linux service and respond to output of a command

June 22, 2015, 11:49 am

≫ Next: Last xx Events. Filtered. 10.6; 10.4.2; 9.5.1

≪ Previous: Network Discovery Alert Variables

I have a service, sssd, that keeps dying on two Linux servers. While I am waiting to resolve that, I would like to be able to monitor the output of the command:

service sssd status

With the following actions

1) if the word "running" is in the result, set condition up

2) if the word "down" is in the result, set condition down

3) if the word "dead" is in the result, run the command "service sssd restart"

When the service dies, if I login and run the above command, I get the result:

[root@dwdataaccm1 ~]# service sssd status

sssd dead but subsys locked

[root@dwdataaccm1 ~]# service sssd restart

Stopping sssd: cat: /var/run/sssd.pid: No such file or directory

[FAILED]

Starting sssd: [ OK ]

[root@dwdataaccm1 ~]#

I would like to use Orion to manage this while we try to find the root cause. Can anyone help? I have searched already and find lots of posts, but none that seem to use the service command, and I don't know perl well enough to modify the existing samples.

Thanks.

↧

Last xx Events. Filtered. 10.6; 10.4.2; 9.5.1

March 30, 2010, 10:00 pm

≫ Next: Alert Prioritising Dashboard (SWQL) for Problematic Nodes (Servers)

≪ Previous: Monitor a linux service and respond to output of a command

↧

Alert Prioritising Dashboard (SWQL) for Problematic Nodes (Servers)

February 26, 2015, 9:01 am

≫ Next: Syslog events not generating

≪ Previous: Last xx Events. Filtered. 10.6; 10.4.2; 9.5.1

Here is an example to use SWQL to build a view to display problematic nodes (servers) with issues from one or more flowing areas:

• Node Status (column name: CONN) - (1 UP, 2 Down, ignore other status)

• Node Response Time (column name: M_SECS) - in milliseconds, (> 0 OR When Node is Down, it is -1). If M_SECS> 500: Warning, If M_SECS> 500: Critical

• Node CPU Load (column name: C_LOAD) - in percentage, (Between 0 - 100). If C_LOAD > 95: Warning, If C_LOAD > 98: Warning, If C_LOAD =100: Down

• Node Memory Usage (column name: R_Load) - percentage, (Between 0 - 100). If R_LOAD > 95: Warning, If R_LOAD > 98: Warning, If R_LOAD =100: Down

• Node Highest Volume Usage (column name: V_PERCENT) - (Between 0 - 100). If V_PERCENT > 95: Warning, If V_PERCENT > 98: Warning, If V_PERCENT =100: Down

• Node Hardware Components worst Status (column name: HW_Status) - (UP, Undefined, Unknown, Warning, Critical, n/a)

• Node Application worst Status (column name: APP_Status) - (UP, Unmanaged, Unknown, Unreachable, Warning, Critical, Down, n/a)

In order to the worst (highest priority) condition are shown on the top of the list I gave each status different scores, and each column different weights. Then calculate total score as the priority. Here is the calculation:

•    wConn (Connection), scores: Down - 1000, Up - 0; weight 1.00
•    wTime (Response Time), scores: > 1000ms - 80, >500ms - 10, other - 0; Weight 0.75
•    wCPU (CPU Load), scores: 100% - 600, >98% - 80, >95% - 10, Other - 0; Weight 1.00
•    wRAM (Memory Load), scores: 100% - 600, >98% - 80, >95% - 10, Other 0; Weight 1.00
•    wVol (MAX(Volume Usage)), the highest volume usage of all volumes on a node, scores: 100% - 600, >98% - 80, >95% - 10, Other 0; Weight 0.75
•    wHW (Hardware Status (worst Value)), the worst HW component status of a node with HW monitor enabled   scores: Critical - 80, Warning - 10, Up - 0, other 1; Weight 0.50
•    wApp (Application Status (worst value), the worst application statues of a node with application monitors assigned. scores: Down - 600, Critical - 80, Warning - 10, Up - 0, other 1; Weight 0.50

Maximum Total Weighted Score (Exclude wConn): 80*0.75 + 600*1.00 + 600*1.00 + 600*0.75 + 80*0.50 + 600 *0.50 = 2050

Priority = ROUND((t1.wTime*0.75 + t1.wCPU*1.0 + t1.wRAM*1.0 + t1.wVol*0.75 + t1.wHW*0.5 + t1.wApp*0.5)/2.05 + t1.wConn*1.00, 2)

Final Priority value is between 0 and 1000.

You can change the score and weight to meeting your requirement.

Steps:

Create a view; add “Custom Query” resource.

In the view, edit Custom Query:
In the Custom SWQL Query box, add the codes in attached file “thwack-swql-alerts.txt”
Enable search, and in Search SWQL Query box, add the codes in attached file “thwack-swql-alerts-withSearch.txt”

Done!

Using Search:

•    By Node Name
If you want to just display a node or a group of nodes with similar names, type node name or part of the name in the search box and click search button.
•    By Connection Status
If you want to just display nodes in DOWN status, type “n 1” (white space between n and 1) in the search box and click search button.
•    By CPU or RAM or Volume usage
If you want to just display node with CPU or RAM or Volume usage above certain level, using the following:

     o    “c 80” (CPU usage above 80%)
     o    “r 80” (Memory usage above 80%)
     o    “v 80” (Volume usage above 80%)
•    By Hardware Status
If you want to just display node with certain hardware status, type “h status” (‘status’ can be one of the following: UP, undefined, Unknown, Warning, Critical, n/a).
•    By Application Status
If you want to just display node with certain application status, type “a status” (‘status‘ can be one of the following: UP, Unmanaged, Unknown, Unreachable, Warning, Critical, Down, n/a).

You can customise the query to meeting your requirements.

Thanks Alex Soul's post https://thwack.solarwinds.com/docs/DOC-174568, which is very helpful!

===========================

Update: As Alex suggested, I have updated the query and new files are attached. Thanks Alex!

===========================

Update: 11/March/2015

I have added 2 addition columns for Alert Prioritising Dashboard.

One column is AlertTime, another one is Acknowledge (Ack). The Ack column is click-able. Right click it and open a new windows to View or Acknowledge an alert.

Please see the additional document at https://thwack.solarwinds.com/docs/DOC-176727

============================

Update: 11/11/2015

The original query is for NPM & SAM, but if you only need NPM (network nodes) part, I did create another two queries for network devices only.

The files: "networkNOC-ForThwack.txt" and "InterfaceNOC-ForThwack.txt" are attached.

"networkNOC-ForThwack.txt" is for network device (NPM) only.

"InterfaceNOC-ForThwack.txt" if is for network Interface only.

Both are limited to Vendor = 'Cisco', you can change it to meet your requirements.

↧

Syslog events not generating

December 21, 2016, 1:17 am

≫ Next: NPM 12 Upgrade broke Automatic Login

≪ Previous: Alert Prioritising Dashboard (SWQL) for Problematic Nodes (Servers)

Hi Experts,

I have installed NPM on 2012 r2 server and monitoring other 2012r2 server.

port 514 is opened at both sides, but still no syslog events.

Please help solve this.

Regards,

Ishant Walia

↧

NPM 12 Upgrade broke Automatic Login

June 23, 2016, 6:20 am

≫ Next: DCOM, Event 10028 in Windows System log

≪ Previous: Syslog events not generating

On 11.5.2 we had windows pass-thru or pass through credentials working and after the upgrade it no longer works. Anyone else run into this or have any ideas?

Thanks

↧

DCOM, Event 10028 in Windows System log

April 27, 2015, 7:00 am

≫ Next: Undefined status for VMWare Datacenter

≪ Previous: NPM 12 Upgrade broke Automatic Login

Hello, and thanks...

Is there a way to prevent log entries in the System log for "Microsoft-Windows-DistributedCom" ID 10028 entries when a windows system managed my NPM/Orion is no longer online?

thanks

chris

↧

Undefined status for VMWare Datacenter

February 14, 2011, 8:05 pm

≫ Next: Any hints on troubleshooting device polling?

≪ Previous: DCOM, Event 10028 in Windows System log

We're on Core 2010.2.0, NPM 10.1, APM 3.5 and IVIM 1.0.0. ESX servers are mostly on 4.1.0

The VCenter, Datacenter and Cluster status is sometimes marked as Unknown/Undefined. It seems that when I select the VCenter server and test the existing credentials again it gets the VCenter server and Clusters into the green state again but the Datacenter status remains as grey and Unknown. I get the number of Clusters, ESX Hosts and Running VMs displayed when I hover the mouse over the Cluster but when I display the Datacentre Details View the Datacenter status shows as Unknown.

Am I doing something wrong here? I'm not sure I have ever seen the Datacenter status as anything other than in this grey state since I first got the VCenter access thing going. Why do the VCenter servers appear to lose authentication periodically?

↧

Any hints on troubleshooting device polling?

December 20, 2016, 10:44 pm

≫ Next: Netpath which port for Lan IP Monitoring

≪ Previous: Undefined status for VMWare Datacenter

I am currently using NPM 12, the 2 polling servers, and the DB server seem to be in good-health.

What has begun to happen is at an arbitrary date (approx 3 weeks ago) Solarwinds orion has started to fail at polling device hardware stats on Cisco devices. It seems like from 11/29/16 it just can't poll hardware stats, it is bizarre because if I click on "List Resources" on a device I can see the SNMP-driven communication occur and Solarwinds will populate the list of interfaces for me to select to be managed. Once managed though the devices will sit at all 0's for its metrics, if it was a managed interface that existed before that arbitrary date / time the stats will display what they were at that time with the warning "This data is obsolete."

Additionally, if I navigate to my version of this page "http://oriondemo.solarwinds.com/Orion/Admin/HistoricalStatistics.aspx" I can see that the data in Cisco Buffer Statistics, Interface Traffic Statistics, Interface Error Statistics states that it has not been updated since that aforementioned time.

Data from other services such as agents is working and recording as normal.

Has anyone encountered this behavior before? I have searched these forums for answers but couldn't find anyone with my described symptoms.

EDIT:

Looked at the collector logs and I constantly see this error:

2016-12-21 09:43:12,229 [39] ERROR Main - Can't load native library 'C:\Program Files (x86)\Common Files\SolarWinds\OpenSSL\x64\libeay32.dll', error code: 126

2016-12-21 09:43:12,229 [39] ERROR Main - AES Error: Not able to load OpenSSL library from path: C:\Program Files (x86)\Common Files\SolarWinds\OpenSSL\x64\libeay32.dll into process.

2016-12-21 09:43:12,229 [39] ERROR Main - Unable to encrypt PDU with AES

2016-12-21 09:43:12,229 [39] ERROR Main - Can't load native library 'C:\Program Files (x86)\Common Files\SolarWinds\OpenSSL\x64\libeay32.dll', error code: 126

2016-12-21 09:43:12,229 [39] ERROR Main - AES Error: Not able to load OpenSSL library from path: C:\Program Files (x86)\Common Files\SolarWinds\OpenSSL\x64\libeay32.dll into process.

I presume it is related to my problem, how would I go about getting a fresh copy of the libeay32.dll library?

Message was edited by: Paul Cheek

↧