Hi,
I was wondering if anyone up on here could provide some direction with regards to a serious issue we have encountered after we upgraded to NPM 11.5.2.
Here are the symptoms in general that we are experiencing:
When the associated services are started via the Orion Service Manager CPU on our Orion Server immediately spikes and the web interface becomes unreachable. I opened a Case with Orion support to document the issue (Case # - 816366). When checking the associated Application Event Log I find the following event showing up.
Exception Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=30038; handshake=0;
Service was unable to open new database connection when requested.
Exception Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
Connection string - Data Source=BISGSQL118\BISGSQL118;Initial Catalog=XXXXXXX;User ID=XXXXXXXXX;Password=*******;Connect Timeout=20;Load Balance Timeout=120;Application Name=NetFlowService;Workstation ID=BISGAPP118
The support folks insist that it is a database issue as there are several Services that cannot connect to the database such as NetFlow and BusinessLayer. I have worked with both my DBA and Network Engineer to ensure that connectivity has been established.
- The Network Engineer and I ran a Netstat -an and we definitely see connections coming from our Primary Solarwinds Server
TCP 10.128.194.22:1433 10.128.194.23:58339 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:58906 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:58913 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:58954 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59004 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59030 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59058 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59062 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59075 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59083 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59084 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59099 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59154 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59156 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59164 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59171 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59180 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59188 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59189 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59449 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59601 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59603 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59605 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59607 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59608 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59609 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59610 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59631 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59633 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59637 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59641 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59679 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59716 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59741 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59751 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59753 ESTABLISHED
TCP 10.128.194.22:1433 10.128.194.23:59756 ESTABLISHED
2. Our DBA looked at the Activity Monitor and can see successful connections being made to the Database from the Orion server.
3. I am able to run queries on the Database Manager on the primary Solarwinds Server.
I am fairly convinced that the errors that we are seeing with Connections to the Database are coming from Solarwinds Server. I am getting a number of dumps from Solarwinds in the Event Logs. One of which is listed below.
Application: SolarWinds.DataProcessor.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.Data.SqlClient.SqlException
Stack:
Server stack trace:
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParser.ConsumePreLoginHandshake(Boolean encrypt, Boolean trustServerCert, Boolean integratedSecurity, Boolean& marsCapable)
at System.Data.SqlClient.TdsParser.Connect(ServerInfo serverInfo, SqlInternalConnectionTds connHandler, Boolean ignoreSniOpenTimeout, Int64 timerExpire, Boolean encrypt, Boolean trustServerCert, Boolean integratedSecurity, Boolean withFailover)
at System.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean ignoreSniOpenTimeout, TimeoutTimer timeout, Boolean withFailover)
at System.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString connectionOptions, SqlCredential credential, TimeoutTimer timeout)
at System.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(TimeoutTimer timeout, SqlConnectionString connectionOptions, SqlCredential credential, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance)
at System.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, Object providerInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString userConnectionOptions, SessionData reconnectSessionData)
at System.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection, DbConnectionOptions userOptions)
at System.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(DbConnectionPool pool, DbConnection owningObject, DbConnectionOptions options, DbConnectionPoolKey poolKey, DbConnectionOptions userOptions)
at System.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
at System.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
at System.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
at System.Data.SqlClient.SqlConnection.TryOpenInner(TaskCompletionSource`1 retry)
at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
at System.Data.SqlClient.SqlConnection.Open()
at SolarWinds.Collector.OrionCommon.DatabaseFunctions.OpenNewDatabaseConnection()
at SolarWinds.Collector.OrionCommon.SqlHelper.ExecuteReader(SqlCommand command)
at SolarWinds.Collector.DAL.PollingEngineDAL.<LoadEnginesFor>d__0.MoveNext()
at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
at SolarWinds.Collector.DAL.PollingEngineDAL.EnumeratePollingEngines()
at SolarWinds.Collector.DAL.PollingEngineDAL.LoadCurrentEngine()
at System.Lazy`1.CreateValue()
at System.Lazy`1.LazyInitValue()
at System.Lazy`1.get_Value()
at SolarWinds.Collector.DAL.PollingEngineDAL.GetCurrentEngine()
at SolarWinds.Collector.PollingPlanStore.ProcessFile(String file)
at SolarWinds.Collector.PluginStore`1.Refresh()
at SolarWinds.Collector.PluginStore`1.Initialize(String manifestRootPath)
at SolarWinds.Collector.PollingPlanStore..ctor(String manifestRootPath, IPollingEngineDAL engineDAL)
at SolarWinds.Collector.PollingPlanStore.Default(IPollingEngineDAL engineDAL)
at SolarWinds.Collector.DataProcessor.DataProcessorHost.Start()
at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs)
at System.Runtime.Remoting.Messaging.StackBuilderSink.SyncProcessMessage(IMessage msg)
at SolarWinds.Collector.Services.CollectorService.InternalStart()
at System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
at System.Threading.ThreadHelper.ThreadStart()
At this point I am looking for any direction or assistance from Solarwinds with regards to what this could be. I saw someone with a similar issue here:
Post orion upgrade getting DB intermittent connectivity(Slowness) issue
Thanks,
Ryan