DFSR Replication Backlog
XKJNGZ Uses Powershell to make WMI queries to get the current backlog file count for each outbound DFSR partner on each DFSR share. These queries can be expensive if the backlog is large, so the polling interval is set to 10 minutes. If there is no backlog, the script finishes quickly. No default alerting is set, but I would recommend adding a threshold to be notified of potential replicationissues. NOTE* - The collector must be able to reach both DFSR partners and will use the same credentials to make the queries for both.440Views1like15CommentsAzure Stack HCI resources don't have storage, memory, disk or cluster metrics
We have a customer with an Azure Stack HCI cluster deployed a few months ago. For those not familiar, this is basically a customised Windows Server core environment that runs Hyper-V VMs and some Azure-specific workloads on-premises. The virtualised workloads are all added as resources using a locally-deployed collector (on the Windows jumphost, if it matters) and they all show CPU, Disks, Interfaces, Processes …everything you’d expect for Windows hosts. We’ve added the two nodes as Resources, but we don’t see any detailed metrics - only Host Status (DNS), HTTP and Ping. I also added the FQDN for the cluster management point / VNN, and it has the same minimal detail as the individual cluster nodes. There are quite a few valid/correct properties recorded for the systems - some, for example: system.domain (customer AD domain) system.ips (all IP addresses for all interfaces) system.model (correctly identifies vendor and server model, presumably from WMI) system.sysinfo (“Microsoft Azure Stack HCI”) system.sysname (hostname) system.systemtype (“x64-based PC”) Is there something else I need to do to have this system monitored? I’m rushing because we nearly had a CSV run out of space - we thought it was monitored, and we were wrong.Solved296Views16likes9CommentsWindows Services Monitoring with quite a bit more Automation applied
So today we use LM's Microsoft Windows ServicesDataSource to monitor Windows Services. This DS uses Groovy Script and WMI calls under the hood to fetch the service metrics like state, start mode, status, etc... Everything works fine but one of the prerequisites is to go and manually populate the list of Windows services which then the DS parses out as a WILDVALUE variable in the script. You know, go to the device, click on Down Arrow (Manage Resource Options) --> Add Additional Monitoring --> and CHOOSE from the list of Windows Services. Rinse and Repeat and Save. Then the DS goes to work. Well, what if you have a list of over 100 Windows Services you need to add to let's say 20 Windows devices? That would take forever to populate that list manually... That's a problem number 1. Scratch that. This is not really a problem since one can run a PowerShell script (or Groovy Script) to perform this task using undocumented - but working very well - LM API calls. That problem is solved. Next - This list of over 100 Services needs to be *refreshed* every let's say 24 hours to remove nonexistent services and add new ones based on the Regex filter. That's a problem number 2. And again, one can do it programmatically running API calls but this is where I am trying to figure out how to do it. Run my script as a custom PropertySource? I am not really writing Resource Properties, I am updating instance list (Windows Services) within Additional Monitoring on bunch of Resources. Plus PropertySources are applied when ActiveDiscovery is run which is what, every 24 hours? Or should I write custom DataSource that would accomplish this refresh and specify 1 day collection period? Thanks.Solved588Views4likes2CommentsProcess Monitoring Batch Script
s there a way we can measure the performance of a Data Source or collectors? Repository:ProcessMonitoring @Stuart Weenig I presume I did not understand why monitoring lots of processes/services on Windows systems, with _Select Data Sources might not be the best approach. Aren’t both making aWMI call? Aren’t both going to bring all the Processes in one go? Can we seethe query count from WMI Vs Batch Groovy?Solved122Views0likes7CommentsProcess Monitoring
Hi @Stuart Weenig Thank you for your awesome work! I was able to use the Win_Process_Stats_Groovy.xmlfile for creating data source for Process. https://github.com/sweenig/lm/tree/main/ProcessMonitoring I am able to see data in Discovery and Collector but under Raw Data in Devices > Data sourceI do not see any data , when I poll I do see data, am I missing something. My Applied To Wizard has the following query I removed the Win_Process_Stats.excludeRegEx &Win_Process_Stats.includeRegEx from “AppliesTo” isWindows() && system.displayname == "server001" or system.displayname == "server001"Solved205Views7likes10CommentsDoes anyone have any experience with monitoring Windows Processes?
I’ve checked the community for datasources and I don’t see anything to what I’m specifically looking for. Our organization currently utilizes the Microsoft_Windows_Services datasource (modified a little bit for our specific needs) to monitor services. I’m looking for something similar to monitor windows processes. Similar to the Microsoft_Windows_Services datasource, what I am hoping to accomplish is provide a list of keywords that will either match or be contained in the process name that I want to monitor, provide a list of machines that I want to monitor those processes on, andthen get alerted on if those processes stop running. Some issues I am running into so far are: Win32_Process always returns a value of NULL for status and state. So I cannot monitor for those two class level properties. Powershell’s Get-Process does not return status or state, rather it just looks for processes that are actively running, so I would need to get creative in having LogicMonitor create the instance and what value to monitor in the instance. Some of the processes I want to monitorcreate multiple processes with the same name, and LogicMonitor then groups them all together into one instance, which makes monitoring diffucult. Some of the process I want to monitor are processes that only run if an application is manually launched, which means that again I will need to get creative in how I set up monitoring because I don’t want to get alerts when a process that I know shouldn’t be running is not running. Because the processes I am trying to monitor are not going to be common for everyone everywhere, something that other people could do to try to replicate my scenario would be: Open Chrome. When Chrome is launched, you will get a processed called “Chrome”. Now, open several other tabs of Chrome, you will just get more processes named “Chrome”. Now, keeping in mind the points I made earlier, set up monitoring to let you know when the 3rd tab in Chrome has been closed, even though the rest of the Chrome tabs arestill open. How would you break that down? My first thought would be to monitor the PIDs, however, when you reboot your machine, your PIDs will likely change. Also, I don’t want to have the datasource wild value search by PID, because that would get confusing really fast once you have 2 or 3 different PIDs that you want to monitor. All suggestions are welcome, and any help is greatly appreciated. Bonus points if you can get this to work with the discovery method as Script and you use an embedded Groovy or Powershell script.Solved424Views12likes19CommentsWindows System Event Log "message" details not accurate
We are using the defaultWindows System Event Log event source and having those errors route through a Teams integration. When tested fromWindows System Event Log event source the Event Logging displays the entire “message” detailing the eventID reason etc etc. When looking in the Alerts section of the GUI it also shows the entire “Message” section with details. However when the alert shows up in Teams its dumbed down and useless. We get the following. Message: error - HOSTNAME Windows System Event Log The Teams integration is setup identically to the Event Source Alert message as seen below. Anyone know why ##Message## is getting overwritten with useless info instead of the actual message details from the Event? Host: ##HOST## Eventsource: ##EVENTSOURCE## Windows Event ID: ##EVENTCODE## Message: ##MESSAGE## Detected on: ##START##71Views12likes7CommentsWhen an anomaly isn't an anomaly what could i do?
What can i do when anomaly detection wont work ( something that is seen on a regular basis, and dynamic threshold also wont help where it is within range? For example a drive on a server gets filled with data ( drive is normally cleared down on a daily basis ) but when someone decides to upload a larger than expected amount the drive hasn't been cleared or with other uploads throughout the day there isn't enough space. You are happy if the drive is above 80% during the night because if it hasn't cleared it can be dealt with in the morning ( no need to get anyone out of bed ) but if there is a rapid spike ( more than 2.5% growth in used space in a 30min period ) then they need an alert to get out of bed and fix / make enough room for the data. A possible solution is a datasourcethat will alert if the drive is over the 80% but only with that rapid growth. DataSource calls the api for the last 30min worth of data and calculates the growth rate. The below is the code for a C drive but the drive letter can be changed easily in the code below, same with the 2.5% and the 80% values, they could also be parameterised for different ranges on different devices. <# Use TLS 1.2 #> [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12 <# account info #> $accessId = '##apiaccessid.key##' $accessKey = '##apiaccesskey.key##' $company = '##company##' $deviceId = "##system.deviceId##" <# request details #> $httpVerb = 'GET' $resourcePath = "/device/devices/$deviceId/devicedatasources" $queryParams = '?filter=dataSourceName:"WinVolumeUsage-"' <# Construct URL #> $url = 'https://' + $company + '.logicmonitor.com/santaba/rest' + $resourcePath + $queryParams <# Get current time in milliseconds #> $epoch = [Math]::Round((New-TimeSpan -start (Get-Date -Date "1/1/1970") -end (Get-Date).ToUniversalTime()).TotalMilliseconds) <# Concatenate Request Details #> $requestVars = $httpVerb + $epoch + $data + $resourcePath <# Construct Signature #> $hmac = New-Object System.Security.Cryptography.HMACSHA256 $hmac.Key = [Text.Encoding]::UTF8.GetBytes($accessKey) $signatureBytes = $hmac.ComputeHash([Text.Encoding]::UTF8.GetBytes($requestVars)) $signatureHex = [System.BitConverter]::ToString($signatureBytes) -replace '-' $signature = [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes($signatureHex.ToLower())) <# Construct Headers #> $auth = 'LMv1 ' + $accessId + ':' + $signature + ':' + $epoch $headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]" $headers.Add("Authorization",$auth) $headers.Add("Content-Type",'application/json') $headers.Add("X-Version","3") <# Make Request #> $response = Invoke-RestMethod -Uri $url -Method $httpVerb -Header $headers <# Get Device DataSource ID #> $deviceDataSourceId = $response.items.id <# request details #> $httpVerb = 'GET' $resourcePath = "/device/devices/$deviceId/devicedatasources/$deviceDataSourceId/data" $queryParams = '' <# Construct URL #> $url = 'https://' + $company + '.logicmonitor.com/santaba/rest' + $resourcePath + $queryParams <# Get current time in milliseconds #> $epoch = [Math]::Round((New-TimeSpan -start (Get-Date -Date "1/1/1970") -end (Get-Date).ToUniversalTime()).TotalMilliseconds) <# Concatenate Request Details #> $requestVars = $httpVerb + $epoch + $data + $resourcePath <# Construct Signature #> $hmac = New-Object System.Security.Cryptography.HMACSHA256 $hmac.Key = [Text.Encoding]::UTF8.GetBytes($accessKey) $signatureBytes = $hmac.ComputeHash([Text.Encoding]::UTF8.GetBytes($requestVars)) $signatureHex = [System.BitConverter]::ToString($signatureBytes) -replace '-' $signature = [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes($signatureHex.ToLower())) <# Construct Headers #> $auth = 'LMv1 ' + $accessId + ':' + $signature + ':' + $epoch $headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]" $headers.Add("Authorization",$auth) $headers.Add("Content-Type",'application/json') <# Make Request #> $response = Invoke-RestMethod -Uri $url -Method $httpVerb -Header $headers <# Print status and body of response #> $status = $response.status $body = $response.data | ConvertTo-Json -Depth 5 function Select-Nth { param([int]$N) $Input | Select-Object -First $N | Select-Object -Last 1 } $array1 = @($response.data.instances.'WinVolumeUsage-C:\'.values) $first = $array1[0] | Select-Nth 3 $last = $array1[19] |Select-Nth 3 $growth = $first - $last if (($growth -gt 2.5) -and ($first -ge 80)){ return 1 }else { return 2 } Hope this gives you some ideas to develop alerting further😁144Views10likes2Commentssystem.info missing in the Info on a device
Hello, I have the system.info =~ “Integrated Lights-Out 4 255 Aug 1 6 2017” on 3 machines but it is missing on 2 other machines. What doI miss in the process to get the system.info appearing on the info for a device and get populated ? I checked the SNMP Services on each machine and the Security Tab is identical with the “Accepted community names” as well as the list of IPs in the “Accept SNMP packets from the hosts”. What did I miss? Thanks, DomSolved244Views2likes2CommentsDatasource to monitor Windows Services/Processes automatically?
Hello, We recently cloned 2 Logic Monitor out of the box datasources (name ->WinService- & WinProcessStats-) in order to enable the 'Active Discovery' feature on those. We did this becausewe've the need to discover services/processesautomatically, since we don't have an 'exact list' of which services/processes we should monitor (due to the amount of clients [+100] & the different services/solutions across them) After enabling this it works fine & does what we expect (discovers all the services/processes running in each box),we further added some filters in the active discovery for the servicesin order to exclude common 'noisy' services & grab only the ones set to automatically start with the system. Our problem arrives when these 2specific datasourcestartto impact the collector performance (due to the huge amount of wmi.queries), it starts to reflect on a huge consumption of CPU(putting thaton almost 100% usage all the time) & that further leads to the decrease of the collector performance & data collection (resulting in request timeouts & full WMI queues). We also thought on creating 2 datasources(services/processes) for each client (with filters to grab critical/wanted processes/services for the client in question) but that's a nightmare(specially when you've clients installing applications without any notice & expecting us to automatically grab & monitor those). Example of 1 of our scenarios (1of our clients): - Collector is a Windows VM (VMWare)&has 8GB of RAM with4 allocated virtual processors (host processor is a Intel Xeon E5-2698v3 @ 2.30Ghz) - Currently, it monitors 78 Windows servers (not including the collector) & those 2datasourceare creating 12 700 instances (4513 - services | 8187 - processes) - examples below This results in approx. 15 requests per second This results in approx. 45 requests per second According to the collector capacity document (ref. Medium Collector) we are below the limits (forWMI), however, those 2 datasourceare contributing A LOT to make the queues full. We're finding errors in a regular basis- example below To sum thisup, we were seeking for another 'way' of doing the same thing without consuming so much resources on the collector end (due to the amount of simultaneousWMI queries). Not sure if that's possible though. Did anyone had this need in the past & was able to come up with a differentsolution (not so resource exhaustive)? We're struggling here mainly because we come from a non-agent less solution (which didn't facedthis problem due to the individual agentdistributed load - per device). Appreciate the help in advance! Thanks,1.2KViews13likes37Comments