bosun.org@v0.0.0-20210513094433-e25bc3e69a1f/cmd/scollector/doc.go (about) 1 /* 2 3 Scollector is a metric collection agent for OpenTSDB 2.0 and Bosun. 4 5 tcollector (https://github.com/OpenTSDB/tcollector) is OpenTSDB's data 6 collection framework built for OpenTSDB 1.0. scollector aims to be tcollector 7 for OpenTSDB 2.0 and is one method of sending data to Bosun (http://bosun.org/) 8 for monitoring. 9 10 Unlike tcollector, scollector is a single binary where all collectors are 11 compiled into scollector itself. scollector supports external collectors, but 12 your goal should be to use those temporarily until the go version is written or 13 the target system send data directly to OpenTSDB or Bosun. scollector has 14 native collectors for Linux, Darwin, and Windows and can pull data from other 15 systems such as AWS, SNMP, and vSphere. 16 17 Usage: 18 scollector [flag] 19 20 The flags are: 21 22 -h="" 23 OpenTSDB or Bosun host. Overrides Host in conf file. 24 -f="" 25 Only include collectors matching these comma separated terms. Prefix 26 with - to invert match and exclude collectors matching those terms. Use 27 *,-term,-anotherterm to include all collectors except excluded terms. 28 -b=0 29 OpenTSDB batch size. Default is 500. 30 -conf="" 31 Location of configuration file. Defaults to scollector.toml in directory of 32 the scollector executable. 33 -l 34 List available collectors (after Filter is applied). 35 -m 36 Disable sending of metadata. 37 -version 38 Prints the version and exits. 39 40 Additional flags on Windows: 41 -winsvc="" 42 Windows Service management; can be: install, remove, start, stop 43 44 Debug flags: 45 -d 46 enables debug output 47 -p 48 print to screen instead of sending to a host 49 -fake=0 50 generates X fake data points per second on the test.fake metric 51 52 The only required paremeter is the host, which may be specified in the conf 53 file or with -h. 54 55 Warning 56 57 scollector has not been tested outside of the Stack Exchange environment, and 58 thus may act incorrectly elsewhere. 59 60 scollector requires the new HTTP API of OpenTSDB 2.1 with gzip support. Ensure 61 that is in use if not using the OpenTSDB docker image. 62 63 Logs 64 65 If started with -p or -d, scollector logs to Stdout. Otherwise, on Unixes, 66 scollector logs to syslog. On Windows when started as a service, the Event Log 67 is used. 68 69 External Collectors 70 71 See http://bosun.org/scollector/external-collectors for details about using 72 external scripts or programs to collect metrics. 73 74 Configuration File 75 76 If scollector.toml exists in the same directory as the scollector 77 executable or is specified via the -conf="" flag, it's content 78 will be used to set configuration flags. The format is toml 79 (https://github.com/toml-lang/toml/blob/master/versions/en/toml-v0.2.0.md). 80 Available keys are: 81 82 Host (string): the OpenTSDB or Bosun host to send data, supports TLS and 83 HTTP Basic Auth. 84 85 Host = "https://user:password@example.com/" 86 87 FullHost (boolean): enables full hostnames: doesn't truncate to first ".". 88 89 ColDir (string): is the external collectors directory. 90 91 Tags (table of strings): are added to every datapoint. If a collector specifies 92 the same tag key, this one will be overwritten. The host tag is not supported. 93 94 Hostname (string): overrides the system hostname. 95 96 DisableSelf (boolean): disables sending of scollector self metrics. 97 98 Freq (integer): is the default frequency in seconds for most collectors. 99 100 BatchSize (integer): is the number of metrics that will be sent in each batch. 101 Default is 500. 102 103 MaxQueueLen (integer): is the number of metrics keept internally. 104 Default is 200000. 105 106 UserAgentMessage (string): is an optional message that will be appended to the 107 User Agent when making HTTP requests. This can be used to add contact details 108 so external services are aware of who is making the requests. 109 Example: Scollector/0.6.0 (UserAgentMessage added here) 110 111 Filter (array of string): Only include collectors matching these terms. Prefix 112 with - to invert match and exclude collectors matching those terms. Use 113 *,-term,-anotherterm to include all collectors except excluded terms. 114 115 MetricFilters (array of string): only send metrics matching these regular 116 expressions. Example ['^(win\.cpu|win\.system\..*)$', 'free'] 117 118 IfaceExpr (string): Replaces the default regular expression for interface name 119 matching on Linux. 120 121 PProf (string): optional IP:Port binding to be used for debugging with pprof. 122 Examples: localhost:6060 for loopback or :6060 for all IP addresses. 123 124 MetricPrefix (string): optional Prefix prepended to all metrics path. 125 126 Collector configuration keys 127 128 Following are configurations for collectors that do not autodetect. 129 130 KeepalivedCommunity (string): if not empty, enables the Keepalived collector 131 with the specified community. 132 133 KeepalivedCommunity = "keepalivedcom" 134 135 HAProxy (array of table, keys are User, Password, Instances): HAProxy instances 136 to poll. The Instances key is an array of table with keys User, Password, Tier, 137 and URL. If User is specified for an instance, User and Password override the 138 common ones. 139 140 [[HAProxy]] 141 User = "hauser" 142 Password = "hapass" 143 [[HAProxy.Instances]] 144 Tier = "1" 145 URL = "http://ny-host01:17/haproxy\;csv" 146 [[HAProxy.Instances]] 147 Tier = "2" 148 URL = "http://ny-host01:26/haproxy\;csv" 149 [[HAProxy.Instances]] 150 Tier = "3" 151 URL = "http://ny-host01:40/haproxy\;csv" 152 [[HAProxy.Instances]] 153 User = "hauser2" 154 Password = "hapass2" 155 Tier = "1" 156 URL = "http://ny-host01:80/haproxy\;csv" 157 158 SNMP (array of table, keys are Community and Host): SNMP hosts to connect 159 to at a 5 minute poll interval. 160 161 [[SNMP]] 162 Community = "com" 163 Host = "host" 164 MIBs = ["cisco"] 165 [[SNMP]] 166 Community = "com2" 167 Host = "host2" 168 # List of mibs to run for this host. Default is built-in set of ["ifaces","cisco"] 169 MIBs = ["custom", "ifaces"] 170 171 MIBs (map of string to table): Allows user-specified, custom SNMP configurations. 172 173 [MIBs] 174 [MIBs.cisco] #can name anything you want 175 BaseOid = "1.3.6.1.4.1.9.9" # common base for all metrics in this mib 176 177 # simple, single key metrics 178 [[MIBs.cisco.Metrics]] 179 Metric = "cisco.cpu" 180 Oid = ".109.1.1.1.1.6" 181 Unit = "percent" 182 RateType = "gauge" 183 Description = "cpu percent used by this device" 184 185 # can also iterate over snmp tables 186 [[MIBs.cisco.Trees]] 187 BaseOid = ".48.1.1.1" #common base oid for this tree 188 189 # tags to apply to metrics in this tree. Can come from another oid, or specify "idx" to use 190 # the numeric index as the tag value. Can specify multiple tags, but must supply one. 191 # all tags and metrics should have the same number of rows per query. 192 [[MIBs.cisco.Trees.Tags]] 193 Key = "name" 194 Oid = ".2" 195 [[MIBs.cisco.Trees.Metrics]] 196 Metric = "cisco.mem.used" 197 Oid = ".5" 198 [[MIBs.cisco.Trees.Metrics]] 199 Metric = "cisco.mem.free" 200 Oid = ".6" 201 202 ICMP (array of table, keys are Host): ICMP hosts to ping. 203 204 [[ICMP]] 205 Host = "internal-router" 206 [[ICMP]] 207 Host = "backup-router" 208 209 Vsphere (array of table, keys are Host, User, Password): vSphere hosts to poll. 210 211 [[Vsphere]] 212 Host = "vsphere01" 213 User = "vuser" 214 Password = "pass" 215 216 AWS (array of table, keys are AccessKey, SecretKey, Region, BillingProductCodesRegex, 217 BillingBucketName, BillingBucketPath, BillingPurgeDays): AWS hosts to poll, and associated 218 billing information. 219 220 To report AWS billing information to OpenTSDB or Bosun, you need to configure AWS to 221 generate billing reports, which will be put into an S3 bucket. See for more detail: 222 http://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/detailed-billing-reports.html 223 224 Once the reports are going into the S3, bucket, the Bucket Name and the Prefix Path that 225 you entered during the report setup need to be entered below. Do not enter a blank bucket 226 path as this is not supported. 227 228 Reports that are over a certain number of days old are purged by scollector. Set the key 229 BillingPurgeDays to 0 to disable purging of old reports (not that this may increase your S3 230 usage costs as all reports are processed each time the collector runs). 231 232 Do not populate the Billing keys if you do not wish to load billing data into OpenTSDB or 233 Bosun. 234 235 Only products whose name matches the BillingProductCodesRegex key will have their billing 236 data sent to OpenTSDB or Bosun. 237 238 [[AWS]] 239 AccessKey = "aoesnuth" 240 SecretKey = "snch0d" 241 Region = "somewhere" 242 BillingProductCodesRegex = "^Amazon(S3|Glacier|Route53)$" 243 BillingBucketName = "mybucket.billing" 244 BillingBucketPath = "reports" 245 BillingPurgeDays = 2 246 247 248 AzureEA (array of table, keys are EANumber, APIKey and LogBillingDetails): Azure Enterprise 249 Agreements to poll for billing information. 250 251 EANumber is your Enterprise Agreement number. You can find this in your Enterprise Agreement portal. 252 253 APIKey is the API key as provided by the Azure EA Portal. To generate your API key for this collector, 254 you will need to log into your Azure Enterprise Agreement portal (ea.azure.com), click the 255 "Download Usage" link, then choose "API Key" on the download page. You can then generate your API 256 key there. Keys are valid 6 months, so you will require some maintenance of this collector twice a year. 257 258 LogBillingDetails tells scollector to add the following tags to your metrics: 259 - costcenter 260 - accountname 261 - subscription 262 263 LogResourceDetails tell scollector to add the following tags to your metrics: 264 - resoucegroup 265 - resourcelocation 266 267 LogExtraTags tells scollector to take resource tags and add them to your metrics. Careful: this will 268 add all tags as they exist in Azure, so you may end up with a large number of distinct tags if you 269 are not careful. It will not process any tags that begin with "hidden". 270 271 If you are a heavy Azure EA user, then these additional tags may be useful for breaking down costs. 272 273 [[AzureEA]] 274 EANumber = "123456" 275 APIKey = "joiIiwiaXNzIjoiZWEubWljcm9zb2Z0YXp1cmUuY29tIiwiYXVkIjoiY2xpZW50LmVhLm1" 276 LogBillingDetails = false 277 LogResourceDetails = false 278 LogExtraTags = false 279 280 Process: processes to monitor. 281 282 ProcessDotNet: .NET processes to monitor on Windows. 283 284 See http://bosun.org/scollector/process-monitoring for details about Process and 285 ProcessDotNet. 286 287 HTTPUnit (array of table, keys are TOML, Hiera): httpunit TOML and Hiera 288 files to read and monitor. See https://github.com/StackExchange/httpunit 289 for documentation about the toml file. TOML and Hiera may both be specified, 290 or just one. Freq is collector frequency as a duration string (default 5m). 291 292 [[HTTPUnit]] 293 TOML = "/path/to/httpunit.toml" 294 Hiera = "/path/to/listeners.json" 295 [[HTTPUnit]] 296 TOML = "/some/other.toml" 297 Freq = "30s" 298 299 Riak (array of table, keys are URL): Riak hosts to poll. 300 301 [[Riak]] 302 URL = "http://localhost:8098/stats" 303 304 RabbitMQ (array of table, keys are URL): RabbitMQ hosts to poll. 305 Regardless of config the collector will automatically poll 306 management plugin on http://guest:guest@127.0.0.1:15672/ . 307 308 [[RabbitMQ]] 309 URL = "https://user:password@hostname:15671" 310 311 Cadvisor: Cadvisor endpoints to poll. 312 Cadvisor collects system statistics about running containers. 313 See https://github.com/google/cadvisor/ for documentation about configuring 314 cadvisor. You can enable per cpu usage metric reporting optionally, and 315 optionally use IsRemote to disable block device lookups. 316 317 [[Cadvisor]] 318 URL = "http://localhost:8080" 319 PerCpuUsage = true 320 IsRemote = false 321 322 RedisCounters: Reads a hash of metric/counters from a redis database. 323 324 [[RedisCounters]] 325 Server = "localhost:6379" 326 Database = 2 327 328 Expects data populated via bosun's udp listener in the "scollectorCounters" hash. 329 330 ExtraHop (array of table): ExtraHop hosts to poll. The two filter options specify how 331 scollector should filter out traffic from being submitted. The valid options are: 332 333 - namedprotocols (Only protocols that have an explicit name are submitted. The rest of the 334 traffic will be pushed into proto=unnamed. So any protocol that begins with 335 "tcp", "udp" or "SSL" will not be submitted (with the exception of SSL443). 336 - toppercent (The top n% of traffic by volume will be submitted. The rest of the traffic 337 will be pushed into proto=otherproto) 338 - none (All protocols of any size will be submitted) 339 340 FilterPercent applies when the FilterBy option is set to "toppercent". Only protocols that account 341 for this much traffic will be logged. For example, if this is set to 90, then if the protocol 342 accounts for less than 10% of the traffic, it will be dropped. This is OK if your traffic is 343 heavilly dominated by asmall set of protocols, but if you have a fairly even spread of protocols 344 then this filtering loses its usefulness. 345 346 AdditionalMetrics is formatted as such: [object_type].[object_id].[metric_category].[metric_spec_name] 347 348 - object_type: is one of: "network", "device", "application", "vlan", "device_group", "activity_group" 349 - object_id: can be found by querying the ExtraHop API (through the API Explorer) under the endpoint 350 for the object type. For example, for "application", you would query the "/applications/" 351 endpoint and locate the ID of the application you want to query. 352 - metric_category: can be found in the Metric Catalogue for the metric you are wanting to query. e.g. for 353 custom metrics, this is always "custom_detail" 354 - metric_spec_name: can be found in the Metric Catalogue for the metric you are wanting to query. e.g. for 355 custom metrics, this is name you have specified in metricAddDetailCount() function in 356 a trigger. 357 358 For these additional metrics, it is expected that the key for the metric is in a keyvalue, comma seperated pair. 359 This key will be converted into an OpenTSDB tagset. For example, if you have a key of 360 "client=192.168.0.1,server=192.168.0.9,port=21441", this will be converted into an OpenTSDB tagset of the same 361 values. 362 363 CAUTION: Do not include unbounded values in your key if you can help it. Putting in something like client IP, or 364 source/destination port, which are out of your control and specified by people external to your network, could 365 end up putting millions of different keys into your Bosun instance - something you probably don't want. 366 367 CertificateSubjectMatch and CertificateActivityGroup are used for collecting SSL information from ExtraHop. The 368 key CertificateSubjectMatch is used to match against the certificate subject. If there is no match, we discard 369 the certificate record. This is important as certificate subjects are essentially unbound, as EH return all 370 certificates it sees, regardless of where they originated. 371 372 The key CertificateActivityGroup is the Activity Group you want to pass through to ExtraHop to pull the certificates 373 from. There is a group called "SSL Servers" which is most likely the group you want to use. You will need to discover 374 the group number for this group and put it in here. 375 376 [[ExtraHop]] 377 Host = "extrahop01" 378 APIkey = "abcdef1234567890" 379 FilterBy = "toppercent" 380 FilterPercent = 75 381 AdditionalMetrics = [ "application.12.custom_detail.my trigger metric" ] 382 CertificateSubjectMatch = "example.(com|org|net)" 383 CertificateActivityGroup = 46 384 385 LocalListener (string): local_listener will listen for HTTP request and forward 386 the request to the configured OpenTSDB host while adding defined tags to 387 metrics. 388 389 LocalListener = "localhost:4242" 390 391 TagOverride (array of tables, key are CollectorExpr, MatchedTags and Tags): if a collector 392 name matches CollectorExpr MatchedTags and Tags will be merged to all outgoing message 393 produced by the collector, in that order. MatchedTags will apply a regexp to the tag 394 defined by the key name and add tags based on the named match groups defined in the 395 regexp. After tags defined in Tags will be merged, defining a tag as empty string 396 will deletes it. 397 398 [[TagOverride]] 399 CollectorExpr = 'cadvisor' 400 [TagOverride.MatchedTags] 401 docker_name = 'k8s_(?P<container_name>[^\.]+)\.[0-9a-z]+_(?P<pod_name>[^-]+)' 402 docker_id = '^(?P<docker_id>.{12})' 403 [TagOverride.Tags] 404 docker_name = '' 405 source = 'kubelet' 406 407 Oracles (array of table, keys are ClusterName, Instances): Oracle database 408 instances to poll. The Instances key is an array of table with keys 409 ConnectionString and Role, which are the same as using sqlplus. 410 411 [[Oracles]] 412 ClusterName = "oracle rac name" 413 [[Oracles.instances]] 414 ConnectionString = "/" 415 Role = "sysdba" 416 [[Oracles.instances]] 417 ConnectionString = "username/password@oraclehost/sid" 418 [[Oracles.instances]] 419 ConnectionString = "/@localnodevip/sid" 420 Role = "sysdba" 421 422 By default Elastic nodes are auto-detected on localhost:9200, but if you have a 423 node running on another network interface, a non-standard port or even multiple 424 nodes running on the same host you can use the Elastic configuration. Also lets 425 you specify basic auth credentials and using TLS by setting the Scheme to https: 426 427 [[Elastic]] 428 Host = "192.168.1.1" 429 Port = 9201 430 ClusterInterval = "10s" 431 IndexInterval = "1m" 432 User = "user" 433 Password = "pass" 434 Scheme = "https" 435 436 [[Elastic]] 437 Host = "192.168.1.1" 438 Port = 9202 439 ClusterInterval = "10s" 440 IndexInterval = "1m" 441 442 Windows 443 444 scollector has full Windows support. It can be run standalone, or installed as a 445 service (see -winsvc). The Event Log is used when installed as a service. 446 447 448 */ 449 package main