2017年6月22日 星期四

使用 netdata 監控 SNMP 設備試作


使用 netdata 監控 SNMP 設備試作

在 netdata 試裝 ( http://xrcd2.blogspot.tw/2017/06/netdata_21.html ) 後
,接著試透過 netdata 來做一般的 snmp monitor ,
其實大多數的網管軟體都提供這樣的基本功能,但以 netdata 來說,
還是需要一些 snmp 的基礎觀念.故整理一下之前所寫的一些基礎,
供需求的人參考之.


SNMP 基礎


(1) http://xrcd2.blogspot.tw/2012/10/snmp-oids-zabbix.html
     [ 利用SNMP OIDs 加入 Zabbix 監控 ]
(2) http://xrcd2.blogspot.tw/2012/10/snmp-oid.html
     [ 再論 SNMP OIDs ]
(3) http://xrcd2.blogspot.tw/2016/11/snmp-oids.html
     [ 三論 SNMP OIDs ]
(4) http://xrcd2.blogspot.tw/2017/04/snmp-oids.html
     [ 四論 SNMP OIDs ]

其它補充

(2) http://xrcd2.blogspot.tw/2012/11/cisco-router-interface-reliability.html
[ Cisco Router Interface Reliability Status Monitor ( DIY cacti template ) ]


本文開始

參考設定

https://github.com/firehol/netdata/blob/master/conf.d/node.d/snmp.conf.md

SNMP Data Collector


example:

{
    "enable_autodetect": false,
    "update_every": 5,
    "max_request_size": 100,
    "servers": [
        {
            "hostname": "10.11.12.8",
            "community": "public",
            "update_every": 10,
            "max_request_size": 50,
            "options": { "timeout": 10000 },
            "charts": {
                "snmp_switch.bandwidth_port1": {
                    "title": "Switch Bandwidth for port 1",
                    "units": "kilobits/s",
                    "type": "area",
                    "priority": 1,
                    "family": "ports",
                    "dimensions": {
                        "in": {
                            "oid": "1.3.6.1.2.1.2.2.1.10.1",
                            "algorithm": "incremental",
                            "multiplier": 8,
                            "divisor": 1024,
                            "offset": 0
                        },
                        "out": {
                            "oid": "1.3.6.1.2.1.2.2.1.16.1",
                            "algorithm": "incremental",
                            "multiplier": -8,
                            "divisor": 1024,
                            "offset": 0
                        }
                    }
                },
                "snmp_switch.bandwidth_port2": {
                    "title": "Switch Bandwidth for port 2",
                    "units": "kilobits/s",
                    "type": "area",
                    "priority": 1,
                    "family": "ports",
                    "dimensions": {
                        "in": {
                            "oid": "1.3.6.1.2.1.2.2.1.10.2",
                            "algorithm": "incremental",
                            "multiplier": 8,
                            "divisor": 1024,
                            "offset": 0
                        },
                        "out": {
                            "oid": "1.3.6.1.2.1.2.2.1.16.2",
                            "algorithm": "incremental",
                            "multiplier": -8,
                            "divisor": 1024,
                            "offset": 0
                        }
                    }
                }
            }
        }
    ]
}



依據上述的範例,得知 netdata snmp data collector 的方式為 OIDs ,
以 Interface Traffic 來說,可以用 1.3.6.1.2.1.2.2.1.10.ifName ( Inbound )
用 1.3.6.1.2.1.2.2.1.10.ifName (Outgoing )取得流量使用資訊.

這時可以使用  snmpwalk +  snmpget 來確定如何設定正確的 snmp.conf

以 Cisco 2960S Switch 為例



[root@centos73 ~]#  snmpwalk -Os -c cisco -v 2c 192.168.111.198  system | more
sysDescr.0 = STRING: Cisco IOS Software, C2960S Software (C2960S-UNIVERSALK9-M),
Version 15.2(1)E, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2013 by Cisco Systems, Inc.
Compiled Tue 27-Aug-13 11:59 by prod_rel_team
sysObjectID.0 = OID: enterprises.9.1.1208
sysUpTimeInstance = Timeticks: (758822632) 87 days, 19:50:26.32
sysContact.0 = STRING:
sysName.0 = STRING: Switch
sysLocation.0 = STRING:
sysServices.0 = INTEGER: 6
sysORLastChange.0 = Timeticks: (0) 0:00:00.00
sysORID.1 = OID: enterprises.9.7.129
sysORID.2 = OID: enterprises.9.7.115

.....
.....
.....
.....


[root@centos73 ~]#  snmpwalk -Os -c cisco  -v 2c 192.168.111.198  1.3.6.1.2.1.31.1.1.1.1
ifName.1 = STRING: Vl1
ifName.5137 = STRING: StackPort1
ifName.10101 = STRING: Gi1/0/1
ifName.10102 = STRING: Gi1/0/2
ifName.10103 = STRING: Gi1/0/3
ifName.10104 = STRING: Gi1/0/4
ifName.10105 = STRING: Gi1/0/5
ifName.10106 = STRING: Gi1/0/6
ifName.10107 = STRING: Gi1/0/7
ifName.10108 = STRING: Gi1/0/8
ifName.10109 = STRING: Gi1/0/9
ifName.10110 = STRING: Gi1/0/10
ifName.10111 = STRING: Gi1/0/11
ifName.10112 = STRING: Gi1/0/12
ifName.10113 = STRING: Gi1/0/13
ifName.10114 = STRING: Gi1/0/14
ifName.10115 = STRING: Gi1/0/15
ifName.10116 = STRING: Gi1/0/16
ifName.10117 = STRING: Gi1/0/17
ifName.10118 = STRING: Gi1/0/18
ifName.10119 = STRING: Gi1/0/19
ifName.10120 = STRING: Gi1/0/20
ifName.10121 = STRING: Gi1/0/21
ifName.10122 = STRING: Gi1/0/22
ifName.10123 = STRING: Gi1/0/23
ifName.10124 = STRING: Gi1/0/24
ifName.10125 = STRING: Gi1/0/25
ifName.10126 = STRING: Gi1/0/26
ifName.10127 = STRING: Gi1/0/27
ifName.10128 = STRING: Gi1/0/28
ifName.10129 = STRING: Gi1/0/29
ifName.10130 = STRING: Gi1/0/30
ifName.10131 = STRING: Gi1/0/31
ifName.10132 = STRING: Gi1/0/32
ifName.10133 = STRING: Gi1/0/33
ifName.10134 = STRING: Gi1/0/34
ifName.10135 = STRING: Gi1/0/35
ifName.10136 = STRING: Gi1/0/36
ifName.10137 = STRING: Gi1/0/37
ifName.10138 = STRING: Gi1/0/38
ifName.10139 = STRING: Gi1/0/39
ifName.10140 = STRING: Gi1/0/40
ifName.10141 = STRING: Gi1/0/41
ifName.10142 = STRING: Gi1/0/42
ifName.10143 = STRING: Gi1/0/43
ifName.10144 = STRING: Gi1/0/44
ifName.10145 = STRING: Gi1/0/45
ifName.10146 = STRING: Gi1/0/46
ifName.10147 = STRING: Gi1/0/47
ifName.10148 = STRING: Gi1/0/48
ifName.10149 = STRING: Gi1/0/49
ifName.10150 = STRING: Gi1/0/50
ifName.10151 = STRING: Gi1/0/51
ifName.10152 = STRING: Gi1/0/52
ifName.12001 = STRING: Nu0
ifName.12002 = STRING: Fa0
[root@centos73 ~]#

取得該 Switch Port  Gi1/0/11 的流量資訊 (ifName.10111 = STRING: Gi1/0/11)

In

[root@centos73 ~]# snmpget -v2c -c cisco 192.168.111.198  1.3.6.1.2.1.2.2.1.10.10111
IF-MIB::ifInOctets.10111 = Counter32: 2637654576

Out

[root@centos73 ~]# snmpget -v2c -c cisco 192.168.111.198  1.3.6.1.2.1.2.2.1.16.10111
IF-MIB::ifOutOctets.10111 = Counter32: 1963738193
[root@centos73 ~]#



=================


[root@centos73 node.d]# pwd
/etc/netdata/node.d


抄改上面的  example:


[root@centos73 node.d]# vi   snmp.conf  [圖一]
{
    "enable_autodetect": false,
    "update_every": 5,
    "max_request_size": 100,
    "servers": [
        {
            "hostname": "192.168.111.198",
            "community": "cisco",
            "update_every": 10,
            "max_request_size": 50,
            "options": { "timeout": 10000 , "version": 1 },
            "charts": {
                "snmp_switch.bandwidth_port1": {
                    "title": "Switch Bandwidth for port 1",
                    "units": "kilobits/s",
                    "type": "area",
                    "priority": 1,
                    "family": "ports",
                    "dimensions": {
                        "in": {
                            "oid": "1.3.6.1.2.1.2.2.1.10.10101",
                            "algorithm": "incremental",
                            "multiplier": 8,
                            "divisor": 1024,
                            "offset": 0
                        },
                        "out": {
                            "oid": "1.3.6.1.2.1.2.2.1.16.10101",
                            "algorithm": "incremental",
                            "multiplier": -8,
                            "divisor": 1024,
                            "offset": 0
                        }
                    }
                },
                "snmp_switch.bandwidth_port11": {
                    "title": "Switch Bandwidth for port 11",
                    "units": "kilobits/s",
                    "type": "area",
                    "priority": 1,
                    "family": "ports",
                    "dimensions": {
                        "in": {
                            "oid": "1.3.6.1.2.1.2.2.1.10.10111",
                            "algorithm": "incremental",
                            "multiplier": 8,
                            "divisor": 1024,
                            "offset": 0
                        },
                        "out": {
                            "oid": "1.3.6.1.2.1.2.2.1.16.10111",
                            "algorithm": "incremental",
                            "multiplier": -8,
                            "divisor": 1024,
                            "offset": 0
                        }
                    }
                }
            }
        }
    ]
}



========

修改自另一個範例 multiply_range


[root@centos73 node.d]# cat snmp.conf [圖二]
{
    "enable_autodetect": false,
    "update_every": 60,
    "servers": [
        {
            "hostname": "192.168.111.198",
            "community": "cisco",
            "update_every": 60,
            "options": { "timeout": 20000, "version": 1 },
            "charts": {
                "snmp_switch.bandwidth_port": {
                    "title": "Switch Bandwidth for port ",
                    "units": "kilobits/s",
                    "type": "area",
                    "priority": 1,
                    "family": "ports",
                    "multiply_range": [ 10101, 10152 ],
                    "dimensions": {
                        "in": {
                            "oid": "1.3.6.1.2.1.2.2.1.10.",
                            "algorithm": "incremental",
                            "multiplier": 8,
                            "divisor": 1024,
                            "offset": 0
                        },
                        "out": {
                            "oid": "1.3.6.1.2.1.2.2.1.16.",
                            "algorithm": "incremental",
                            "multiplier": -8,
                            "divisor": 1024,
                            "offset": 0
                        }
                    }
                }
            }
        }
    ]
}

===================

驗證 snmp plugin

( 可參考 https://github.com/firehol/netdata/blob/master/conf.d/node.d/snmp.conf.md
 [ Testing the configuration ] 這一段的方式 )

[root@centos73 node.d]# /usr/libexec/netdata/plugins.d/node.d.plugin 1 snmp
/usr/libexec/netdata/plugins.d/node.d.plugin: line 2: exec: ERROR node.js IS NOT AVAILABLE IN THIS SYSTEM: not found


出現上述這訊息即為,未安裝 nodejs 套件

解法

#yum install epel-release
#yum install nodejs

驗證 nodejs

[root@centos73 node.d]# node --version
v6.10.3


裝好後再測試一次

[root@centos73 node.d]# /usr/libexec/netdata/plugins.d/node.d.plugin 1 snmp
2017-06-22 11:07:00: node.d.plugin: ERROR: snmp: 192.168.111.198: Received error = TypeError: snmp.varbindError is not a function
    at Object.responseCb (/usr/libexec/netdata/node.d/snmp.node.js:267:89)
    at Object.feedCb (/usr/libexec/netdata/node.d/node_modules/net-snmp.js:646:8)
    at Object.Session.onSimpleGetResponse [as onResponse] (/usr/libexec/netdata/node.d/node_modules/net-snmp.js:960:7)
    at Session.onMsg (/usr/libexec/netdata/node.d/node_modules/net-snmp.js:929:9)
    at emitTwo (events.js:106:13)
    at Socket.emit (events.js:191:7)
    at UDP.onMessage (dgram.js:549:8) varbinds = undefined
DISABLE

出現上述訊息則為 snmp.conf 設定錯誤,請利用 snmpwalk +  snmpget 來確認 snmp 相關資訊的配置.


過關無誤則會出現以下資訊

[root@centos73 node.d]# /usr/libexec/netdata/plugins.d/node.d.plugin 1 snmp
CHART "snmp_switch.bandwidth_port1" "snmp_switch.bandwidth_port1" "Switch Bandwidth for port 1" "kilobits/s" "ports" "" "area" 50001 10
DIMENSION "in" "in" "incremental" 8 1024
DIMENSION "out" "out" "incremental" -8 1024
BEGIN snmp_switch.bandwidth_port1
SET in = 1167620306
SET out = 220094275
END

CHART "snmp_switch.bandwidth_port11" "snmp_switch.bandwidth_port11" "Switch Bandwidth for port 11" "kilobits/s" "ports" "" "area" 50001 10
DIMENSION "in" "in" "incremental" 8 1024
DIMENSION "out" "out" "incremental" -8 1024
BEGIN snmp_switch.bandwidth_port11
SET in = 2509704807
SET out = 1913107763
END

BEGIN snmp_switch.bandwidth_port1 3548000
SET in = 1167620306
SET out = 220107600
END

BEGIN snmp_switch.bandwidth_port11 3548000
SET in = 2509816273
SET out = 1913162904
END

BEGIN snmp_switch.bandwidth_port1 10040000
SET in = 1167620306
SET out = 220133355
END

BEGIN snmp_switch.bandwidth_port11 10040000
SET in = 2510106481
SET out = 1913238854
END
.....

新增圖示可參考如下 URL

https://github.com/firehol/netdata/wiki/Add-more-charts-to-netdata#network

Add more charts to netdata

configuring plugins

Most plugins come with auto-detection, configured to work out-of-the-box on popular
operating systems with the default settings.

However, there are cases that auto-detection fails. Usually the reason is that the
applications to be monitored do not allow netdata to connect. In most of the cases,
allowing the user netdata from localhost to connect and collect metrics, will
automatically enable data collection for the application in question
(it will require a netdata restart).

You can verify netdata plugins are able to collect metrics, following this procedure:

# become user netdata
sudo su -s /bin/bash netdata

# execute the plugin in debug mode, for a specific module.
# example for the python plugin, mysql module:
/usr/libexec/netdata/plugins.d/python.d.plugin 1 debug mysql

其它參考資訊

General Info node.d

https://github.com/firehol/netdata/wiki/General-Info---node.d


Demo

2 interface [圖一]




All Interface [圖二]



2017年6月21日 星期三

netdata 試裝

今天在網上亂逛,剛好看到一個 netdata 的軟體,
網上是說~它是可以用來即時監控 Linux 的網管軟體,
而且安裝很簡單,就順手裝來玩看看.

安裝的 SOP 如下:  OS 以 CentOS 7 為例 (netdata  release 1.6.0)

#yum install -y zlib-devel gcc make git autoconf autogen automake pkgconfig libuuid libuuid-devel

#git clone https://github.com/firehol/netdata.git

#cd netdata

#./netdata-installer.sh

如缺相依軟體則會顯示
....

Sorry! netdata failed to build...

You many need to check these:

1. The package uuid-dev (or libuuid-devel) has to be installed.

   If your system cannot find libuuid, although it is installed
   run me with the option:  --libs-are-really-here

2. The package zlib1g-dev (or zlib-devel) has to be installed.

   If your system cannot find zlib, although it is installed
   run me with the option:  --libs-are-really-here

3. You need basic build tools installed, like:

   gcc make autoconf automake pkg-config

   Autoconf version 2.60 or higher is required.

If you still cannot get it to build, ask for help at github:

   https://github.com/firehol/netdata/issues

如安裝完成則會顯示:
....

Downloading default configuration from netdata...
New configuration saved for you to edit at /etc/netdata/netdata.conf
 --- Check KSM (kernel memory deduper) ---

Memory de-duplication instructions

You have kernel memory de-duper (called Kernel Same-page Merging,
or KSM) available, but it is not currently enabled.

To enable it run:

    echo 1 >/sys/kernel/mm/ksm/run
    echo 1000 >/sys/kernel/mm/ksm/sleep_millisecs

If you enable it, you will save 40-60% of netdata memory.

 --- Check version.txt ---
 --- Check apps.plugin ---
 --- Generate netdata-uninstaller.sh ---
 --- Basic netdata instructions ---

netdata by default listens on all IPs on port 19999,
so you can access it with:

  http://this.machine.ip:19999/

To stop netdata run:

  systemctl stop netdata

To start netdata run:

  systemctl start netdata


Uninstall script generated: ./netdata-uninstaller.sh
Update script generated   : ./netdata-updater.sh

netdata-updater.sh can work from cron. It will trigger an email from cron
only if it fails (it does not print anything when it can update netdata).
Run this to automatically check and install netdata updates once per day:

sudo ln -s /usr/local/src/netdata/netdata/netdata-updater.sh /etc/cron.daily/netdata-updater

 --- We are done! ---

  ^
  |.-.   .-.   .-.   .-.   .-.   .  netdata                          .-.   .-
  |   '-'   '-'   '-'   '-'   '-'   is installed and running now!  -'   '-'
  +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+--->

  enjoy real-time performance and health monitoring...

..

那就可以直接連入該主機的  monitor  web 了,
如果安裝多台亦可以在同一個 Browser 去點選不同的主機.








關於 netdata

https://github.com/firehol/netdata


Demo

http://my-netdata.io/#demosites



其它的 Demo Sites URL 寫在 ( https://github.com/firehol/netdata/wiki )

http://london.my-netdata.io/
http://atlanta.my-netdata.io/
http://bangalore.my-netdata.io/
http://sanfrancisco.my-netdata.io
.....


https://my-netdata.io

★ Scalable
netdata scales out, your web browser is the central netdata connecting all your servers together.
But netdata can also replicate its database to other netdata, and archive its metrics to graphite,
 opentsdb, influxdb or prometheus at a lower rate, to avoid congesting these servers with the
amount of data collected.


https://github.com/firehol/netdata/wiki/netdata-backends





所以 dashboard 就有多種的變化可能....

如 grafana  + netdata ( https://grafana.com/dashboards/1295 ).....