Zabbix 4.0 and beyond

Size: px
Start display at page:

Download "Zabbix 4.0 and beyond"

Transcription

1 Zabbix 4.0 and beyond What we may expect in the future The Universal Open Source Enterprise Level Monitoring Solution

2 Alexei Vladishev Founder and CEO of Zabbix 2

3 Zabbix is a universal open source enterprise level monitoring solution 3

4 Zabbix Team 4

5 Where we are? Item pre-processing Released Dependent items 3.0 LTS LTS Maps and dashboards Remote commands by Proxies Elastic Search 5

6 Where we are? Released Under development 4.0 LTS 3.0 LTS LTS 6

7 A few major improvements of Zabbix 4.0 and why they are important 7

8 1 Making problems independent 8

9 Problems and events 3.x Triggers {HOST.NAME} has just been restarted Problems *No problem name* Slow: problems and events name are calculated on the fly 9

10 Problems and events 4.0 Triggers {HOST.NAME} has just been restarted Problems Name: Linux006 has just been restarted Fast: display as it is in problems and events 10

11 2 Better integration options 11

12 Real-time export of history and events History file Trends file Events file JSON Zabbix Server ExportDir=/var/log/zabbix ExportFileSize=100M 12

13 3 Better work flow 13

14 3.x Acknowledgements Monitoring -> Problems -> Ack Message is mandatory No way to put message only No way just to close problem 14

15 4.0 Advanced problem work flow Monitoring -> Problems -> Update Message is optional Operations are optional: - ACK - Change severity (!) - Close problem 15

16 4 New ways of monitoring 16

17 HTTP item type 17

18 Some use cases Monitoring content of WEB application Getting data out of APIs, which are based on JSON/XML Access to HTTP header fields Server: Apache/2.4.1 (Unix) 18

19 Typical HTTP processing TEXT HTML HTTP check JSON Pre-processing History XML BINARY XPath JSONPath Regex HTTP data processing 19

20 5 Better interface 20

21 UI getting simpler No Monitoring->Triggers anymore, use Monitoring->Problems 21

22 New widgets and more! 22

23 We create an universal self-service monitoring platform delivering business value 23

24 Getting most business value out of collected data Self service Give access to everyone: finance, analytics, sales, support, developers, customers, etc Requires best user experience Security and flexible user roles are important 24

25 Collect any data Pre-process and transform collected data in any way Extreme flexibility Modules and webhooks for extending Zabbix Choice of: OS, HW, database, programming languages 25

26 More platforms Official packages for more hardware and cloud platforms 26

27 Modularity 27

28 3.x 28

29 3.x 4.x + Independent modules 29

30 Single pane of glass Central place to see and control monitoring of whole infrastructure Central management of alerting Event collection from various sources Observing information from multiple Zabbix Servers 30

31 Events Events Events Events Events Events Unified dashboard and alerting 31

32 Root cause analysis 32

33 Root cause analysis It gives a clear answer to the question What is the cause of the problem? It provides information about impact and importance Reduces recovery time (MTTR) 33

34 3.4 Root cause analysis Trigger dependencies Event correlation 34

35 3.4 4.x Root cause analysis Trigger dependencies Event correlation Automatic and manual relationship between problems Complex event processing (de-duplication, filtering, enrichment incl. AI & machine learning) 35

36 Root cause analysis Server B is not available Datacenter: Tokyo1 Class: Availability Enrichment 36

37 Root cause analysis Server B is not available Datacenter: Tokyo1 Class: Availability Enrichment AI & ML, CMDB, Network topology, service tree External systems Server B is not available Datacenter: Tokyo1 Location: Rack4,32 Service: Helpdesk Class: Availability Contact: Alexei HW: HP DL380 GEO:

38 Root cause analysis Server B is not available Datacenter: Tokyo1 Class: Availability Server C is not available Datacenter: Tokyo1 Class: Availability Relationship Datacenter Tokyo1 is not available Datacenter: Tokyo1 Class: Availability between Network is not available to Tokyo1 problems Server B is not available Server C is not available 197 more problems. 38

39 Root cause analysis Server B is not available Datacenter: Tokyo1 Class: Availability Server C is not available Datacenter: Tokyo1 Class: Availability Relationship Datacenter Tokyo1 is not available Datacenter: Tokyo1 Class: Availability between problems Network is not available to Tokyo1 Server B is not available Correlation rules Datacenter Tokyo1 is not available (2 related problems) Server C is not available 197 more problems. Server B is not available Server C is not available 39

40 Service as a first class citizen 40

41 Our services Helpdesk VOIP Ticket selling system Transaction processing WEB Server Oracle Database Java Middleware API Disk space CPU Network 41

42 Our services Helpdesk VOIP Ticket selling system Transaction processing WEB Server Oracle Database Java Middleware API Tag based linkage to problems Disk space Service: Oracle System: Disk CPU Network 42

43 Services Much easier maintenance: tag based linkage between problems and services Choice of service propagation rules (up, down) Visualisation: more widgets to display services (status, SLA) Alerting of service status changes Use service tree for problem correlation and impact analysis 43

44 More value for business IT Infrastructure level More about technology Business level About SLA and KPIs Metrics Problems Services Value 44

45 Our services Helpdesk VOIP Ticket selling system Business IT Infrastructure Transaction processing WEB Server Oracle Database Java Middleware API Disk space CPU Network 45

46 A few announcements 46

47 New training programs new ZCU ZCS ZCP ZCE new Certified User Certified Specialist Certified Professional Certified Expert Low Medium High Very high Difficulty 47

48 48

49 49

50 Thank you! Some of the used icons made by Freepik from The Universal Open Source Enterprise Level Monitoring Solution