OpenTSDB-Writing Data

Writing Data

You may want to jump right in and start throwing data into your TSD, but to really take advantage of OpenTSDB‘s power and flexibility, you may want to pause and think about your naming schema. After you‘ve done that, you can procede to pushing data over the Telnet or HTTP APIs, or use an existing tool with OpenTSDB support such as ‘tcollector‘.

你可能调到这里,开始将数据丢进TSD中,但是真正地利用好OpenTSDB的强大功能以及灵活性,你可能需要停一下,想一下你的naming schema。

然后,你就可以继续通过Telnet或者HTTPAPIs推送数据,或者利用现有OpenTSDB支持的工具,如tcollector

 

Naming Schema命名范式

Many metrics administrators are used to supplying a single name for their time series. For example, systems administrators used to RRD-style systems may name their time series webserver01.sys.cpu.0.user. The name tells us that the time series is recording the amount of time in user space for cpu 0 on webserver01. This works great if you want to retrieve just the user time for that cpu core on that particular web server later on.

多数的metrics使用单个命名。例如,系统管理的参数使用RRD-格式命名,格式如webserver01.sys.cpu.0.user。这个名字告诉我们,时间序列是记录webser01上cpu0的user 

占用的时间。如果你想获取特定web server上cpu的用户态使用时间的话,这将很好地支持。

 

But what if the web server has 64 cores and you want to get the average time across all of them? Some systems allow you to specify a wild card such as webserver01.sys.cpu.*.user that would read all 64 files and aggregate the results. Alternatively, you could record a new time series called webserver01.sys.cpu.user.all that represents the same aggregate but you must now write ‘64 + 1‘ different time series. What if you had a thousand web servers and you wanted the average cpu time for all of your servers? You could craft a wild card query like *.sys.cpu.*.user and the system would open all 64,000 files, aggregate the results and return the data. Or you setup a process to pre-aggregate the data and write it to webservers.sys.cpu.user.all.

但是,如果web server有64个核,而你想获取平均时间呢?有些系统允许你使用一个模糊匹配,例如webserver01.sys.cpu.*.user ,然后读取64个文件,然后将它们聚合。

另外,你可以记录一个新的时间序列,名为webserver01.sys.cpu.user.all,这样表示同样的聚合效果,但是需要64+1个不同的时间序列。

如果你有1000个webserer,对所有的server求cpu平均时间的画?你可能使用*.sys.cpu.*.user ,然后读取64000个文件,然后聚合结果返回数据,或者提前聚合数据,写入新的时间序列如webservers.sys.cpu.user.all。

 

OpenTSDB handles things a bit differently by introducing the idea of ‘tags‘. Each time series still has a ‘metric‘ name, but it‘s much more generic, something that can be shared by many unique time series. Instead, the uniqueness comes from a combination of tag key/value pairs that allows for flexible queries with very fast aggregations.

OpenTSDB使用不同的处理方式,引入tags的思想。每个时间序列都有一个metric的名字,但是这个更通用,被很多不同的时间序列共享。

唯一性来自于tag,key/value pairs,这样使用查询灵活,也快速进行整合。

 

Note

Every time series in OpenTSDB must have at least one tag.

在OpenTSDB中的每个时间至少有一个tag。

 

Take the previous example where the metric was webserver01.sys.cpu.0.user. In OpenTSDB, this may become sys.cpu.userhost=webserver01, cpu=0. Now if we want the data for an individual core, we can craft a query likesum:sys.cpu.user{host=webserver01,cpu=42}. If we want all of the cores, we simply drop the cpu tag and ask forsum:sys.cpu.user{host=webserver01}. This will give us the aggregated results for all 64 cores. If we want the results for all 1,000 servers, we simply request sum:sys.cpu.user. The underlying data schema will store all of the sys.cpu.user time series next to each other so that aggregating the individual values is very fast and efficient. OpenTSDB was designed to make these aggregate queries as fast as possible since most users start out at a high level, then drill down for detailed information.

回到前面的例子中的metric,webserver01.sys.cpu.0.user。在OpenTSDB中,将变为sys.cpu.userhost=webserver01, cpu=0。

如果想获取单个核的数据,可以使用如下查询sys.cpu.user{host=webserver01,cpu=42}。

如果想获取所有核的话,可以使用如下查询sys.cpu.user{host=webserver01},这给出64个核聚合后的结果。

如果想获取所有webserver的,查询方式如sys.cpu.user。

底层的数据结构是逐个存储sys.cpu.user时间序列,因此获取单个值是非常快和高效的。

OpenTSDB设计的目标是尽可能地快进行查询的整合,因为大多数用户进行更上层的查询,然后获取更细节的信息。

 

Aggregations——聚合

While the tagging system is flexible, some problems can arise if you don‘t understand how the querying side of OpenTSDB, hence the need for some forethought. Take the example query above: sum:sys.cpu.user{host=webserver01}. We recorded 64 unique time series forwebserver01, one time series for each of the CPU cores. When we issued that query, all of the time series for metric sys.cpu.user with the tag host=webserver01 were retrieved, averaged, and returned as one series of numbers. Let‘s say the resulting average was 50 for timestamp 1356998400. Now we were migrating from another system to OpenTSDB and had a process that pre-aggregated all 64 cores so that we could quickly get the average value and simply wrote a new time series sys.cpu.user host=webserver01. If we run the same query, we‘ll get a value of 100 at 1356998400. What happened? OpenTSDB aggregated all 64 time series and the pre-aggregated time series to get to that 100. In storage, we would have something like this:

虽然标签系统很灵活,但是如果不了解OpenTSDB的查询方式,可能还会遇到问题,因此需要进一步了解。

以上面的查询作为例子:sum:sys.cpu.user{host=webserver01}

webserver01记录64个不同时间序列,每个核都记录一个。当讨论查询时,所有带有标签host=webserver01的sys.cpu.user的metric都会查询,然后求平均,返回一串数字。

假设结果平均值为50,时间戳为1356998400。现在我们移到另一个OpenTSDB系统,它有一个进程提前整合64核的数据,这样我们将快速得到平均值,写入一个新的时间序列中sys.cpu.user host=webserver01,但是运行同样的查询,结果却为100。这样是发生什么事情呢?

在存储中,数据格式如下:

sys.cpu.user host=webserver01        1356998400  50
sys.cpu.user host=webserver01,cpu=0  1356998400  1
sys.cpu.user host=webserver01,cpu=1  1356998400  0
sys.cpu.user host=webserver01,cpu=2  1356998400  2
sys.cpu.user host=webserver01,cpu=3  1356998400  0
...
sys.cpu.user host=webserver01,cpu=63 1356998400  1

OpenTSDB will automatically aggregate all of the time series for the metric in a query if no tags are given. If one or more tags are defined, the aggregate will ‘include all‘ time series that match on that tag, regardless of other tags. With the querysum:sys.cpu.user{host=webserver01}, we would include sys.cpu.user host=webserver01,cpu=0 as well as sys.cpu.userhost=webserver01,cpu=0,manufacturer=Intel

sys.cpu.user host=webserver01,foo=bar and 

sys.cpu.userhost=webserver01,cpu=0,datacenter=lax,department=ops.

The moral of this example is: be careful with your naming schema.

如果在一个查询中没有设置tags,OpenTSDB自动整合所有时间序列。如果定义一个或者多个tags,整合只会包含和tag匹配的时间序列,忽略掉其他的tags。

例如,查询sum:sys.cpu.user{host=webserver01},将会包括如下:

sys.cpu.user host=webserver01,cpu=0

sys.cpu.userhost=webserver01,cpu=0,manufacturer=Intel

 

sys.cpu.user host=webserver01,foo=bar

sys.cpu.userhost=webserver01,cpu=0,datacenter=lax,department=ops

 

这个例子的寓意是:使用naming schema应谨慎

 

 

【参考资料】

1、http://opentsdb.net/docs/build/html/user_guide/writing.html

 

郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。