Skip to content

Latest commit

 

History

History
116 lines (73 loc) · 6.86 KB

timestream.md

File metadata and controls

116 lines (73 loc) · 6.86 KB

TSBS Supplemental Guide: Timestream

Amazon Timestream is a serverless time series database service. This supplemental guide explains how the data generated for TSBS is stored, additional flags available when using the data importer (tsbs_load load timestream), and additional flags available for the query runner (tsbs_run_queries_timestream). This should be read after the main README.

Data format

Data generated by tsbs_generate_data for Timestream is serialized in a "pseudo-CSV" format, along with a custom header at the beginning. The header is several lines long:

  • one line composed of a comma-separated list of dimensions, with the literal string tags as the first value in the list
  • one or more lines composed of a comma-separated list of measures, with the table name as the first value in the list
  • a blank line

An example for the cpu-only use case:

tags,hostname,region,datacenter,rack,os,arch,team,service,service_version,service_environment
cpu,usage_user,usage_system,usage_idle,usage_nice,usage_iowait,usage_irq,usage_softirq,usage_steal,usage_guest,usage_guest_nice

Following this, each reading is composed of two rows:

  1. a comma-separated list of tag values for the reading, with the literal string tags as the first value in the list
  2. a comma-separated list of field values for the reading, with the table the reading belongs to being the first value and the timestamp as the second value

An example for the cpu-only use case:

tags,host_0,eu-central-1,eu-central-1b,21,Ubuntu15.10,x86,SF,6,0,test
cpu,1451606400000000000,58.1317132304976170,2.6224297271376256,24.9969495069947882,61.5854484633778867,22.9481393231639395,63.6499207106198313,6.4098777048301052,44.8799140503027445,80.5028770761136201,38.2431182911542820

The resulting table in Timestream for LiveAnalytics will vary based on whether records are defined using single-measure or multi-measure records.

Resulting cpu-only Timestream Table using multi-measure records (default)

hostname region datacenter rack os arch team service service_version service_environment measure_name time usage_user usage_system usage_idle usage_nice usage_iowait usage_irq usage_softirq usage_steal usage_guest usage_guest_nice
host_0 eu-central-1 eu-central-1b 21 Ubuntu15.10 x86 SF 6 0 test metrics 2016-01-01 00:00:00.000000000 58.1317132304976170 2.6224297271376256 24.9969495069947882 61.5854484633778867 22.9481393231639395 63.6499207106198313 6.4098777048301052 44.8799140503027445 80.5028770761136201 38.2431182911542820

Resulting cpu-only Timestream Table using single-measure records

hostname region datacenter rack os arch team service service_version service_environment measure_name time measure_value
host_0 eu-central-1 eu-central-1b 21 Ubuntu15.10 x86 SF 6 0 test usage_user 2016-01-01 00:00:00.000000000 58.1317132304976170
host_0 eu-central-1 eu-central-1b 21 Ubuntu15.10 x86 SF 6 0 test usage_system 2016-01-01 00:00:00.000000000 2.6224297271376256
host_0 eu-central-1 eu-central-1b 21 Ubuntu15.10 x86 SF 6 0 test usage_idle 2016-01-01 00:00:00.000000000 24.9969495069947882
...


tsbs_load load timestream Additional Flags

loader.db-specific.aws-region (type: string, default us-east-1)

AWS region where the db is located

loader.db-specific.use-common-attributes (type: boolean, default true)

Timestream client makes write requests with common attributes. If false, each value is written as a separate Record, and a request of 100 records at once is sent.

loader.db-specific.hash-property (type: string, default hostname)

Dimension to use when hasing points to different workers

loader.db-specific.use-current-time (type: boolean, default: false)

Use the current local timestamp when creating the records to load. Usefull when you don't want to worry about the retention period vs simulated period.

loader.db-specific.mag-store-retention-in-days (type: int, default: 180)

The duration for which data must be stored in the magnetic store

loader.db-specific.mem-store-retention-in-hours (type: int, default: 12)

The duration for which data must be stored in the memory store.

loader.db-specific.custom-partition-key-dimension (type: string, default: "")

The dimension to use as the partition key. This parameter is required if the CustomPartitionKeyType parameter is set to dimension.

loader.db-specific.custom-partition-key-type (type: string, default: measure)

The type of custom partition key to use. Valid options are dimension or measure. The dimension option requires the CustomPartitionKeyDimension parameter to also be set. If this parameter is not provided, newly-created tables will use default partitioning and none of the parameters relating to custom partition keys will be used.

loader.db-specific.enforce-custom-partition-key (type: bool, default: false)

Whether to only allow the ingestion of records that contain the custom partition key. Valid options are true or false.

loader.db-specific.enable-mag-store-writes (type: bool, default: true)

Enables writes to mag store for late-arriving data.

loader.db-specific.multi-measure-measure-name (type: string, default: "metrics")

The name to use for the measure_name field in multi-measure records.


tsbs_generate_queries required -db-name flag

Timestream requires the database name be part of the WHERE clause of every query, so the --db-name flag is a required flag


tsbs_run_queries_timestream Additional Flags

-aws-region (type: string, default: us-east-1)

AWS region where the database is located