Manual Chapter : Configure Data Collection Devices

Applies To:

Show Versions

BIG-IQ Centralized Management

8.2.0, 8.1.0

Configure Data Collection Devices

Data Collection Device configuration overview

You configure the Data Collection Devices (DCD)s for your BIG-IQ solution using the BIG-IQ CM. The BIG-IQ controls the settings for everything except the zone assignment and master key settings.

Statistics retention policy overview

Statistics retention impacts quantity of data reported by your managed BIG-IP devices. When you choose how much data to retain, you need to consider how much disk space you have available. The controls for data retention require an understanding of how much space you have to store data, and which data needs additional storage.

The fields on the Statistics Retention Policy screen set the size of the indices that BIG-IQ uses to store raw data. These fields all work in a similar fashion. One way to understand how these indices work is to think of your data storage space as a set of containers. The values you specify on this screen determine how much storage space each container (index) consumes. Because data is saved for the time periods you specify, the longer the time period that you specify, the more space you consume. The disk storage that is consumed depends on several factors.

The number of BIG-IP devices for which you are collecting data

The number of objects those BIG-IP devices have (for example, virtual servers, pools, pool members, and iRules)

The frequency of data collection

The data retention policy

The data replication policy

Additional data storage for prioritized service groups

It is important to note, that the system has default global retention settings, and fine-tuning the statistics retention policy is not required. Changing the retention settings can improve the efficiency of disk usage, or quality of statistics data retained, depending on how long you retain certain data after its initial collection.

The following are key concepts to understand about how the retention policy works.

How long is data in each container retained?	Data is retained in each container for the time period you specify. When the specified level is reached, the oldest chunk of data is deleted. For example, if you specify a raw data value of 48 hours of retained collected data, when 48 hours of raw data has accumulated, the next hour of incoming raw data triggers the BIG-IQ to delete the oldest hour of collected data.
When does data from one container transfer to the next?	The BIG-IQ transfers data from one container to the next in increments that are the size of the next (larger) container. That is, every 60 minutes, BIG-IQ aggregates the last 60 minutes of collected raw data into a data set and passed to the Hour(s) container. Every 24 hours, BIG-IQ aggregates the last 24 hours of hourly data into a data set and passed to the Day(s) container, and so on for the Month(s) container.
What about limits?	Limit Max Storage to specifies the percentage of total disk space that you want data to consume on the DCDs in your cluster. If more disk space is consumed than the percentage you specified, BIG-IQ takes two actions to prevent data corruption when storage is completely exhausted The DCDs do not collect new data until the available disk space complies with the Limit max storage to setting. Statistical data not required to calculate the next higher time layer is removed (for example, you need 60 minutes of raw data to aggregate to the Hours level). Data is removed starting with the raw data container, then the hourly data container, then the daily time container. This process stops when storage consumption is below the Limit max storage to setting.
Global vs. Group properties	Global properties is applied for all statistics data retention. You can adjust retention settings for specific service groups that are activated on your system DCDs: Global Properties Global properties provide the statistics retention settings for all statistics collection for your BIG-IQ. These are the default settings for all service modules activated on your DCD. Group Properties Service groups can have statistics retention settings that differ from the default global properties. You can use these settings to fine-tune retention such that more, or less, data is retained for an activated service. This can improve the efficiency of how data is retained in your system. A service group will apply only specified settings and apply the global values for settings that are blank (e.g. Keep monthly data up to is left blank).

Manage the retention policy for your statistics data

Before you can set the statistics retention policy, you must have added a data collection device (DCD). If you are adding statistics retention for a specific service group, ensure that the service is activated on the DCD.

You can manage the default settings that determine how your statistics data is retained, based on quality. The highest quality data is real-time (raw) data, (data that has not been averaged), but that consumes a lot of disk space, so you need to consider your needs in choosing your data retention settings.

From BIG-IQ, at the top of the screen, click

System

, then, on the left, click

BIG-IQ DATA COLLECTION

and then select

BIG-IQ Data Collection Cluster

The BIG-IQ Data Collection Cluster screen opens. On this screen, you can view summary status for the Data Collection Device (DCD) cluster and access the screens that you can use to configure the DCD cluster.

Under

SUMMARY

, you can access screens detailing how much data is stored, as well as how the data is stored.

Under

CONFIGURATION

, you can access the screens that control DCD cluster performance.

Under the screen name, click

CONFIGURATION

Statistics Data Collection

The Statistics Collection Status screen opens.

Click the

Configure Retention

button.

The Statistics Retention Policy screen opens.

Go to

Global Properties

to edit default retention:

In the

Keep real-time (raw) data up to

field, type the number of hours of raw data to retain.

You must specify a minimum of 1 hour, so that there is sufficient data to average and create a data point for the

Keep hourly data up to

In the

Keep hourly data up to

field, type the number of hourly data points to retain. field, type the number of hours of raw data to retain.

You must specify a minimum of 24 hours, so that there is sufficient data to average and create a data point for the

Keep daily data up to

container.

In the

Keep daily data up to

field, type the number of daily data points to retain.

You must specify a minimum of 31 days, so that there is sufficient data to average and create a data point for the

Keep monthly data up to

container.

In the

Keep monthly data up to

field, type the number of monthly data points to retain.

Once the specified number of months passes, the oldest monthly data set is deleted.

In the

Limit max storage to

In the

Keep events up to

field, type the number of days that you want keep events before the oldest events data set is deleted.

n the

Keep traffic capturing up to

field, type the number of days that you want keep captured traffic before the oldest traffic data set is deleted.

The Global Properties are the default retention settings for any service groups, or service group values, that are not populated in Group Properties.

Go to

Group Properties

to add custom time retention settings for specific service groups.

These settings will retain any Global Properties that are not specified in the Group Properties. Ensure that you have enough disk space to accommodate service group statistics retention.

Expand Advanced Settings:

The following configuration of data scaling and resilience using Elasticsearch. For more information, see

General Elasticsearch FAQ

Select the

Replicas

check box to enable high availability for the stored data on your DCD cluster.

Replicas

are copies of a data sets available to the DCD cluster when one or more devices within that cluster become unavailable. By default, data replication for statistics is enabled. Disabling replication reduces the amount of disk space required for data retention. However, this provides no protection from data corruption that can occur when you remove a DCD. You should enable replicas to provide this protection.

Select the

Auto expand replicas

check box to enable automatic duplication of the number of replicas for a specific data set.

This allows the DCD cluster to dynamically host up to 2 separate replicas for a given data set, based on the number of DCDs available. This provides redundancy that protects from data loss even when more than one DCD becomes unavailable.

This option is only available when

Enable Replicas

is selected. In addition, your system must include at least 3 DCDs (one primary and two replicas) with sufficient disk space.

When you are satisfied with the values specified for data retention, click

Save & Close

Log index rotation policy overview

The optimum settings used to configure your Data Collection Device (DCD) indices depend on a number of key factors.

The system provides the ability to dynamically create new indices based on either a specified interval or a specified size. The primary goal to consider when you make these decisions is how to maintain a maximum disk allocation for the DCD data, while maintaining capacity for new data that flows in.

Secondary considerations include search optimization, and the ability to optimize old indices to reduce their size.

Generally, the best policy is one that does not create unnecessary indices. The more indices, the lower the overall performance, because your searches have to deal with more shards. For example, if you know a service has a low indexing volume (thousands/day) then it makes the most sense to have a large aggregation per rotation (5 days or 30 days). For services like Web Application Security that probably have high indexing volumes, it makes more sense to rotate every 8 hours (which reduces the number of retained indices).

Index rotation also allows new sharding and replica counts by changing the template on a given index type. New indices created from that template will contain the new shard and replica count properties.

This table shows the default configuration values for each index running on BIG-IQ Centralized Management. These values are based on anticipated data ingestion rates and typical usage patterns.

Service	Index Name	Minimum Number of DCDs	Rotation Policy	Retained Index Count	Approximate time window	Size of /var file system
Access	access-event-logs	2	Time/5 days	19	95 days	500 GB
Access	access-stats	2	Time/5 days	19	95 days	500 GB
Web Application Security	asmindex	2	Size/100000 MB	5	N/A	500 GB
FPS	websafe	2	Time/30 days	100	8 years	10 GB

If multiple services are running on a given DCD, or you have higher inbound data rates, you might have to adjust these values to keep the

/var

file system from filling up. (There is a default alert to warn of this when the file system becomes 80% full.)

The simplest resolution is to revise the retained index count; lowering this value reduces the disk space requirements, but it will also reduce the amount of data available for queries. For details about changing this setting, refer to the modifying indices topic for the service you are configuring.

How does the DCD aggregate raw data?

The DCD stores raw data coming from the BIG-IP devices in data indices. As data is received, it accumulates in the current index. When the accumulated data reaches the rotation threshold that you set, four things happen.

A new current index is created.

BIG-IP data begins accumulating in the new index.

The former current index becomes one of the retained indices.

If the total number of indexes is now larger than the retained index count, the oldest one is deleted.

When you set up index rotation, you determine what triggers the rotation threshold The Indices settings specify the characteristics of how the Data Collection Device manages your data..

The ideal configuration for data indices depends on the amount, frequency, and type of data your devices send to the DCD. The default settings are designed to satisfy most user scenarios, but you might want to explore the settings for the data types that you plan to send to the DCD, to make sure that those settings meet your needs.

Modify log indices

Before you can configure the indices for a data collection device, you must activate data collection for the services that you want to collect data for.

BIG-IQ stores incoming BIG-IP device data in indices on the Data Collection Devices (DCD) cluster. Each service that sends data uses it's own indices. You control how the BIG-IQ manages your data by adjusting the settings for the

Indices

for each service.

From BIG-IQ, at the top of the screen, click

System

, then, on the left, click

BIG-IQ DATA COLLECTION

and then select

BIG-IQ Data Collection Cluster

Under

SUMMARY

, you can access screens detailing how much data is stored, as well as how the data is stored.

Under

CONFIGURATION

, you can access the screens that control DCD cluster performance.

Under the screen name, click

CONFIGURATION

Logging Data Collection

The Settings screen opens.

Click the

Configure

button for the service that you want to set up.

BIG-IQ displays the indices settings for the selected service.

Perform the next two steps for each index.

If you are configuring the Access service, use the same indices values for the

access-event-logs

and

access stats

to avoid a mismatch in the reports generated from your logging data.

Specify the

Rotation Type

To chunk your data based on the amount of data:

Select

Size Based

For the

Max Index Size

, type the size of the indexes you want to create.

For example, if you type

1000

, when the index size reaches 1 Gb, it becomes a retained index and new data from your BIG-IP begins accumulating in a new current index. If your

Retained Index Count

is set to 10, then the maximum disk space used by these indexes will be approximately 10 Gb.

To chunk your data based on the increments of time:

Select Time Based

For the

Rotation Period

, specify a time unit, and type how many of those units you want to comprise indexes you want to create.

For example, if you type

and select

Hours

, a new index is created every half hour. If your

Retained Index Count

is set to 10, then each retained index will contain approximately 5 hours of data.

For the

Retained Index Count

, type the total number of indices you want to store on the DCD.

This setting determines the maximum amount of data stored on the DCD. When this limit is reached, the oldest data is truncated or discarded. For example, if you set the number of indices to 10 and each index is 1 Gb, then you must have 10 Gb of storage available on your DCD.

Click

Save & Close

to save the indices configuration settings.

Change the minimum number of master eligible devices

You can manage the minimum number of devices that must be available for the cluster to be considered operational. If the number of available devices is less than the value specified for the Minimum Master Eligible Devices, the cluster is deemed unhealthy.

From BIG-IQ, at the top of the screen, click

System

, then, on the left, click

BIG-IQ DATA COLLECTION

and then select

BIG-IQ Data Collection Cluster

Under

SUMMARY

, you can access screens detailing how much data is stored, as well as how the data is stored.

Under

CONFIGURATION

, you can access the screens that control DCD cluster performance.

Under the screen name, click

CONFIGURATION

Cluster Settings

The Cluster Settings screen opens.

To change this setting, click

Override

The button text changes to

Update

In the

Minimum Master Eligible Devices

field, type or select the new minimum number of healthy devices for this DCD cluster, and click

Update

The system updates the setting.

When you are satisfied with the minimum number of devices setting, click

Cancel

to close the screen.

How do Data Collection Device zones work?

There are two ways to use Data Collection Device (DCD) zones to control how data is stored for your managed BIG-IP devices.

You can use zones to optimize statistics traffic routing. By assigning DCDs to a zone and then assigning managed BIG-IP devices to that zone, you control which DCDs collect statistic traffic for each device.

DCD zone awareness factors into how the DCD cluster performs during Disaster Recovery scenarios. The role zones play in these scenarios is discussed in the Disaster Recovery Best Practices article on

support.f5.com

To specify which DCDs collect statistics traffic for a BIG-IP device, you perform two tasks:

Change the zone for a Data Collection Device

Normally, you assign a Data Collection Device (DCD) to a zone as part of the initial setup for that device. But you can change the zone to which a DCD is assigned as needed.

From BIG-IQ, at the top of the screen, click

System

, then, on the left, click

BIG-IQ DATA COLLECTION

BIG-IQ Data Collection Devices

The BIG-IQ Data Collection Devices screen opens listing the DCDs in the cluster. The Services column lists the BIG-IP services monitored by each DCD. If no services are enabled for a DCD, this column displays

Add Services

instead.

Under Device Name, select the DCD that you want to revise.

On the DCD properties page, click

Edit

to display the Edit Zone popup.

To use an existing

Zone

, select the zone you want to assign to this DCD and click

Continue

To use a new

Zone

, select

Create New

, then type the name of the zone to want to create and assign to this DCD and click

Continue

Click

Save & Close

to close the DCD properties screen.

Use SSH to log in to DCD as

root

Type

bigstart restart elasticsearch

and press Enter.

Repeat the last three steps for each DCD that you want to move to this zone.

As you run this command on each DCD, it momentarily stops processing DCD data, so the data routes to another node in the cluster and no data is lost.

You can now assign managed BIG-IP devices to this zone for data collection.

Change a zone for a BIG-IP device

Before you can change a BIG-IP device's zone, you must have created the zone on the Data Collection Device (DCD).

Changing the zone assignment for a BIG-IP determines which DCDs collect statistics data for that device. Normally, you assign a BIG-IP to a zone as part of the initial setup for that device, but you can change the zone to which a BIG-IP device is assigned as needed.

At the top of the screen, click

Devices

Click on the name of the device for which you want to change the zone.

The properties screen displays for that device.

On the left, click

STATISTICS COLLECTION

For

Collect Statistics Data

, select

Enabled

, to collect statistics from this device.

For

Zone

, select the zone to which you want to assign this BIG-IP device.

Click

Save & Close

to close the device properties screen.

DCDs assigned to the zone you selected start collecting the statistics data for this device.

General Elasticsearch FAQ

Scaling incoming data on BIG-IQ

BIG-IQ applies Elasticsearch (ES) to automatically distribute data across all available data collection devices (DCDs) in your system setup.

On BIG-IQ, data coming in from managed BIG-IP devices are distributed into an index. These indices are actually logical grouping of physical shards, where each shard is a self-contained index (each index usually includes 5 shards). ES automatically balances new data across the shards in an index using an internal hashing algorithm. Each shard grows as the amount of data increases. Since the statistic and event data provided by managed BIG-IPs is fairly consistent, the shards will each grow at approximately the same rate. With the use of zones (see

Overview of DCD Zones

), this may not be the case.

To protect against potential data loss, there are two different kinds of shards: primary and replica. BIG-IQ allows each primary shard to have up to two replicas (total of three shards). By default, there is one replica shard for each of the five primary shards. ES manages these shards so that each shard (primary or replica) is on a separate DCD. All data is written originally to the primary shard and then copied to each of the replicas. Replica shards allow for high availability and improved read performance of the data. Once an index is created, the number of primary shards cannot be changed, but the number of replicas can be adjusted. When a new DCD is added to the cluster, ES will redistribute both the primary and replica shards to take advantage of the new node.

The following is an example of how a single index might get balanced across three DCDs when there are four primary shards and two replicas per primary (4 primary + (2 replicas * 4 primary) = 12 total shards).

DCD 1	DCD 2	DCD 3
Primary-0	Replica-0	Replica-0
Primary-1	Replica-1	Replica-1
Primary-2	Replica-2	Replica-2
Primary-3	Replica-3	Replica-3

Data scaling in DCD zones

Elasticsearch (ES) and BIG-IQ provide an additional option to group data using zones. Zoning allows you to put BIG-IP devices and data collection devices (DCDs) in close proximity to each other while the BIG-IQ management console is in another location. In an environment that has more than one data center, it is helpful to ensure that the DCD closest to the data source are used for storing that data. The biggest reason for doing this is to avoid any issues relating to latency between the BIG-IP and DCDs.

Upon initial setup, BIG-IQ uses one zone for all data (named 'default') and additional zones may be created as needed. When specifying a specific zone for a host device, ES will ensure the primary data for that host is on a DCD in that zone. This can lead to several shards being larger in an index if the amount of data coming from the assigned hosts is greater in one zone than another. ES prefers to allocate replica shards in a different zone than the primary, but will allocate replicas in the same zone if space is not available in a different zone. This ensures that the loss of a single data center doesn't cause total data loss for that zone. Adding more replicas increases resiliency, because each replica will be placed in a different zone allowing the ability to protect the data even with the loss of multiple data centers.

When adding a DCD to the cluster, each node is assigned a specific zone to store primary data. BIG-IQ then allows you to associate a host to that zone to ensure data from that host goes to that zone. This allows you to ensure the primary data within a data center will not traverse the WAN.

For more information about configuring zones, see

Configure Data Collection Devices

Setting up and Configuring a BIG-IQ Centralized Management Solution

support.f5.com

. For a sample BIG-IQ zone configuration, see BIG-IQ Zone Management.

Subscriptions

Product Usage

Trials

Registration Keys

Downloads

Applies To:

BIG-IQ Centralized Management

Configure Data Collection Devices

Data Collection Device configuration overview

Statistics retention policy overview

Manage the retention policy for your statistics data

Log index rotation policy overview

How does the DCD aggregate raw data?

Modify log indices

Change the minimum number of master eligible devices

How do Data Collection Device zones work?

Change the zone for a Data Collection Device

Change a zone for a BIG-IP device

General Elasticsearch FAQ

Scaling incoming data on BIG-IQ

Data scaling in DCD zones

Have a Question?

About F5

Education

F5 Sites

Support Tasks

Subscriptions

Product Usage

Trials

Registration Keys

Downloads

Applies To:

BIG-IQ Centralized Management

Configure Data Collection Devices

Data Collection Device configuration overview

Statistics retention policy overview

Manage the retention policy for your statistics data

Log index rotation policy overview

How does the DCD aggregate raw data?

Modify log indices

Change the minimum number of master eligible devices

How do Data Collection Device zones work?

Change the zone for a Data Collection Device

Change a zone for a BIG-IP device

General Elasticsearch FAQ

Scaling incoming data on BIG-IQ

Data scaling in DCD zones

Have a Question?

Follow Us

About F5

Education

F5 Sites

Support Tasks