Manual Chapter :
Diagnostic agent overview
Applies To:
Show VersionsF5OS-A
- 1.2.0
Diagnostic agent overview
The Diagnostic Agent is the entity responsible for performing a
diagnostic process. Internally, the agent has a component model which
represents the system. Each component represents the health of the system. The
agent has the ability to load a diagnostic profile and execute it. The profile
consists of a set of tasks to carry out work of updating components.
The Diagnostic Agent maintains the list of components that make up the
platform. This list varies, depending on whether it is for an appliance,
blade, or system controller, and each platform type defines its own set of
hardware components. The components can be services, firmware, or hardware
blocks. Components that represent services can be discovered and created
dynamically.
Example components include:
- NVME Drives
- FPGAs (ATSE & VQF)
- LOP Application
- Memory
- CPU
- Sensors
- Platform HAL Service
- FPGA Manager
Component overview
Components adhere to these guidelines and enable multiple tasks to update component
health:
- A component container can have one or more components.
- A component can have zero or more child components.
- A component can have zero or more attributes.
- Each hardware type has a fixed set of hardware components.
- Each component has a Health and Severity value.
- Each attribute has a Health and Severity value.
A component encapsulates some object in the system and includes these
fields:
- A unique key that identifies the component. The key is a lower-case, Unix path separated name: blade/hardware/drives, for example. The root is a more generic name, and the leaf is generally more specific
- By convention, we have the following root component nodes:
- controller/
- blade/
- chassis/
- By convention we have three major subject nodes:
- hardware/
- Components associated with the physical hardware
- services/
- Components associated with a running service
- firmware/
- Components associated with a firmware element
- A user-?friendly name or description
- A health value
- A severity value
- 0..N attribute values (0 - N?)
- 0..M child components (0 - M?)
- A parent component
The component structure is hierarchical. Each component can have a single
parent and 0 to N child components. A root component is said to have no
parent. Examples of root components include: blade, chassis, controller.
By convention, components are separated by the Unix path
separator, a forward slash (/).
Component health overview
A component can have these health states:
- Ok
- The component is considered healthy. This is the initial health state.
- Unhealthy
- The component is no longer healthy.
- NA
- The health status does not apply to this component.
You can configure component health to impact the health of the
parent component. For example, if the "nvme0n1" component is unhealthy, then
the health state for the "blade/hardware/drives," "blade/hardware," and
"blade" components become unhealthy.
Component severity levels and health
Each component has a severity value, which adds weight to the health
status, and follows standard syslog severity levels. The severity value also
determines the health of the attribute and/or component.
Name |
Level |
Description |
Health |
Action |
---|---|---|---|---|
Emergency |
0 |
The system is unusable. |
Unhealthy |
Create RMA SO#. |
Alert |
1 |
A problem has occurred and must be
remedied immediately. |
Unhealthy |
Create support SR#. |
Critical |
2 |
A problem has occurred and must
remedied soon. |
Unhealthy |
Create support SR#. |
Error |
3 |
A problem has occurred that needs
attention. |
Unhealthy |
Run foreground diagnostics (currently unavailable, but to be
implemented in a future version). |
Warning |
4 |
A possible problem has occurred that
needs attention. |
Healthy |
N/A |
Notice |
5 |
A condition has occurred that might
need attention. For example, voltage limits that are out of
range do not fail and are instead marked as Notice. |
Healthy |
N/A |
Info |
6 |
The component is operating
normally. |
Healthy |
N/A |
Debug |
7 |
The component does not reflect
health conditions. |
N/A |
N/A |
Component attributes overview
An attribute of a component encapsulates a specific value for a moment in time. Each attribute includes:
- A unique key that identifies the health attribute
- Attributes are similar fields of a class or structure; they have a name and a single basic value type.
- Attributes keys are all lowercase.
- Attributes can contain generic and specific names, separated by a colon (:). For example: switch:port:link-status.
- A user-friendly name
- An optional reference to a criteria object
- A health status
- A severity value
- A current value
- An updated-at time stamp
- An updated count
- Extra data
Each attribute can have an association with a criteria object. A criteria object can have 1..M limits, which can be applied to a data object. Each criteria limit includes:
- A unique key, which describes the limit within the criteria.
- A user-friendly message associated with the limits.
- A string expression that can be applied to a data object.
- A severity to set the attribute to if the above expression evaluates to "true".
Each criteria object can have multiple limits. The limits are evaluated in the order in which they are defined within the framework. If one expression evaluates to "true", then the evaluator stops processing the remaining limits.