Supplemental Document : F5OS-C 1.5.1 Fixes and Known Issues Release Notes

Applies To:

Show Versions Show Versions

F5OS-C

  • 1.5.1
Updated Date: 02/17/2023

F5OS-C Release Information

Version: 1.5.1
Build: 14085

Note: This content is current as of the software release date
Updates to bug information occur periodically. For the most up-to-date bug data, see Bug Tracker.

The blue background highlights fixes


Cumulative fixes from F5OS-C v1.5.0 that are included in this release
Cumulative fixes from F5OS-C v1.5.0 that are included in this release
Known Issues in F5OS-C v1.5.x

Functional Change Fixes

ID Number Severity Links to More Info Description
1161557-2 1-Blocking BT1161557 BIG-IP tenants created before F5OS-C 1.5.1 or F5OS-A 1.3.0 may be allocated a smaller disk than required


F5OS-C Fixes

ID Number Severity Links to More Info Description
1189013-1 2-Critical BT1189013 Race condition in platform bringup can result in incorrect Openshift images in local registry after upgrade
1173853-3 2-Critical BT1173853 Packet loss caused by failure of internal hardware bus
1173061-1 2-Critical BT1173061 etcd database may be corrupted in certain failure scenarios
1169341-3 2-Critical BT1169341 Using MAC Masquerade in a BIG-IP tenant causes traffic issues when re-deploying the tenant
1161761 2-Critical BT1161761 Egress traffic is dropped on interface 1/1.1
1135853-1 2-Critical BT1135853 Openshift kubelet-server and kubelet-client certificates expire after 365 days
1132485-2 2-Critical BT1132485 Controller sync can enter an erroneous double standby configuration in rare circumstances
1128765-1 2-Critical BT1128765 Data Mover lock-up causes major application traffic impact and tenant deploy failures
1185497-1 3-Major BT1185497 Tenant health in the partition shows additional entries that are not part of the tenant configuration
1154089-1 3-Major BT1154089 After a controller upgrade, Kubevirt pods fail to upgrade due to leftover pods stuck in Unknown state
1146013-1 3-Major BT1146013 VELOS floating IP may not work properly with IPv4 prefix-length other than /24, /16, or /8
1144633-1 3-Major BT1144633 System controller components can hang during controller rolling upgrade
1141293-2 3-Major BT1141293 F5OS will not import system images copied with WinSCP
1137669-2 3-Major BT1137669 Potential mis-forwarding of packets caused by stale internal hardware acceleration configuration
1136829-2 3-Major BT1136829 Blank server error popup appears over unauthorized popup for operator user
1135181-1 3-Major BT1135181 Controller rolling upgrade may cause blades to reboot into partition "none", deleting tenant data
1116869 3-Major BT1116869 Tcpdump on F5OS does not capture packets of certain sizes
1113225-1 3-Major BT1113225 The tcam-mgr neuron client disconnects
1091537-1 4-Minor   CVE-2022-23943 mod_sed: Read/write beyond bounds



Cumulative fixes from F5OS-C v1.5.0 that are included in this release


Vulnerability Fixes

ID Number CVE Links to More Info Description
1075693 CVE-2021-22543 K01217337 CVE-2021-22543 Linux Kernel Vulnerability
1067077 CVE-2021-43527 K54450124, BT1067077 NSS: Memory corruption in decodeECorDsaSignature with DSA


Functional Change Fixes

ID Number Severity Links to More Info Description
1043701 3-Major   Allow configurable partition hostname


F5OS-C Fixes

ID Number Severity Links to More Info Description
1091933 1-Blocking BT1091933 After a partition upgrade, tenant health and traffic statistics are no longer presented
950901 2-Critical BT950901 Wrong chassis serial number chs599996s populates to VELOS tenants
1125505-1 2-Critical BT1125505 LOP communication may stop working on a system controller after failover
1114861 2-Critical BT1114861 Local controller etcd instance may become unreachable after a rolling upgrade.
1111549 2-Critical BT1111549 System import functionality is unstable if PXE install source is not imported
1109021-1 2-Critical BT1109021 CLI commands are not logged in audit.log
1107433 2-Critical BT1107433 BIG-IP Next floating IPs do not properly issue GARPs on failover
1105001 2-Critical BT1105001 Large tar/gz/iso file download via the restconf API fails.
1095977 2-Critical BT1095977 Tenant disk image not removed from blade on scale-down of tenant deployment
1092913 2-Critical BT1092913 Tenant CPU pinning can fail when a blade is moved to a new partition and the blade was previously running a deployed tenant
1092257 2-Critical BT1092257 Downloading files larger 500 megabytes via File Utilities in the webUI can result in a corrupted file.
1091313 2-Critical BT1091313 Rapid transition of BIG-IP Next tenant from Standby-to-Active-to-Standby can leave floating addresses active
1090521-1 2-Critical BT1090521 Tenant deployment may fail if the memory configured is an odd number.
1088565 2-Critical BT1088565 Various services may stop working on a system controller if the LCD is malfunctioning
1080421-2 2-Critical BT1080421 LACP does not transmit PDU's when creating a LAG
1078433-1 2-Critical BT1078433 BIG-IP Next Tenant Management IP address is not reachable after a chassis power cycle.
1076705-1 2-Critical BT1076705 Etcd instance might not start correctly after upgrade
1073777 2-Critical BT1073777 LACP interface goes down after the partition is disabled/enabled
1073305-1 2-Critical BT1073305 Upgrade to F5OS-C 1.3.0 failed to upgrade chassis partition
1072209 2-Critical BT1072209 Packets are dropped on VELOS when a masquerade MAC is on a shared VLAN
1053873 2-Critical BT1053873 Partition CLI command 'show cluster nodes node' may time out
1039721 2-Critical BT1039721 The system image displays data for only one VELOS controller
968881 3-Major BT968881 Creating a partition using the CLI, 'commit check' fails
1117561 3-Major BT1117561 Vcc-terminal-server gets stuck in unsuccessful restart loop
1113233 3-Major BT1113233 External access for BIG-IP Next pods can be lost after tenant redeployment
1112229-2 3-Major   File download API changes to support file download from the webUI
1111237 3-Major BT1111237 Logrotate parameters do not get updated by software upgrade
1110429 3-Major BT1110429 Duplicate service-instance entries in chassis partition
1106881 3-Major BT1106881 F5OS with an AFM license provisioned may provide incorrect AFM stats to a BIG-IP tenant
1106093 3-Major BT1106093 After a node is removed from a running tenant, the operational status of that node remains
1104769 3-Major BT1104769 After a node is removed from a tenant in provisioned mode, the operational status of that node remains
1103105 3-Major BT1103105 The naming convention of core files has changed
1100861 3-Major BT1100861 System aaa primary-key state not returning both hash and status
1097833-1 3-Major BT1097833 Debug messages logged in platform.log
1093301 3-Major BT1093301 GUI keeps showing "Firmware updates are currently in progress" after upgrade
1091641 3-Major BT1091641 NTP (chrony) packet authentication is not fully implemented on VELOS
1090145 3-Major BT1090145 VLAN-Listener incorrectly updated on Network Manager component restart
1089037-1 3-Major BT1089037 Dnsmasq configuration blocks resolution of names in .local domains
1084817-2 3-Major BT1084817 Container api-svc-gateway crashes due to certificate issues partition database
1084581 3-Major BT1084581 Log files collected by QKView are truncated with the newest entries removed
1083993 3-Major BT1083993 File import should check that the target doesn't exist
1081333 3-Major BT1081333 Local file path in file transfer-status for remote file import operation does not show appropriately
1080417-1 3-Major BT1080417 List of running containers are not captured in host qkview
1079809 3-Major BT1079809 Alert manager status is occasionally reported as unhealthy on startup
1079037 3-Major BT1079037 Tenant deployment fails when tenant name ends with hyphen
1073581 3-Major BT1073581 Removing a 'patch' version of services might remove the associated 'base' version as well
1069917 3-Major BT1069917 Platform registry ports can become de-synchronized, impacting Openshift deployment
1066185 3-Major   MIB files cannot be download or exported using file utilities.
1064225 3-Major BT1064225 HTTP response status codes do not match the result of the file import/export operations
1060097 3-Major   Chassis admin user cannot use SCP to import controller/chassis partition ISO images
1059073 3-Major   QKView upload to iHealth is not supported by web proxy
1051241 3-Major BT1051241 LAG and interface names should not contain special characters such as whitespace, asterisk, slashes, and curly braces.
1049737-1 3-Major BT1049737 F5OS: Some members in LACP trunks may not stand up
1040461 3-Major BT1040461 Permissions of some QKView control files do not follow standards
1034093 3-Major   protobuf vulnerability: CVE-2021-3121
1102137 4-Minor   Diagnostics ihealth upload qkview-file does not auto-complete with available qkview file names
1085005 4-Minor BT1085005 'cluster nodes node blade-N reboot' failure message is incorrect
1079857 4-Minor BT1079857 Orchestration-agent logs spurious "Warning" severity messages



Cumulative fixes from F5OS-C v1.5.0 that are included in this release


Vulnerability Fixes

ID Number CVE Links to More Info Description
1075693 CVE-2021-22543 K01217337 CVE-2021-22543 Linux Kernel Vulnerability
1067077 CVE-2021-43527 K54450124, BT1067077 NSS: Memory corruption in decodeECorDsaSignature with DSA


Functional Change Fixes

ID Number Severity Links to More Info Description
1043701 3-Major   Allow configurable partition hostname


F5OS-C Fixes

ID Number Severity Links to More Info Description
1091933 1-Blocking BT1091933 After a partition upgrade, tenant health and traffic statistics are no longer presented
950901 2-Critical BT950901 Wrong chassis serial number chs599996s populates to VELOS tenants
1125505-1 2-Critical BT1125505 LOP communication may stop working on a system controller after failover
1114861 2-Critical BT1114861 Local controller etcd instance may become unreachable after a rolling upgrade.
1111549 2-Critical BT1111549 System import functionality is unstable if PXE install source is not imported
1109021-1 2-Critical BT1109021 CLI commands are not logged in audit.log
1107433 2-Critical BT1107433 BIG-IP Next floating IPs do not properly issue GARPs on failover
1105001 2-Critical BT1105001 Large tar/gz/iso file download via the restconf API fails.
1095977 2-Critical BT1095977 Tenant disk image not removed from blade on scale-down of tenant deployment
1092913 2-Critical BT1092913 Tenant CPU pinning can fail when a blade is moved to a new partition and the blade was previously running a deployed tenant
1092257 2-Critical BT1092257 Downloading files larger 500 megabytes via File Utilities in the webUI can result in a corrupted file.
1091313 2-Critical BT1091313 Rapid transition of BIG-IP Next tenant from Standby-to-Active-to-Standby can leave floating addresses active
1090521-1 2-Critical BT1090521 Tenant deployment may fail if the memory configured is an odd number.
1088565 2-Critical BT1088565 Various services may stop working on a system controller if the LCD is malfunctioning
1080421-2 2-Critical BT1080421 LACP does not transmit PDU's when creating a LAG
1078433-1 2-Critical BT1078433 BIG-IP Next Tenant Management IP address is not reachable after a chassis power cycle.
1076705-1 2-Critical BT1076705 Etcd instance might not start correctly after upgrade
1073777 2-Critical BT1073777 LACP interface goes down after the partition is disabled/enabled
1073305-1 2-Critical BT1073305 Upgrade to F5OS-C 1.3.0 failed to upgrade chassis partition
1072209 2-Critical BT1072209 Packets are dropped on VELOS when a masquerade MAC is on a shared VLAN
1053873 2-Critical BT1053873 Partition CLI command 'show cluster nodes node' may time out
1039721 2-Critical BT1039721 The system image displays data for only one VELOS controller
968881 3-Major BT968881 Creating a partition using the CLI, 'commit check' fails
1117561 3-Major BT1117561 Vcc-terminal-server gets stuck in unsuccessful restart loop
1113233 3-Major BT1113233 External access for BIG-IP Next pods can be lost after tenant redeployment
1112229-2 3-Major   File download API changes to support file download from the webUI
1111237 3-Major BT1111237 Logrotate parameters do not get updated by software upgrade
1110429 3-Major BT1110429 Duplicate service-instance entries in chassis partition
1106881 3-Major BT1106881 F5OS with an AFM license provisioned may provide incorrect AFM stats to a BIG-IP tenant
1106093 3-Major BT1106093 After a node is removed from a running tenant, the operational status of that node remains
1104769 3-Major BT1104769 After a node is removed from a tenant in provisioned mode, the operational status of that node remains
1103105 3-Major BT1103105 The naming convention of core files has changed
1100861 3-Major BT1100861 System aaa primary-key state not returning both hash and status
1097833-1 3-Major BT1097833 Debug messages logged in platform.log
1093301 3-Major BT1093301 GUI keeps showing "Firmware updates are currently in progress" after upgrade
1091641 3-Major BT1091641 NTP (chrony) packet authentication is not fully implemented on VELOS
1090145 3-Major BT1090145 VLAN-Listener incorrectly updated on Network Manager component restart
1089037-1 3-Major BT1089037 Dnsmasq configuration blocks resolution of names in .local domains
1084817-2 3-Major BT1084817 Container api-svc-gateway crashes due to certificate issues partition database
1084581 3-Major BT1084581 Log files collected by QKView are truncated with the newest entries removed
1083993 3-Major BT1083993 File import should check that the target doesn't exist
1081333 3-Major BT1081333 Local file path in file transfer-status for remote file import operation does not show appropriately
1080417-1 3-Major BT1080417 List of running containers are not captured in host qkview
1079809 3-Major BT1079809 Alert manager status is occasionally reported as unhealthy on startup
1079037 3-Major BT1079037 Tenant deployment fails when tenant name ends with hyphen
1073581 3-Major BT1073581 Removing a 'patch' version of services might remove the associated 'base' version as well
1069917 3-Major BT1069917 Platform registry ports can become de-synchronized, impacting Openshift deployment
1066185 3-Major   MIB files cannot be download or exported using file utilities.
1064225 3-Major BT1064225 HTTP response status codes do not match the result of the file import/export operations
1060097 3-Major   Chassis admin user cannot use SCP to import controller/chassis partition ISO images
1059073 3-Major   QKView upload to iHealth is not supported by web proxy
1051241 3-Major BT1051241 LAG and interface names should not contain special characters such as whitespace, asterisk, slashes, and curly braces.
1049737-1 3-Major BT1049737 F5OS: Some members in LACP trunks may not stand up
1040461 3-Major BT1040461 Permissions of some QKView control files do not follow standards
1034093 3-Major   protobuf vulnerability: CVE-2021-3121
1102137 4-Minor   Diagnostics ihealth upload qkview-file does not auto-complete with available qkview file names
1085005 4-Minor BT1085005 'cluster nodes node blade-N reboot' failure message is incorrect
1079857 4-Minor BT1079857 Orchestration-agent logs spurious "Warning" severity messages

 

Cumulative fix details for F5OS-C v1.5.1 that are included in this release

968881 : Creating a partition using the CLI, 'commit check' fails

Links to More Info: BT968881

Component: F5OS-C

Symptoms:
When creating a partition using the CLI, and trying to validate the changes with 'commit check', a validation error occurs:

partitions 'partition part1 uuid' is not configured.

Conditions:
-- Create a partition using the CLI.
-- Attempt to validate the changes using 'commit check'.

Impact:
The 'commit check' operation rejects this config change. This error is misleading, indicating that you need to specify a uuid value.

Note: Not only is uuid irrelevant, it is not possible for you to specify it.

Workaround:
None


950901 : Wrong chassis serial number chs599996s populates to VELOS tenants

Links to More Info: BT950901

Component: F5OS-C

Symptoms:
Under some corner cases, system controllers may have a blank serial number in the /etc/PLATFORM file. If partition software is started during this period, any tenants deployed on that partition will report an incorrect chassis serial number chs599996s.

Conditions:
- Restarting the system controllers
- Removing and adding the blades from the tenant

Impact:
VELOS tenants may report the wrong serial number, visible in "tmsh show sys hardware" or "tmsh list cm device".

If a VELOS tenant spans multiple blades, and the different blades pick up different serial numbers from F5OS, the tenant may fail to properly cluster; multiple tenant blades will function as cluster primary, competing over the cluster management IP.

Workaround:
The correct chassis serial number can be seen in the license file, which can be viewed from a tenant by running "tmsh show sys license".

If a tenant is currently bifurcated (multiple blades functioning as cluster primary), the immediate mitigation is to set the tenant to "provisioned" and then back to "deployed".

If a tenant is reporting the incorrect chassis serial number (chs599996s), then the following should restore the correct serial number:

1. Determine which controller has a blank serial number for the partition, by looking at "grep CHASSIS_SERIAL_NO /var/log/sw-util.log | tail -1" run on each controller, e.g.:

    [root@controller-2 ~]# for i in controller-{1,2}; do echo -n "$i: "; ssh $i grep CHASSIS_SERIAL_NO /var/log/sw-util.log | tail -1; done
    controller-1: ++ CHASSIS_SERIAL_NO=chs700144s
    controller-2: ++ CHASSIS_SERIAL_NO=
    [root@controller-2 ~]#

    In this example, controller-2 is affected.

2. Reboot the affected controller. After it reboots, check whether it has a blank serial number. Repeat this step until it boots, and reports a non-blank serial number.
3. From the partition, set the tenant to "provisioned" and then back to "deployed".
4. After the tenant reboots, confirm the serial numbers is now correct (not chs599996s) in the output of "tmsh show sys hardware"


1189013-1 : Race condition in platform bringup can result in incorrect Openshift images in local registry after upgrade

Links to More Info: BT1189013

Component: F5OS-C

Symptoms:
After upgrade to F5OS-C controller OS version 1.5.x, it is possible for the docker registry from which the Openshift platform pulls container images to have the wrong contents. This can result in necessary images to be missing from the registry, and lead to failures in Openshift cluster bringup after a cluster re-install.

Conditions:
Upgrading F5OS-C controller OS from a pre-1.5.x version to 1.5.x and triggering a cluster re-install after upgrade.

Impact:
Openshift cluster does not come up after re-install.

Workaround:
1. Remove the controller ISO version used to USB install the system. This version can be determined by running the following in a bash shell on either system controller:

grep ^version: /var/platform-services/VERSION | cut -d' ' -f2

2. Reboot both system controllers.

Fix:
Fix for race condition in platform bringup that can result in incorrect Openshift images in local registry after upgrade.


1185497-1 : Tenant health in the partition shows additional entries that are not part of the tenant configuration

Links to More Info: BT1185497

Component: F5OS-C

Symptoms:
When the admin upgrades the system software from 1.3.x to 1.5.0, the platform updates the tenant's table with additional entries that are not running as part of the tenant's original configuration.

Conditions:
Power cycle or system software upgrades from 1.3.x to 1.5.0.

Impact:
There will not be any impact on the critical functionality of the tenant, and traffic continues to work. However, it does show some unwanted information in the health which could be confusing.

Workaround:
Toggling the affected tenant's running state from "Deployed" to "Provisioned" and back to "Deployed" will fix the state of the tenant in the table.

Fix:
During the power cycle/system upgrade, the platform re-populates the tenant oper status from Openshift and publishes it to Partition. If the REST response of the tenants from Openshift is incomplete, the platform is populating entries under the wrong key/value. As a result, the partition tenant's table ends up with some unwanted entries.
It is a cosmetic issue and will not impact any tenants.


1173853-3 : Packet loss caused by failure of internal hardware bus

Links to More Info: BT1173853

Component: F5OS-C

Symptoms:
All or 50% of from-network packets arriving at a front panel port are dropped in hardware prior to delivery to tenant(s) running on the CPU. Packet loss is caused by CRC errors on an internal bus connecting two hardware components leading to eventual failure of the bus.

Conditions:
Issue occurs randomly, but is most commonly seen soon after bootup when packets first start to be handled by fastL4 hardware acceleration, hardware per-virtual server syn cookie protection, or AFM hardware protection.

Impact:
Total loss of from-network to CPU packets on r5900, r5800, and r5600 appliances, and either total loss or loss of 50% of from-network to CPU packets on r10900, r10800, and r10600 appliances. The r4800, r4600, r2800, and r2600 appliances are unaffected.

Workaround:
Reboot the appliance and disable fastL4 acceleration, per-virtual syn cookie hardware protection, and AFM hardware protection before re-enabling ingress traffic.

Fix:
This issue has been corrected.


1173061-1 : etcd database may be corrupted in certain failure scenarios

Links to More Info: BT1173061

Component: F5OS-C

Symptoms:
/etc/etcd/dump_etcd.sh might show that the etcd instance native to system controller #1 or #2 does not come up after an upgrade.

This displays in the output of /etc/etcd/dump_etcd.sh and might occur for the .3.51 or .3.52 node:

failed to check the health of member 25fa6669d235caa6 on https://100.65.3.52:2379: Get https://100.65.3.52:2379/health: dial tcp 100.65.3.52:2379: connect: connection refused
member 25fa6669d235caa6 is unreachable: [https://100.65.3.52:2379] are all unreachable

This can cause a longer OpenShift outage if the system controller containing the healthy instance is rebooted, and complete outage if the system controller containing the healthy instance is lost.

Conditions:
This can happen if both system controllers are rebooted at the same time.

Impact:
The local etcd instance on the affected system controller will not work correctly, compromising the high availability (HA) of the OpenShift cluster. The cluster will continue to work correctly while both system controllers are up.

Workaround:
The only workaround is to rebuild the OpenShift cluster by running "touch /var/omd/CLUSTER_REINSTALL" from the shell as root on the active system controller. This will cause all running tenants to be taken down during the cluster reinstall, which takes 90+ minutes.

Fix:
This is fixed in F5OS-C-1.5.1 and later.

With this fix, the impacted etcd instance will be recovered automatically, restoring full high availability support in etcd.


1169341-3 : Using MAC Masquerade in a BIG-IP tenant causes traffic issues when re-deploying the tenant

Links to More Info: BT1169341

Component: F5OS-C

Symptoms:
If the tenant has configured MAC Masquerade, when the tenant is moved to a Configured or Provisioned state, then back to Deployed, the tenant may experience loss of traffic.

Conditions:
The tenant has configured MAC Masquerade and redeploys the tenant.

Impact:
The tenant may experience loss of datapath traffic.

Workaround:
N/A

Fix:
Using MAC Masquerade in a BIG-IP tenant no longer causes traffic issues.


1161761 : Egress traffic is dropped on interface 1/1.1

Links to More Info: BT1161761

Component: F5OS-C

Symptoms:
Egress traffic on interface 1/1.1 is dropped. If that interface is configured as part of a LAG with LACP enabled, the interface will remain in an LACP_DEFAULTED state.

Conditions:
-- F5OS-C partition using the blade in slot 1.
-- The port group for the interfaces in slot 1 are configured in 4x25GbE or 4x10GbE mode.

Impact:
All traffic that the system tries to transmit out of interface 1/1.1 is dropped.

Workaround:
Do not use interface 1/1.1 in affected software versions.

Fix:
Do not use interface 1/1.1 in affected software versions.


1161557-2 : BIG-IP tenants created before F5OS-C 1.5.1 or F5OS-A 1.3.0 may be allocated a smaller disk than required

Links to More Info: BT1161557

Component: F5OS-C

Symptoms:
If the BIG-IP tenant disk space is fully used by creating multiple software volumes within the tenant, it will generate disk errors.

Conditions:
- A tenant originally deployed from an “ALL-F5OS” tenant image (i.e., BIGIP-15.1.6.1-0.0.10.ALL-F5OS.qcow2.zip.bundle) originally created from one of the following:
 -- 14.1.5 or above in the 14.1.x branch of code
 -- 15.1.6.1 or above in the 15.1.x branch of code

- The tenant is configured to use 76G of disk space (the default)

Impact:
Software installs within the tenant may fail.

Workaround:
Beginning in F5OS-A 1.3.0, the system detects the minimum size of a disk created from a tenant image file, and enforces that minimum on newly-created tenants.

If a customer has a tenant affected by this issue and upgrades their system to F5OS-A 1.3.0 or later, set the tenant to "configured", and then deploy the tenant again.

If the disk size is not right, the system will show the minimum size, then adjust the tenant disk size to what is advised by the system or larger.

From 1.4.0, user does not need to adjust the size unless the user needs a bigger size.
The right/minimum size will be auto-allocated when the state is changed.

Fix:
The tenant disk size will be detected and auto-allocated.

Behavior Change:
There are two behaviors.

1.3.x: If the disk size is smaller than it has to be, it warns the user and doesn't start the tenant until the user specifies the right/minimum size.

1.4.0: It auto increases the size to the right/minimum size if the user didn't specify the disk size.


1154089-1 : After a controller upgrade, Kubevirt pods fail to upgrade due to leftover pods stuck in Unknown state

Links to More Info: BT1154089

Component: F5OS-C

Symptoms:
Tenants will not move to a running state.

Conditions:
After a controller upgrade, it is possible that some of the Kubevirt pods from the previous software version can remain in an Unknown state. With these leftover pods, the Kubevirt install script will fail to install the newer Kubevirt pods.

Impact:
Tenants are not running.

Workaround:
Manually delete the leftover Kubevirt pods in the Unknown state and rerun the Kubevirt install script.

Fix:
Kubevirt pods will update as expected.


1146013-1 : VELOS floating IP may not work properly with IPv4 prefix-length other than /24, /16, or /8

Links to More Info: BT1146013

Component: F5OS-C

Symptoms:
When a VELOS device is configured with a prefix-length other than /24, /16, or /8 for IPv4 management addresses, the system may fail to install correct routes for handling reply traffic sourced from the floating management address.

One of the two following situations may occur:

1. The floating management address will not be accessible from other devices on the same local network (cannot ping the floating management IP from the standby system controller).

2. The floating management address will not be accessible from another range of IPs, because the system thinks those addresses are link-local.

For instance, if a device is assigned an IP address of 198.51.78.88/26:

[root@controller-1 ~]# ip route show table mgmt-floating4
default via 198.51.100.126 dev mgmt-floating
198.51.100.0/26 dev mgmt-floating scope link

The system will not be accessible from devices with IP address 198.51.100.0 through 198.51.100.63.

Conditions:
-- VELOS controller
-- Management network with an IPv4 management address configured, and management network prefix-length other than /24, /16, or /8.

Impact:
Floating system controller management IP may not be able to reply to traffic from all IPs.

Workaround:
On active system controller (and after any reboot or system controller failover), fix the routing rules. Log in to the active system controller as root and run the following commands:

CORRECT_NETWORK=$(ip route show table main | grep mgmt-floating | cut -f1 -d' ')
WRONG_ROUTE=$(ip route show table mgmt-floating4 | grep 'scope link')
ip route delete table mgmt-floating4 $WRONG_ROUTE
ip route add table mgmt-floating4 $CORRECT_NETWORK dev mgmt-floating

Fix:
The system correctly handles IPv4 management addresses with a prefix-length other than /24, /16, and /8.


1144633-1 : System controller components can hang during controller rolling upgrade

Links to More Info: BT1144633

Component: F5OS-C

Symptoms:
System controller components can hang during controller rolling upgrade, resulting in failure to start the partitions correctly, and other incorrect operation.

Partition instance state may show as "failed", "offline", or "running", rather than the normal "running-active"/"running-standby".

This can also cause imported ISO images to not synchronize across controllers following an upgrade.

Conditions:
Performing a system controller rolling upgrade from a version prior to 1.5.0, to version 1.5.0.

Impact:
Partition instances may not reach the normal state of running-active/running-standby, and will not operate correctly.

Workaround:
If the system is in this state, it can be fixed by rebooting both system controllers, in sequence. Failover/go-standby is not sufficient; both controllers must be restarted to clear the issue.

The problem can be avoided by performing an out-of-service upgrade, using the "out-of-service true" option with the system controller "system image set-version" command.

Fix:
The system controller components no longer hang during the rolling upgrade.


1141293-2 : F5OS will not import system images copied with WinSCP

Links to More Info: BT1141293

Component: F5OS-C

Symptoms:
F5OS will not import system images copied into /var/import/staging/ using WinSCP. The file will be present on the filesystem, but the system will not process and validate them.

On older software versions (prior to F5OS-C 1.3.0 and F5OS-A 1.1.0), the image will remain stuck in an "In Queue" state.

Conditions:
Importing F5OS system images (F5OS-C controller and chassis partition images and F5OS-A system images) to /var/import/staging/.

Impact:
The images cannot be used for F5OS software installs.

Workaround:
After importing the images, log in to the F5OS device as root and run touch against the newly-uploaded files. For instance:

    touch /var/import/staging/F5OS-C-1.4.0-4112.CONTROLLER.iso

Fix:
F5OS will correctly import system images copied with WinSCP.


1137669-2 : Potential mis-forwarding of packets caused by stale internal hardware acceleration configuration

Links to More Info: BT1137669

Component: F5OS-C

Symptoms:
Because configuration entries added to the internal ePVA hardware acceleration tables may become stuck, packets arriving from front panel ports may be handled by stale entries resulting in unexpected forwarding behavior. The stale entries may also prevent TMM from offloading new connections to ePVA.

Conditions:
The most likely cause for entries to become stuck is either a reboot of tenant or restart of TMM while it has active connections offloaded to ePVA without also rebooting the entire appliance.

Impact:
Packets may be forwarded to unexpected destinations, and/or new connections are unable to be offloaded to ePVA.

Workaround:
Don't reboot or restart TMM without also rebooting the entire appliance.

Fix:
Packets are behaving as expected.


1136829-2 : Blank server error popup appears over unauthorized popup for operator user

Links to More Info: BT1136829

Component: F5OS-C

Symptoms:
When an operator user performs any operation that makes a REST call that is unauthorized for the operator role, a blank server error popup appears behind the unauthorized popup.

Conditions:
When the logged in user is in an operator role and performs an unauthorized action.

Impact:
A blank server error popup is seen behind an unauthorized popup, which is unnecessary.

Workaround:
NA

Fix:
Tested that only the unauthorized popup is visible when the operator user performs any unauthorized action.


1135853-1 : Openshift kubelet-server and kubelet-client certificates expire after 365 days

Links to More Info: BT1135853

Component: F5OS-C

Symptoms:
See https://support.f5.com/csp/article/K64001020

The kubelet-server and kubelet-client certificates on each blade and controller expire after 365 days and are not automatically renewed when they expire.

When the blade kubelet-server and kubelet-client certificates expire, the blade(s) will go offline in the openshift cluster, and be re-added to the Openshift cluster by the orchestration-manager daemon. This will cause a tenant outage.

On the active system controller, messages appear similar to the following example, indicating the certificates are expired:

controller-2.chassis.local dockerd-current[4212]: E0809 19:48:01.601509 1 authentication.go:62] Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid]

The systemd journal on the system controller logs messages similar to the following example:

controller-2.chassis.local origin-node[19920]: E0808 08:35:03.754013 19930 certificate_manager.go:326] Certificate request was not signed: timed out waiting for the condition

Conditions:
Any system where the Openshift cluster was installed with a release of 1.5.0 or earlier.

Impact:
The blade(s) will go offline in the Openshift cluster and be re-added to the Openshift cluster by the orchestration-manager daemon. This will cause a tenant outage, and the tenants may not restart correctly after the blades have been re-added to the cluster.

Workaround:
The renew_nodes.sh script mentioned in K64001020 can be used to renew the kubelet-server and kubelet-client certificates for one more year. It is not possible to renew these certificates for more than a year without rebuilding the Openshift cluster.

At 2 years, other certificates in the Openshift cluster will expire, so it is necessary to rebuild the Openshift cluster with the fix for this issue.

Fix:
Openshift has been updated to use a certificate expiration time of 10 years, and new Openshift containers have been added to releases with this fix. To make use of these new containers with longer certificate expiration times, it is necessary to rebuild the Openshift cluster.

Warning messages have been added to the “show cluster cluster-status” output on the system controller CLI that warn when certificates are within 90 days of expiring, and when the Openshift cluster needs to be rebuilt to take advantage of the new containers with the longer certificate expiration times.

syscon-1-active# show cluster cluster-status
cluster cluster-status summary-status "Openshift cluster is healthy, and all controllers and blades are ready. WARNING: 1 or more Openshift certificates expiring within 90 days. WARNING: Manual Openshift cluster rebuild necessary to update containers."

INDEX STATUS
--------------------------------------------------------------------------------------------------------------------15 2023-08-20 12:03:09.773660 - WARNING: Openshift cluster needs manual rebuild to upgrade to latest version.
16 2023-08-20 12:05:05.373785 - WARNING: Openshift certificates expiring within 90 days.

The Openshift cluster can be rebuilt after upgrading to a release containing the fix by issuing a “touch /var/omd/CLUSTER_REINSTALL” command from the shell on the active system controller. This rebuild will take 90+ minutes and will cause a tenant outage. Once the cluster rebuild is complete, all chassis partitions should be disabled and re-enabled, and all tenants should be cycled to provisioned and back to deployed to ensure they have restarted correctly after the cluster rebuild. At this point all certificates in the cluster will have a 10 year expiration.

Once the Openshift cluster is rebuilt using this fix, it is not possible to downgrade without rebuilding the Openshift cluster after the download. This is due to the new Openshift containers not being available after the downgrade. If a downgrade is done before the Openshift cluster is rebuilt, there will not be any issues.


1135181-1 : Controller rolling upgrade may cause blades to reboot into partition "none", deleting tenant data

Links to More Info: BT1135181

Component: F5OS-C

Symptoms:
System controller components can hang during controller rolling upgrade, resulting in failure to start the partitions correctly, and other incorrect operation.

Partition instance state may show as "failed", "offline", or "running", rather than the normal "running-active"/"running-standby".

If switchd hangs in during rolling upgrade, this will cause failure messages when blades reboot.

Conditions:
Performing a system controller rolling upgrade to version F5OS-C 1.5.0 from an earlier version.

Impact:
Tenant instance data (the virtual disk image) may be deleted from the blades if the blades are rebooted while this issue is occurring.

Workaround:
The problem can be avoided by performing an out-of-service upgrade, using the "out-of-service true" option with the system controller "system image set-version" command.

If a VELOS chassis has already undergone a rolling upgrade to F5OS-C 1.5.0, reboot both system controllers to get them back into a stable state.

If blades in a partition were affected, reboot those blades after rebooting the system controllers. The tenant instance data cannot be recovered, and must be recreated and/or restored from a UCS backup.

Fix:
The system controller components no longer hang during the rolling upgrade.


1132485-2 : Controller sync can enter an erroneous double standby configuration in rare circumstances

Links to More Info: BT1132485

Component: F5OS-C

Symptoms:
Under rare circumstances, the controller sync daemon (which is responsible for mirroring data between active and standby controllers) can end up in a "double standby" configuration. This results in an interruption in proper controller synchronization, and can result in negative impacts such as controller software import or upgrade failure.

Conditions:
Controller sync daemon on both active and standby controllers is configured to "standby" mode.

Impact:
Controller sync does not work until corrected, which can result in a number of negative side effects such as import and live upgrade failures.

Workaround:
On the active controller, run:

echo get_state | nc -U /var/ccsync.unix

If output is "standby", run:

echo stop | nc -U /var/ccsync.unix
echo start_active | nc -U /var/ccsync.unix

Fix:
Fixed intermittent issue where controller sync could enter an erroneous double standby configuration.


1128765-1 : Data Mover lock-up causes major application traffic impact and tenant deploy failures

Links to More Info: BT1128765

Component: F5OS-C

Symptoms:
Major impact to BIG-IP tenant virtual server traffic. PoolMember health monitors fluctuate up and down, or remain down. LACP LAGs may go down.

Depending on which Data Mover (DM) is impacted, a subset of the BIG-IP tenant TMMs will no longer transmit packets. The LACP daemon will be unable to transmit its PDUs.

/var/F5/partition<n>/log/velos.log contains messages like these at the time the problem started:

  blade-1(p1) dma-agent[10]: priority="Alert" version=1.0 msgid=0x4201000000000129 msg="Health monitor detected DM Tx Action Completion ring hung." ATSE=0 DM=2 OQS=3.
  blade-1(p1) dma-agent[10]: priority="Info" version=1.0 msgid=0x4201000000000135 msg="Health monitor DM register dump requested.".
  blade-1(p1) dma-agent[10]: priority="Info" version=1.0 msgid=0x4201000000000137 msg="Health monitor DM register dump complete." FILE="agent-dump-1666310215.txt".


In the BIG-IP tenant, the tmctl sep_stats table shows high counts for tx_send_drops2 or tx_send_drops3 (over 10,000). In the output below, all of the TMMs with SEP devices on DM 2 are impacted, unable to transmit packets.


  # tmctl sep_stats --select=iface,dm,sep,atse_socket,tx_send_drops2,tx_send_drops3
  iface dm sep atse_socket tx_send_drops2 tx_send_drops3
  ------ -- --- ----------- -------------- --------------
  1/0.1 2 0 0 1180470 <-- 80068 <--
  1/0.10 2 9 0 0 33046 <--
  1/0.11 0 10 0 0 0
  1/0.2 0 1 0 0 0
  1/0.3 1 2 0 0 0
  1/0.4 2 3 0 0 33714 <--
  1/0.5 0 4 0 0 0
  1/0.6 1 5 0 0 0
  1/0.7 2 6 0 0 32980 <--
  1/0.8 0 7 0 0 0
  1/0.9 1 8 0 0 0

In the F5OS Partition CLI, the following command will show a high count of tx-action-ring-full drops. In the output below, DM 2 on blade-1 is impacted:

  default-1# show dma-states dma-state state dm-packets dm-packet * 2-3 tx-action-ring-full
                    TX ACTION
  NAME DM QOS RING FULL
  --------------------------------
  blade-1 0 2 0
           0 3 0
           1 2 0
           1 3 0
           2 2 65890377811 <--
           2 3 328664822594 <--
  merged 0 2 0
           0 3 0
           1 2 0
           1 3 0
           2 2 65890377811 <--
           2 3 328664822594 <--


After encountering this, subsequent attempts to deploy a tenant may fail until the blade is recovered, since the locked-up Data Mover is unable to free the memory it is holding for the impacted tenants.

Conditions:
Although the exact conditions are unknown, the problem is more likely to occur when standard virtual servers are configured to mirror traffic to the peer BIG-IP.

While L7 connection mirroring increases the risk, it is not a necessary condition.

Impact:
Significant or total loss of application traffic for BIG-IP tenant instances running on the affected blade. This impact could also affect tenant instances on other blades if the LACP LAGs are marked down.

Subsequent attempts to launch a new tenant or to stop and then start an existing one may fail.

Workaround:
To recover a device, determine which blade is affected by looking at the start the following dma-agent log message in /var/F5/partition<n>/log/velos.log:

  blade-1(p1) dma-agent[10]: priority="Alert" version=1.0 msgid=0x4201000000000129 msg="Health monitor detected DM Tx Action Completion ring hung." ATSE=0 DM=2 OQS=3.
  ^^^^^^^

Then, reboot the blade. This will shut down all tenant instances on the blade. Once the blade boots up, the tenants should run and pass traffic normally.

If the blade cannot be rebooted immediately, it may be possible to mitigate the problem for a multi-slot tenant by disabling the impacted slot to steer traffic to the remaining slots that are still healthy:

  # An example of disabling BIG-IP tenant slot 1
  tmsh modify sys cluster default members { 1 { disabled } }

Reducing the use of connection mirroring, especially for standard virtual servers, should reduce the likelihood of encountering this issue.

Fix:
No fix exists yet.


1125505-1 : LOP communication may stop working on a system controller after failover

Links to More Info: BT1125505

Component: F5OS-C

Symptoms:
Various services may become unresponsive or not work correctly when communicating with the LOP.

Conditions:
This can happen rarely during failover of a system controller.

Impact:
Any functionality that interacts with the LOP could be impacted.

Workaround:
Reboot the affected system controller.

Fix:
Resolve a platform-hal LOP communication lockup that can occur due to a race condition.


1117561 : Vcc-terminal-server gets stuck in unsuccessful restart loop

Links to More Info: BT1117561

Component: F5OS-C

Symptoms:
Terminal services will fail to function and the vcc-terminal-server-main component container will appear to be stuck with a status of restarting.

Conditions:
When the vcc-terminal-server-main component container is restarted without first recreating the container from scratch, internal filesystem changes persist which prevent it from starting up successfully.

Impact:
This should generally not be seen in normal usage, although error conditions can trigger a basic container reset.

Workaround:
Restarting the system should perform a complete bringup of all component containers, including vcc-terminal-server-main, resulting in normal operation.

Fix:
This fix modifies the internal filesystem operations such that they do not prevent the container from initializing properly after a basic restart.


1116869 : Tcpdump on F5OS does not capture packets of certain sizes

Links to More Info: BT1116869

Component: F5OS-C

Symptoms:
When using tcpdump on the F5OS host, packets of certain sizes may not be captured via tcpdump.

Conditions:
Tcpdump packets less than 1501 bytes and greater than 1483 bytes as well as several other ranges are affected by this issue.

Impact:
Tcpdumps may be incomplete.

Fix:
Packets of certain sizes are no longer dropped.


1114861 : Local controller etcd instance may become unreachable after a rolling upgrade.

Links to More Info: BT1114861

Component: F5OS-C

Symptoms:
The local etcd instance on a system controller may be unreachable after a rolling upgrade. This can be seen via the output of /etc/etcd/dump_etcd.sh script on the controller.

Conditions:
This can happen during a rolling upgrade.

Impact:
The local etcd instance will become unreachable. If the local instance on the active controller is unreachable, and the standby controller fails, etcd and the openshift cluster will go offline.

Workaround:
None

Fix:
The unreachable etcd instance will be restored once the rolling update is completed.


1113233 : External access for BIG-IP Next pods can be lost after tenant redeployment

Links to More Info: BT1113233

Component: F5OS-C

Symptoms:
Some of the pods for a BIG-IP Next tenant can lose external access after the tenant is redeployed.

Conditions:
Stale iptables rules assigned to the tenant are not being cleared after a tenant redeployment.

Impact:
The tenant loses external access on one or more of its pods.

Fix:
The iptables rules entries are cleared on a tenant redeploy.


1113225-1 : The tcam-mgr neuron client disconnects

Links to More Info: BT1113225

Component: F5OS-C

Symptoms:
A tenant's neuron client connection to the tcam-mgr is disconnected without any indication to the tenant, or continually tries to re-establish this connection.

Conditions:
When a tenant has greater than 512 virtual addresses with a wildcard virtual address (one that spans a vast domain of addresses), then the tenant restarts or the tenant's wildcard domain is not protected as would be expected.

Impact:
The DDoS protection for the client when not using Fast L4 will not be properly established.

Workaround:
Configure less than 512 virtual addresses or rely on software-only DDoS protection.


1112229-2 : File download API changes to support file download from the webUI

Component: F5OS-C

Symptoms:
Header information is not effective to download files from the webUI.

Conditions:
X-Auth token is required to download from the webUI.

Impact:
Downloading files from the webUI fails.

Workaround:
None


1111549 : System import functionality is unstable if PXE install source is not imported

Links to More Info: BT1111549

Component: F5OS-C

Symptoms:
If a VELOS controller or rSeries appliance is PXE installed with a given ISO, and that ISO is not imported manually on the controller after the installation, future imports may fail or be left in an inconsistent state.

Conditions:
1. PXE install VELOS system controller or rSeries appliance
2. Fail to manually import ISO used for PXE install
3. Import other software

Impact:
Confusing import and upgrade failures under conditions that seem like they shouldn't produce issues.

Workaround:
After PXE installing a VELOS controller, make sure to manually import the ISO used for PXE install before importing any other platform software components.

Fix:
Better handling for cases where the ISO that is used for PXE install of VELOS controllers is not imported after the install.


1111237 : Logrotate parameters do not get updated by software upgrade

Links to More Info: BT1111237

Component: F5OS-C

Symptoms:
If the parameters (frequency/size) for log file rotation are updated in a new software release, they are not updated on the target system during upgrade. The result is that the size of retained log messages depends on the upgrade history, not on the software version.

Conditions:
System that is live upgraded from any version to any other version prior to F5OS-C 1.5.0.

Impact:
When logfiles are collected by qkview, differing amounts of data may be gathered, perhaps omitting information that was intended to be collected.

Workaround:
None.

Fix:
The system updates the logrotate parameters during software install, so that the setting correspond to the software version, not the upgrade history.


1110429 : Duplicate service-instance entries in chassis partition

Links to More Info: BT1110429

Component: F5OS-C

Symptoms:
In rare circumstances, when viewing the partition service-instance entries, duplicate entries will exist for system level daemons like LACPD, L2FwdSvc, and SwRbcaster. The issue occurs rarely, and the user should only notice a cosmetic difference.

Conditions:
Adding blades to and removing blades from a partition may trigger the issue.

Impact:
Display is not correct.

Workaround:
Delete and recreate the affected partition.

Fix:
Duplicate service-instance entries will be removed in cases of a blade rebooting and a blade being added to a partition.


1109021-1 : CLI commands are not logged in audit.log

Links to More Info: BT1109021

Component: F5OS-C

Symptoms:
CLI commands from ConfD are not getting logged in audit.log.

Conditions:
Execute commands using the ConfD CLI.

Impact:
CLI commands which are required for security compliance audit will not get logged in audiit.log file.

Workaround:
None


1107433 : BIG-IP Next floating IPs do not properly issue GARPs on failover

Links to More Info: BT1107433

Component: F5OS-C

Symptoms:
The floating management IP of a BIG-IP Next high availability (HA) pair may not function correctly after an high availability (HA) failover when running on F5OS-C-1.4.0.

Conditions:
When the BIG-IP Next active instance changes location, the mapping of IP to MAC address and switch ports needs to be updated in the network, or packets will continue to be routed to the "old" active instance.

Impact:
The floating management IP will not be usable for some period of time after failover. If a client is actively connected and sending packets, this condition can persist forever.

Workaround:
None.

Fix:
The high availability (HA) failover logic now correctly issues unsolicited ARP replies (GARP) in order to flush/update the IP address mappings.


1106881 : F5OS with an AFM license provisioned may provide incorrect AFM stats to a BIG-IP tenant

Links to More Info: BT1106881

Component: F5OS-C

Symptoms:
This is an intermittent problem where the affected BIG-IP tenant may receive incorrect statistics from the F5OS platform. This can cause the BIG-IP tenant to drop DNS traffic that should not be dropped.

Typically, the BIG-IP tenant will have periods of time where it receives the correct stats, and periods where it receives incorrect stats.

Conditions:
All of the below must be true:

-- Two or more BIG-IP tenants are deployed either on the same node in a partition or on the same appliance.
-- An AFM license is installed on the F5OS platform.
-- At least one tenant is receiving malformed DNS traffic.

Impact:
Clients that send DNS traffic to the affected BIG-IP tenant will not receive DNS responses when they should.

Workaround:
When AFM is provisioned for the system, deploying tenants on different nodes on a chassis based system or one tenant per appliance avoids the issue.

Fix:
BIG-IP tenants receive the correct platform statistics regardless of the node in which they are deployed.


1106093 : After a node is removed from a running tenant, the operational status of that node remains

Links to More Info: BT1106093

Component: F5OS-C

Symptoms:
When a node is removed from the configuration of a running tenant, the node that was removed remains in the CLI operational status of the tenant.

Conditions:
-- A tenant spans multiple nodes and is running.
-- One of the nodes is removed

Impact:
The operational status of the tenant for the node that was removed is incorrect.

Workaround:
None

Fix:
The correct operational status of the tenant is displayed.


1105001 : Large tar/gz/iso file download via the restconf API fails.

Links to More Info: BT1105001

Component: F5OS-C

Symptoms:
Downloading large tar/gz/iso files using the restconf API results in a corrupted file.

Conditions:
Large tar/gz/iso file download via the restconf API.

Impact:
Download fails, the downloaded file is corrupted.

Workaround:
None

Fix:
Fixed the code to download large tar/gz/iso files.


1104769 : After a node is removed from a tenant in provisioned mode, the operational status of that node remains

Links to More Info: BT1104769

Component: F5OS-C

Symptoms:
When a node is removed from the configuration of a tenant that is in provisioned mode, the node that was removed remains in the CLI operational status of the tenant.

Conditions:
-- A tenant spans multiple nodes and is in a state of Provisioned.
-- One of the nodes is removed

Impact:
The operational status of the tenant for the node that was removed is incorrect.

Workaround:
None

Fix:
The correct operational status of the tenant is displayed.


1103105 : The naming convention of core files has changed

Links to More Info: BT1103105

Component: F5OS-C

Symptoms:
Core files were previously named <app>-1.core.gz or <app>-2.core.gz where <app> was the first 12 characters of the failing executable. New core file are named as follows: core.<app>.<pid>.<timestamp>.core.gz.

Conditions:
A core file is created.

Impact:
The name of the core file is now different.

Workaround:
None

Fix:
The core files will now have a name that is consistent with other F5 formatting for core-files.


1102137 : Diagnostics ihealth upload qkview-file does not auto-complete with available qkview file names

Component: F5OS-C

Symptoms:
The ConfD command system diagnostics ihealth upload qkview-file is not tab-expandable, and you are not presented with the list of available qkview files.

Conditions:
Running "system diagnostics ihealth upload qkview-file <TAB>" to see the list of available qkview files.

Impact:
The available qkview files are not presented using tab autocomplete.

Workaround:
Run "system diagnostics qkview list" to obtain the list of available qkview files, and then manually type the desired qkview file name in when using the "system diagnostics ihealth upload qkview-file" command.

Fix:
Pressing <TAB> after system diagnostics ihealth upload qkview-file will produce a list of available files. Entering part of the name and <TAB> will auto-complete selecting a valid and available qkview file name.


1100861 : System aaa primary-key state not returning both hash and status

Links to More Info: BT1100861

Component: F5OS-C

Symptoms:
Requesting both hash and status by the query command "system aaa primary-key state" fails.

Conditions:
When no key-migration has been performed.

Impact:
Requesting the status fails.

Workaround:
Perform a key migration.

Fix:
The fix allows the query of the system state and there is no failure, returning "NONE" if no key-migration was known to the system.


1097833-1 : Debug messages logged in platform.log

Links to More Info: BT1097833

Component: F5OS-C

Symptoms:
When performing an ISO install on the hardware, some services log debug messages to platform.log until ConfD comes up.

Conditions:
This occurs during an ISO install.

Impact:
Unnecessary debug logs are logged to platform.log.

Workaround:
None


1095977 : Tenant disk image not removed from blade on scale-down of tenant deployment

Links to More Info: BT1095977

Component: F5OS-C

Symptoms:
If a tenant is scaled down to run on fewer blades, the tenant disk image will be left behind on the blade that the tenant was scaled down from.

Conditions:
BIG-IP tenant image running on one or more blades where one of the running blades is removed from the tenant configuration.

Impact:
Tenant image file will be left on blade where the tenant is scale down from:

/var/F5/partition<#>/cbip-disks/<tenant_name>.raw

Workaround:
The tenant disk image can be removed directly from the blade via the shell using the rm command. This can be done by the chassis administrator.

Fix:
Bug has been fixed to cause removal of tenant disk image when tenant deployment is scaled down.


1093301 : GUI keeps showing "Firmware updates are currently in progress" after upgrade

Links to More Info: BT1093301

Component: F5OS-C

Symptoms:
- GUI keeps showing "Firmware updates are currently in progress"
- Both system controllers report that they are active
- Chassis firmware updates can fail

Conditions:
The active system controller is updating its AOM, and while that update is occurring the peer system controller assumes the active role. When the initially active system controller's AOM boots again it will not report that the arbitration state has changed.

Impact:
There seems to be no operational impact.

Workaround:
Reset the initially active system controller.

Fix:
Fix for GUI continuously showing "Firmware updates are currently in progress" after upgrade


1092913 : Tenant CPU pinning can fail when a blade is moved to a new partition and the blade was previously running a deployed tenant

Links to More Info: BT1092913

Component: F5OS-C

Symptoms:
Newly deployed tenants on a blade just moved to a new partition will not perform optimally and confd tenant status may show the cpu allocations to the tenant failed.

Conditions:
A blade is running a deployed tenant

The blade is moved into a new partition

   or

The blade tenant is deleted, and less than about 2 minutes after that, the blade is moved to a new partition

Impact:
It's possible one or more new tenants deployed on the blade in the new partition will fail to be assigned to appropriate cpus. This could affect the performance of those tenants.

Workaround:
The main issue is that before the blade moved to the new partition, it never had the chance to release the cpus assigned to any tenants that had been deployed on the blade.

The first thing is to try and avoid this problem from happening at all be ensuring that tenants are not deployed on a blade before moving it to a new partition. Changing a tenant's deployed state to any other state (provisioned, configured, deleted) is not synchronous, so wait at least 2 minutes before moving the blade to a new partiiton.

But if avoidance fails, it is still possible to clean up the tenant cpu allocator database. The simplest steps are:

1. If any tenants are deployed on the blade in the new partition, set them to provisioned, and wait more than 2 minutes.
2. Manually remove the tenant cpu database file on the blade: "rm /opt/f5/cpumgr/cpu_users"
3. Reboot the blade (this will recreate the above file, but with all tenant records cleared)
4. Redeploy any tenants from step 1.


1092257 : Downloading files larger 500 megabytes via File Utilities in the webUI can result in a corrupted file.

Links to More Info: BT1092257

Component: F5OS-C

Symptoms:
When using the File Utilities feature in the webUI, if you select a file that is larger than 500MB and attempt to download it locally the file could become corrupted.

Conditions:
Selecting and attempting to download a file 500MB or greater when using File Utilities in the webUI.

Impact:
The downloaded file is corrupted.

Workaround:
Files 500 megabytes or larger in size can be selected and exported from the device using the "Export" option available in File Utilities that will export the file over HTTPS. Additionally, files can be exported from the device using the Secure Copy Protocol (SCP).

Fix:
Downloading files that are 500 megabytes or larger in size has been temporarily disabled in the webUI. You will receive a warning popup when you select a file that is 500MB or larger and click the Download button. The warning popup advises you to use another supported option to export the file.


1091933 : After a partition upgrade, tenant health and traffic statistics are no longer presented

Links to More Info: BT1091933

Component: F5OS-C

Symptoms:
Health and traffic statistics for a tenant are not available after a partition upgrade.

Conditions:
This occurs after upgrading a partition.

Impact:
You are unable to see updated health and traffic statistics for the tenant.

Workaround:
The tenant running-state can be toggled from Running to Configured and then back to Running.


1091641 : NTP (chrony) packet authentication is not fully implemented on VELOS

Links to More Info: BT1091641

Component: F5OS-C

Symptoms:
It is not possible to enable NTP packet authentication.

Conditions:
Running a version of F5OS-C earlier than 1.5.0.

Impact:
NTP packet authentication is not available.

Workaround:
None

Fix:
Added support for NTP packet authentication.


1091537-1 : CVE-2022-23943 mod_sed: Read/write beyond bounds

Component: F5OS-C

Symptoms:
Out-of-bounds Write vulnerability in mod_sed of Apache HTTP Server allows an attacker to overwrite heap memory with possibly attacker provided data.

This issue affects Apache HTTP Server 2.4 version 2.4.52 and prior versions.

Acknowledgements: Ronald Crane (Zippenhop LLC)

Conditions:
F5 does not believe that F5-OS vulnerable in standard/supported/recommended configuration.

Impact:
N/A

Fix:
Upgraded to an unaffected version of apache.


1091313 : Rapid transition of BIG-IP Next tenant from Standby-to-Active-to-Standby can leave floating addresses active

Links to More Info: BT1091313

Component: F5OS-C

Symptoms:
A rapid transition of a BIG-IP Next tenant from Standby-to-Active-to-Standby can leave the floating IP address active even though the tenant is in Standby.

Conditions:
This happens only if the transition to Active and back to Standby happens in under 40 seconds or so.

Impact:
The high availability (HA) floating address remains active on both instances of the BIG-IP Next tenant, causing high availability (HA) issues with the tenant.

Workaround:
If this condition occurs, effect a failover on the BIG-IP Next tenant.

This causes the address to be removed on the previously Active instance.

If failover is unsuccessful, wait for 60 seconds and issue one more failover.


1090521-1 : Tenant deployment may fail if the memory configured is an odd number.

Links to More Info: BT1090521

Component: F5OS-C

Symptoms:
1. Tenant deployment fails.
2. System may go into bad state.

Conditions:
When memory configured for a tenant is set to an odd number.

Impact:
Tenant deployment fails.

Workaround:
This issue has been fixed in F5OS-A 1.2.0.


1090145 : VLAN-Listener incorrectly updated on Network Manager component restart

Links to More Info: BT1090145

Component: F5OS-C

Symptoms:
When the Network Manager component is restarted, VLAN Listener entries can be incorrectly updated to each tenant's default Service ID.

Conditions:
Network Manager restarts can happen due to system controller restarts, partition upgrades, or a manual restart.

Impact:
Some traffic could incorrectly follow the default Port Hash disaggregation algorithm. For example, if a VLAN has been set to use the IPPORT disaggregation algorithm, this reset can cause some of the traffic to revert to using the default Port Hash algorithm.

Workaround:
Inside the affected tenants, the cmp-hash field can be changed back to default, then changed back to the desired setting.


1089037-1 : Dnsmasq configuration blocks resolution of names in .local domains

Links to More Info: BT1089037

Component: F5OS-C

Symptoms:
DNS resolution of names in .local domains will be blocked by the dnsmasq configuration.

Conditions:
Some domain names are in the .local domain.

Impact:
Name resolution of hostnames in the .local domain will not work correctly.

Workaround:
None

Fix:
Dnsmasq configuration has been updated to remove overly restrictive local=/local/ entry.


1088565 : Various services may stop working on a system controller if the LCD is malfunctioning

Links to More Info: BT1088565

Component: F5OS-C

Symptoms:
Various services may become unresponsive or not work correctly.

Conditions:
LCD is not working or host cannot communicate with the LCD.

Impact:
Any functionality that interacts with platform-hal could be impacted.

Workaround:
Recover or repair the LCD. Rebooting the affected system controller can also help temporarily.

Fix:
Fixed a leak that occurs when platform-hal cannot communicate with the LCD.


1085005 : 'cluster nodes node blade-N reboot' failure message is incorrect

Links to More Info: BT1085005

Component: F5OS-C

Symptoms:
Attempting to reboot a blade that is not assigned to the partition generates a cryptic error message:

default-2(config)# cluster nodes node blade-2 reboot
Error: Node is Not Assigned false

Conditions:
Attempt to reboot a blade that is not assigned to the partition.

Impact:
Inaccurate information provided to the user.

Workaround:
None

Fix:
Fixed the error message:

default-1(config)# cluster nodes node blade-2 reboot
Error: Node is not assigned to this partition.


1084817-2 : Container api-svc-gateway crashes due to certificate issues partition database

Links to More Info: BT1084817

Component: F5OS-C

Symptoms:
The api-svc-gateway container crashes when a bad self-signed certificate or key is published to partition database.

Conditions:
A corrupted certificate/key causes the issue.

Impact:
The api-svc-gateway service crashes.

Workaround:
Run the following command:

(config) # system database reset-to-default proceed

Fix:
In the scenario this happens, api-svc-gateway now:

 * detects when it cannot set up an SSL connection using these credentials
 * logs an error
 * sets health status to unhealthy with appropriate error and severity
 * tries to start a GRPC server with only insecure credentials


1084581 : Log files collected by QKView are truncated with the newest entries removed

Links to More Info: BT1084581

Component: F5OS-C

Symptoms:
If log files are exceedingly large, they may be truncated when collected by QKView from the 'bottom-up', meaning that the most recent log entries are clipped.

Conditions:
Log files exceed the maximum file size (default 500 MB) specified during QKView creation.

Impact:
Most recent log entries are clipped, making diagnosis difficult.

Workaround:
Collect the log files manually.

Fix:
QKView log files are now truncated 'top-down', preserving the most recent log entries.


1083993 : File import should check that the target doesn't exist

Links to More Info: BT1083993

Component: F5OS-C

Symptoms:
File import will fail if the same file name already exists.

Conditions:
Importing a file that already exists on the file system.

Impact:
An error occurs if the file already exists.

Workaround:
None


1081333 : Local file path in file transfer-status for remote file import operation does not show appropriately

Links to More Info: BT1081333

Component: F5OS-C

Symptoms:
Local file path does not show appropriately when file import operation is done from a remote URL.

Conditions:
File transfer from a remote URL with query params.

Impact:
Inappropriate file name shown in local file path.

Fix:
Modified file name in utils-agent code to remove unnecessary query params after file name.


1080421-2 : LACP does not transmit PDU's when creating a LAG

Links to More Info: BT1080421

Component: F5OS-C

Symptoms:
The LAG interface creation will not be successful and tx packet count in 'show lacp' will be zero.

Conditions:
This issue occurs due to a race condition while creating a LAG interface and is not reproducible every time.

Impact:
Link aggregation of the front panel ports will not work as expected.

Workaround:
1) clear newly added lag configurations
   a) remove lacp interface
      no lacp interfaces interface <lag-name>
   b) remove interfaces from lag
      no interfaces interface <interface> ethernet config aggregate-id
   c) remove lag interface
      no interfaces interface <lag-interface>
2) create Lag interface and add interfaces to the lag

Fix:
Fix code to remove the race condition and read lag-type as LACP


1080417-1 : List of running containers are not captured in host qkview

Links to More Info: BT1080417

Component: F5OS-C

Symptoms:
List of running containers are not captured in host qkview

Conditions:
Collect qkview and look for list of containers running on the system from qkview file.

Impact:
Unable to get the list of running containers from qkview

Workaround:
Administrator needs to run 'docker ps' command on the system and share the output with support.

Fix:
Qkview includes list of running containers on the system


1079857 : Orchestration-agent logs spurious "Warning" severity messages

Links to More Info: BT1079857

Component: F5OS-C

Symptoms:
Orchestration-agent can issue warning-level log messages about "unknown tags" in velos.log, similar to this:

2022-02-09T19:57:07.444093+00:00 controller-1(p1) orchestration-agent[1]: priority="Warn" version=1.0 msgid=0x503000000000003 msg="unknown tag in operation" TAG=1013977200 OP=4.

Conditions:
Schema changes can result in new items being added that the code does not care about, but can result in log messages.

Impact:
No functional impact; these warnings an be safely ignored.

Workaround:
None

Fix:
The orchestration-agent no longer logs warnings about unknown tags.


1079809 : Alert manager status is occasionally reported as unhealthy on startup

Links to More Info: BT1079809

Component: F5OS-C

Symptoms:
When Alert manager health is reported as unhealthy, it displays system controller health as unhealthy and causes failover

Conditions:
On startup, if any one of the analog sensors reports a sensor fault while reading its initial state, it is causing alert manager go unhealthy.

Impact:
It causes controller health fault/failover

Workaround:
None

Fix:
Added diagnostic tasks to monitor the health of sensors, VFC and VPC periodically and raise alerts.


1079037 : Tenant deployment fails when tenant name ends with hyphen

Links to More Info: BT1079037

Component: F5OS-C

Symptoms:
A tenant fails to deploy and reports "Tenant deployment failed - Server is not responding" if the user-configured tenant name ends with a hyphen.

Conditions:
Tenant name ends with a hyphen.

Impact:
Tenant will not deploy.

Workaround:
Delete the tenant and recreate it with a valid name, consisting only of alphanumeric characters with embedded hyphens.

Fix:
The configuration validation code rejects attempts to create a tenant with an invalid name.


1078433-1 : BIG-IP Next Tenant Management IP address is not reachable after a chassis power cycle.

Links to More Info: BT1078433

Component: F5OS-C

Symptoms:
After a chassis is restarted during a system reboot or power cycle, the tenant cannot be reached using the management IP address.

Conditions:
Primarily during a chassis power cycle, tenant IP addresses cannot be reached.

Impact:
Tenant admins will not be able to perform management operations such as network creation/Application configuration.

Workaround:
If the tenant is recovered within a certain time period, toggling its state from Partition will fix it.

CLI:

tenants tenant <name> config running-state configured; commit; exit
tenants tenant <name> config running-state deployed; commit; exit


1076705-1 : Etcd instance might not start correctly after upgrade

Links to More Info: BT1076705

Component: F5OS-C

Symptoms:
/etc/etcd/dump_etcd.sh might show that the etcd instance native to system controller #1 or #2 does not come up after an upgrade.

This displays in the output of /etc/etcd/dump_etcd.sh and might occur for the .3.51 or .3.52 node:

failed to check the health of member 25fa6669d235caa6 on https://100.65.3.52:2379: Get https://100.65.3.52:2379/health: dial tcp 100.65.3.52:2379: connect: connection refused
member 25fa6669d235caa6 is unreachable: [https://100.65.3.52:2379] are all unreachable

This can cause a longer OpenShift outage if the system controller containing the healthy instance is rebooted, and complete outage if the system controller containing the healthy instance is lost.

Conditions:
This is caused by a previous mount failure of the drbd file system, which causes a corruption of the etcd instance on the standby system controller. This is seen very infrequently.

Impact:
The local etcd instance on the affected system controller will not work correctly, compromising the high availability (HA) availability of the OpenShift cluster. The cluster will continue to work correctly while both system controllers are up.

Workaround:
Rebuild the OpenShift cluster by running "touch /var/omd/CLUSTER_REINSTALL" from the CLI on the active system controller. This will cause all running tenants to be taken down during the cluster reinstall, which takes 50+ minutes.

Once the cluster rebuild is complete, all chassis partitions should be disabled and re-enabled, and all tenants should be cycled to provisioned and back to deployed to ensure they have restarted correctly after the cluster rebuild.

Fix:
This is fixed in F5OS-C 1.4.0 and later.


1075693 : CVE-2021-22543 Linux Kernel Vulnerability

Links to More Info: K01217337


1073777 : LACP interface goes down after the partition is disabled/enabled

Links to More Info: BT1073777

Component: F5OS-C

Symptoms:
If LACP interfaces are configured and used by the tenant, after the partition is disabled/enabled, the LACP interface may not work properly due to wrong interface LACP state, which is caused by the disable/enable process.

Symptoms: interface state lacp_state is LACP_DEFAULTED
For example: interface 2/1.0 is one of member of an LACP interface, which works fine before the partition is disabled/enabled. After the partition is disabled/enabled, run "show interfaces interface 2/1.0 state" command; the output is lacp_state LACP_DEFAULTED.

lacp_state LACP_DEFAULTED is wrong which will cause vlan-listeners missing on the blade. vlan-listeners missing on the blade will cause tenant interface failure.

Conditions:
1. LACP interface is used.
2. LACP interface is configured correctly and up before the partition is disabled/enabled.
3. Disable/enable the partition.

Impact:
Tenant will not be able to send/receive user traffic and it will not recover automatically.

Workaround:
Remove the ethernet interface from the aggregation and re-attach it to the aggregation. This will recover the interface state and resolve the problem.
Example: commands to recover the interface (interface 2/1.0 belongs to the LACP interface lag1 and 2/1.0 has a wrong lacp_state).
1. In the partition CLI, remove the aggregate-id for the interface.

Entering configuration mode terminal
default-1(config)# no interfaces interface 2/1.0 ethernet config aggregate-id ;commit
Commit complete.
default-1(config)#

2. Re-add the aggregate-id.
default-1(config)# interfaces interface 2/1.0 ethernet config aggregate-id lag1 ;commit

3. Reboot the blade (optional step -- use only if step 1 and 2 did not fix the problem).


1073581 : Removing a 'patch' version of services might remove the associated 'base' version as well

Links to More Info: BT1073581

Component: F5OS-C

Symptoms:
Removing a 'patch' version (X.Y.Z, Z>0) of a platform ISO or services might, under certain conditions, lead to the unexpected removal of the 'base' version (X.Y.0) associated with that patch.

Conditions:
1. A 'patch' ISO is imported when the 'base' associated with the patch is not already imported (example: An F5OS-C 1.2.2 ISO is imported, and F5OS-C1.2.0 is not already imported).
2. Some time later, the F5OS-C 1.2.2 ISO is removed. This also removes the 1.2.0 services.

Impact:
F5OS-C removes software that wasn't explicitly chosen to be removed.

Workaround:
To work around this issue, import the 'base' version ISO (X.Y.0) before importing any patches. If this is done, removal of a 'patch' will not remove the 'base'. If a 'base' was already removed accidentally, re-importing the 'base' ISO will also make it available again.

Fix:
N/A


1073305-1 : Upgrade to F5OS-C 1.3.0 failed to upgrade chassis partition

Links to More Info: BT1073305

Component: F5OS-C

Symptoms:
Upgrading VELOS from F5OS-C 1.2.2 to 1.3.0 caused partition containers to go in crashbackoffloop back. This can be checked by running this command:

oc get pods --all-namespaces |grep -i crash

Conditions:
After upgrading to F5OS-C 1.3.0, tenant datapath interfaces do not come up.

Impact:
Traffic is impacted.

Workaround:
Restarting the chassis partition, that is, disabling and enabling the chassis partition fixes the issue.

Fix:
N/A


1072209 : Packets are dropped on VELOS when a masquerade MAC is on a shared VLAN

Links to More Info: BT1072209

Component: F5OS-C

Symptoms:
On the VELOS platform, any packets destined to a masquerade MAC address are dropped when the masquerade MAC is located on a shared VLAN (a VLAN shared between multiple F5OS tenants).

On rSeries hardware platforms, all traffic for this MAC is first handled by the software-rebroadcaster and is replicated to all tenants sharing that VLAN.

Conditions:
-- A masquerade MAC is configured on a shared VLAN.
-- Traffic to the MAC address is initiated, that is, ping a floating self-IP.
-- The packets are dropped on ingress.

Impact:
Connectivity issues.

Workaround:
Configure a static FDB entry at the partition level.

Fix:
Packets are no longer dropped when a masquerade MAC is on a shared VLAN.


1069917 : Platform registry ports can become de-synchronized, impacting Openshift deployment

Links to More Info: BT1069917

Component: F5OS-C

Symptoms:
Under certain circumstances, platform services on the active and standby controllers can end up residing in docker registries that use different port numbers. This can result in Openshift pods failing to start, because they expect these port assignments to be synchronized.

Conditions:
Varied causes, which lead to inconsistent registry port assignments on active and standby controllers.

Impact:
Openshift pods cannot start.

Fix:
Added more checks to ensure platform registry ports are synchronized between controllers.


1067077 : NSS: Memory corruption in decodeECorDsaSignature with DSA

Links to More Info: K54450124, BT1067077


1066185 : MIB files cannot be download or exported using file utilities.

Component: F5OS-C

Symptoms:
MIB directory is not available for download or export file utilities

Conditions:
-- VELOS chassis or rSeries appliance
-- You would like to download the MIB file(s) via the file utilities API

Impact:
You are unable to download the MIBs or export them.

Workaround:
None


1064225 : HTTP response status codes do not match the result of the file import/export operations

Links to More Info: BT1064225

Component: F5OS-C

Symptoms:
HTTP response status codes do not match the result of the file import/export operations.

Conditions:
When the unallowed path is given in file import CLI

Impact:
User experience

Workaround:
None

Fix:
Return the proper HTTP codes in error scenarios as well.


1060097 : Chassis admin user cannot use SCP to import controller/chassis partition ISO images

Component: F5OS-C

Symptoms:
The controller admin user cannot copy ISO images to the chassis for import.

Conditions:
SCP is not allowed for the admin user.

Impact:
User must either use root shell access or a different upload protocol that may not be convenient based on the user network configuration.

Workaround:
None

Fix:
The controller admin use can now use the SCP command on a client machine to upload controller or chassis partition ISOs to the chassis active or floating management IP by specifying a target directory of "IMAGES/" in the same fashion that the partition admin user can use for importing tenant images.

scp F5OS-C-1.5.0-4444.PARTITION.DEV.iso admin@activeccip:IMAGES/


1059073 : QKView upload to iHealth is not supported by web proxy

Component: F5OS-C

Symptoms:
The upload to iHealth feature for sending QKView data to the external F5 service is not supported by web proxy. This means that QKView files must be copied to a local computer on your intranet before the files can be uploaded.

Conditions:
Uploading a QKView file to the iHealth service.

Impact:
Several steps are required to move QKView files around before uploading to iHealth, which is inconvenient.

Workaround:
Copy the QKView file to a computer on the intranet.

Then you can upload the QKView file to the iHealth service from that secondary computer when it has internet access.

Fix:
Web proxy now supports uploading QKView files to iHealth.


1053873 : Partition CLI command 'show cluster nodes node' may time out

Links to More Info: BT1053873

Component: F5OS-C

Symptoms:
Running the command 'show cluster nodes node' in the chassis partition CLI may cause a time-out.

Conditions:
Internal problem within the CLI design can trigger this situation where the CLI command fails and times out.

Impact:
User is unable to access the output of this command.

Fix:
CLI command infrastructure was redesigned.


1051241 : LAG and interface names should not contain special characters such as whitespace, asterisk, slashes, and curly braces.

Links to More Info: BT1051241

Component: F5OS-C

Symptoms:
-- LAGs (trunks) not functional in system.
-- LACPD daemon in a restart loop
-- The "LAGs" and "Interfaces" pages in the F5OS partition GUI fail to load and report "Something went wrong. Check the web browser console for more details or contact technical support for assistance."

Conditions:
-- A LAG is defined that has a space in the name.

When a user creates an LAG interface that contains whitespace characters, the LACPD daemon fails to read the interface information, and keeps restarting until the interface is removed or the name is changed to a regular name.

Impact:
All trunks in the partition are down.

Workaround:
Use the the F5OS partition CLI to remove the LAG, for example:

ottersPart-1(config)# no interfaces interface "test lag"
ottersPart-1(config)# no interfaces interface 2/2.0 ethernet config aggregate-id
ottersPart-1(config)# commit and-quit
Commit complete.
ottersPart-1#

Fix:
N/A


1049737-1 : F5OS: Some members in LACP trunks may not stand up

Links to More Info: BT1049737

Component: F5OS-C

Symptoms:
When configuring an LACP trunk (aggregate link), if the trunk has interfaces on multiple blades, some members of the trunk may not join the trunk, and the peer layer-2 switch may produce warnings stating that the LACP members are not all on the same remote device.

In addition, after enabling debug logging for the lacpd daemon, messages will be seen from both blades that indicate the value of "actor_oper_key". These values should be the same for all the ports within the same LACP trunk, but in this situation, the debug output may show different values for ports on different blades.

Conditions:
- VELOS chassis
- LACP trunk with member interfaces on multiple blades

Impact:
One or more ports in the LACP trunk (aggregate link) will not be able to join the trunk.

Workaround:
Restart the LACPD container on each blade in the affected partition.

For example, if the partition consists of slots 1 and 2, log in as root to the controller, and run the following command:

   for i in 1 2; do ssh blade-$i docker restart lacpd ; done


1043701 : Allow configurable partition hostname

Component: F5OS-C

Symptoms:
The partition name is set by the chassis administrator at creation time and cannot be changed. Every chassis starts with a partition named "default", so it can be difficult to determine what partition the user is connected to.

Conditions:
Users who have multiple partitions on multiple chassis, and need to easily tell them apart.

Impact:
User confusion, possible unintended configuration changes when connected to the wrong partition.

Workaround:
Create all partitions with unique names, and remove/do not use the "default" partition.

Fix:
The partition now has a configurable hostname under /system/config/hostname. This hostname accepts either a simple label or a fully qualified domain name, following DNS name rules.

The first label is used in the partition prompt in preference to the partition name.

The partition hostname can be set by the chassis administrator, or by partition administrator if not configured at the chassis level.

Behavior Change:
Partition hostname can now be configured, and is used in the partition CLI prompt if present.


1040461 : Permissions of some QKView control files do not follow standards

Links to More Info: BT1040461

Component: F5OS-C

Symptoms:
Permissions of some QKView control files do not follow standards.

Conditions:
Viewing permissions of QKView files.

Impact:
Some do not follow standards.

Workaround:
None

Fix:
Permissions of all QKView control files now follow the standards.


1039721 : The system image displays data for only one VELOS controller

Links to More Info: BT1039721

Component: F5OS-C

Symptoms:
The 'show system image' command displays image details for only one VELOS controller, even though there is more than one.

Conditions:
Running the 'show system image' command after uploading VELOS software.

Impact:
No image information is displayed for a VELOS controller on the user interface.

Workaround:
Reboot the VELOS controllers (specifically vcc-partition-software-manager container).

Fix:
This issue has been fixed.


1034093 : protobuf vulnerability: CVE-2021-3121

Component: F5OS-C

Symptoms:
A flaw was found in github.com/gogo/protobuf before 1.3.2 that allows an out-of-bounds access when unmarshalling certain protobuf objects.

Conditions:
- Unmarshalling protobuf objects

Impact:
This flaw allows a remote attacker to send crafted protobuf messages, causing panic and resulting in a denial of service. The highest threat from this vulnerability is to availability.

Workaround:
N/A

Fix:
Protobuf updated to mitigate CVE-2021-3121



Known Issues in F5OS-C v1.5.x


F5OS-C Issues

ID Number Severity Links to More Info Description
1162233-1 1-Blocking   Mixed front panel port speed configurations are unsupported on F5OS-C v1.5.0
1211465-1 2-Critical BT1211465 Partition openshift tokens may go invalid, causing tenants to not start after configuration or reboot
1210073-2 2-Critical BT1210073 Observing "Building LLDP PDU Failed!" error messages in partition's VELOS log continuously
1207537 2-Critical BT1207537 Chassis partition ConfD may fail to start completely during controller rolling upgrade
1200665 2-Critical   During an upgrade from 1.3 to 1.5.1, a core file may be created from the diag-agent
1196813-1 2-Critical   Adding or removing nodes from a running BIG-IP tenant instance can cause data plane and management IP access issues
1195417-1 2-Critical   BIG-IP Next tenant configuration defaults to 15GB virtual disk size upon creation
1134105-1 2-Critical   BIG-IP tenants might not start correctly after upgrade from 1.2.1 to 1.5.0
1133985-1 2-Critical BT1133985 Controller upgrade from 1.4.0 to 1.5.0 caused the partition to go into failed state.
1081281-1 2-Critical   Multi-node BIG-IP tenants may fail to cluster after rolling upgrade
1210885 3-Major   Core file generated on blade by lldpd pod during an upgrade from 1.3.2 to 1.5.1
1209749-1 3-Major BT1209749 Core file generated for cc-lacpd.vcc-lacpd
1209669-1 3-Major   BIG-IP Next fails to come up intermittently upon system power cycle/reboot
1167821-1 3-Major BT1167821 Tcpdump may not capture large packets
1154789-1 3-Major   Unexpected flow type logs
1134157-1 3-Major   When upgrading VELOS from v1.4.0 to v1.5.0, core file dagd is generated
1134117 3-Major   BIG-IP tenant libvirtd process can generate a core file when being shut down
1124061 3-Major   Tenant pods stuck in terminating stage for longer duration than necessary
1102765-1 3-Major   Blade is not in the Ready status in the cluster
1100713 3-Major   After a partition upgrade, a tenant in Provisioned state may show inconsistent CLI status
1084785 3-Major BT1084785 etcd database may be corrupted on upgrade to 1.3.1 release.
1065641 3-Major BT1065641 File import/export operation performed on disallowed paths is not shown in file transfer status
1190985-1 4-Minor   WebUI server error when opening entry for added NTP server created with FQDN

 

Known Issue details for F5OS-C v1.5.x

1211465-1 : Partition openshift tokens may go invalid, causing tenants to not start after configuration or reboot

Links to More Info: BT1211465

Component: F5OS-C

Symptoms:
Tenants not coming up correctly after upgrade or blade reboot.

The tenants will be stuck in ContainerCreating in the "oc get pods --all-namespaces" output

partition-2 virt-launcher-velos1-cf-gslb-2-gksrp 0/1 ContainerCreating 0 46m <none> blade-2.chassis.local <none>
partition-2 virt-launcher-velos1-cf-rprxy1-1-jcw9k 0/1 ContainerCreating 0 46m <none> blade-1.chassis.local <none>
partition-2 virt-launcher-velos1-cf-rprxy2-2-gl7kw 0/1 ContainerCreating 0 46m <none> blade-2.chassis.local <none>
partition-2 virt-launcher-velos1-cloud-rprxy1-1-kwg4b 0/1 ContainerCreating 0 46m <none> blade-1.chassis.local <none>

If this condition is hit, the token can validated to be bad from the CC shell with the following command:

oc get pods -n partition-<#> --token="`cat /tmp/omd/tokens/partition-<#>/tokens/partition-<#>-saToken`"

e.g.

[root@controller-1 ~]# oc get pods -n partition-6 --token="`cat /tmp/omd/tokens/partition-6/tokens/partition-6-saToken`"
NAME READY STATUS RESTARTS AGE
lldpd-6d4458d967-xfs7d 0/1 Pending 0 7m
stpd-6f844d8d65-wf6s8 0/1 Pending 0 7m
tmstat-rsync-65c9cfb8b9-m2j7j 0/1 Pending 0 7m
[root@controller-1 ~]#

If the token is bad, an error will happen.

[root@controller-1 ~]# oc get pods -n partition-3 --token="`cat /tmp/omd/tokens/partition-3/tokens/partition-3-saToken`"
No resources found.
error: You must be logged in to the server (Unauthorized)
[root@controller-1 ~]#

Conditions:
This is related to deleting and re-creating partitions, and then upgrading or rebooting blades, but does not happen every time. There may be other conditions that can cause this.

Impact:
Tenants will not start correctly, causing an outage.

Workaround:
The workaround is to remove the token files from the /tmp/omd/tokens/partition-<#>/tokens directory.

e.g., rm /tmp/omd/tokens/partition-1/tokens/partition-1-saToken

orchestration-manager will then regenerate the token file with the correct partition token.


1210885 : Core file generated on blade by lldpd pod during an upgrade from 1.3.2 to 1.5.1

Component: F5OS-C

Symptoms:
A core file can be generated by the lldpd pod during the upgrade process from 1.3.2 to 1.5.1.

Conditions:
Upgrade of F5OS-C from 1.3.2 to 1.5.1

Impact:
A Linux core file is generated by the lldpd pod. The pod will automatically restart. No functional impact is visible.

Workaround:
No workaround; the pod will restart automatically and function normally.


1210073-2 : Observing "Building LLDP PDU Failed!" error messages in partition's VELOS log continuously

Links to More Info: BT1210073

Component: F5OS-C

Symptoms:
Observing "Building LLDP PDU Failed!" error messages in partition's VELOS log continuously.

[root@controller-2 IMAGES]# tail -f /var/F5/partition2/log/velos.log |grep Err
2022-12-15T16:22:07.718412+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:22:08.719292+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:22:37.733326+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:22:38.733361+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:23:07.751908+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:23:08.751963+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:23:37.770928+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:23:38.771927+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:24:07.795989+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".
2022-12-15T16:24:08.796913+00:00 100.65.18.3 blade-3(p2) lldpd[9]: priority="Err" version=1.0 msgid=0x6001000000000009 msg=": Building LLDP PDU Failed! : Unable to obtain System Serial Number" func="buildPktBuf".

Conditions:
So far this case has only been observed in an upgrade (1.3.0 -> 1.5.0), downgrade (1.5.0 -> 1.3.2), upgrade (1.3.2 -> 1.5.1) scenario.

Impact:
When this occurs, the LLDP PDUs will fail to be constructed due to missing chassis serial number resulting in loss of LLDP functionality.

Workaround:
The following steps will resolve this issue:

1. On the active CC 'docker restart vcc-chassis-manager'
2. Restart LLDP pod for all active partitions


1209749-1 : Core file generated for cc-lacpd.vcc-lacpd

Links to More Info: BT1209749

Component: F5OS-C

Symptoms:
A core file for cc-lacpd.vcc-lacpd is generated.

Conditions:
Occurs occasionally when creating a new interface of type ieee_8023adLag.

Impact:
Controller LACPD restarts and recovers. During the restart, mgmt backplane links between the system controllers and blades may go down for a second or less.

Workaround:
Do not create interfaces of type ieee_8023adLag on the controllers.


1209669-1 : BIG-IP Next fails to come up intermittently upon system power cycle/reboot

Component: F5OS-C

Symptoms:
Every tenant gets its own storage space and F5OS applies the right permissions for BIG-IP Next to access the necessary paths to generate certs/database/etc. When the system goes for a reboot, F5OS will have to remount the storage path from volume. But when it does, F5OS is skipping those permissions back, hence tenant containers are failing to access the path and go for a crash loop.

Conditions:
When the system goes for power cycle or blade reboot.

Impact:
The tenant will not be available functionally or pass any traffic since the majority of containers are in the restart loop due to permission issues.

Workaround:
Please run the following commands from the blade shell

setfacl -Rdm u:7053:rwx /mnt/disks/<tenant-name>/
setfacl -Rm u:7053:rwx /mnt/disks/<tenant-name>/


1207537 : Chassis partition ConfD may fail to start completely during controller rolling upgrade

Links to More Info: BT1207537

Component: F5OS-C

Symptoms:
Following a controller rolling upgrade, one or both of the chassis partition controller instances may fail to start completely.

This can be seen by running the "show partitions" command. Normal status is that one controller instance will show "running-active" and one will show "running-standby". If any other status is shown (running, offline, failed, or no status), then the database is not operating correctly.

Conditions:
At database startup, it is possible for a chassis partition to hang retrieving the database primary key. The presence of this defect confirmed by observing this message at the end of the partition devel.log file:

ERR> 6-Jan-2023::17:51:49.205 partition1 confd[109]: confd encryptedStrings command timed out after 300000 ms inactivity

Impact:
One or both instances of the chassis partition control plane are not operating. This will prevent the chassis partition rolling upgrade, and may stop tenant traffic.

Workaround:
If the chassis partition is in this state, it can be recovered by disabling the partition, waiting for both instances to transition to "disabled", and then re-enabling. The error state is unlikely to occur unless the partition startup happens during a controller failover.


1200665 : During an upgrade from 1.3 to 1.5.1, a core file may be created from the diag-agent

Component: F5OS-C

Symptoms:
A core file can be generated by the diag-agent during the upgrade process from 1.3 to 1.5.1.

Conditions:
Upgrade of F5OS-C from 1.3 to 1.5.1.

Impact:
A Linux core file is generated by the diag-agent service. The service will automatically restart. No functional impact is visible.

Workaround:
No workaround; the service will restart automatically and function normally.


1196813-1 : Adding or removing nodes from a running BIG-IP tenant instance can cause data plane and management IP access issues

Component: F5OS-C

Symptoms:
If nodes are added to the tenant, then tenant management IP may bounce between nodes of a tenant instance. There may also be data plane issues where traffic will not be routed to the nodes added to an existing tenant instance. This occurs because the slot masks are not being updated in the existing tenant instances.

Conditions:
- Nodes are added or removed from a BIG-IP tenant instance on F5OS.

Impact:
Data plane traffic may be impacted, and management access to the tenant IP may be unreliable.

Workaround:
- If the node population of a tenant has already been modified, then as a workaround configure the tenant to provisioned and then back to deployed. This will restart all the tenant instances and make the node masks consistent across all instances.

If a node population change is planned, then the as a workaround configure the tenant to provisioned, configure the different node population on the tenant and then configure back to deployed.


1195417-1 : BIG-IP Next tenant configuration defaults to 15GB virtual disk size upon creation

Component: F5OS-C

Symptoms:
When initially configuring a BIG-IP Next tenant, the virtual disk size always defaults to 15GB regardless of whether you specify a larger size at the time of initial creation.

Conditions:
Applies to newly created BIG-IP Next tenants on the VELOS hardware platform running F5OS-C v1.5.1 or earlier.

Impact:
The issue requires you to subsequently edit the tenant's configuration, specify the desired virtual disk size, and to save the changes.

Workaround:
Change the tenant's state to Configured, edit the tenant's virtual disk size to the desired value, save the tenant's configuration changes and redeploy.


1190985-1 : WebUI server error when opening entry for added NTP server created with FQDN

Component: F5OS-C

Symptoms:
When the user creates an NTP server with FQDN, the NTP server data table on the time settings screen shows the resolved IP address instead of the FQDN. If the user clicks on the hyperlinked IP address in order to launch the edit screen for the NTP server, the webUI throws an error as a record with the IP address is not found.

Conditions:
For an NTP server created with FQDN.

Impact:
The edit screen for the NTP server does not launch.

Workaround:
If the user replaces the IP address in the browser URL with the FQDN of the NTP server, they are able to view the Edit screen and make the required changes.


1167821-1 : Tcpdump may not capture large packets

Links to More Info: BT1167821

Component: F5OS-C

Symptoms:
The tcpdump utility may not capture packets larger than 1371 bytes.

Conditions:
Large packets, chassis platform.

Impact:
Troubleshooting network issues by running tcpdump in the partition may not work effectively.


1162233-1 : Mixed front panel port speed configurations are unsupported on F5OS-C v1.5.0

Component: F5OS-C

Symptoms:
Attempting to set the blade front panel ports into a 100:10/25G or 40:10/25G configuration will result in a non-functional data path. The blade may appear to be linked at the physical layer, but no traffic will pass through the blade until the front panel port speed configuration is the same for both ports (for example, 2x100g, 2x40G, 2x 4x10G).

Conditions:
Setting a mixed speed configuration on the front panel ports (for example, one port at 100G and the other at 4x10G, or one port at 40G and the other at 4x10G).

Impact:
No traffic will pass through the blade.
Tenants will not deploy.

Workaround:
If mixed front panel port speeds are required, update to the next version when it becomes available.


1154789-1 : Unexpected flow type logs

Component: F5OS-C

Symptoms:
fpgamgr docker logs will display lines such as:

hdp_cap_fc_get: Unexpected flow type (17) in HDP_CAP_FC_CAP_11_REG

Conditions:
These lines can appear at any time, triggered with various under-the-hood API calls that need to determine hardware capabilities. Please note that despite the appearance of the log message, this isn't tied to the usage of ePVA (or lack thereof).

Impact:
No impact, these logs are purely cosmetic.

Workaround:
There is no workaround. However, these lines are purely cosmetic and can safely be ignored.


1134157-1 : When upgrading VELOS from v1.4.0 to v1.5.0, core file dagd is generated

Component: F5OS-C

Symptoms:
When upgrading VELOS from v1.4.0 to v1.5.0, core file dagd is generated.

Conditions:
Upgrading VELOS from v1.4.0 to v1.5.0.

Impact:
The core is generated during the blade shutdown phase of the upgrade; there is no traffic impact.


1134117 : BIG-IP tenant libvirtd process can generate a core file when being shut down

Component: F5OS-C

Symptoms:
Because it happens when the tenant is being shut down, and it happens after the virtual-machine has shut down, it does no harm, other than leaving a core file on the blade host environment where it happened.

This happens occasionally when BIG-IP tenants are shut down. It does not happen with BIG-IP NEXT tenants.

Conditions:
A BIG-IP tenant is shutdown.

Impact:
A core file is generated on a blade in /var/F5/partition1/shared/blade<N>/core/container

Workaround:
The core file may be manually removed.


1134105-1 : BIG-IP tenants might not start correctly after upgrade from 1.2.1 to 1.5.0

Component: F5OS-C

Symptoms:
BIG-IP tenants might not start correctly after upgrading from version 1.2.1 to version 1.5.0.

This can be seen in the "show tenant" output from the chassis partition CLI:

                      INSTANCE
NODE POD NAME ID PHASE CREATION TIME READY TIME STATUS MGMT MAC
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 bigiptenant1-1 1 Pending 2022-08-05T09:40:52Z 2022-08-05T09:40:52Z Not ready: containers with unready status: [compute] 00:00:00:00:00:00
2 bigiptenant1-2 2 Pending 2022-08-05T09:40:52Z 2022-08-05T09:40:52Z Not ready: containers with unready status: [compute] 00:00:00:00:00:00
3 bigiptenant1-3 3 Pending 2022-08-05T08:35:15Z 2022-08-05T09:34:25Z Node 3 which was running tenant instance bigiptenant1 is unresponsive 5e:ad:6f:14:f0:a5
4 bigiptenant1-4 4 Pending 2022-08-05T08:35:17Z 2022-08-05T09:34:20Z Node 4 which was running tenant instance bigiptenant1 is unresponsive a2:e4:ef:be:9b:2f

Conditions:
This can occur when a v1.2.1 system with existing BIG-IP tenants is upgraded directly to v1.5.0.

Impact:
Tenant instance might not start correctly, which will cause an interruption to dataplane traffic.

Workaround:
The workaround is to configure the tenant to provisioned, and then back to deployed. This causes the tenant instance to restart.


1133985-1 : Controller upgrade from 1.4.0 to 1.5.0 caused the partition to go into failed state.

Links to More Info: BT1133985

Component: F5OS-C

Symptoms:
After an upgrade of the system controller, the chassis partition went into a failed state.

syscon-1-active# show partitions partition default
                                                            RUNNING
             BLADE OS SERVICE PARTITION SERVICE STATUS
NAME ID VERSION VERSION CONTROLLER STATUS VERSION AGE
--------------------------------------------------------------------------------
default 1 1.4.0-4112 1.4.0-4112 1 failed - 5m
                                     2 running 1.4.0-4112 15m

Conditions:
Upgrade to v1.5.0 causes the chassis partition to go into a failed state.

Impact:
The partition management software goes into a failed state and the chassis partition cannot be managed.

Workaround:
Rebooting the failed controller after upgrade or restarting "vcc partition software manager" service fixes the issue.

"docker restart vcc-partition-software-manager"


1124061 : Tenant pods stuck in terminating stage for longer duration than necessary

Component: F5OS-C

Symptoms:
When the system is rebooted/upgraded, Openshift will bring up new pods and terminate the old ones. During this process, terminating pods may run into a cleanup issue and be stuck in that state for a long period of time.

Conditions:
When the system is upgraded or rebooted

Impact:
No impact on the tenant traffic or performance.

Workaround:
Clean up the pods by running the following openshift command

oc delete pod <pod-name> --grace-period=0 --force -n partition-X

example:

partition-1 tenant2-data-store-7fc9bf578d-526wf 0/1 Terminating

oc delete pod tenant2-data-store-7fc9bf578d-526wf --grace-period=0 --force -n partition-1


1102765-1 : Blade is not in the Ready status in the cluster

Component: F5OS-C

Symptoms:
Blade does not join the cluster.

[root@controller-2 ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
blade-1.chassis.local NotReady compute 3h v1.11.0+d4cacc0
blade-2.chassis.local Ready compute 3h v1.11.0+d4cacc0
controller-1.chassis.local Ready infra,master 3h v1.11.0+d4cacc0
controller-2.chassis.local Ready infra,master 3h v1.11.0+d4cacc0

service failure messages on the blade console

May 11 00:11:28 blade-2.chassis.local platform-deployment[13130]: Job for var-mnt-chassis.mount failed. See "systemctl status var-mnt-chassis.mount" and "journalctl -xe" for details.

Conditions:
It intermittently and rarely happens on up/downgrade.

Impact:
The blade cannot join the cluster and cannot run any tenant.

Workaround:
Reboot the blade through the chassis partition CondfD.
CLI command:
cluster nodes node blade-X reboot
Where X is the slot number of the blade to reboot.


1100713 : After a partition upgrade, a tenant in Provisioned state may show inconsistent CLI status

Component: F5OS-C

Symptoms:
After a partition upgrade, if the running-state of a tenant is configured in the Provisioned state, the operational status of the tenant may oscillate between "Ready to deploy" and "Allocating resources to tenant is in progress" state in the partition CLI status.

Conditions:
A race condition exists after an partition upgrade that may display an inaccurate tenant operational state when the tenant is configured as Provisioned.

Impact:
The tenant state constantly changes.

Workaround:
Configure the running-state of the tenant to Deployed.


1084785 : etcd database may be corrupted on upgrade to 1.3.1 release.

Links to More Info: BT1084785

Component: F5OS-C

Symptoms:
etcd instance on one of the CC's may become unresponsive after rolling upgrade to 1.3.1 due to failure to mount drbd partition. System will continue to function, but if the CC with functioning etcd instance goes offline, the openshift cluster will stop responding.

Conditions:
Rolling upgrade to 1.3.1 release from older software release.

Impact:
No immediate impact, but will cause HA failure if CC that is not affected goes offline while the etcd instance on remaining CC has hit this issue.

Workaround:
Workaround is to re-install the openshift cluster by doing a "touch /var/omd/CLUSTER_REINSTALL" on the active CC.


1081281-1 : Multi-node BIG-IP tenants may fail to cluster after rolling upgrade

Component: F5OS-C

Symptoms:
BIG-IP tenant instances may fail to cluster after a rolling upgrade, due to the CHASSIS_SERIAL_NO being set incorrectly in the config-map that is used to deploy the tenant instance.

This can be seen in the "show tmsh sys cluster" output on the tenant showing the slots in a failed state:

root@(localhost)(cfg-sync Standalone)(/S1-green-P::Active)(/Common)(tmos)# show sys cluster
 
-----------------------------------------
Sys::Cluster: default
-----------------------------------------
Address 10.238.133.200/24
Alt-Address ::
Availability available
State enabled
Reason Cluster Enabled
Primary Slot ID 1
Primary Selection Time 07/21/22 01:10:47
 
  -------------------------------------------------------------------------------------------
  | Sys::Cluster Members
  | ID Address Alt-Address Availability State Licensed high availability (HA) Clusterd Reason
  -------------------------------------------------------------------------------------------
  | 1 :: :: available enabled true active running Run
  | 2 :: :: offline enabled false unknown shutdown Slot Failed
  | 3 :: :: offline enabled false unknown shutdown Slot Failed
  | 4 :: :: offline enabled false unknown shutdown Slot Failed
  | 5 :: :: offline enabled false unknown shutdown Slot Failed
  | 6 :: :: offline enabled false unknown shutdown Slot Failed
  | 7 :: :: offline enabled false unknown shutdown Slot Failed
  | 8 :: :: offline enabled false unknown shutdown Slot Failed

This condition can verified by display the config map for a tenant instance and verifying that the CHASSIS_SERIAL_NO field is empty.

e.g.

From the system controller shell:

oc get cm -n partition-1 <tenant_name>-<blade_#>-configmap -o json | egrep CHASSIS

Bad Entry:
# oc get cm -n partition-1 bigiptenant1-1-configmap -o json | egrep CHASSIS; done
        "CHASSIS_SERIAL_NO": "",

Good Entry:
oc get cm -n partition-1 bigiptenant1-2-configmap -o json | egrep CHASSIS; done
        "CHASSIS_SERIAL_NO": "chs414616s",

Conditions:
This can happen during a rolling upgrade if the CHASSIS_SERIAL_NO field is not read correctly and the tenant instance is restarted as part of the rolling upgrade. This is an intermittent issue.

Impact:
If this issue occurs, one more instance of the tenant may not communicate correctly, which can cause some or all of the data plane to not function correctly, causing an outage.

Workaround:
1.) Set tenant(s) state to provisioned for BIG-IP, or configured for BIG-IP Next.
2.) Once the tenant(s) have stopped, disable the partition.
3.) Re-enable the partition.
4.) Set tenant(s) state back to deployed.


1065641 : File import/export operation performed on disallowed paths is not shown in file transfer status

Links to More Info: BT1065641

Component: F5OS-C

Symptoms:
When file import/export is performed on unallowed paths, the status is not shown in the file transfer status.

Conditions:
Performing import/export is performed on unallowed paths.

Impact:
The rejected transfer status is not visible in the file transfer status.

Workaround:
None




This issue may cause the configuration to fail to load or may significantly impact system performance after upgrade


*********************** NOTICE ***********************

For additional support resources and technical documentation, see:
******************************************************