Applies To:
Show Versions
F5OS-C
- 1.8.1
F5OS-C Release Information
Version: 1.8.1
Build: 26929
Tag: LTS
Note: This content is current as of the software release date
Updates to bug information occur periodically. For the most up-to-date bug data, see Bug Tracker.
The blue background highlights fixes |
Cumulative fixes from F5OS-C v1.8.0 that are included in this release
Known Issues in F5OS-C v1.8.x
Vulnerability Fixes
ID Number | CVE | Links to More Info | Description |
1871517 | CVE-2017-18342 | K000139901 | CVE-2017-18342 PyYaml arbitrary code execution from untrusted data |
Functional Change Fixes
None
F5OS-C Fixes
ID Number | Severity | Links to More Info | Description |
1818185 | 1-Blocking | BT1818185 | The meaning of the interface phyport internal field changed from phyport to DID. This will break functionality that is using phyport★ |
1789417-1 | 1-Blocking | BT1789417 | Component fpgamgr in restart loop with segmentation fault after failed FPGA firmware update |
1750613-1 | 1-Blocking | BT1750613 | If a system controller PXE boots and reimages, partitions may not start correctly, and cause data loss★ |
1926829-2 | 2-Critical | When attributes are added under exporters for Open Telemetry, the keys are not visible on webUI. | |
1921793-1 | 2-Critical | Health summary is not reported for some nodes in controller and partition ConfD | |
1920325-1 | 2-Critical | BT1920325 | The network-manager container crashes when it fails to create FDB entry in database |
1890297-1 | 2-Critical | BT1890297 | Memory leak in l2_agent daemon on F5OS |
1889913-2 | 2-Critical | VELOS partition Allowed IP rule restrictions | |
1850481 | 2-Critical | BT1850481 | Standby tenant is unreachable after F5OS partition upgrade to 1.7.x or higher. |
1814053-2 | 2-Critical | Orchestration Agent process may core | |
1814045-2 | 2-Critical | Daemons that handle ZMQ messages may crash under certain conditions. | |
1789141-2 | 2-Critical | If 'ldap-group is configured for a role but LDAP search fails, users with the default GID for the role can still get those privileges | |
1782925-3 | 2-Critical | BT1782925 | Active Directory LDAP integration without uidNumber/gidNumber does not work after system reboot |
1779465-1 | 2-Critical | SwitchD core file observed after live upgrade | |
1737677-1 | 2-Critical | Reboot of both system controllers results in dataplane issues | |
1709665-2 | 2-Critical | BT1709665 | Blade NotReady after liveupgrade★ |
1696325-1 | 2-Critical | Unresolved VQF IMM watchdogs after system controller failover, VoQ Window Errors, and extensive disconnect to confd | |
1682425-1 | 2-Critical | Rate limiting does not work on BX520 front panel interfaces | |
1681533 | 2-Critical | F5 VELOS ATSE firmware v7.10.7.12 | |
1681529 | 2-Critical | F5 VELOS ATSE firmware v7.10.7.02 | |
1681525 | 2-Critical | F5 VELOS ATSE firmware v7.10.7.22 | |
1681521 | 2-Critical | F5 VELOS ATSE firmware v7.10.7.11 | |
1681501 | 2-Critical | F5 VELOS ATSE firmware v7.10.7.00 | |
1638629-1 | 2-Critical | BT1638629 | "Unhealthy" kubevirt pod due to internal networking issue with blade★ |
1633681-1 | 2-Critical | Dynamic FDB entries may not be flushed from all blades when a vlan tag is removed from a LAG. | |
1586089-2 | 2-Critical | Resource-admin is unable to perform SCP. | |
1933721-2 | 3-Major | BT1933721 | Interface remain down in F5OS after removing and reinserting SFP modules |
1891301-2 | 3-Major | CVE 2020-27743: pam_tacplus through 1.5.1 lacks a check for a failure of RAND_bytes()/RAND_pseudo_bytes(). | |
1850165-1 | 3-Major | Missing internal interface pgindex field causes l2-agent to restart★ | |
1819289-2 | 3-Major | BT1819289 | Zero is not allowed as Prefix Length for allowed IPs |
1817669-1 | 3-Major | Timeout for the Ansible playbook during cluster install cannot be retried. | |
1814073-1 | 3-Major | F5OS chassis switchd core dump | |
1789125-1 | 3-Major | VQF VOQ entries missing for the functional blades in the show fpga-tables output | |
1785621-1 | 3-Major | Tenant deployed with Max Memory available on system results in Resource allocation failed - Node is up but Platform services not responding | |
1783781 | 3-Major | Bash history file containing "PRIVATE KEY" may block qkview | |
1779677-1 | 3-Major | Multiple docker containers can get assigned the same bridge IP during rolling upgrade★ | |
1778689-1 | 3-Major | Duplicate OMD alerts during Inaccessible Memory incident | |
1772433-2 | 3-Major | BT1772433 | Config restore fails after upgrade★ |
1772305-1 | 3-Major | Unable to deploy a tenant to both BX110 and BX520 blade in same partition | |
1772053-1 | 3-Major | BT1772053 | High memory usage due to log flood when one controller is in FIPS error state |
1759733-1 | 3-Major | BT1759733 | Controller reboot during a controller RMA can cause openshift cluster to fail. |
1757729-2 | 3-Major | BT1757729 | Default port for LDAP server does not match default server type |
1752821-1 | 3-Major | BT1752821 | Cluster re-install with missing system controller does not complete★ |
1737517-1 | 3-Major | BT1737517 | Rare partition startup conditions can cause persistent application-communication error on that partition |
1730833-4 | 3-Major | BT1730833 | Tmm may egress broadcast traffic even when VLANs are disabled in F5OS |
1710765-2 | 3-Major | BT1710765 | The node number fetched by the SNMP disk stats handler from the disk operational handler has the wrong blade value.★ |
1710453-1 | 3-Major | BT1710453 | Partition configuration wiped out during Controller reboot |
1710405-1 | 3-Major | BT1710405 | MAC exhausted error can occur even though there are available MACs |
1709121-5 | 3-Major | BT1709121 | Unable to create a tenant as the Network Manager start-up or failover may result in a looping process |
1699821-2 | 3-Major | BT1699821 | Partition data missing |
1696157-4 | 3-Major | BT1696157 | Container api-svc-gateway crashes after enabling a tenant |
1680105-2 | 3-Major | BT1680105 | Using 'iburst' option is preferred when adding NTP servers. |
1670029-1 | 3-Major | BT1670029 | Reset counter functionality not working properly on rSeries platforms |
1633073-4 | 3-Major | BT1633073 | A core can occur in a forked process with an Orchestration Agent |
1624853-3 | 3-Major | BT1624853 | ETCD consumes a high amount of CPU time |
1612557-1 | 3-Major | Dma-agent service health warnings appears in show system summary | |
1600693-1 | 3-Major | BT1600693 | F5OS - BIG-IP Tenant does not display VELOS Chassis slot serial number |
1595113-4 | 3-Major | Interface state enabled value stale due to timeout to reach confd | |
1586661-2 | 3-Major | BT1586661 | First login for a remote user fails |
1586641 | 3-Major | OPT-0063 400G-FR4 periodically has non-zero RMON_RX_BAD_FCS | |
1586057-1 | 3-Major | F5OS displays an incorrect error if the admin tries to set a password before committing a new user | |
1584469-1 | 3-Major | BX520 TCPDUMP throughput improvement | |
1582105-1 | 3-Major | Partition RESTCONF may return an incomplete response for f5-cluster:cluster/nodes/node | |
1574861-1 | 3-Major | BT1574861 | Incomplete API payload and CLI failure for openconfig interfaces when one controller node is not ready |
1469333-1 | 3-Major | BT1469333 | VELOS management LAG may bridge traffic between management interfaces during LACP negotiation |
1381385-3 | 3-Major | Additional password policy settings | |
1321429-5 | 3-Major | BT1321429 | F5-PLATFORM-STATS-MIB::diskPercentageUsed not available. |
1285997-7 | 3-Major | LLDP is allowed to configure on interfaces when virtual wire is enabled | |
1135845-4 | 3-Major | Increased interval for boot device selector hot-key 'b' acceptance after the BIOS banner | |
1826209-1 | 4-Minor | BT1826209 | Error log does not contain all needed information. |
1624057-2 | 4-Minor | BT1624057 | BX110 Port Flapping or interface/connectivity issues |
Cumulative fixes from F5OS-C v1.8.0 that are included in this release
Vulnerability Fixes
ID Number | CVE | Links to More Info | Description |
1620513-1 | CVE-2024-38477 | K000140784, BT1620513 | CVE-2024-38477 httpd: NULL pointer dereference in mod_proxy |
1614821-3 | CVE-2024-3596 | K000141008, BT1614821 | CVE-2024-3596 - Blast-RADIUS |
1607745-3 | CVE-2024-38474, CVE-2024-38475, CVE-2024-38476 | K000140618 | Apache HTTPD vulnerabilities CVE-2024-38476, 2024-38474 and CVE-2024-38475 |
1388477-1 | CVE-2025-46265 | K000139503, BT1388477 | Default GID group mapping authorized even when GID mapped to different group ID |
1365409-2 | CVE-2023-3341 | K000137582 | CVE-2023-3341: bind: stack exhaustion in control channel code may lead to DoS |
1327689-1 | CVE-2025-36546 | K000140574, BT1327689 | Manually remove root and user keys before entering Appliance Mode |
1285669-6 | CVE-2022-21216 | K000133432 | CVE-2022-21216 - Intel BIOS vulnerabilities on r2000/r4000 and r5000/r10000/r12000 |
1691557-1 | CVE-2020-8037 | K000149929 | CVE-2020-8037: tcpdump memory leak. |
1353001-1 | CVE-2025-43878 | K000139502, BT1353001 | tcpdump service improvements |
1577049-1 | CVE-2024-1086 | K000139430, BT1577049 | CVE-2024-1086 - Linux kernel vulnerability |
Functional Change Fixes
ID Number | Severity | Links to More Info | Description |
1353161-1 | 3-Major | BT1353161 | Snmpd daemon stuck in loop deleting and recreating 'system snmp communities community' entry after recreating and deleting SNMP config a few times |
F5OS-C Fixes
ID Number | Severity | Links to More Info | Description |
1642081 | 1-Blocking | BT1642081 | "default" partition key sometimes initialized improperly★ |
1628557-3 | 1-Blocking | F5OS high memory usage when using snmp | |
1624777-1 | 1-Blocking | BT1624777 | Tenants will not deploy since Orchestration Agent process is continuously generating a core |
1614429-1 | 1-Blocking | K000140362, BT1614429 | iHealth upload is failing with error "certificate signed by unknown authority" |
1576545-2 | 1-Blocking | BT1576545 | After upgrade, BIG-IP Next tenant os unable to export toda-otel (event logs) data to Central Manager★ |
1572493-2 | 1-Blocking | BT1572493 | LAG Trunk Configuration is Missing Inside of Tenant |
1496837-2 | 1-Blocking | BT1496837 | User-manager's ConfD socket getting closed. |
1360285-1 | 1-Blocking | BT1360285 | Partition is not reachable after performing multiple powercycles |
1349257 | 1-Blocking | K000137531, BT1349257 | Rolling software upgrade is stuck with one system controller in an "in-progress" state, and a "No such file or directory" error in sw-mgmt.debug★ |
1345977-1 | 1-Blocking | K000136113, BT1345977 | VELOS interfaces flapping if an interface is disabled |
1314453-5 | 1-Blocking | BT1314453 | Datapath is broken when LAG type is changed from LACP to Static on r2000/r4000 platforms |
1208573-3 | 1-Blocking | BT1208573 | Disabling Basic Authentication does not block the RESTCONF GET requests |
1753469 | 2-Critical | BT1753469 | Add notification to set-version when downgrading the system from F5OS-A/C-1.8.0 |
1677797-1 | 2-Critical | BT1677797 | OMD on Active CC hung due to 'oc delete project' command hang, after delete and recreate a partition and move slots |
1673925-4 | 2-Critical | BT1673925 | Missing masquerade MAC FDB entry causes excessive DLFs following tenant failover. |
1672269-1 | 2-Critical | BT1672269 | Blades missing L2 entries causing excessive DLFs. |
1660961-4 | 2-Critical | BT1660961 | Active Directory LDAP integration without uidNumber/gidNumber does not work with LDAP over TLS |
1644221-3 | 2-Critical | BT1644221 | Log file grows to gigabytes (GBs) under /var/log |
1634545 | 2-Critical | BT1634545 | OpenShift cluster may fail to install if no management IP's are configured★ |
1629257-2 | 2-Critical | BT1629257 | Diag-agent service memory utilization increases because of heartbeat probe |
1622869-5 | 2-Critical | BT1622869 | Might see TPOB core after HA disassembly |
1620077-4 | 2-Critical | BT1620077 | FDB entry port motion not working if new interface is a trunk/LAG |
1612405-5 | 2-Critical | BT1612405 | LACP status shows UP in BIG-IP tenant even if its down on F5OS. |
1603509 | 2-Critical | BT1603509 | No alarm sent when front panel management link is down |
1596149-1 | 2-Critical | BT1596149 | Monitor rSeries ATSE to BE2 links and Raise Alarms in the Event of Failures |
1594125 | 2-Critical | BT1594125 | GUI fails to modify interfaces on F5OS-C |
1591645-3 | 2-Critical | BT1591645 | EPVA related dma-agent crash |
1590617-1 | 2-Critical | BT1590617 | Partition Network Manager is crashing when turning up. |
1587925-1 | 2-Critical | BT1587925 | Modifying a RADIUS server from the web UI requires the Secret to be configured or re-entered |
1586965-1 | 2-Critical | BT1586965 | No active instance of ConfD after failover |
1585001 | 2-Critical | BT1585001 | Radius authentication does not work when the shared secret key in the radius configuration is more than or equal to 32 characters |
1581589 | 2-Critical | BT1581589 | Lack of IPv4 management address causes OpenShift Ansible playbooks to fail |
1580489-1 | 2-Critical | BT1580489 | BE2 GCI interface training issue results in failure to process networking traffic |
1576241 | 2-Critical | K000139293, BT1576241 | Duplicate MAC on different tenants |
1575925 | 2-Critical | BT1575925 | Running 'show system aaa primary-key state status' while a key migration is in progress can cause key migration errors |
1549521-1 | 2-Critical | BT1549521 | VQF and VoQs fail to synchronize after system controller reboot |
1538277-1 | 2-Critical | BT1538277 | Duplicate Service-Instance IDs for L2FwdSvc causes L2 entries to not be forwarded to all blades |
1536413-1 | 2-Critical | BT1536413 | Allowed-ips allowed-ip <name> is not accepting the '-' in the names |
1505589 | 2-Critical | K000139300, BT1505589 | Subject-Alternative-Name (SAN) feature now supports client-side SSL Validation |
1498009 | 2-Critical | BT1498009 | Learned L2 entries in data-plane L2 forwarding table may disrupt some traffic flows between tenants |
1497657-1 | 2-Critical | BT1497657 | First SSH login after editing remote RADIUS or TACACS+ user privileges will still apply old privileges |
1496977-2 | 2-Critical | BT1496977 | Remote GID mappings to F5OS roles are disconnected for TACACS+/RADIUS authentication methods. |
1494945-2 | 2-Critical | BT1494945 | ConfD Application Error when tenant interface stats are not available |
1472373 | 2-Critical | BT1472373 | Failure of BX110 10G Links to recover after going DOWN |
1462329 | 2-Critical | BT1462329 | CC takes time to come up after reboot is triggered in active CC. |
1455725-1 | 2-Critical | BT1455725 | Partition go-standby command sometimes fails to change active instance |
1436153-2 | 2-Critical | BT1436153 | F5OS upgrades fail when SNMP configuration contains special characters. |
1429741-3 | 2-Critical | BT1429741 | Appliance management plane egress traffic from F5OS-A host going via BIG-IP Next tenant management interface instead of host management when both are in same subnet |
1429713 | 2-Critical | BT1429713 | VELOS ATSE v7.10.4.12 firmware |
1410229 | 2-Critical | BT1410229 | Display a GUI warning to let user know tenants might be affected/reboot★ |
1410225 | 2-Critical | BT1410225 | Enhanced the upgrade prompt for better understanding the impacts of upgrade on tenants |
1408369-1 | 2-Critical | BT1408369 | The "MAC exhaustion" error message during tenant creation may be caused by configuration processed during startup initialization★ |
1400221-2 | 2-Critical | BT1400221 | OpenTelemetry exporters may not produce data upon first tenant being added to system |
1400125 | 2-Critical | BT1400125 | Non-patch version of orchestration may start on controller after RMA replacement or rolling upgrade. |
1389001 | 2-Critical | BT1389001 | Controller upgrade failed with certificate bundle |
1388525 | 2-Critical | BT1388525 | Partition configuration database locks up, preventing database changes |
1379565-2 | 2-Critical | BT1379565 | Observing QKView start from 100% and then going back to 1% |
1378805-2 | 2-Critical | BT1378805 | Error occurs when changing LAG type for an existing LAG interface on webUI |
1365985-1 | 2-Critical | BT1365985 | GID role mapping may not work with secondary GID |
1355277-1 | 2-Critical | BT1355277 | Incorrect Vlan Listeners when a Static FDB is configured |
1353649-1 | 2-Critical | BT1353649 | System controller can configure an invalid chassis network prefix |
1342129-1 | 2-Critical | BT1342129 | Issues with liveness probe during tenant deploy/re-deploy causing incorrect identification of container health status |
1332781-1 | 2-Critical | BT1332781 | A remote user with the same username as the local F5OS user will be granted the local user's roles |
1330797 | 2-Critical | BT1330797 | Interfaces removed from LACP trunk due to traffic congestion |
1330793 | 2-Critical | BT1330793 | Interfaces removed from LACP trunk due to traffic congestion |
1325893-5 | 2-Critical | BT1325893 | A vqfdm system software core file is occasionally observed on system reboot |
1315041-1 | 2-Critical | BT1315041 | Partition config-restore failed after reset-default-config is performed★ |
1304921-1 | 2-Critical | BT1304921 | F5OS file download API does not work with basic authentication |
1304765-4 | 2-Critical | BT1304765 | A remote LDAP user with an admin role is unable to make config changes through the F5 webUI |
1300749-1 | 2-Critical | BT1300749 | Syslog target files do not use the hostname configured via system user interface. |
1296997-3 | 2-Critical | BT1296997 | Large core files can cause system instability |
1196813-3 | 2-Critical | BT1196813 | Adding or removing nodes from a running BIG-IP tenant instance can cause data plane and management IP access issues |
1126865 | 2-Critical | BT1126865 | F5OS HAL lock up if the LCD module is not responding. |
1047689-5 | 2-Critical | BT1047689 | Sw_rbcast core file found on system |
1018557-1 | 2-Critical | BT1018557 | On system controller failover, tenant mgmt IP's may be unreachable for several minutes. |
1696269-1 | 3-Major | BT1696269 | If partition confd initiates a failover due to a health fault, it may incorrectly attempt to fail over repeatedly |
1695589-1 | 3-Major | BT1695589 | Data-plane links are bounced on HA failover |
1670437-1 | 3-Major | BT1670437 | Jumbo frames with an IP length greater than 9174 bytes may be dropped |
1644293 | 3-Major | BT1644293 | Interface status alert and SNMP trap is not sent immediately after interface is disabled |
1644185-1 | 3-Major | BT1644185 | DAG State table is not cleaned when a tenant is deleted or moved to configured/provisioned |
1627541-1 | 3-Major | BT1627541 | System Controller unexpected failover in auto mode due to unhealthy SwitchD |
1624665-4 | 3-Major | BT1624665 | ConfD state data shows key and certificate configured for secure (mTLS) even after deleting from config |
1624449-2 | 3-Major | BT1624449 | SNMP polling of coreTotal5minAvg causing timeouts and genErrors |
1623761 | 3-Major | BT1623761 | After cleaning up disk due to disk space full error, tcpdump program still detects the disk as full and aborts |
1623101-2 | 3-Major | BT1623101 | External OTEL server receives log data for both the platform and event logs, even if only one of them has been configured |
1615969-4 | 3-Major | BT1615969 | Tenant operational data is not getting updated properly after upgrade |
1615917-1 | 3-Major | BT1615917 | L2_agent crashed due to SNMP★ |
1612217-1 | 3-Major | BT1612217 | A large amount of SPVA DoS allow list entries can overload DMA-Agent causing a tenant to fail to pass traffic |
1612101-2 | 3-Major | BT1612101 | When vCPU cores configuration changed for BIG-IP Next tenant, RRD stats shows both the old and new CPU data stats |
1598937 | 3-Major | BT1598937 | SNMP traps are not always sent★ |
1598509-2 | 3-Major | BT1598509 | iHealth client can occasionally throw a core file |
1593385 | 3-Major | BT1593385 | F5OS Tenant Throughput (bits/packets) and TMM CPU usage higher than expected until VLAN is added or removed |
1592221 | 3-Major | BT1592221 | A partition's internal bridge IP address is not detected correctly if there is a missing partition ID in the list of partitions. |
1591585 | 3-Major | BT1591585 | Sshd, httpd, rsync crashes with bunch of whitespaces in /etc/hosts file |
1591549-1 | 3-Major | BT1591549 | Support for case-insensitive LDAP username lookup |
1591069 | 3-Major | BT1591069 | Blades may fail to get marked as InCluster in "show cluster" output after rolling upgrade |
1590425 | 3-Major | BT1590425 | Adding blade to openshift cluster can fail with ansible error |
1588093-1 | 3-Major | BT1588093 | Forwarding host log files to remote targets |
1587837 | 3-Major | BT1587837 | Memory leak in multiple components |
1586893 | 3-Major | BT1586893 | Metrics server pod on system controller can exit and not be restarted |
1586773 | 3-Major | BT1586773 | BX520 Internal FPGA links can fail to come UP during initialization |
1585853 | 3-Major | BT1585853 | Telemetry streaming pauses if mgmt-ip gets updated |
1585749-1 | 3-Major | BT1585749 | Including lspci commands in QKView capture |
1585237-2 | 3-Major | BT1585237 | When telemetry exporter is not reachable, logs to enable send_queue or retry will be printed in platform.log |
1583233-1 | 3-Major | BT1583233 | The 'show portgroups' command may not display DDM statistics, or may display stale/out-of-date DDM statistics |
1582553-1 | 3-Major | BT1582553 | The 'components component state' data is not displayed in ConfD. |
1580349-1 | 3-Major | BT1580349 | Loading backup file with partition ID 1 that is not named "default", throws an error★ |
1580165-1 | 3-Major | BT1580165 | Removing a failed patch ISO can remove base services imported from a different ISO★ |
1579453-1 | 3-Major | BT1579453 | SAN Validation Mismatch: Key/Cert virtual server No Key Configured |
1575585 | 3-Major | BT1575585 | Unable to add blade to Openshift cluster if newly-installed blade is not member of active partition |
1573493-1 | 3-Major | BT1573493 | Qkview does not collect the files gid-map.txt, /etc/libnss-udr/passwd, or /etc/libnss-udr/group |
1572929-2 | 3-Major | BT1572929 | Changing remote authentication methods from RADIUS/TACACS to LDAP may break remote-gid functionality. |
1572489-1 | 3-Major | BT1572489 | User accounts with username which includes only numeric values or special characters like "." or ".." or starts with '-' are inactive |
1572137-1 | 3-Major | BT1572137 | Upload/Download API should work with '/api' and '/restconf' |
1560533 | 3-Major | BT1560533 | Inconsistent case values (upper and lower case) for different F5OS-C SNMP OIDs |
1559509 | 3-Major | BT1559509 | Incorrect displayed state of blade internal data link |
1558505 | 3-Major | BT1558505 | After restarting the fpgamgr service, the last service-instance is not processed |
1556173 | 3-Major | BT1556173 | Poor management backplane link performance on system controller failover |
1555457 | 3-Major | BT1555457 | System controller failover may take up to 60 seconds |
1552945-1 | 3-Major | BT1552945 | Tenant images renamed with bracket are not supported★ |
1552721 | 3-Major | BT1552721 | Partition ipv6 managent address is not reachable after a partition switchover |
1552369 | 3-Major | BT1552369 | F5OS-C: Partition volume cannot be removed if an active shell in that directory |
1550413 | 3-Major | BT1550413 | System events visible in the CLI may not be visible in the GUI |
1549753-1 | 3-Major | BT1549753 | System telemetry exporter send queue and retry settings are causing memory issues |
1549549 | 3-Major | BT1549549 | Blades in the "none" partition may cause kubernetes services to fail. |
1538217-1 | 3-Major | BT1538217 | View fpgamgr core file after partition shutdown |
1519869-1 | 3-Major | BT1519869 | BIG-IP tenant reports blank interface |
1505221-1 | 3-Major | BT1505221 | If accidentally import bad ISO images, it may not removed automatically |
1497349 | 3-Major | BT1497349 | Support for SSH-RSA host key algorithm for partitions added in non-fips mode |
1496893 | 3-Major | BT1496893 | Third etcd instance can get into an error state on controller upgrade from 1.5.1 to 1.6.1 |
1496397-2 | 3-Major | BT1496397 | Allowing entry of a Subject-Alternative-Name (SAN) for certificate and CSR creation |
1494809-1 | 3-Major | Allowing user to configure HostKeyAlgorithms parameters | |
1492621-4 | 3-Major | BT1492621 | Config-restore fails when backup file has expiry-status field for admin or root user |
1492401-1 | 3-Major | BT1492401 | User with operator role is not having read-access to all pages |
1490753-2 | 3-Major | BT1490753 | A linkUp and linkDown traps are sent when an up interface is disabled, and vice versa |
1488225 | 3-Major | BT1488225 | Partition dagd cores during system startup |
1486697-2 | 3-Major | BT1486697 | Configuring Expiry-status of root and admin users should not be allowed |
1474833 | 3-Major | BT1474833 | Debug output is missing from qkview |
1472917-1 | 3-Major | BT1472917 | LDAP authenticated admins logging in via the serial console may have trouble disabing appliance mode during system instability |
1469385-2 | 3-Major | BT1469385 | GUI freezes during LDAP user authentication if no remote GID mapped locally. |
1466397 | 3-Major | BT1466397 | LDAP authentication is consuming several minutes to authenticate via GUI and SSH. |
1461289 | 3-Major | BT1461289 | On a rSeries appliance, config-backup proceed is broken |
1455913-4 | 3-Major | BT1455913 | Tcpdump on F5OS does not honor the -c flag |
1455769 | 3-Major | BT1455769 | Slow execution of ansible-playbooks on cluster reinstall caused timeouts and retries for many hours. |
1429721-2 | 3-Major | BT1429721 | SCP as non-root user does not report errors correctly for bad/non-existent files. |
1411137-2 | 3-Major | BT1411137 | Audit log entries are missing when creating or deleting objects via UI or API |
1410729 | 3-Major | BT1410729 | VELOS backplane packet priority issue |
1410609 | 3-Major | BT1410609 | Watchdog resets during PSU management may cause AOM/LOP to remain in bootloader mode |
1408477-1 | 3-Major | BT1408477 | When more than one PCIe AER error has occurred, diag-agent reports this as a "RAS AER 'unknown' error" instead of the individual AER errors. |
1403817 | 3-Major | BT1403817 | SNMP IF-MIB misreport the status and speed of LACP LAGs |
1401621-1 | 3-Major | BT1401621 | Modifying a remote server with multiple selectors from the web UI removes the AUTHPRIV configuration. |
1400557-1 | 3-Major | BT1400557 | Incorrest slot info may cause blade backplane link errors |
1399757 | 3-Major | BT1399757 | SNMP ifTable data missing for some interfaces when ports unbundled |
1397145-3 | 3-Major | BT1397145 | Unable to add blade to Openshift cluster if VELOS partition root password is expired or locked |
1394993 | 3-Major | BT1394993 | Upon configuration changes, the l2-agent container restarts with a core. |
1394913 | 3-Major | BT1394913 | Rare LACPD crash during process termination |
1394201 | 3-Major | BT1394201 | Vcc-lacpd can intermittently core dump when disconnected from system database |
1393269-2 | 3-Major | BT1393269 | Error log: "PINGLOOP Failed to ssh to 127.0.0.1" |
1381737-1 | 3-Major | BT1381737 | On VELOS, utils-agent generates "item is not writable" errors every fifteen minutes |
1381661-1 | 3-Major | BT1381661 | LDAP external authentication fails if there is no group definition for user's primary GID |
1381277-1 | 3-Major | BT1381277 | Most recent login information is not displayed in F5OS webUI |
1381057-2 | 3-Major | BT1381057 | Opening and closing preview pane is causing the page scrollbar to disappear on View Tenant Deployments screen |
1379625-3 | 3-Major | BT1379625 | Changing the max-age attribute in password policy is not reflecting immediately |
1377945-2 | 3-Major | BT1377945 | Controller Upgrade Failure Reported by ConfD★ |
1366417-1 | 3-Major | BT1366417 | Long BIG-IP tenant names will cause not having virtual console access |
1366157-2 | 3-Major | BT1366157 | Warning needed about creating tenant with same name as existing user account name |
1365977-1 | 3-Major | BT1365977 | Container daemons running as PID 1 cannot be cored on-demand |
1360905-1 | 3-Major | BT1360905 | Unexpected log messages in /var/log/boot.log post-integrity recovery |
1360137-2 | 3-Major | BT1360137 | Non-root users unable to download or pull qkview/pcap/core files via SCP |
1359933 | 3-Major | BT1359933 | System controller fails over when mgmt ports are aggregated |
1354697 | 3-Major | BT1354697 | Stale trunk data after trunk deletion |
1354341-1 | 3-Major | BT1354341 | Changing a VLAN from trunked (tagged) to native (untagged) on a LAG in a single transaction can cause traffic outage |
1354329-3 | 3-Major | BT1354329 | Unable to access tenant through console access. |
1353985 | 3-Major | BT1353985 | Controller-manager pods fail to start with status of CrashLoopBackOff |
1353085-1 | 3-Major | BT1353085 | Configure admin/operator roles in LDAP without uidNumber or gidNumber attributes |
1352845-3 | 3-Major | BT1352845 | Some internal log content may not appear in external log server |
1352449-3 | 3-Major | BT1352449 | iHealth upload is failing with error "certificate signed by unknown authority" |
1352353 | 3-Major | BT1352353 | Remove integrity-check configurable option from CLI |
1351893-3 | 3-Major | BT1351893 | ConfD Logging 'Failed to change working directory' Error Message |
1351541-1 | 3-Major | BT1351541 | Unable to remove the ISO images that share the same minor version with the running version |
1349977-2 | 3-Major | BT1349977 | Setup wizards fails and immediately exits if it is given incorrect credentials. |
1349953-2 | 3-Major | BT1349953 | Setup wizard script gives an "All IP addresses must be unique" error when NTP and DNS servers match |
1349465 | 3-Major | BT1349465 | Partition s/w upgrade compatibility check doesn't use correct target version |
1348989-1 | 3-Major | BT1348989 | GUI virtual server CLI has different limitations for days-valid |
1348093-1 | 3-Major | BT1348093 | Appliance-setup-wizard traceback on invalid NTP input |
1341521-2 | 3-Major | BT1341521 | Incorrect subnet mask returned for GET call for /systems |
1338521-1 | 3-Major | BT1338521 | Unable to login when accessing F5OS GUI through a network proxy on a port other than 443. |
1329797-1 | 3-Major | BT1329797 | RADIUS user logs in through the WebUI without configuring the F5-F5OS-UID, will be disconnected after 10 minutes |
1329449 | 3-Major | BT1329449 | Missing days-valid, store, and key type logging items of a certificate |
1329161-2 | 3-Major | BT1329161 | In non-FIPS mode, added support for the SSH-RSA host key algorithm |
1326125-1 | 3-Major | BT1326125 | RADIUS authentication fails if F5-F5OS-HOMEDIR attribute is not specified |
1319613-1 | 3-Major | BT1319613 | Sluggishness in SSH access to system on VELOS system controllers |
1316097 | 3-Major | BT1316097 | LAGs not programmed when adding VLAN to LAG |
1315425 | 3-Major | BT1315425 | Manual Configuration of FEC for 25G ports |
1314593 | 3-Major | BT1314593 | The snmp table F5-PLATFORM-STATS-MIB::platformMemoryStatsTable is not available on a partition. |
1307577-1 | 3-Major | BT1307577 | Add more resilience to the file download API |
1307565-1 | 3-Major | BT1307565 | The file download API is not working with the x-auth-token header |
1305005-1 | 3-Major | BT1305005 | Error handling in F5OS file-download API |
1304749-1 | 3-Major | BT1304749 | Implements duplicate port check and fix logic on standby controller |
1304085 | 3-Major | BT1304085 | Unable to set local user's password if the same user exists on a remote LDAP server |
1297357-4 | 3-Major | WebUI authentication does not follow best practices in some situations | |
1295141 | 3-Major | BT1295141 | Ability to change SNMPD listening port |
1294561-1 | 3-Major | BT1294561 | When OCSP is disabled, configurations are not accurately shown outside of 'config' mode |
1293249-1 | 3-Major | BT1293249 | AAA server group Port and Type are not displayed on ConfD |
1291513-1 | 3-Major | BT1291513 | Some log messages/timestamps do not observe configured timezone |
1289861-1 | 3-Major | BT1289861 | Ability to suppress the proceed warning generated when portgroup mode is changed |
1288765-1 | 3-Major | BT1288765 | Provide ability to manage services through systemd/docker commands from F5OS CLI |
1287245 | 3-Major | BT1287245 | DAGD component crashes during live upgrade or downgrade |
1286153-1 | 3-Major | BT1286153 | Error logs while generating the qkview |
1282185 | 3-Major | BT1282185 | Unable to restore backup file containing expired TLS certificate |
1277429 | 3-Major | BT1277429 | Operational and Configurational prompts do not persist through user sessions |
1272469 | 3-Major | BT1272469 | FPGA update status in ConfD may show error even though it was successful |
1271417 | 3-Major | BT1271417 | VELOS system controller fails to PXE boot when network-range-type is RFC1918 |
1268433-1 | 3-Major | BT1268433 | Some firewall rules do not generate denial logs |
1251957-1 | 3-Major | SNMP OIDs to monitor serial number of the device, type of hardware and hostname | |
1251161-3 | 3-Major | BT1251161 | Authentication fails via the webUI when “:” is at the end or beginning of the password |
1233865 | 3-Major | BT1233865 | Memory capacity and utilization details are confusing / misleading |
1229465-1 | 3-Major | BT1229465 | QKView is not collecting core files in /var/crash |
1224261-1 | 3-Major | BT1224261 | Chassis internal controlplane and mgmtplane traffic outage during failover and controller reboot. |
1211233-5 | 3-Major | BT1211233 | F5OS dashboard in webUI displays the system root file system usage, not the entire disk |
1204985-1 | 3-Major | BT1204985 | The root-causes of F5OS upgrade compatibility check failures are hidden in /var/log/sw-util.log. |
1196417-2 | 3-Major | BT1196417 | First time user SSH session is getting closed after password change |
1189057-1 | 3-Major | BT1189057 | LACPD fails to read system-priority at container starting time |
1188825-1 | 3-Major | New role named "user" with read-only access to non-sensitive system level data | |
1188069-1 | 3-Major | BT1188069 | F5OS installer does not indicate progress or completion state |
1181929-1 | 3-Major | BT1181929 | F5OS install may partially fail, leaving system with mismatched OS and services★ |
1166313 | 3-Major | BT1166313 | QKView now collects data from unassigned but active blades |
1162341-1 | 3-Major | BT1162341 | Front panel interface status is not reported in alarms or events |
1141573-1 | 3-Major | BT1141573 | ConfD management IP configuration command DHCP shows unusable extra options which might confuse user |
1137413 | 3-Major | BT1137413 | F5OS prompt parses \t incorrectly |
1136557-4 | 3-Major | BT1136557 | F5OS config restore fails if .iso or components vary between two devices. |
1135021-2 | 3-Major | BT1135021 | F5OS config-restore with an incorrect primary-key does not produce a warning |
1124809-1 | 3-Major | BT1124809 | Add or improve the reporting status of imported images |
1096341-3 | 3-Major | BT1096341 | During ISO import, the size was incorrectly displayed as 1 |
1069365-1 | 3-Major | BT1069365 | Error shown when configuring known-host for file transfer when FIPS mode is enabled` |
1679941-2 | 4-Minor | BT1679941 | "gen error" while running snmpget/snmpbulkget commands |
1591553 | 4-Minor | BT1591553 | Including /etc/resolv.conf and /etc/hosts files in QKView capture |
1505293 | 4-Minor | BT1505293 | Partition image removal message is truncated |
1401965 | 4-Minor | BT1401965 | Copying BIG-IP ISO to /var/import/staging/, leaves ISO loopback mounted |
1399929 | 4-Minor | BT1399929 | F5OS permits non-existent ethernet interfaces to be configured |
1393441 | 4-Minor | BT1393441 | Partition fails over on link fault when mgmt ports are aggregated |
1367041 | 4-Minor | BT1367041 | Import of a system controller image fails on standby system controller during removal★ |
1353429 | 4-Minor | BT1353429 | False indication of Always-On Management (AOM) Power-On Self-Test (POST) failure for I2C1 interface |
1298865-2 | 4-Minor | BT1298865 | Upgrade compatibility issue from 1.6.0-A to 1.7.0-A, 1.6.0-C to 1.8.0-C and 1.7.0-C to 1.8.0-C |
1297349-3 | 4-Minor | Tightening controls on uploading files to F5OS | |
1186781 | 4-Minor | BT1186781 | "Warning: Invalid HW_TYPE_MINOR: 01." is observed in BIOS banner during the controller restart |
1185805 | 4-Minor | BT1185805 | The "test media" option during USB install may be interrupted by the hardware watchdog |
1161117 | 4-Minor | BT1161117 | DNS warning on cluster status is ambiguous |
1148177 | 4-Minor | BT1148177 | Add MAC Address to "show system mgmt ip" Command |
1147673-1 | 4-Minor | BT1147673 | Downloading QKViews directly from the System Reports screen. |
1128633 | 4-Minor | BT1128633 | Failed upload entries displayed under CLI file transfer-operations |
1121921-2 | 4-Minor | BT1121921 | Common name for setup-wizard tool across platforms |
Cumulative fix details for F5OS-C v1.8.1 that are included in this release
1933721-2 : Interface remain down in F5OS after removing and reinserting SFP modules
Links to More Info: BT1933721
Component: F5OS-C
Symptoms:
After SFPs are removed and reinserted in a VELOS blade, the interface will remain down in F5OS until the blade is rebooted. The peer switch may report the interfaces as having a link.
Conditions:
- VELOS chassis running F5OS-C 1.8.0
- SFPs in blade are removed and reinserted.
Impact:
F5OS interfaces remain reported as operationally down until the blade is rebooted.
Workaround:
After SFP modules are removed and reinserted on a blade, reboot the blade.
1926829-2 : When attributes are added under exporters for Open Telemetry, the keys are not visible on webUI.
Component: F5OS-C
Symptoms:
When attributes are added under exporters for Open Telemetry, the keys are not visible on webGUI.
If any exporters have existing attributes and we try to edit the exporter from webUI, the attributes get deleted.
Conditions:
Adding or updating attributes to an open telemetry exporter through webUI.
Impact:
New attributes created under exporters don not have their keys visible on the webUI.
Editing the exporter from the webUI will delete existing attributes.
Workaround:
Adding attributes to exporters or updating existing exporters can be done from the CLI.
1921793-1 : Health summary is not reported for some nodes in controller and partition ConfD
Component: F5OS-C
Symptoms:
System health summary is missing for some nodes.
Conditions:
It is observed when iso is upgraded to 1.8.1 branch
Impact:
System health summary is not reported for some nodes. It throws error while fetching summary.
Workaround:
None
Fix:
Updated Node tag in components properly. Since GET:health api is fixed in diag-agent, Show system health summary reports etails properly for all nodes.
1920325-1 : The network-manager container crashes when it fails to create FDB entry in database
Links to More Info: BT1920325
Component: F5OS-C
Symptoms:
Network-manager container crashes.
Conditions:
The issue may occur when there is an upgrade/downgrade, tenant creation/deletion, or reset/restore the database.
Impact:
The network-manager container will restart.
Workaround:
None
Fix:
The network-manager will not crash when it fails to create FDB entry in database.
1891301-2 : CVE 2020-27743: pam_tacplus through 1.5.1 lacks a check for a failure of RAND_bytes()/RAND_pseudo_bytes().
Component: F5OS-C
Symptoms:
libtac in pam_tacplus through 1.5.1 lacks a check for a failure of RAND_bytes()/RAND_pseudo_bytes(). This could lead to use of a non-random/predictable session_id.
Conditions:
The current version pam_tacplus from version 1.6.0 doesn't have the fix as this was added in version 1.6.1 source package.
Impact:
This could lead to use of a non-random/predictable session_id which means an adversary could gain access.
Workaround:
N/A
Fix:
By updating the pam_tacplus source code to 1.7.0 where the vulnerability was fixed in 1.6.1, the new code does not have this issue.
1890297-1 : Memory leak in l2_agent daemon on F5OS
Links to More Info: BT1890297
Component: F5OS-C
Symptoms:
- Large memory consumption by the l2_agent.
- Tenant disruption on rSeries appliance
Conditions:
- An F5OS system with SNMP configured and a LAG (Link Aggregation Group) with more than 1 member.
- SNMP monitoring in use.
We can check the l2_agent memory consumption by using `top` command.
Ex: Top output showed a 15GB l2_agent process:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
19454 root 20 0 17.3g 14.9g 1788 S 0.0 5.9 174:04.57 /confd/bin/l2_agent -s appliance-1
Impact:
Eventually the system experience OOM (Out of Memory).
On an rSeries appliance, a tenant might experience disruptions and slowness, up to and including a TMM SIGABRT core.
Workaround:
Restart the L2_agent process on the F5OS host.
On an rSeries appliance, one of the following:
- as root, run the command 'docker restart system_L2'
- as an administrative user from the F5OS CLI, enter config mode and then run 'system diagnostics os-utils docker restart node platform service system_L2'
On a VELOS system, restart the partition<ID>_L2 instance on each controller. Log into the system controller as an administrative user, enter config mode, and then run:
system diagnostics os-utils docker restart node controller-1 service partition1_L2
system diagnostics os-utils docker restart node controller-2 service partition1_L2
(Adjust 'partition1' to the partition ID for the partition in question)
Fix:
The l2_agent process no longer leaks memory.
1889913-2 : VELOS partition Allowed IP rule restrictions
Component: F5OS-C
Symptoms:
Configuring a "system allowed-ips allowed-ip" rule in a VELOS partition without limiting it to a specific port (22, 80, 443, 7001-7032, 8888, or SNMP) allows access to more than expected.
Conditions:
- VELOS partition
- Allowed IP Address rules configured to use port 'All' in the GUI (or that do not specify a specific port when configured via the CLI)
Impact:
Permits more traffic than it should.
Workaround:
Instead of creating an 'All' port allowed IP address rule, create individual rules for each specific port.
Fix:
Expected behavior is performed when using port 'All'.
1871517 : CVE-2017-18342 PyYaml arbitrary code execution from untrusted data
Links to More Info: K000139901
1850481 : Standby tenant is unreachable after F5OS partition upgrade to 1.7.x or higher.
Links to More Info: BT1850481
Component: F5OS-C
Symptoms:
- The `tmsh show net arp` may show arps with an unknown status.
- The confd CLI `show dag-states` command shows dag tables consisting of only zeros.
Conditions:
* Multi-slot tenant in a device group
* Connection mirroring enabled
* Upgrade F5OS partition from 1.6.x to 1.7.x or greater
Impact:
Standby tenant is inaccessible.
Workaround:
None
Fix:
This issue has been fixed in F5OS partition upgrades to 1.7.x or higher.
1850165-1 : Missing internal interface pgindex field causes l2-agent to restart★
Component: F5OS-C
Symptoms:
Upon upgrade from 1.1 -> 1.6 -> 1.8, l2-agent on blade will exit due to interface data mismatch. This mismatch happens because the pgindex hidden leaf is missing from cdb, but the l2-agent on blade expects it.
Conditions:
Chain upgraded from 1.1 -> 1.6 -> 1.8. Version 1.8 is the version where l2-agent added more logic to check interface data inconsistency.
Impact:
Dataplane is not functioning.
Workaround:
Work around is to delete the blades from the partition and re-add them. This will require user to reconfigure interface data (vlans, lag members).
Fix:
With this fix, the upgrade into 1.8 will work as expected, and l2-agent on the blade will find matching interface data.
1826209-1 : Error log does not contain all needed information.
Links to More Info: BT1826209
Component: F5OS-C
Symptoms:
An "Interface data differ" log is logged by l2-agent, but all of the compared fields in the log message are identical.
Conditions:
L2-agent logs an error message that the interface data differs.
Impact:
The lack of some data such as interface type and slot ID in the log entry makes troubleshooting more complex.
Workaround:
Save the backup configuration file, and inspect the file for hidden fields. For example, search for pgindex under the interface entry.
Fix:
With this change, the log ERROR will display all required data.
1819289-2 : Zero is not allowed as Prefix Length for allowed IPs
Links to More Info: BT1819289
Component: F5OS-C
Symptoms:
It is not possible to save a prefix length with a value of ‘0’.
Conditions:
Prefix Length value is configured to '0'.
Impact:
Allowed IPs cannot be created with prefix value '0'.
Workaround:
Works from CLI.
Fix:
Fixed to accept '0' as prefix length value.
1818185 : The meaning of the interface phyport internal field changed from phyport to DID. This will break functionality that is using phyport★
Links to More Info: BT1818185
Component: F5OS-C
Symptoms:
Two different symptoms surfaced after the meaning of phyport changed:
- reset-counters for an specific interface
- lag members association with a lag in fpgamgr
Conditions:
You configure LAGs while running 1.6.2, with LAG members spanning multiple blades (e.g. blade-1 and blade-3), and then perform a live upgrade to 1.8.0 and above.
You attempt to reset counters for a specific interface (e.g. 1/1.0) after a live upgrade from 1.6.2 into 1.8.0 and above.
Impact:
For the lag members mismatch, the data packets will be forwarded to the wrong port.
For the reset counters, the counters are not reset for the specified interface.
Workaround:
For the reset-counters issue, execute reset counters for all interfaces. This will clean up the counters.
For the LAG member mismatch, reboot the blades after live install ends successfully.
Fix:
The fix addresses the mismatch in interface phyport values.
This will allow reset counters per interface to work.
This will allow the lag members to be properly handled after live upgrade.
1817669-1 : Timeout for the Ansible playbook during cluster install cannot be retried.
Component: F5OS-C
Symptoms:
If there are other issues on the chassis that cause the ansible playbooks to run slowly during Kubernetes cluster install, the playbook cannot be retried correctly if it reaches timeout.
Conditions:
This can occur, if there are other issues on the chassis that cause the ansible playbooks to run slowly, such as DNS or remote auth issues when a Kubernetes cluster rebuild is executed.
Impact:
The Kubernetes cluster install may fail repeatedly because it will not correctly recognize the timeout, and raise the amount of time it will wait.
Workaround:
Mitigation is resolve the issue(s) causing the playbooks to run slowly. This may involving removing bad DNS servers or remote auth servers that may be causing the slow down.
Fix:
The orchestration-manager code has been updated to correctly recognize the timeout error, and handle it correctly.
1814073-1 : F5OS chassis switchd core dump
Component: F5OS-C
Symptoms:
The switchd process experiences crashes that generate core dumps.
Conditions:
These crashes are typically observed during certain interface queries or other operations involving statistics updates.
Impact:
The switchd process crashes and generates core files. Temporary service disruptions may occur for functionalities reliant on the switchd process.
Workaround:
None
Fix:
This issue has been fixed, ensuring switchd includes proper handling for TMSTAT query.
1814053-2 : Orchestration Agent process may core
Component: F5OS-C
Symptoms:
User may see the Orchestration Agent process on the active controller core.
Users may find logs with message ID 0x401000000000027. They shall look like the following.
network-manager[1]: priority="Err" version=1.0 msgid=0x401000000000027 msg="Failed to parse ZMQ message header" session_id="<sessionid>".
Conditions:
N/A
Impact:
Tenants that are currently running are not affected. New tenants that are deployed when this condition is occurring may be delayed in coming to a running state.
Workaround:
None
Fix:
The Orchestration Agent behaves correctly.
1814045-2 : Daemons that handle ZMQ messages may crash under certain conditions.
Component: F5OS-C
Symptoms:
Certain allowed IP configurations on partitions may cause crash scenarios when handling zmq messages
Conditions:
Configurations that allow all IPs as part of Allowed IP settings.
Impact:
With the said condition, handling certain ZMQ messages may cause daemon crash scenarios.
Workaround:
Avoid using allow all IPs as part of Allowed IP configuration.
Fix:
Daemons no longer crash in partitions while handling zmq messages
1789417-1 : Component fpgamgr in restart loop with segmentation fault after failed FPGA firmware update
Links to More Info: BT1789417
Component: F5OS-C
Symptoms:
Component fpgamgr experiences segmentation fault after failed FPGA firmware update and persists in a reboot loop. The CLI command "show cluster nodes node state platform fpga-state" indicates that FPGA_STATE persists in FPGA_INIT and never reaches the state FPGA_RDY.
Conditions:
FPGA firmware update fails and one or more of the FPGA devices does not show up on the PCI bus. This causes a FPGA SDK segmentation fault upon the fpgamgr component startup, and perpetual reboot loop so long as the FPGA issue persists.
Impact:
A failure in the FPGA firmware update process results in one or more FPGA devices not being detected on the PCI bus. This, in turn, causes a segmentation fault in the FPGA SDK upon the startup of the fpgamgr component, leading to a continuous reboot cycle until the FPGA issue is resolved.
Workaround:
None. Perpetual reboot loops after trying to load FPGA firmware that do not recover typically indicates a hardware error and requires an RMA.
Fix:
One of the BARs fails to initialize when the PCIE speed does not load at the intended Generation. This issue was causing a segmentation fault in the SDK, but it has now been resolved by having the SDK notify the fpgamgr of the missing BAR instead. While the device may still fail to load, the fpgamgr will no longer experience a crashing loop as a result.
1789141-2 : If 'ldap-group is configured for a role but LDAP search fails, users with the default GID for the role can still get those privileges
Component: F5OS-C
Symptoms:
When an 'ldap-group' mapping is configured for a F5OS role, and the mapping fails (because the filter is invalid or the LDAP query of remote groups fails for some other reason), the default mapping for the role (or, what is configured in 'remote-gid' for the role) is still used.
For example, if you were attempting to map the F5OS role 'admin' (default GID 9000) to an LDAP group 'CN=my-ldapgroup', and the LDAP search for that group failed (because the provided filter was invalid, the group does not exist, etc.), users with GID 9000 would still be able to authenticate and login with 'admin' privileges.
Conditions:
1. LDAP authentication is enabled.
2. A role mapping is applied via the 'ldap-group' configuration for a F5OS role.
3. The provided 'ldap-group' filter is invalid or another unexpected issue is encountered when querying the LDAP server.
Impact:
Users can login with privileges in excess of what one might expect given the system configuration.
Workaround:
If the LDAP group/users have Posix attributes ('gidNumber'), it is possible to map the F5OS role using this GID number by specifying it in the 'remote-gid' configuration under the role.
If this is not feasible, it is possible to directly validate the 'ldap-group' mapping was successful by inspecting this file from a bash shell:
[root@appliance-1(test):Active] ~ # cat /etc/ldap-gid-map.txt
1108:=9000
If there is an entry that has the default GID for the role on the right-hand side of ':=' in this file, it means the mapping was applied successfully and users with the default or 'remote-gid' GID will not be able to obtain the role permissions. If such an entry is missing, you will need to fix the 'ldap-group' filter so an LDAP query of the group can be successful.
Fix:
If a configured 'ldap-group' mapping fails, deny all role-based access for the mapped role until it is fixed or de-configured.
1789125-1 : VQF VOQ entries missing for the functional blades in the show fpga-tables output
Component: F5OS-C
Symptoms:
Blade 13 is in faulty state due to a different issue related to memory DIMMs.
For the show FPGA tables command there is output for VOQs corresponding to blades 1 and 11.
And in the vqf_voq_stat table output, the remaining VOQ stat requests starting from 13 do not return data although the tmstat table for some other blades are intact.
Conditions:
One of the intermediate blades from the list of show components is faulty and leads to skipping of processing the vqf voq stat requests for rest of the blades that are properly functional.
Impact:
Improper output for the 'show fpga-tables vqf_voq_stat' command.
Workaround:
None
Fix:
Added a code change to get the stats completion for rest of the functional blades when one of the blades is faulty.
1785621-1 : Tenant deployed with Max Memory available on system results in Resource allocation failed - Node is up but Platform services not responding
Component: F5OS-C
Symptoms:
Tenant fails to come to running state when deployed with max memory on system.
Conditions:
Tenant should be deployed with max-available memory on the blade in prior releases of F5OS-C 1.8.1 version.
Impact:
Tenant fails to come to running state.
Workaround:
Since the max memory available for tenants on blade is corrected in F5OS-C-1.8.1, the tenant memory should be configured accordingly.
Step 1. Move failed tenant to configured state and adjust the memory to the new max-available memory of the tenant.
Step 2. Move the tenant back to the deployed state.
Fix:
Max memory available on system for tenant deployment has been corrected with right value.
1783781 : Bash history file containing "PRIVATE KEY" may block qkview
Component: F5OS-C
Symptoms:
Qkview file generation gets stuck at zero percent complete:
# system diagnostics qkview status
result {"Busy":true,"Percent":0,"Status":"collecting","Message":"Collecting Data","Filename":"controller1.qkview.tar.gz"}
Subsequent attempts to generate a qkview fail with the result "Qkview capture can not be initiated. Another Qkview capture is already in progress"
Conditions:
-- Generating qkview
-- The bash history file is large and contains the text "PRIVATE KEY"
Impact:
Qkview files are not able to be collected
Workaround:
1. Run system diagnostics qkview cancel
2. mv ~/.bash_history ~/.bash_history.bak
3. Re-run qkview
Fix:
TBD
1782925-3 : Active Directory LDAP integration without uidNumber/gidNumber does not work after system reboot
Links to More Info: BT1782925
Component: F5OS-C
Symptoms:
After an rSeries appliance reboot, Active Directory LDAP authentication configured with "Unix Attributes" set to false does not work and users from Active Directory are unable to authenticate with the F5OS system.
There will be messages similar to the following logged in platform.log shortly after the reboot:
authd[8]: priority="Err" version=1.0 msgid=0x3901000000000101 msg="LDAP API error during : -" oper="SASL bind" code=-1 msg="Can't contact LDAP server".
authd[8]: priority="Warn" version=1.0 msgid=0x3901000000000098 msg="Unable to retrieve domain Sid for supplied servers and domains; server will be treated as if it has unix attributes present.".
Conditions:
- F5OS device configured with Active Directory LDAP authentication, and the "Unix Attributes" setting configured as false.
- System reboots
Impact:
LDAP remote authentication does not work.
Workaround:
To workaround this issue on an rSeries appliance, create a cron task to restart the system_user_manager and authentication-mgr docker containers after a system reboot:
1. Log into the system as root and create /etc/cron.d/ldap-post-reboot with these contents (not including the '==='):
===
# Workaround for post-reboot issue with LDAP auth (ID1782925)
#
# In the the first five minutes after the system reboots, assume the first
# instance of the following log message that we see is a result of the management
# port lack of connectivity when the docker containers start up, and restart both
# system_user_manager and authentication-mgr once.
#
# authd[8]: priority="Err" version=1.0 msgid=0x3901000000000101 msg="LDAP API error during : -" oper="SASL bind" code=-1 msg="Can't contact LDAP server".
@reboot root timeout 5m sh -c 'tail -n0 -F /var/F5/system/log/platform.log | grep -a -m1 authd.*0x3901000000000101 && sleep 20s && echo Restarting authd and user-manager && docker restart system_user_manager authentication-mgr' || echo "Timed out"
===
This mitigation may fail under some corner cases, e.g. potentially after an upgrade or if something goes wrong with the platform services such that they don't start up within the first five minutes after system boot. In those circumstances, log into the system as root and restart the system_user_manager and authentication-mgr containers:
docker restart system_user_manager authentication-mgr
1779677-1 : Multiple docker containers can get assigned the same bridge IP during rolling upgrade★
Component: F5OS-C
Symptoms:
Multiple containers can get the same bridge IP during a rolling upgrade or docker restart
[root@controller-2 ~]# docker inspect controller-services-registry-2502 | grep IPAddress
"SecondaryIPAddresses": null,
"IPAddress": "100.64.0.2",
"IPAddress": "100.64.0.2",
[root@controller-2 ~]# docker inspect partition-services-registry-2202 | grep IPAddress
"SecondaryIPAddresses": null,
"IPAddress": "100.64.0.2",
"IPAddress": "100.64.0.2",
Conditions:
When multiple containers start at the same time.
Impact:
This causes one of the two containers to answer requests depending on which container last refreshed the arp cache.
The other container does not work properly.
Workaround:
Reboot the system.
Fix:
Docker address allocator uses bit map to manage IP address pool but it's not thread safe.
Now, set/unset bitmap operations are protected by a lock.
1779465-1 : SwitchD core file observed after live upgrade
Component: F5OS-C
Symptoms:
Users may observe core files being generated on both controllers after a system live upgrade.
Conditions:
The occurrence of the core is non-deterministic, but it can happen after the live upgrade.
Impact:
When this issue occurs, the SwitchD process generates a core file on the controller.
Workaround:
Reboot the controllers after observing SwitchD core file on the controller.
Fix:
This issue has been resolved to ensure proper process initialization during SwitchD initialization.
1778689-1 : Duplicate OMD alerts during Inaccessible Memory incident
Component: F5OS-C
Symptoms:
During certain conditions where an “Inaccessible Memory” issue occurs, duplicate OMD alerts may accidentally be triggered at the same time due to overlapping OID/alert IDs associated with the same root cause.
Conditions:
This issue arises when an “Inaccessible Memory” incident occurs, resulting in OMD generating redundant alerts “openshiftCertsExpWithinNinetyDays” for the same event, which is causing confusion and unnecessary noise in alert tracking systems.
Impact:
False-positive or duplicate alerts for OMD.
Workaround:
To verify and troubleshoot the issue, you can:
1. Use the confD command 'show cluster cluster-status' to check the cluster's current status.
2. Analyze the openshift.log/velos.log file for any errors or abnormalities related to the incident or cluster health.
Fix:
The issue has been addressed by implementing enhanced logic in OMD alert generation to eliminate duplicate alerts resulting from overlapping OID/alert IDs. The system now ensures each alert is uniquely identified and mapped to its respective event, preventing redundancy during “Inaccessible Memory” incidents. All configurations have been updated to maintain integrity and consistency.
1772433-2 : Config restore fails after upgrade★
Links to More Info: BT1772433
Component: F5OS-C
Symptoms:
1. Bare metal to: 1.6.1-19136
2. Upgrade to: 1.8.0-19115
3. Take controller backup
4. Reset dabase: system database config reset-default-config
5. Attempt to apply backup from step 3, this fail.
Conditions:
-- Upgrade from 1.6.1 to 1.8.0+
-- Perform config-restore
Impact:
Unable to perform config-restore after upgrade.
Workaround:
None
Fix:
With the fix for ID1917841, you can now perform the config-restore.
1772305-1 : Unable to deploy a tenant to both BX110 and BX520 blade in same partition
Component: F5OS-C
Symptoms:
A tenant can only be deployed to a partition if it is deployed to a node that is the same type as the other nodes that are running tenants. Deploying a multi-bladed tenant that includes both BX110 and BX520 blades is not supported.
Conditions:
Deploying a tenant to a partition that contains a mix of BX110 and BX520 blades.
Impact:
If a partition contains both BX110 and BX520 blades, you must choose to deploy tenants to one blade type or the other but not both.
Workaround:
Deploy tenants to nodes that are of the same blade type.
Fix:
None
1772053-1 : High memory usage due to log flood when one controller is in FIPS error state
Links to More Info: BT1772053
Component: F5OS-C
Symptoms:
In FIPS error state, the active controller triggers a sync to the errored controller which results into an infinite loop of waiting as the peer is unreachable. This dumps an enormous amount of logs in ccsync.log and consumes excessive memory.
Conditions:
One active controller and one FIPS errored out controller.
Impact:
Consumes high system memory and log files are rotated in no time leaving a huge dump of logs in ccsync.log
Workaround:
- stop ccswatch.service
- Recover FIPS errored controller
- restart ccswatch.service
Fix:
Added retries to wait for a finite time period before exiting to reduce log flood and memory usage.
1759733-1 : Controller reboot during a controller RMA can cause openshift cluster to fail.
Links to More Info: BT1759733
Component: F5OS-C
Symptoms:
If a system controller is rebooted after it's ETCD instance has been started, but before the controller has been fully added to cluster, it can cause a failure that will not automatically recover. The controller will not be able to join the cluster after this failure.
Conditions:
A system controller is rebooted after it's local ETCD instance has been started, but before the controller is fully added into the openshift cluster.
Impact:
The rebooted controller will persistently fail to join the cluster after this failure. As such the cluster will not be redundant between the 2 system controllers.
Workaround:
Rebuild the openshift cluster to recover the affected system controller.
Fix:
The fix cleans any stale ETCD state when the process of adding the controller to the cluster after the reboot. This allows the controller to be re-added to the cluster correctly.
1757729-2 : Default port for LDAP server does not match default server type
Links to More Info: BT1757729
Component: F5OS-C
Symptoms:
On Server Groups screen, when adding an LDAP server, the default value for LDAP Over TCP type is set to 636 port by default, which is used for LDAP over SSL. This behavior is causing confusion.
Conditions:
When configuring an LDAP server.
Impact:
This issue can be confusing because the default setting for LDAP over TCP type is set to 636 port (instead of 389, which is the port used for LDAP over TCP).
Workaround:
None
Fix:
The default value for the ‘Port’ field has been changed to 389 to align with the default value for LDAP over TCP type.
1753469 : Add notification to set-version when downgrading the system from F5OS-A/C-1.8.0
Links to More Info: BT1753469
Component: F5OS-C
Symptoms:
A downgrade to an earlier version of F5OS from F5OS-A/C 1.8.0 can leave the system inoperable. Refer to ID1712009 for more information.
Conditions:
Perform a config-restore or config reset-to-default operation to an earlier version of F5OS.
Impact:
A downgraded system may be inoperable.
Workaround:
Refer to ID1712009 for workaround.
Fix:
There is an issue with performing a config-restore after downgrading from F5OS-A/C 1.8.0 (ID1712009). If you intend to perform a config-restore or config reset-to-default operation, please refer to the F5OS-A/C 1.8.0 release notes for information on avoiding this issue.
1752821-1 : Cluster re-install with missing system controller does not complete★
Links to More Info: BT1752821
Component: F5OS-C
Symptoms:
If a cluster re-install is issued when only one system controller is installed in the chassis, the cluster re-install will not complete and the system will not be functional.
Conditions:
-- Only one system controller is in a chassis, or one of the system controllers is broken.
-- Re-installing the cluster via 'touch /var/omd/CLUSTER_REINSTALL'
Impact:
System will not be able to launch tenants or pass traffic.
Workaround:
None
Fix:
The cluster orchestration layer has been update to allow K8S cluster install when one system controller is missing from the system. If the system controller is broken, but still inserted into the system the "/var/omd/FORCE_PEER_CC_MISSING" can be created on the remaining controller, and it will behave as if the broken CC has been removed from the chassis. Once the broken controller is replaced, the /var/omd/FORCE_PEER_CC_MISSING file should be removed.
1750613-1 : If a system controller PXE boots and reimages, partitions may not start correctly, and cause data loss★
Links to More Info: BT1750613
Component: F5OS-C
Symptoms:
If a system controller PXE boots, the partition instance restart on that controller may not work and the partition instance will be left in the "failed"/not running state with no configuration database. If that instance later becomes "active" it will overwrite the correct partition configuration database with the empty database.
Example failed partition instance state:
syscon-1-active# show partitions
RUNNING
BLADE OS SERVICE PARTITION SERVICE STATUS
NAME ID VERSION VERSION CONTROLLER STATUS VERSION AGE
----------------------------------------------------------------------------------------
none - - -
default 1 1.6.2-22734 1.6.2-22734 1 running-active 1.6.2-22734 40m
2 failed - 11m
Normally following a controller reimage, the partitions will complete restart after all the ISOs are replicated to the controller and reimported. This may take 15 to 30 minutes depending on how many images are present. The partitions will show as "failed" while this resync occurs, and then they will start up normally. In the failure case, the instance stays "failed" indefinitely.
Do NOT attempt to enable/disable the partition while it is in this "failed" state, or perform a software upgrade (set-version). If that happens, the "wiped" partition instance may start up and become Active, and all partition configuration will be lost.
Conditions:
This problem occurs when the partition is running a "patch" version of partition-services rather than a "base" version. Patch versions have a version number (major.minor.patch) that ends in a number other than “0” (zero).
A race condition may occur between the completion of the partition ISO import and the initiation of the partition, resulting in a potential declaration of success despite failure. In such cases, the operation will not be retried.
In this scenario, the partition might never get started, so it has no opportunity to form an HA pair with the other partition instance and synchronize the configuration database and tenant images. If it does eventually become Active it will erase all partition configurations.
Impact:
All partition and tenant configuration in that partition is lost.
Workaround:
Following a PXE boot or reimage of the controller, check the status of all partition ISOs using the "show image partition" command. For patch versions, the partitions may stay in the "failed" state. However, for base versions, the partition should automatically restart and become running-standby within approximately 5 minutes after the ISOs have been imported. No further corrective action is necessary in this scenario.
To recover force the partition instance startup code to retry by changing the partition configuration in a minimally disruptive way. Recommend toggling the partition mgmt-ip to 'none' and then back, as this will force the retry but not permanently change any configuration.
Example:
syscon-1-active(config)# partitions partition default config mgmt-ip ipv4 address 0.0.0.0 ; exit
syscon-1-active(config)# commit
Commit complete.
syscon-1-active(config)# partitions partition default config mgmt-ip ipv4 address <ip address>; exit
syscon-1-active(config)# commit
Commit complete.
syscon-1-active(config)#
Do NOT attempt to enable/disable the partition while an instance is in this "failed" state following a reimage or perform a software upgrade (set-version). If that happens, the "wiped" partition instance may become Active, and all partition configuration will be lost.
Fix:
Partitions restart and form an HA pair correctly following system controller reimage/replacement, regardless of partition services version.
1737677-1 : Reboot of both system controllers results in dataplane issues
Component: F5OS-C
Symptoms:
Traffic outage after simultaneously rebooting both system controllers.
Conditions:
With a multi-blade partition configured, reboot both system controllers simultaneously.
Impact:
Traffic outage
Workaround:
Reboot blades in affected partition.
1737517-1 : Rare partition startup conditions can cause persistent application-communication error on that partition
Links to More Info: BT1737517
Component: F5OS-C
Symptoms:
While executing partition commands related to tenants. Commands include but not limited to commits related to VLANs, tenants, and interfaces, or, showing data related to VLANs, tenants, and interfaces. Persistent error logging in the partition's confd.log and devel.log about an unregistered lac_mac_hook/write_all callpoint.
Conditions:
Secific cases, where a partition failover occurs, when the partition starts up, or reset to its default settings
Impact:
The partition is effectively inoperable, as very few commands are related to VLANs, and tenants. Additionally, VLANs are functional.
Workaround:
Reboot active partition's system controller or toggle the partition's enabled state.
1730833-4 : Tmm may egress broadcast traffic even when VLANs are disabled in F5OS
Links to More Info: BT1730833
Component: F5OS-C
Symptoms:
In certain scenarios such as restoring a UCS on an F5OS tenant, if the VLANs in F5OS are disabled, tmm may egress broadcast traffic such as gratuitous ARPs onto the disabled VLANs.
Conditions:
-- An F5OS tenant where VLANs were assigned and then removed.
-- An F5OS tenant where tmm is not in forced-offline mode.
-- An action occurs on the tenant (such as restoring a UCS or restarting tmm, or loading the config) that results in gratuitous ARPs.
Impact:
This could cause IP address conflicts on the network or other issues related to unexpected broadcast traffic such as gratuitous ARPs on the network.
Workaround:
- In F5OS, remove the affected VLANs from the LAG or interface.
- On the tenant use forced offline to prevent traffic egress.
- If you are restoring a UCS from another BIG-IP such as for a platform migration, put the source BIG-IP into forcedoffline state before taking the UCS.
- delete the tenant, and recreate without any VLANs assigned.
Fix:
A single tenant with a vlan that was configured and then removed via F5OS will no longer leak broadcast traffic onto the network on the removed vlan.
This fix does not address the issue when multiple tenants are attached to the same vlan. F5 has created ID1758957 for that issue.
1710765-2 : The node number fetched by the SNMP disk stats handler from the disk operational handler has the wrong blade value.★
Links to More Info: BT1710765
Component: F5OS-C
Symptoms:
Rarely, SNMP command output may not show up the disk stats for a particular blade. This could happen because of incorrect blade value of the blade fetched from the backend.
The partition "velos.log" file may show below logs:
1. <Timestamp> default platform-stats-bridge[8]: nodename=controller-2(p1) priority="Err" version=1.0 msgid=0x4305000000000007 msg="" msg="Invalid slot value." value=761491247.
2. <Timestamp> default platform-stats-bridge[8]: nodename=controller-2(p1) priority="Err" version=1.0 msgid=0x4305000000000007 msg="" msg="Failed to assign blade instance" value=761491247.
Conditions:
1. Upgrade the partition
2. Configure SNMP community of any version
3. Execute SNMPWalk command on the disk stats table MIB.
Impact:
SNMPWalk will miss the disk utilisation stats of problematic blade.
Workaround:
As a workaround, either restart the platform-stats-bridge container of the partition or disable/enable the partition from Confd.
Fix:
As a workaround, either restart the platform-stats-bridge container of the partition or disable/enable the partition from Confd.
1710453-1 : Partition configuration wiped out during Controller reboot
Links to More Info: BT1710453
Component: F5OS-C
Symptoms:
In rare cases the partition configuration volume can be wiped during a system controller reboot when partitions are disabled, resulting in partition configuration loss.
Conditions:
When partitions are disabled and a system controller is rebooted there can be a shutdown race between a (spurious) resize request and LVM shutdown that can cause one of the partition volumes to get removed.
When the partition is subsequently enabled, whichever controller instance starts first will establish the current configuration. If the instance that was removed starts first, the partition is reinitialized to a clean configuration.
If the partition is running when the system controller reboots it will automatically resync itself from the other system controller as soon as it restarts. Configuration loss is not observed, though there may be missing logfiles on one of the system controller partition instances.
Impact:
Partition and tenant configuration is lost, and must be restored from backup before continuing.
Workaround:
Partitions should be left enabled. As long as at least one partition instance is running, the high availability subsystem will ensure that no configuration is lost.
Chassis power loss won't trigger this problem since there won't be a "race" between the stopping components.
Fix:
The spurious resizes no longer occur, and the error paths in partition volume resize and partition enablement can no longer result in removing the volumes.
1710405-1 : MAC exhausted error can occur even though there are available MACs
Links to More Info: BT1710405
Component: F5OS-C
Symptoms:
MAC address processing during tenant configuration can result in a "MAC exhausted" error even though there are available MAC addresses.
Conditions:
If the processing of a tenant's configuration releases MAC addresses to the partition's free list then this can erroneously cause a MAC exhaustion error. In this case there may be error logs in velos.log as well indicating failure to update or modify the MAC address pool.
Impact:
This can disrupt tenant configuration.
Workaround:
Modifying the tenant in the CLI when adding VLANs to a tenant is less likely to run into this issue.
Fix:
The code has been modified to log the error but not cause the misleading MAC exhaustion error and not block tenant configuration.
1709665-2 : Blade NotReady after liveupgrade★
Links to More Info: BT1709665
Component: F5OS-C
Symptoms:
A blade is stuck in the NotReady state after an upgrade.
Conditions:
-- The VELOS system is being upgraded.
-- A reboot is triggered before the grub config update is complete.
Impact:
Blade stuck in NotReady state.
Workaround:
Perform a clean install of the blade by PXE installing it. Connect to the serial console of the blade and interrupt the boot process by selecting 'b' when the boot process displays "Press <c> to enter setup".
1709121-5 : Unable to create a tenant as the Network Manager start-up or failover may result in a looping process
Links to More Info: BT1709121
Component: F5OS-C
Symptoms:
While creating a new tenant, an error occurs:
"Failure for data/f5-tenants:tenants API. The server or an underlying service is unreachable."
The network-manager service seems to hang, or it might be in a restart loop.
In confd, the 'show system mac-allocation state' command indicates that no MAC addresses have been allocated.
$ show system mac-allocation state
system mac-allocation state free-single-macs 16
system mac-allocation state allocated-single-macs 0
system mac-allocation state free-large-blocks 2
system mac-allocation state allocated-large-blocks 0
system mac-allocation state free-medium-blocks 0
system mac-allocation state allocated-medium-blocks 0
system mac-allocation state free-small-blocks 0
system mac-allocation state allocated-small-blocks 0
system mac-allocation state total-free-mac-count 80
system mac-allocation state total-allocated-mac-count 0 <---
system mac-allocation state total-mac-count 80
Conditions:
This can occur with combinations of tenants using MAC blocks greater the size 1. The specific combinations are somewhat unpredictable.
Impact:
Tenants cannot be created.
Workaround:
None
Fix:
The code will be updated to prevent the hang condition.
1699821-2 : Partition data missing
Links to More Info: BT1699821
Component: F5OS-C
Symptoms:
The system controller can be rebooted while a partition is being created. This can cause the partition to not function correctly.
Conditions:
A system controller is rebooted while the partition is being created.
Impact:
Partition doesnt work as expected. /config, /shared, /images paths (one or more) will be missing.
Workaround:
Disable and delete the defective partition, then re-create the partition.
Fix:
Controller reboot during partition creation completes correctly after the controller returns to service.
1696325-1 : Unresolved VQF IMM watchdogs after system controller failover, VoQ Window Errors, and extensive disconnect to confd
Component: F5OS-C
Symptoms:
The VoQ IMM Enabled status in the fpga-tables vqf-voq-stats output from the CLI remains 0 indefinitely resulting in traffic loss between blades.
Example:
show fpga-tables vqf-voq-stats
COS MEM COS WIN
EMM IMM SMS FILL FULL HI COS LO SMS EMM IMM ERR
SLOT NAME ENABLED ENABLED DRPLVL PKT CNT BYTE CNT DROP DROP DROP DROP DROP DROP DROP CNT
--------------------------------------------------------------------------------------------------------------------------
3 13.12 1 0 32767 1819895878 2330473381038 200121 0 0 86532 0 14 9 0
3 13.13 1 0 32767 1815815755 2322725261469 251277 0 0 58031 0 14 9 0
3 13.14 1 0 32767 1824204787 2337092078111 211707 0 0 1528 0 14 9 0
3 13.15 1 0 32767 1839939128 2357633747305 208636 0 0 0 0 14 9 0
3 13.4 1 0 32767 0 0 0 0 0 0 0 14 9 0
3 13.9 1 0 5427 0 0 0 0 0 0 0 14 9 0
Conditions:
A temporary loss of the dataplane links between the system controller and a blade on a system, followed by an extensive outage for that blade to the confD database.
Impact:
Traffic loss from the blade reporting the zero values for IMM Enabled towards the destination blade. The destination blade is indicated by the first number in the decimal of the "NAME" column.
For instance, if the IMM ENABLED values are 0 for "Slot 3 and NAME "13.12", this indicates that traffic from slot 3 towards slot 13 will be lost.
Workaround:
Reboot the blades reporting the IMM Enabled values of 0.
1696269-1 : If partition confd initiates a failover due to a health fault, it may incorrectly attempt to fail over repeatedly
Links to More Info: BT1696269
Component: F5OS-C
Symptoms:
In some conditions, when the partition confd initiates a failover to the other controller, it fails to complete the failover in a timely fashion and the original instance reclaims the active role. If the failover was due to a controller fault and is still present, it will immediately fail over again.
Conditions:
If a controller health fault is present on system controller-1, and the partition redundancy mode is set to either "auto" or "prefer-1".
Impact:
While the partition instance is failing back and forth, the control-plane functions are unavailable or degraded, and this can impact dataplane operations.
Workaround:
Set the partition "system redundancy config mode" to "active-controller". When a controller fault exists, and the controller fails over, the partition will automatically prefer to follow the active controller location.
1696157-4 : Container api-svc-gateway crashes after enabling a tenant
Links to More Info: BT1696157
Component: F5OS-C
Symptoms:
The api-svc-gateway container crashes intermittently.
The logs contain the following entries
appliance-1.chassis.local tcpdumpd-manager[8]: priority="Info" version=1.0 msgid=0x5401000000000095 msg="Interfaces/VLANs were removed. No change to hardware programming needed.".
appliance-1.chassis.local Core-helper.Appliance: priority="Err" msgid="0x6501000000000001" msg="Core dumped on Appliance" process="api_svc_gateway" location="/var/shared/core/container/core.system_api_svc.api_svc_gateway.25499.1728690599.core.gz"
appliance-1.chassis.local alert-service[9]: priority="Notice" version=1.0 msgid=0x2201000000000029 msg="Received event." event="327680 appliance core-dump EVENT NA 'Core dumped on appliance. process=api_svc_gateway, location=/var/shared/core/container/core.system_api_svc.api_svc_gateway.25499.1728690599.core.gz'
Conditions:
1. Enabling a tenant by changing it's running-state to deployed.
2. Enabling a tenant followed by deleting the tenant from the CLI promptly.
Impact:
The api-svc-gateway container crashes.
Workaround:
None. The api-svc-gateway will restart immediately and tenants will be recovered automatically.
Fix:
The api-svc-gateway will not crash and tenant will be in the expected state after performing the operations.
1695589-1 : Data-plane links are bounced on HA failover
Links to More Info: BT1695589
Component: F5OS-C
Symptoms:
If the active management port link is cycled down and up, a system controller and partition HA failover will occur. When the system controller failover occurs, a slot state change event is generated causing switchd to "link bounce" all data plane ports even though the slot state on those ports has not changed.
Any act performed on the chassis that would cause a slot state change event will trigger this behavior. That includes inserting or removing a blade.
The impact of the link bounce can be observed by 'IMM watchdog events' reported in the partitions velos.log (/var/F5/partition<id>/velos.log:
fpgamgr[14]: nodename=controller-1(p4) nodename=blade-3(p4) priority="Warn" version=1.0 msgid=0x305000000000008 msg="VQF IMM Watchdog." slot=5 port=9.
Conditions:
This occurs when the active system controller management link is marked down, resulting in an HA switchover or any other act performed on the chassis that can lead to a slot state change event (ie removing/inserting a blade).
Impact:
The data plane links are bounced (brought down and immediately back up), this will trigger the VQF IMM watchdogs.
Workaround:
None.
1691557-1 : CVE-2020-8037: tcpdump memory leak.
Links to More Info: K000149929
1682425-1 : Rate limiting does not work on BX520 front panel interfaces
Component: F5OS-C
Symptoms:
Broadcast and DLF traffic on BX520 front-panel interfaces is not rate-limited.
Conditions:
Excessive broadcast or DLF traffic is present at the front panel interfaces.
Impact:
Excessive broadcast or DLF traffic can cause traffic loss.
Workaround:
None
Fix:
This issue has been fixed by configuring the BX520 rate-limiter hardware correctly.
1681533 : F5 VELOS ATSE firmware v7.10.7.12
Component: F5OS-C
Symptoms:
F5 VELOS ATSE firmware v7.10.7.12
Conditions:
F5 VELOS system
Impact:
Not applicable.
Workaround:
None
Fix:
Fixes intermittent register access. See ID1624057 for more information.
1681529 : F5 VELOS ATSE firmware v7.10.7.02
Component: F5OS-C
Symptoms:
F5 VELOS ATSE firmware v7.10.7.02
Conditions:
F5 VELOS system
Impact:
Not applicable.
Workaround:
None
Fix:
Fixes intermittent register access. See ID1624057 for more information.
1681525 : F5 VELOS ATSE firmware v7.10.7.22
Component: F5OS-C
Symptoms:
F5 VELOS ATSE firmware v7.10.7.22
Conditions:
F5 VELOS system
Impact:
Not applicable.
Workaround:
None
Fix:
Fixes intermittent register access. See ID1624057 for more information.
1681521 : F5 VELOS ATSE firmware v7.10.7.11
Component: F5OS-C
Symptoms:
F5 VELOS ATSE firmware v7.10.7.11
Conditions:
F5 VELOS system
Impact:
Not applicable.
Workaround:
None
Fix:
Fixes intermittent register access. See ID1624057 for more information.
1681501 : F5 VELOS ATSE firmware v7.10.7.00
Component: F5OS-C
Symptoms:
F5 VELOS ATSE firmware v7.10.7.00
Conditions:
F5 VELOS system
Impact:
Not applicable.
Workaround:
None
Fix:
Fixes intermittent register access. See ID1624057 for more information.
1680105-2 : Using 'iburst' option is preferred when adding NTP servers.
Links to More Info: BT1680105
Component: F5OS-C
Symptoms:
It's reported that sometimes system time drifts even with NTP server configured.
Conditions:
This is a common occurrence among specific NTP servers.
Impact:
System time drift.
Workaround:
Use 'iburst' option.
It helps making more reliable synchronization and initial accuracy with the server.
Fix:
From 1.8.1 and later, If the default settings are not specified, the settings will automatically change to iburst=true and association-type=pool.
The old NTP configurations, which have the default settings, will be updated to new default settings after the upgrade..
This change is relatively secure and is not likely to result in any problems.
1679941-2 : "gen error" while running snmpget/snmpbulkget commands
Links to More Info: BT1679941
Component: F5OS-C
Symptoms:
Triggered shell script which does the snmpget/snmpbulkget in a loop with 50sec delay in each loop reports genError for hrStorageAllocationunits
Conditions:
Snmpwalk is fetching the value for any index. No validation for the key passed.
Impact:
Some OIDs report an error, for example
Error in packet
Reason: (genError) A general failure occured
Failed object: HOST-RESOURCES-MIB::hrStorageAllocationUnits.131080
Workaround:
None
Fix:
Need to validate the index/key
1677797-1 : OMD on Active CC hung due to 'oc delete project' command hang, after delete and recreate a partition and move slots
Links to More Info: BT1677797
Component: F5OS-C
Symptoms:
After deleting and recreating a partition and then moving slots in to the new partition, as a result:
* Blades scheduling is disabled
* multus and/or kubevirt are unhealthy
* Pods pending in the new partition
* Controller-manager pods CrashLoopBackOff
* New partition namespace is terminating
Conditions:
This issue occurs when you delete and recreate a partition.
During this operation, slots are moved to the new partition.
The ‘oc delete project’ command hangs, causing OMD Active CC to hang.
Impact:
This leads to system instability due to blade scheduling issues. Unhealthy pods impacting functionality and service availability.
Workaround:
Restart OMD services on Active CC.
Fix:
The issue has been resolved by adding timeouts to the ‘oc delete project’ command. This ensures the operation will not hang indefinitely, preventing the OMD Active CC from locking up and allowing the system to recover cleanly after partition and slot changes. You should now experience improved reliability during these operations.
1673925-4 : Missing masquerade MAC FDB entry causes excessive DLFs following tenant failover.
Links to More Info: BT1673925
Component: F5OS-C
Symptoms:
The FDB entry for the tenants masquerade MAC is missing from a blades internal L2 table after a tenant failover.
The output of
[root@blade-1 ~]# docker exec -i partition_fpga tmctl -d blade -w 180 nse_l2 -s mac,l2_tag
mac l2_tag
--- ------
[root@blade-1 ~]
where MAC and L2_tag match the masquerade MAC and VLAN from the output of 'show FDB'
Conditions:
During tenant failover, the system will delete the masquerade MAC from the old active and add it to the new active. In parallel, the system will detect a port-motion event when the tenant issues a GARP for the new MAC.
This introduces a race condition between the static ADD from the system and the dynamic port-motion event from the H/W. If the port-motion event is processed last, the new static entry can be deleted erroneously.
Impact:
All front-panel traffic towards the tenant will encounter a DLF, causing excessive DLF traffic to the tenant.
Workaround:
From the tenant, remove and then re-add the masquerade MAC to the traffic group.
Fix:
For port-motion events, don't delete the existing entry if it's a static system entry.
1672269-1 : Blades missing L2 entries causing excessive DLFs.
Links to More Info: BT1672269
Component: F5OS-C
Symptoms:
Excessive DLFs from certain blades due to missing L2 entries.
The 'l2fs_stat' tmstat table shows the IDs of the blades to which L2 entries will be forwarded to:
[root@blade-1 ~]# docker exec -i partition_fpga tmctl -d blade -w 180 l2fs_stat -s svc_ids
svc_ids
---------------------------------
[ 0x2c 0x4c 0x6c 0x8c 0xac 0xcc ]
[root@blade-1 ~]#
In this example, blade-1 will forward to blades 3, 5,7,9,11 and 13.
A blade should have an entry for all other blades in the partition.
Conditions:
Reboot of a tenant or changing the tenant from deployed to configured back to deployed.
Impact:
L2 entries learned on the affected blade are not forwarded to other blades causing missing L2 entries on those blades.
Workaround:
Reboot the blade that's missing the entries for other blades.
For example, blade-1 is missing IDs for all blades in the partition:
[root@blade-1 ~]# docker exec -i partition_fpga tmctl -d blade -w 180 l2fs_stat -s svc_ids
svc_ids
---------------------------------
[ ]
[root@blade-1 ~]#
Fix:
On tenant deletion, don't remove service IDs belonging to the L2FwdSvc.
1670437-1 : Jumbo frames with an IP length greater than 9174 bytes may be dropped
Links to More Info: BT1670437
Component: F5OS-C
Symptoms:
Jumbo frames with an IP total length greater than 9174 bytes are dropped when traversing the VELOS inter-blade backplane.
Conditions:
This issue may occur for VELOS tenants with a VLAN MTU set to 9175 or higher.
Impact:
Data transfers between a VELOS tenant and another host configured with the same MTU may be disrupted. Individual packets may be dropped, or some flows may be permanently dropped.
Workaround:
Do not set the VLAN MTU higher than 9174 on a VELOS tenant.
Fix:
The MTU limit of the inter-blade backplane has been increased to align with the maximum supported size of jumbo frames, ensuring that jumbo frame communication is reliably transmitted without packet drops.
1670029-1 : Reset counter functionality not working properly on rSeries platforms
Links to More Info: BT1670029
Component: F5OS-C
Symptoms:
On rSeries appliances, interface counters will be reset briefly but then revert to the previous values. This behavior occurs within both the Link Aggregation Group (LAG) and individual interfaces, affecting the accuracy of network statistics and troubleshooting efforts.
Conditions:
Execute the “reset counters all” or equivalent command. The counters briefly reset before reverting to their previous values.
Impact:
The issue impacts the accuracy of interface statistics displayed in the GUI section under “Network -> Network Details.” When you reset counters for a specific interface, only the “Out” counters are successfully reset to 0, while the “In” counters remain unchanged or continue increasing. This causes confusion or incorrect reporting during network diagnostics or performance monitoring.
Workaround:
None
1660961-4 : Active Directory LDAP integration without uidNumber/gidNumber does not work with LDAP over TLS
Links to More Info: BT1660961
Component: F5OS-C
Symptoms:
Configuring an F5OS device to integrate with Active Directory using group names to map to roles rather than requiring unix attributes (uidNumber/gidNumber) in the directory will not work if the LDAP servers are configured to use encryption (TLS/SSL).
Log messages similar to the following in platform.log / velos.log:
authd[8]: priority="Err" version=1.0 msgid=0x3901000000000101 msg="LDAP API error during : -" oper="bind" code=-1 msg="Can't contact LDAP server".
authd[8]: priority="Warn" version=1.0 msgid=0x3901000000000098 msg="Unable to retrieve domain Sid for supplied servers and domains; server will be treated as if it has unix attributes present.".
Conditions:
- LDAP system authentication configured to authenticate against an Active Directory Server
- Under the system Authentication Settings configuration in the Common LDAP Configuration section, "Authenticate with Active Directory" set to True and "Unix Attributes" set to False
- LDAP group filters specified for one or more roles
Impact:
LDAP authentication functions based on unix attributes in the directory (uidNumber/gidNumber)
Workaround:
None
1644293 : Interface status alert and SNMP trap is not sent immediately after interface is disabled
Links to More Info: BT1644293
Component: F5OS-C
Symptoms:
When an interface is disabled, the alert or SNMP trap is not sent immediately.
Conditions:
-- Disable an interface.
Impact:
No alert or SNMP trap is sent when an interface is disabled. The trap is sent when the interface is re-enabled.
Workaround:
None
Fix:
Add a new "Interface disabled" event triggered when an interface is disabled. The "Interface up" and "Interface down" alerts changed to events.
1644221-3 : Log file grows to gigabytes (GBs) under /var/log
Links to More Info: BT1644221
Component: F5OS-C
Symptoms:
The default setting for logrotation on host-os is once per day. This can be troublesome if a problem arises and causes an excessive amount of log files to be generated. In such cases, the log files will grow to several GBs within a day.
Conditions:
If any service floods the logfiles under /var/log then file starts to grow in GBs.
Impact:
System disk gets full and becomes unusable.
Workaround:
None
Fix:
This issue has been fixed and the Log files will no longer grow in GBs.
1644185-1 : DAG State table is not cleaned when a tenant is deleted or moved to configured/provisioned
Links to More Info: BT1644185
Component: F5OS-C
Symptoms:
DAG State table is not cleared when a tenant is deleted, or moved to configured or provisioned state
Conditions:
1. Deploy a tenant and confirm the sDAG state table is present in partition ConfD.
2. Delete the tenant
Impact:
DAG State table is not deleted. The stale table is no longer functional.
Workaround:
The stale table can be manually deleted.
Fix:
DAG State table is now cleaned when a tenant is deleted.
1642081 : "default" partition key sometimes initialized improperly★
Links to More Info: BT1642081
Component: F5OS-C
Symptoms:
There is a potential for the default partition to incorrectly initialize the partition primary key at initial startup.
If this happens the API gateway on the blades will log this error message and secure tenants will be unable to connect.
2024-09-05T17:05:18.626737+00:00 default api-svc-gateway[12]: nodename=blade-1(p1) priority="Err" version=1.0 msgid=0x5803000000000010 msg="Key header check failed" HEADER="????xg?A????j?8?????p?}=?ajT".
Once the database & key are mismatched, the partition database is non-recoverable.
Conditions:
This issue only affects the "default" partition, and only during initial database creation following either a USB install or resetting the system controller database using "system database config reset-default-config true".
It does not affect any other partition. It does not occur if the controller database is reinitialized using "system database reset-to-default".
Impact:
Tenant will be unable to connect to the API Gateway and start up correctly.
Other encrypted fields will also be unable to be decoded.
Workaround:
Before configuring and enabling the default partition, recreate the default partition using the following command sequence.
syscon-2-active# config
Entering configuration mode terminal
syscon-2-active(config)# no partitions partition default
syscon-2-active(config)# validate
Failed: illegal reference 'slots slot 1 partition'
syscon-2-active(config)# partitions partition default ; exit
syscon-2-active(config)# validate
Validation complete
syscon-2-active(config)# commit
Commit complete.
If the partition has ever been enabled, this sequence will not have the desired effect, and will not repair the partition.
Fix:
The database startup initialization is fixed to ensure that the default partition primary key is correctly initialized.
1638629-1 : "Unhealthy" kubevirt pod due to internal networking issue with blade★
Links to More Info: BT1638629
Component: F5OS-C
Symptoms:
Some kubevirt pods are in a "CrashLoopBackOff" state following a live upgrade. The output of the 'show cluster' command shows that kubevirt status is unhealthy.
Conditions:
Exact conditions are unknown and this occurs rarely.
It was encountered during internal testing after a live upgrade.
Impact:
Might affect tenant deployment & traffic on the issued blade.
Workaround:
There are 2 workarounds for this issue:
1. Reboot the affected blade
2. Unschedule & reschedule the affected node
Steps for workaround #2:
'oc adm cordon <node>' ------> Mark <node> as unschedulable.
'oc adm drain <node> --delete-local-data --ignore-daemonsets' -----> safely evicts all pods from the specified node,preparing it for maintenance or decommissioning.
'oc adm uncordon <node>' -------> mark the node as schedulable again. After the maintenance is complete, can use this command to allow new pods to be scheduled onto the node.
Fix:
Please follow the work around steps and contact f5 support if need further assistance.
1634545 : OpenShift cluster may fail to install if no management IP's are configured★
Links to More Info: BT1634545
Component: F5OS-C
Symptoms:
The OpenShift cluster may fail to install after a bare-metal install or cluster rebuild if not management IP's have been configured on the system controller management ports. The output of the 'show cluster' command reports that 'MasterInstall' is in a state of Failed.
syscon-1-standby# show cluster
STAGE NAME STATUS
--------------------------------------
AddingBlade Not Started
HealthCheck Done
HostedInstall Not Started
MasterAdditionalInstall Not Started
MasterInstall Failed <===========
NodeBootstrap Done
NodeJoin Not Started
Prerequisites Done
RemoveBlade InProgress
ServiceCatalogInstall Not Started
etcdInstall Done
Conditions:
No management IP's configured on the system controller management ports while an OpenShift cluster install is initiated, either via a bare-metal install or a manual cluster rebuild.
Impact:
OpenShift cluster install will fail until management IP's are configured on the the system controller management ports.
Workaround:
Configure management IP's on the system controller management ports.
Fix:
F5OS will now add a default route on both system controllers that will allow the OpenShift cluster install to complete even when no management addresses have been configured on the system controller management ports.
1633681-1 : Dynamic FDB entries may not be flushed from all blades when a vlan tag is removed from a LAG.
Component: F5OS-C
Symptoms:
When a vlan tag is removed from a LAG in a VELOS partition, existing FDB entries for that vlan that were learned on that LAG may not be flushed out on each blade.
If that vlan is then added to a different interface or LAG, the old FDB entries may get updated via L2 learning. But if that fails to happen (e.g. due to ID1620077), the old entries may persist.
Conditions:
Remove a vlan tag from a LAG on VELOS, and add the vlan to another.
Old FDB entries may persist when moving a vlan tag from a LAG to another LAG. If moving a vlan tag from a LAG to an interface, L2 learning seems to correct the situation.
Impact:
Since the old FDB entries are not flushed, if the system fails to update them via L2 learning also, egress traffic that matches these old entries is dropped.
This depends on which blades have the old entries and where the tenants are assigned to run. Tenant instances running on those blades are impacted, for the MAC address and vlan matching the old entry.
Workaround:
If old L2 entries persist, a reboot of the blade is required to clear them out.
1633073-4 : A core can occur in a forked process with an Orchestration Agent
Links to More Info: BT1633073
Component: F5OS-C
Symptoms:
You may occasionally notice a core file from a forked process of the orchestration agent.
Conditions:
This can occur in orchestration agent during normal operation.
Impact:
There’s a minimal impact. The core occurs rarely. It happens in a forked process during a read of the partition token. It doesn’t core the overall orchestration agent, only the forked process. There are no error logs. If the read fails, there will be a retry.
Workaround:
None
1629257-2 : Diag-agent service memory utilization increases because of heartbeat probe
Links to More Info: BT1629257
Component: F5OS-C
Symptoms:
Diag-agent service memory utilization rises if not controlled which can lead to OOM.
Conditions:
Diag-agent service generates heartbeat events which are sometimes creating a deadlock in the service. Once deadlock is hit the memory queue of diag-agent service in increasing because of heartbeat probes and eventually diag-agent service memory utilization also rises.
Impact:
Diag-agent service memory utilization rises if not controlled which can lead to OOM.
Workaround:
None
Fix:
Updated diag-agent service handle event locking in a better way so that a deadlock does not occur.
1628557-3 : F5OS high memory usage when using snmp
Component: F5OS-C
Symptoms:
Excess memory usage by snmpd while running commands.
Conditions:
Excess memory usage by snmpd on the F5OS Chassis or Appliance system.
Impact:
Potential system crash with out of memory errors.
Workaround:
Excess memory used can be released by restarting the snmpd service.
The CLI commands to restart the service are given below,
In 1.8.0:
==========
F5OS-A:
appliance-1(config)# system diagnostics os-utils docker restart node platform service snmpd
appliance-1(config)# system diagnostics os-utils docker restart node platform service system_platform-stats-bridge
Releases earlier to 1.8.0
==========================
F5OS-A:
docker restart snmpd
docker restart system_platform-stats-bridge
F5OS-C (Controller):
docker restart snmpd
docker restart platform-stats-bridge-cc
F5OS-C (Partition):
docker restart partition<n>_snmpd
docker restart partition<n>_platform-stats-bridge
Fix:
Excess memory usage no long occurs.
1627541-1 : System Controller unexpected failover in auto mode due to unhealthy SwitchD
Links to More Info: BT1627541
Component: F5OS-C
Symptoms:
A issue was identified where an unhealthy status reported by switchd was causing a system controller failover.
Conditions:
This issue occurs when switchd experiences a transient connection problem with ConfD and as a result reports it is unhealthy.
Impact:
The reporting of a transient ConfD connection problem as unhealthy triggers an unexpected system controller failover.
Workaround:
None.
Fix:
Switchd no longer reports an unhealthy condition because of a transient ConfD connection interruption thus removing this as a trigger of system controller Failover.
1624853-3 : ETCD consumes a high amount of CPU time
Links to More Info: BT1624853
Component: F5OS-C
Symptoms:
ETCD may consume a significant amount of CPU time after a controller failover, or when tenants are being deployed or removed.
Conditions:
Conditions causing extended high CPU time are unknown at the moment.
Impact:
This may slow down other F5OS control plane processes while ETCD is consuming a high amount of CPU.
Workaround:
If the ETCD CPU usage is continually high, it is possible to restrict the CPU's that ETCD is allowed to run on.
This can be done from the system controller shell, and needs to be done on both system controllers. This will need to re-done on system controller reboot or failover.
for x in $(pgrep 'etcd$'); do taskset -cp 4-7 $x; done
1624777-1 : Tenants will not deploy since Orchestration Agent process is continuously generating a core
Links to More Info: BT1624777
Component: F5OS-C
Symptoms:
When attempting to deploy a tenant an error occurs:
tenants tenant my-bigip-1 config type BIG-IP (fill out all prompts)
default-1(config-tenant-my-bigip-1)# commit
Aborted: application communication failure
Core files are found in the partition's /shared/core/container/ directory.
Conditions:
-- Creating a BIG-IP tenant
-- Orchestration agent is crashing
Impact:
Tenants cannot be deployed if Orchestration Agent is crashing. User will not be able to deploy a tenant successfully.
Workaround:
None
1624665-4 : ConfD state data shows key and certificate configured for secure (mTLS) even after deleting from config
Links to More Info: BT1624665
Component: F5OS-C
Symptoms:
ConfD operational state data shows key and certificate configured for mutual transport layer security (mTLS) even after deleting them from configuration.
Conditions:
When the exporter is configured with mutual TLS. And then the key and certificate are deleted from the configuration. ConfD operational state data displays the deleted key and certificate for the exporter.
Impact:
No functional impact.
Workaround:
Delete the exporter and reconfigure it again.
Command to delete the exporter from ConfD CLI:
no system telemetry exporters exporter <exporter-name>
1624449-2 : SNMP polling of coreTotal5minAvg causing timeouts and genErrors
Links to More Info: BT1624449
Component: F5OS-C
Symptoms:
While running an snmpwalk that includes coreTotal5minAvg, you may get a timeout or a general error:
Timeout: No Response from 10.170.9.16
The general error occurs less frequently:
Error in packet
Reason: (genError) A general failure occured
Conditions:
-- snmpwalk a MIB that includes coreTotal5minAvg
-- The polling is done for CPUs that are not present
Impact:
Error in packet
Reason: (genError) A general failure occurred
Failed object: iso.3.6.1.4.1.12276.1.2.1.1.3.1.6.8.112.108.97.116.102.111.114.109.0
Workaround:
After the system starts, after about two minutes, platform-stats-bridge will log this log message:
msg="DB ready check done" NAME="SnmpCpuStatsHandler".
After that log message, you will be able to check coreTotal5minAvg.
Fix:
Modified code such that snmpwalk will not be executed for offline cpus
1624057-2 : BX110 Port Flapping or interface/connectivity issues
Links to More Info: BT1624057
Component: F5OS-C
Symptoms:
F5OS-C v1.8.0 has a fix for an issue "VELOS interfaces flapping if an interface is disabled"; however a corner case remains that could still cause port flapping or have ATSE register reads return 0xebade001 instead of the correct value.
Conditions:
VELOS system
Impact:
Interfaces are intermittently marked DOWN and then UP. Traffic is disrupted while the interface is marked DOWN.
There may be other intermittent issues with interfaces or general connectivity issues.
Workaround:
Upgrade to F5OS-C 1.8.0 EHF-1
1623761 : After cleaning up disk due to disk space full error, tcpdump program still detects the disk as full and aborts
Links to More Info: BT1623761
Component: F5OS-C
Symptoms:
Tcpdump program detects the disk and aborts if the disk does not have enough space. However, even after cleaning up the disk, tcpdump does not recover from the abort state.
Conditions:
Fill up the disk space in /var/F5/partition/shared/. Then, run tcpdump from confd. An abort error will show up. After cleaning up the disk space, the system will still show abort errors when running tcpdump in confd.
Impact:
Can not run tcpdump after the disk space have been full at one point in time.
Workaround:
Restart the tcpdumpd_manager container on the controller that is running-active for the partition.
1623101-2 : External OTEL server receives log data for both the platform and event logs, even if only one of them has been configured
Links to More Info: BT1623101
Component: F5OS-C
Symptoms:
The configured OTEL exporter receives log data from both platform-log and event log, even when only one of them is configured.
Conditions:
This occurs when you configure one telemetry exporter with only either of “platform-log” or “event-log” instruments and another telemetry exporter with “all” or “logs” or both “[platform-log event-log]” instruments.
Impact:
The telemetry exporter configured to receive only platform-log or event-log instrument data will receive data from both log instruments.
Workaround:
None
1622869-5 : Might see TPOB core after HA disassembly
Links to More Info: BT1622869
Component: F5OS-C
Symptoms:
TPOB container might crash after performing BIG-IP Next-HA disassembly operation.
Conditions:
-- BIG-IP Next in a HA pair
-- The HA pair is disassembled and factory reset
Impact:
No impact, as the container gets re-created
Workaround:
None
Fix:
No Fix needed
1620513-1 : CVE-2024-38477 httpd: NULL pointer dereference in mod_proxy
Links to More Info: K000140784, BT1620513
1620077-4 : FDB entry port motion not working if new interface is a trunk/LAG
Links to More Info: BT1620077
Component: F5OS-C
Symptoms:
Immediately after a fail-over of traffic from one trunk/LAG to another, outbound traffic from the appliance or chassis to certain addresses may be interrupted for up to five minutes before recovering.
Conditions:
Switching traffic from one LAG to another on an appliance or chassis.
Impact:
Temporary disruption of tenant’s outbound traffic on an appliance or chassis system.
Workaround:
None
Fix:
Updated handling of FDB entry port motion to include cases with a trunk/LAG as the new interface.
1615969-4 : Tenant operational data is not getting updated properly after upgrade
Links to More Info: BT1615969
Component: F5OS-C
Symptoms:
Tenant pods are up and running but not all details are updated.
Intermittently after upgrade to F5OS-A 1.8.0 version, Tenant operation data in confD not getting updated
Conditions:
Occasionally, the tenant's operational data is not completely updated.
Impact:
Operational data for tenant is not updated properly after system upgrades to F5OS-A 1.8.0 intermittently.
Workaround:
Toggle tenant running-state to configured and deployed, then verify the tenant details again.
Fix:
Handled tenant operational map data updates properly.
1615917-1 : L2_agent crashed due to SNMP★
Links to More Info: BT1615917
Component: F5OS-C
Symptoms:
After upgrading system to 1.8.0, L2-agent crashes.
Conditions:
1. Create system with older version (earlier then 1.8.0)
2. Configure SNMP
3. Upgrade system to 1.8.0 version
4. L2-agent will start crashing.
Impact:
L2-agent crashes and you are unable to do get/set operations for interfaces using ConfD interfaces.
Workaround:
None
Fix:
Fixed an issue causing l2-agent to crash after upgrade.
1614821-3 : CVE-2024-3596 - Blast-RADIUS
Links to More Info: K000141008, BT1614821
1614429-1 : iHealth upload is failing with error "certificate signed by unknown authority"
Links to More Info: K000140362, BT1614429
Component: F5OS-C
Symptoms:
When attempting to use the QKView upload feature, the upload may fail with the message "certificate signed by unknown authority". This is due to a recent change in certificate authority that is inconsistent between F5OS and iHealth.
Conditions:
Always, after mid-July 2024.
Impact:
Unable to upload QKView files to iHealth with a single click.
Workaround:
You can use the File Export feature to download QKView files, and then upload these files to iHealth.
You can find the QKView files in the GUI at System Settings > File Utilities, then choose "diags/shared" as the base directory, then select "qkview".
Fix:
Certificate authorities used by the iHealth upload feature in F5OS will be updated.
1612557-1 : Dma-agent service health warnings appears in show system summary
Component: F5OS-C
Symptoms:
Dma-agent service health warnings shown in show system health summary even when dma-agent service is reporting healthy.
Conditions:
When the health file is not deleted by any means and created again making it untracked.
Impact:
When dma-agent sevice health file reports dma-agent to be healthy, stale data (including warnings) might be seen in show system health summary.
Workaround:
SSH to the impacted blade and restart the platform-monitor service. E.g.
ssh blade-1
docker restart platform-monitor
Fix:
Show system health won't show stale data (warnings) when dma-agent service health file reports dma-agent to be healthy.
1612405-5 : LACP status shows UP in BIG-IP tenant even if its down on F5OS.
Links to More Info: BT1612405
Component: F5OS-C
Symptoms:
LACP Trunk is UP in BIG-IP tenant even when it’s DOWN on F5OS.
Conditions:
Condition 1:
1. Setup a rSeries or VELOS system.
2. Configure LACP LAG with interfaces operationally down.
3. Make sure LACP Trunk is DOWN on F5OS.
4. Upgrade the software.
5. Launch a BIG-IP tenant.
6. Check LACP trunk status inside tenant.
Condition 2:
1. Setup a rSeries or VELOS system.
2. Configure STATIC LAG with interfaces operationally down.
3. Ensure STATIC Trunk is DOWN on F5OS.
4. Launch a BIG-IP tenant.
5. Check the Trunk status inside the tenant. It will be DOWN.
6. Convert LAG type to LACP
7. Check the Trunk status inside the tenant. It will be UP even though it is down on F5OS.
Impact:
LACP Trunk members are shown as working members even though they are DOWN.
Workaround:
Check the interface config. If the admin is disabled, enable it.
Fix:
The status of LACP members is read whenever an LACP member is added as an operational member.
1612217-1 : A large amount of SPVA DoS allow list entries can overload DMA-Agent causing a tenant to fail to pass traffic
Links to More Info: BT1612217
Component: F5OS-C
Symptoms:
If the DMA-Agent receives a high volume of SPVA allow list entries at once, it may become overwhelmed and stop working. As a result, no traffic will be able to exit the tenant. This can be identified by observing the DMA-Agent using 100% of the cpu.
Conditions:
This is usually seen in configurations where there are many virtual servers configured with a dos profile that contains an IP-based allow list.
The problem does not arise when VIPs are added individually, but it often happens after TMM is restarted following a tenant reboot.
Impact:
Tenant will fail to pass any traffic on the data-plane.
The TMSTAT sep_stats.tx_send_drops3 will be incremented.
Workaround:
Perform the following on the tenant:
tmsh modify sys db dos.forceswdos value true
tmsh save sys conf
To recover the DMA-Agent in F5OS, set the tenant state to “configured” and then set it back to “deployed.
Fix:
The DMA-Agent now handles a high volume of SPVA allow list entries.
1612101-2 : When vCPU cores configuration changed for BIG-IP Next tenant, RRD stats shows both the old and new CPU data stats
Links to More Info: BT1612101
Component: F5OS-C
Symptoms:
The RRD stats display the data for old and new CPU cores. You can match the new CPU cores and validate the data. The old CPU cores data is invalid and should not be displayed.
Conditions:
When user configures BIG-IP Next tenant and changes the vCPU cores.
Impact:
No Functional Impact. Both old and new data stats appear for cpu-stats in RRD. However, data streaming works as expected.
Workaround:
None
Fix:
None
1607745-3 : Apache HTTPD vulnerabilities CVE-2024-38476, 2024-38474 and CVE-2024-38475
Links to More Info: K000140618
1603509 : No alarm sent when front panel management link is down
Links to More Info: BT1603509
Component: F5OS-C
Symptoms:
When the front panel management port is down, no alarm is sent
Conditions:
Happens only when chassis is power cycled or blades are inserted/removed in slot 0 and 1.
Impact:
No alarm sent when front panel management link is down and switch stats displayed will not have accurate entries in "show system health".
Workaround:
None
Fix:
Diag-agent will not remove switch port entries when it receives module present events for slot 0 and 1.
1600693-1 : F5OS - BIG-IP Tenant does not display VELOS Chassis slot serial number
Links to More Info: BT1600693
Component: F5OS-C
Symptoms:
F5OS BIG-IP Tenant does not display the serial number for the slot ("Host Board Serial") under "System Information"
Conditions:
BIG-IP tenant is running on a chassis, and command "tmsh show sys hardware" is run from the tenant
Impact:
The slot serial number is not immediately visible to the user
Workaround:
For CLI, login to the partition and run command "show components component state serial-no". For GUI, login to the active controller, then go to System Settings -> System Inventory. The blade serial number will be shown.
Fix:
F5OS was updated to provide the blade serial number to the tenant for display. The tenant was updated to populate the blade serial number into "show sys hardware" command output, so it is now visible to the user. This fix requires a version 17.5 tenant.
1598937 : SNMP traps are not always sent★
Links to More Info: BT1598937
Component: F5OS-C
Symptoms:
After upgrading to 1.8.0 version SNMP traps may stop working.
Conditions:
Upgrade system to 1.8.0 from previous version.
Impact:
SNMP trap functionality does not work
Workaround:
Reconfigure the SNMP configuration.
Fix:
Correct the SNMP configuration in the upgrade case. So, issue is resolved.
1598509-2 : iHealth client can occasionally throw a core file
Links to More Info: BT1598509
Component: F5OS-C
Symptoms:
The iHealth client, accessible with the command line,
system diagnostics ihealth can be used for uploading QKView files to the iHealth service. If this client loses connection to the system database for any reason, it may throw a core file, in the host system's /var/shared/core directory.
Conditions:
System has been up for a long time, and there is a problem with the ConfD database causing the iHealth client to disconnect.
Impact:
A core file may be thrown. The iHealth client will restart if this happens, so functionality is not affected.
Workaround:
Retry the ihealth client operation.
Fix:
The iHealth client will only access the ConfD database when it needs to query information, and not maintain an open connection.
1596149-1 : Monitor rSeries ATSE to BE2 links and Raise Alarms in the Event of Failures
Links to More Info: BT1596149
Component: F5OS-C
Symptoms:
Monitor rSeries ATSE to BE2 links and Raise Alarms in the Event of Failures
Conditions:
F5 rSeries r5000, r10000, or r12000-series appliance.
This update is not applicable to r2000 or r4000-series appliances.
Impact:
In cases where errors are detected between the ATSE and BE2 links, alarms and events will be reported.
Workaround:
None
Fix:
Monitor ATSE to BE2 links and raise alarms and report events when errors are detected.
1595113-4 : Interface state enabled value stale due to timeout to reach confd
Component: F5OS-C
Symptoms:
When trying to modify the interface admin status to disabled across five different interfaces on five blades in a VELOS partition in a single commit message, the CLI operation to update the state interface enabled field fails with an error "system call failed". "Failed to write 68 bytes to ConfD: Connection timed out".
Conditions:
This can occur when a failover of chassis-controller and partition occurs, right before the interface enabled field changes.
Impact:
Stale value for interface/state/enabled field.
Workaround:
Enable and re-disable the interfaces.
Fix:
With the fix, the interface/state/enabled field will reflect accurately the configuration admin status of the interface.
1594125 : GUI fails to modify interfaces on F5OS-C
Links to More Info: BT1594125
Component: F5OS-C
Symptoms:
Interface-related operations from the GUI fail.
Conditions:
-- Interface-related operations like LAG creation or deletion.
-- F5OS build prior to 1.8.0-15246
Impact:
You are unable to perform interface operations from the GUI
Workaround:
None
Fix:
GUI is able to modify the interfaces on F5OS-C
1593385 : F5OS Tenant Throughput (bits/packets) and TMM CPU usage higher than expected until VLAN is added or removed
Links to More Info: BT1593385
Component: F5OS-C
Symptoms:
Higher CPU usage and throughput from the tenant than expected. Traffic being directed to a single blade in a multi-blade system.
Conditions:
Repeated deletes/adds of a VLAN from/to a tenant. After approximately 130 deletes, the issue occurs.
Impact:
Traffic imbalance, higher than normal CPU usage.
Workaround:
Re-add the recently deleted VLAN to the tenant.
Fix:
Properly clean up internal storage when a VLAN is deleted from a tenant.
1592221 : A partition's internal bridge IP address is not detected correctly if there is a missing partition ID in the list of partitions.
Links to More Info: BT1592221
Component: F5OS-C
Symptoms:
The system controller logs will include the msg "Floating IP is not present for enabled partition; do not change controller state" on system controller failover.
Conditions:
When the list of partition IDs includes "holes" in the list. For example, there are partition IDs 1 and 3 (but no 2) on the chassis. This can happen if a partition is deleted.
Impact:
System controller failover is impacted.
Workaround:
Recreate a partition (no need to enable it). It will use the missing ID in the list.
Fix:
The code has been fixed to correctly check the partition ID when detecting presence of the partition bridge IP address.
1591645-3 : EPVA related dma-agent crash
Links to More Info: BT1591645
Component: F5OS-C
Symptoms:
A dma-agent seg_fault occurs when there is a conflict between special EPVA allow-list entries.
Conditions:
A conflict between two entries on the allow-list triggers a code path in the dma-agent and resulting in a seg_fault.
Impact:
Traffic loss as the dma-agent needs to be restarted by its watchdog/start up script. Tenants need to re-register with the datapath.
Workaround:
None
Fix:
This issue has been fixed by setting a THREAD local variable in the epva_tbl_mgmt thread, preventing a seg_fault when the edge case method is triggered.
1591585 : Sshd, httpd, rsync crashes with bunch of whitespaces in /etc/hosts file
Links to More Info: BT1591585
Component: F5OS-C
Symptoms:
When VELOS system controllers fail over, OMD rewrites /etc/hosts on each controller to move around where the 'etcd3.chassis.local' name is assigned.
When this occurs an extra space character is added to the controller-1.chassis.local and controller-2.chassis.local lines. If you add enough whitespace to /etc/hosts (uncertain how much, but megabytes will do it), it starts causing daemons to crash in getnameinfo() calls as they try to resolve the local system IP to a hostname.
Conditions:
VELOS System controllers fails over. Extra space characters are added after controller-X.chassis.local.
Impact:
Sshd, httpd, rsync crashes when the whitespace in /etc/hosts becomes excessive.
Workaround:
Run below command in bash to remove extra space in etc/hosts file.
sed -i 's/[[:space:]]\+$//' /etc/hosts
Fix:
Fixed C-1.8.0
1591553 : Including /etc/resolv.conf and /etc/hosts files in QKView capture
Links to More Info: BT1591553
Component: F5OS-C
Symptoms:
The /etc/resolv.conf and /etc/hosts files are included to check the configured parameters in host QKView from the affected device.
Conditions:
F5OS-A 1.7.0 and lower versions QKView capture does not include the /etc/resolv.conf and /etc/hosts files.
Impact:
The /etc/resolv.conf and /etc/hosts files are not captured in F5OS-A 1.7.0 and lower versions.
Workaround:
None
Fix:
The /etc/resolv.conf and /etc/hosts files are included in QKView capture as part of F5OS-A 1.8.0 release.
1591549-1 : Support for case-insensitive LDAP username lookup
Links to More Info: BT1591549
Component: F5OS-C
Symptoms:
Previously, username lookup for LDAP-authenticated users was always case-sensitive.
Conditions:
Third-party authentication is configured with LDAP or Active Directory; user(s) in question reside in LDAP directory.
Impact:
Username lookups for authentication/authorization against LDAP directory were always conducted in a case-sensitive fashion, even for directories where case-insensitive was the default for the organization (e.g. Windows AD).
Case-insensitive default is considered a safer security posture. It prevents username masking and cache injection when multiple users that only differ by case, with differing authorization privileges, exist in the same directory.
Workaround:
Always use correct case for case-sensitive searches.
Fix:
A new option was added which allows the admin to enable case-insensitive searches for LDAP username lookups. Note that case-sensitive remains the default for security reasons.
1591069 : Blades may fail to get marked as InCluster in "show cluster" output after rolling upgrade
Links to More Info: BT1591069
Component: F5OS-C
Symptoms:
After a rolling upgrade, one or more blades may be marked as "Not In Cluster" in the "show cluster" output.
Conditions:
Perform rolling upgrades from a manufacturing-installed F5OS C v1.7.0 to F5OS C v1.7.1.
Impact:
System will function correctly, but "show cluster" output will show the blade is not being marked "In Cluster".
Workaround:
To workaround the issue, the orchestration-manager daemon can be restarted, which will result in the "In Cluster" status being updated. This action needs to be performed from the shell on both controllers.
systemctl restart orchestration_manager_container.service
Fix:
Fixed issue in the orchestration-manager daemon causing the "In Cluster" status not being updated.
1590617-1 : Partition Network Manager is crashing when turning up.
Links to More Info: BT1590617
Component: F5OS-C
Symptoms:
Upon Partition turn up, the Network Manager component crashes.
Conditions:
The Partition is turning up. This can happen due to partition creation, partition enable, or controller reboot.
Impact:
No impact. The Network Manager will successfully start after a retry.
Workaround:
None
Fix:
None
1590425 : Adding blade to openshift cluster can fail with ansible error
Links to More Info: BT1590425
Component: F5OS-C
Symptoms:
Adding or re-Adding a blade to the OpenShift cluster can fail with the following ansible error:
fatal: [blade-2.chassis.local -> controller-1.chassis.local]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: {{ hostvars[groups.oo_first_master.0].openshift.master.api_url }}: 'dict object' has no attribute 'master'\n\nThe error appears to be in '/usr/share/ansible/openshift-ansible/roles/openshift_manage_node/tasks/main.yml': line 5, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# systemd to start the master again\n- name: Wait for master API to become available before proceeding\n ^ here\n"}
Conditions:
Adding or re-Adding a blade to blade the OpenShift cluster after the etcd instance been rebuilt.
Impact:
New blade will not join the cluster correctly.
Workaround:
The workaround is to rebuild the OpenShift cluster which will regenerate the openshift.fact file that had been corrupted.
Fix:
The fix checks the openshift.fact file before running any ansible playbooks to make sure it is correct.
1588093-1 : Forwarding host log files to remote targets
Links to More Info: BT1588093
Component: F5OS-C
Symptoms:
/var/log/messages growing quickly, consuming the disk space, making the box unusable.
Conditions:
Having /var/log/messages as a host-logs files entry to forward the file lines to a remote destination.
Impact:
When syslog generated files are configured to be forwarded as files, forwarding efficiency can be affected compared to utilizing selectors.
The /var/log/messages being in this list can lead to a cyclical logging issue, where the disk space is consumed faster than the logs can be rotated out, potentially resulting in a full disk.
Workaround:
Use selectors instead for any file that is syslog generated.
The host-logs files configuration is meant for text files that cannot be forwarded through selectors configuration.
Fix:
To prevent filling the disk, files that are forwarded out line by line would not be processed locally. This will prevent having entries in /var/log/messages.
1587925-1 : Modifying a RADIUS server from the web UI requires the Secret to be configured or re-entered
Links to More Info: BT1587925
Component: F5OS-C
Symptoms:
Modifying a RADIUS server from the webUI always requires the Secret to be configured or re-entered.
Conditions:
Modifying a RADIUS server from the webUI.
Impact:
It requires the Secret to be entered, even if it is already configured.
Workaround:
If secret configuration is not required, edit the RADIUS server from the CLI.
Fix:
Create a Radius server and edit it. Editing the port or timeout fields no longer requires the Secret to enable saving.
1587837 : Memory leak in multiple components
Links to More Info: BT1587837
Component: F5OS-C
Symptoms:
A mishandling of memory allocation in the data provider callback library can cause memory allocation to grow over time. This memory usage growth can cause poor performance and the Out Of Memory (OOM) killer may kill components, causing outages.
Conditions:
If a data provider processes overlapping requests it can leak memory. The components most affected by this are the platform-stats, snmp-service, an L2 agent.
Impact:
Components may crash or get killed.
Workaround:
Monitor memory usage and periodically restart daemons that experience excessive memory growth. On a chassis system, a manual failover and the rebooting the standby controller will restart all daemons.
To minimize the occurrence of this leak, do not constantly poll for statistics, especially from multiple monitoring stations.
Fix:
The library has been fixed to no longer leak session data.
1586965-1 : No active instance of ConfD after failover
Links to More Info: BT1586965
Component: F5OS-C
Symptoms:
Unable to configure VELOS system, ConfD CLI commands fail.
Conditions:
Rarely, after failover newly active system controller silently transitions to none.
Impact:
Unable to configure VELOS system, ConfD CLI commands fail.
Workaround:
Reboot chassis.
Fix:
In releases with this fix in place, after failover there will be always be an Active instance of ConfD.
1586893 : Metrics server pod on system controller can exit and not be restarted
Links to More Info: BT1586893
Component: F5OS-C
Symptoms:
The openshift cluster does not operate properly and the controller-manager pods are in a crash loop.
Conditions:
When the metrics server pod exits it causes the controller-manager pods to go into a crash loop.
Impact:
Openshift cluster will not operate correctly since controller-manger pods are in running state.
Workaround:
If you see the metrics-server pod that is in the exit state, the following commands can be run at the root shell prompt.
1. Run oc get pods -n kube-system |grep -i metrics and get the pod name.
2. Run oc delete -n kube-system pod/<pod-name>
1586773 : BX520 Internal FPGA links can fail to come UP during initialization
Links to More Info: BT1586773
Component: F5OS-C
Symptoms:
On the BX520, internal FPGA links fail to initialize with the following error:
fpgamgr[9]: nodename=blade-11(p1) priority="Err" version=1.0 msgid=0x301000000000006 msg="SDK error during programming." API="f5sw_xilinx_cmac_datapath_reset" port=18 error="Waiting for the Xilinx MAC RX_ALIGN to be acheived has failed. Number of retries have exceeded".
Conditions:
Reboot of a BX520 blade.
Impact:
Traffic outage between FPGAs.
Workaround:
Reboot the affected blade.
Fix:
Implement updated initialization procedure for internal FPGA links.
1586661-2 : First login for a remote user fails
Links to More Info: BT1586661
Component: F5OS-C
Symptoms:
The first time a remote user attempts to login to a system, the access is denied despite providing the correct credentials. This is true for both TACACS or RADIUS remote users.
Conditions:
This happens always. A way to simulate the first login is to delete the file /etc/libnss-udr/passwd.
Impact:
The first login fails. Subsequent remote login attempts succeed with proper credentials.
Workaround:
Attempt renote login again.
Fix:
The user now can login with proper credentials from the first attempt. Note that the fix involves having the following version of openssh (or newer):
# rpm -q openssh
openssh-7.4p1-21.F5.6.2.7.el7.x86_64
1586641 : OPT-0063 400G-FR4 periodically has non-zero RMON_RX_BAD_FCS
Component: F5OS-C
Symptoms:
Small number of FCS errors, up to approximately 100 per second, may be seen in FPGA 400G MAC RMON stats.
Conditions:
No special conditions known.
Impact:
Small number of packets will be dropped for FCS failure.
Workaround:
Disable and re-enable the 400G link.
1586089-2 : Resource-admin is unable to perform SCP.
Component: F5OS-C
Symptoms:
Resource-admin is unable to perform SCP.
Conditions:
When trying to use SCP with resource-admin for the available virtual paths.
Impact:
Resource-admin cannot perform SCP file transfers.
Workaround:
Though SCP fails, the file upload/download API works for file upload/downloads.
Fix:
Permissions for resource-admin to perform the SCP file transfer were added.
1586057-1 : F5OS displays an incorrect error if the admin tries to set a password before committing a new user
Component: F5OS-C
Symptoms:
F5OS reports that a password was rejected and displays the configured password policy if the admin tries to set a new user’s password before the new user has been added to the system.
Conditions:
The admin tries to set a password for a user that has just been configured but not yet committed.
Impact:
The administrator could mistakenly think that the selected password is inadequate. But the actual problem is that the user has not been committed to the system yet.
Workaround:
When creating a new user, admins must commit the new user before setting a user’s password.
Fix:
None
1585853 : Telemetry streaming pauses if mgmt-ip gets updated
Links to More Info: BT1585853
Component: F5OS-C
Symptoms:
Telemetry streaming to an external OTEL server is paused for some time if mgmt-ip of the F5OS device is updated.
Conditions:
There should be a telemetry exporter configured to receive data and the mgmt-ip of the F5OS device will be updated at a later time..
Impact:
The external server won’t receive the telemetry data for some time after updating mgmt-ip.
Workaround:
Disable and enable the exporters from ConfD using below commands to re-establish the connection after updating mgmt-ip.
system telemetry exporters exporter <exporter-name> config disabled
system telemetry exporters exporter <exporter-name> config enabled
Fix:
Updated the otel-collector service in F5OS to re-establish the connection with the external server in the event of a lost connection caused by mgmt-ip updates.
1585749-1 : Including lspci commands in QKView capture
Links to More Info: BT1585749
Component: F5OS-C
Symptoms:
The lspci command helps in analyzing the system's faults by evaluating PCI busses. This command is not captured in the QKView file.
Conditions:
Running QKView.
Impact:
The lspci command output is not included in the QKView.
Workaround:
None
Fix:
The lspci command is added in QKView capture.
1585237-2 : When telemetry exporter is not reachable, logs to enable send_queue or retry will be printed in platform.log
Links to More Info: BT1585237
Component: F5OS-C
Symptoms:
When telemetry exporter is not reachable, logs to enable send_queue or retry will be printed in platform.log.
Conditions:
Logs will be printed only when configured telemetry exporter is not reachable.
Impact:
No functional impact.
Workaround:
Ensure the exporter is reachable.
Fix:
OTEL service will not be logging retry and send queue logs when exporter is not reachable.
1585001 : Radius authentication does not work when the shared secret key in the radius configuration is more than or equal to 32 characters
Links to More Info: BT1585001
Component: F5OS-C
Symptoms:
The remote radius users authentication fails when the radius shared secret has more than 31 characters.
Conditions:
The radius shared secret having more than 31 characters
Impact:
The remote radius users will not access to the system.
Workaround:
Log in as an admin into the system and change the radius 'secret' field to have characters less than or equal to 31.
system aaa server-groups server-group <server-group-name>servers server <server-address> radius config secret-key <number-of-characters-should-be<=31>
Then commit the changes.
Fix:
When the radius secret key is longer than 31, the radius users will not have access to the system.
1584469-1 : BX520 TCPDUMP throughput improvement
Component: F5OS-C
Symptoms:
The BX520 blades had more throughput than the BX110, but the TCPDUMP utility could not keep up with the amount of TCPDUMP traffic the smaller blade could do.
Conditions:
BX520 TCPDUMP throughput was quite low compared to BX110 blades, about half of BX110 when it should be double to 3x that of BX110 since BX520 has 4x throughput as BX110.
Impact:
Slower TCPDUMP from dropping TCPDUMP packets when customer uses system diagnostic TCPDUMP in the confD cli.
Workaround:
None
Fix:
Now the line-dma-agent is servicing the DMs on NSO/TAM fast enough for the TCPDUMP higher-throughput traffic on BX520.
1583233-1 : The 'show portgroups' command may not display DDM statistics, or may display stale/out-of-date DDM statistics
Links to More Info: BT1583233
Component: F5OS-C
Symptoms:
An F5OS system (rSeries appliance or VELOS partition) may display stale/out-of-date DDM statistics or no DDM statistics if there are interface in the system that do not have SFP modules inserted.
Conditions:
- r5000, r10000, or r12000-series appliance
- VELOS partition
- Previous interfaces in the system that do not have an SFP module inserted.
Impact:
System does not report correct DDM statistics in 'show portgroups' command output.
Workaround:
Run the ‘show portgroups’ command for each interface that has an SFP module inserted, that is, ‘show portgroups portgroup 5’.
Fix:
Fixed the display issue in ‘show portgroups portgroup state ddm data’.
1582553-1 : The 'components component state' data is not displayed in ConfD.
Links to More Info: BT1582553
Component: F5OS-C
Symptoms:
- No data will be displayed as part of “show components component” in ConfD.
- In the absence of component platform information, GUI features default to r5xxx platform, leading to some functional issues for other platforms.
Conditions:
Intermittently occurs when initializing the state data.
Impact:
You cannot view the hardware information, which is updated under “show components component”.
GUI functional issues for other platform:
For r10xxx - Raid Configuration will not be visible.
For r4xxx/r2xxx - Port Groups may not function as expected. STP screens and Port Mappings will show up, which are not applicable to the platform and will be non-functional.
Workaround:
Log into the appliance as root and restart the platform-mgr docker container:
docker restart platform-mgr
Fix:
The functionalities disrupted on the GUI can be accessed via the CLI.
1582105-1 : Partition RESTCONF may return an incomplete response for f5-cluster:cluster/nodes/node
Component: F5OS-C
Symptoms:
When querying f5-cluster:cluster/nodes/node in a partition, it succeeds for 1000 calls, but then starts returning an incomplete response.
Conditions:
This only happens on chassis with at least one empty slot. Each time that cluster/nodes/node/<blade>/state/tenant-memory is requested on an empty slot, an internal queue will hold on to that request. When the queue is full, requests will stop working.
Impact:
After the symptom starts, cluster/nodes/node cannot be queried successfully until partition services are restarted.
Workaround:
Modify queries to avoid requesting tenant-memory on empty slots. For example, do not use the top-level cluster/nodes/node, but instead use cluster/nodes/node/blade-1.
Fix:
Fixed platform-stats-bridge to no longer query blades that are not present or ready.
1581589 : Lack of IPv4 management address causes OpenShift Ansible playbooks to fail
Links to More Info: BT1581589
Component: F5OS-C
Symptoms:
If there are no IPv4 addresses defined, ansible playbook executions will fail to look up a default route, causing the playbook to fail.
Conditions:
VELOS chassis with no IPv4 management addresses configured.
Impact:
This will fail the addition of new blades to the cluster, as well as a failure in the return merchandise authorization (RMA) situation for both blades and controllers.
Workaround:
The workaround is to add an IPv4 default route to both controllers from the bash shell.
nmcli conn modify team0 ipv4.gateway 192.6.3.254 ipv4.route-metric 32768
nmcli conn up team0
Fix:
Added a default route to allow the ansible playbooks to lookup the route and interface it requires.
1580489-1 : BE2 GCI interface training issue results in failure to process networking traffic
Links to More Info: BT1580489
Component: F5OS-C
Symptoms:
Some particular rSeries systems fail to process networking traffic due to the BE2 GCI interfaces not training properly, resulting in an FPGA datapath lockup.
One potential indication of this is the DMA agent detecting a DM Tx Action ring hang, which can be observed in velos.log / platform.log:
dma-agent[13]: priority="Alert" version=1.0 msgid=0x4201000000000130 msg="Health monitor detected DM Tx Action ring hung." ATSE=0 DM=0 OQS=3
Conditions:
RSeries r5000, r10000, or r12000-series appliance
This issue does not affect r2000 or r4000 series appliances.
Impact:
The system stops delivering traffic from front-panel ports to the host, although egress traffic may continue to work. If an LACP LAG is configured, ports will be unable to join the LAG.
Workaround:
None, and F5 continues tracking the BE2 issue via ID1596625.
Fix:
During system startup, FPGA manager now ensures that the BE2 GCI interfaces are brought up and trained properly.
1580349-1 : Loading backup file with partition ID 1 that is not named "default", throws an error★
Links to More Info: BT1580349
Component: F5OS-C
Symptoms:
Loading an F5OS-C system controller backup file with partition ID 1 that is not named "default" throws an error following a reset-to-default.
Conditions:
If the chassis admin deletes the partition named "default", and then creates a new partition, it will be assigned partition ID 1.
The reset-to-default operation re-creates the default partition with an ID of 1.
Impact:
Saved configuration cannot be restored after reset-to-default.
Workaround:
None
Fix:
Config-restore has been changed to allow restoring a saved configuration that contains a partition with ID 1 that is not named "default".
1580165-1 : Removing a failed patch ISO can remove base services imported from a different ISO★
Links to More Info: BT1580165
Component: F5OS-C
Symptoms:
Removing a failed patch ISO also removes the base services ISO imported by another ISO. Further upgrade will fail even though importing the patch version is successful. You may observe the below log.
appliance-1(config)# system image check-version iso-version 1.5.2-21056
response Compatibility verification succeeded.
Conditions:
-- Base services are already imported by another ISO.
-- Same version patch ISO import failed.
-- Delete the failed patch ISO.
Impact:
Upgrade to a new successful import of patch ISO of the same version will fail.
Workaround:
Rebooting the device will resolve the issue.
Fix:
While removing the failed patch ISO, added a check that if the base services are imported by another ISO, do not delete the base services ISO.
1579453-1 : SAN Validation Mismatch: Key/Cert virtual server No Key Configured
Links to More Info: BT1579453
Component: F5OS-C
Symptoms:
When TLS key/cert is set in confd, create-csr accepts invalid SAN values without generating a CSR or errors. Without a key/cert, confdcli correctly validates the CSR.
2: Run create-csr with various san values
appliance-1(config)# system aaa tls create-csr name namesan san ""
----------------------------------------------------------------^
syntax error: "" has a bad length/size. <======== EXPECTED
appliance-1(config)# system aaa tls create-csr name namesan san ''
appliance-1(config)# <===== should give error
appliance-1(config)# system aaa tls create-csr name namesan san "IP"
appliance-1(config)# <======= should give error
appliance-1(config)# system aaa tls create-csr name namesan san "DNS"
appliance-1(config)# <==== should give error
appliance-1(config)# system aaa tls create-csr name namesan san "f5best"
appliance-1(config)# <==== should give error
appliance-1(config)# system aaa tls create-csr name namesan san IP:1.1.1.1
response <====== EXPECTED
Conditions:
Invalid SAN values are accepted
Impact:
Confd accepting invalid SAN values
Workaround:
None
Fix:
Fixed in F5OS-A 1.8.0
1577049-1 : CVE-2024-1086 - Linux kernel vulnerability
Links to More Info: K000139430, BT1577049
1576545-2 : After upgrade, BIG-IP Next tenant os unable to export toda-otel (event logs) data to Central Manager★
Links to More Info: BT1576545
Component: F5OS-C
Symptoms:
After upgrade, the BIG-IP Next tenant is unable to export toda-otel (event logs) data to CM in VELOS
Conditions:
Upgrading BIG-IP Next tenant from 20.1 to 20.2 on a VELOS system.
Impact:
After upgrade, the BIG-IP Next tenant is unable to export toda-otel (event logs) data to CM
Workaround:
For VELOS Standalone
====================
After upgrade, if the f5-toda-otel-collector cannot connect to host change the tenant status from "DEPLOYED" TO "CONFIGURED" TO "DEPLOYED" to fix the issue. Please note that it will take 5 to 10 min for tenant status to change and it might impact the traffic.
For VELOS HA follow the following steps
=======================================
1. Setup CM on Mango build
2. Add 2 BIG-IP Next instances(Mango build) on the CM
3. Bring up HA on CM with the Enable Auto Failover option unchecked
4. Add a license to the HA instance.
5. Deploy a basic HTTP app in FAST mode with WAF policy attached (Enforcement mode - Blocking, Log Events - all)
6. Send the traffic and verify the WAF Dashboard under the Security section, should be able to see the Total Requests and Blocked response fields with non-zero values
7. Upgrade standby instance to latest nectarine build with the "auto-failover" button switched off.
8. We will observe the instances goes into an unhealthy state on CM.
9. Change the status of the standby instance from Deployed to Configure Mode and save it through partition GUI/CLI.
10. After confirming the status of the pods, change the state of the standby instance back to the Deployed state from the configured state. There should be no impact on the traffic flow during this step.
11. Now do the force failover and check the health status of instances, it will still show unhealthy as instances are in between upgrades.(one instance with Mango build (standby node) and other with Nectarine build(Active node))
12. Now Upgrade the standby instance to the latest nectarine build with the "auto-failover" button switched off.
13. HA should look healthy in this state and traffic should continue to flow.
14. Change the state of the standby instance from Deployed to Configure Mode and save it using partition GUI/CLI
15. After confirming the status of the pods for the instance on partition CLI, change the state of the standby instance back to the Deployed state from the configured state.
16. We will observe the Event logs on the WAF Dashboard under the security section on CM.
17. We can also observe the logs on the "f5-toda-otel-collector" pod showing no Export failures.
18. Upgrade the CM. Systems should be Healthy.
1576241 : Duplicate MAC on different tenants
Links to More Info: K000139293, BT1576241
Component: F5OS-C
Symptoms:
VELOS system controller and chassis partition software may incorrectly start allocating the same MAC addresses to different objects in chassis partitions. In the worst case, this can result in multiple tenants using the same MAC addresses on the same VLAN, resulting in traffic disruptions for those tenants.
This issue occurs when the following conditions are met:
You are running F5OS-C 1.6.x software on the F5 VELOS system controllers.
The system controllers restarted simultaneously, such as during an out-of-service upgrade or power outage.
The F5 VELOS system controllers then fail over.
Conditions:
After this occurs, the VELOS system controller loses track of which MAC addresses have been allocated to chassis partitions, setting up a situation where creating new tenants or chassis partitions may re-use MAC addresses already allocated to objects on the system.
Impact:
Traffic disruption on tenants due to duplicated MAC address.
Workaround:
None
Fix:
Once a system is affected, upgrading to a version or engineering hotfix (EHF) that contains the fix for ID1576241 does not resolve the issue; manual intervention is also required to fix the issue.
1575925 : Running 'show system aaa primary-key state status' while a key migration is in progress can cause key migration errors
Links to More Info: BT1575925
Component: F5OS-C
Symptoms:
If a key migration is in progress (initiated via the ConfD action 'system aaa primary-key set'), and while it is in progress the status of the key migration is checked ('show system aaa primary-key state status'), this can intermittently cause the key migration to fail. Under these conditions, future attempts to 'show' this area of state will also return 'application communication failure'.
Conditions:
1. A ConfD primary key migration is initiated on a VELOS Controller or Appliance system.
2. While the key migration is in progress, the status of the migration is checked.
Impact:
Key migration fails, leaving encrypted ConfD elements in a corrupted state. Furthermore, all operational data callbacks for the 'system aaa primary-key' schema tree will fail indefinitely with 'application communication error'.
Workaround:
To workaround this issue, reboot the affected controller(s) or appliance. After the reboot, the user may re-attempt the key migration.
Fix:
Fixed issue where checking status of key migration could cause the migration to fail.
1575585 : Unable to add blade to Openshift cluster if newly-installed blade is not member of active partition
Links to More Info: BT1575585
Component: F5OS-C
Symptoms:
After a blade is clean-installed (PXE, USB, etc), if the blade is not a member of an enabled/functioning partition, the system is unable to add it to the Openshift cluster successfully.
If an administrator attempts to log into the blade via SSH, it will prompt them that root's password is expired and needs to be changed:
[root@controller-1(VELOS) ~]# ssh blade-1
You are required to change your password immediately (root enforced)
Changing password for root.
(current) UNIX password:
Connection to blade-1 closed.
[root@controller-1(VELOS) ~]#
The "show cluster" command output will report that a blade is reachable ("able to ping"), but will not be able to connect to it ("able to SSH"):
ABLE ABLE
IN READY TO TO PARTITION
INDEX NAME INSERTED CLUSTER CLUSTER PING SSH STATE LABEL
--------------------------------------------------------------------------------------------------
1 blade-1.chassis.local true false false true false Not In Cluster
2 blade-2.chassis.local true false false true false Not In Cluster
3 blade-3.chassis.local true false false true false Not In Cluster
Conditions:
-- Blade is not a member of a VELOS partition, or is a member of a disabled partition.
-- A clean install is performed on blade (i.e. PXE install); this will be the case during an RMA replacement.
Impact:
- Blade will not join Openshift cluster.
Workaround:
Either configure the blade to be a member of an enabled partition, or manually log into the blade as root and go through the "change password" process.
1574861-1 : Incomplete API payload and CLI failure for openconfig interfaces when one controller node is not ready
Links to More Info: BT1574861
Component: F5OS-C
Symptoms:
When one of the system controller nodes transitions to a "NotReady" state:
The OpenConfig Interfaces API (/openconfig-interfaces:interfaces) returns incomplete or "unfinished chunk" payloads.
CLI commands such as 'show interfaces' fail, displaying an "application communication failure" error.
Conditions:
The problem might occur when one of the system controllers is not available.
Impact:
API users may experience incomplete data responses. Users might be temporarily unable to retrieve interface data from the CLI.
Workaround:
Minimize scenarios where one controller is not available.
Fix:
Modified the callpoint registration to ensure reliable data retrieval even when one of the system controllers is unavailable.
1573493-1 : Qkview does not collect the files gid-map.txt, /etc/libnss-udr/passwd, or /etc/libnss-udr/group
Links to More Info: BT1573493
Component: F5OS-C
Symptoms:
When a QKView is collected, the files gid-map.txt, /etc/libnss-udr/passwd, and /etc/libnss-udr/group are not present in the QKView.
Conditions:
A qkview is collected.
Impact:
It may not be possible to troubleshoot certain issues related to authentication.
Workaround:
None
Fix:
The files gid-map.txt, /etc/libnss-udr/passwd, and /etc/libnss-udr/group have been added to QKView collection. Whenever a QKView is collected, these files are present.
1572929-2 : Changing remote authentication methods from RADIUS/TACACS to LDAP may break remote-gid functionality.
Links to More Info: BT1572929
Component: F5OS-C
Symptoms:
If RADIUS or TACACS are utilized for authentication, the user’s ‘passwd’ details will be saved in /etc/libnss-udr/passwd. However, if the user switches to LDAP authentication and disables the previous method, their entry may not be removed from /etc/libnss-udr/passwd.
If a user is using GID remapping (by configuring remote-gid), the authentication will fail, at least when logging into the CLI.
Conditions:
- Enable RADIUS authentication and log into the system as a remote RADIUS-defined user.
- Change the authentication method to LDAP and disable RADIUS authentication.
- Configure remote-gid functionality for an LDAP-defined user. This LDAP-defined user should have the same name as the RADIUS-defined user.
- Log into the system as that remote LDAP-defined user.
Impact:
The authentication will fail for the LDAP-defined user. An error message will appear such as: “No valid role group found in user groups: 9002 123 5340”.
Workaround:
Log into the system as a ‘root’ user and clear the information in /etc/libnss-udr/passwd.
Fix:
The remote-gid functionality will no longer be affected by changing authentication methods from RADIUS/TACACS to LDAP. LDAP users with valid credentials will be allowed in.
1572493-2 : LAG Trunk Configuration is Missing Inside of Tenant
Links to More Info: BT1572493
Component: F5OS-C
Symptoms:
When creating a LACP LAG or Static LAG, the lag and its members will show as up on the F5OS and switch side (Arista and Cisco). However, on the tenant, tmsh will show that neither the trunk nor trunk members are present:
root@(localhost)(cfg-sync Standalone)(Active)(/Common)(tmos)# list net trunk
root@(localhost)(cfg-sync Standalone)(Active)(/Common)(tmos)#
Conditions:
BIG-IP tenant on F5OS system
Impact:
The trunk information will not be visible in the tenant.
- On high-end rSeries appliances (r5000, r10000, and r12000-series systems) and VELOS tenants, traffic will still work.
- On low-end rSeries appliances (r2000 and r4000-series systems), traffic will not flow.
Workaround:
NA
1572489-1 : User accounts with username which includes only numeric values or special characters like "." or ".." or starts with '-' are inactive
Links to More Info: BT1572489
Component: F5OS-C
Symptoms:
User accounts created with username that include only numeric values are inactive or non-functional. Also, usernames starting with dash ‘-’, contain only “.”, “..”, or any invalid characters (other than letters, digits, underscores, dashes and $ (at the end)) creates non-functional user accounts.
Conditions:
User account with username containing only numerics or starts with dash '-' or username like "." or ".." are non-functional.
Impact:
Non-functional user accounts are created. User functionalities like set-password, change-password, or other would not be working as expected.
Workaround:
None
Fix:
User account creation with invalid username will not be possible. An error will be displayed for invalid usernames.
Following is an example:
appliance-1(config)# system aaa authentication users user 12345676578 ?
Possible completions:
Error: "12345676578" is an invalid value.
There wont be 'config' option available to create/configure new user account until you provide valid usernames.
1572137-1 : Upload/Download API should work with '/api' and '/restconf'
Links to More Info: BT1572137
Component: F5OS-C
Symptoms:
Upload/Download is not working with '/api' endpoint.
Conditions:
Use '/api' endpoint to upload/download a file.
Impact:
Fails to Upload/Download a file.
Workaround:
None
Fix:
Fixed an issue occurring with the Upload/Download API.
1560533 : Inconsistent case values (upper and lower case) for different F5OS-C SNMP OIDs
Links to More Info: BT1560533
Component: F5OS-C
Symptoms:
AlertSource in SNMP alert contains text as Controller starting with uppercase C instead of lower case in core alert events.
Similarly, for core alert events generated in blade, comes with Blade instead of blade.
Conditions:
Process crash generating core file and SNMP alerts are enabled.
Impact:
Tools processing SNMP alerts might get affected if tooling is case-sensitive.
Workaround:
None
Fix:
Fixed alertSource text for SNMP core alert events to send lower case.
Tools modified to read alertSource of SNMP core alert events require to update as per the correction.
1559509 : Incorrect displayed state of blade internal data link
Links to More Info: BT1559509
Component: F5OS-C
Symptoms:
The "ifcfg" TMSTAT table on VELOS blades displays an incorrect state for a blade internal link between FPGAs. The "av.1" link is shown as DOWN regardless of its actual state. This link carries tenant traffic on VELOS blades and its operating state may be relevant when performing diagnostics.
Conditions:
The issue is seen on all VELOS blades running a F5OS version that does not have the fix.
Impact:
This issue may incorrectly indicate a breakage in a blade's datapath when there is actually none.
Workaround:
It is possible to view the correct link state with a lower-level debugging command.
From the Linux CLI of a VELOS blade, run the following command to get the current state of the blade's data links.
[root@blade-4 ~]# docker exec partition_fpga fpgatool -c "linkscan show"
Fix:
Corrected a configuration field in fpgamgr code that updates link status.
1558505 : After restarting the fpgamgr service, the last service-instance is not processed
Links to More Info: BT1558505
Component: F5OS-C
Symptoms:
Traffic outage. One service-instance on the slot is missing.
Conditions:
The fpgamgr service restarting without a full system reboot.
Impact:
Traffic outage.
Workaround:
Reboot the device.
1556173 : Poor management backplane link performance on system controller failover
Links to More Info: BT1556173
Component: F5OS-C
Symptoms:
The connectivity of the chassis management backplane may be disrupted for a minimum of 1-5 seconds, and in specific situations, for up to 20 seconds. During this time, tenant instances are unable to communicate with each other over the chassis management backplane.
Conditions:
Failover of the system controller has been observed. Rebooting the active system controller may aggravate the symptoms.
Impact:
Since tenant instances cannot communicate with one another during this period, if the link downtime exceeds 10 seconds, it will trigger a BIG-IP tenant's clusterd timeout. If that BIG-IP tenant is active in an HA pair, a failover will tigger such that the standby BIG-IP is now active.
Additionally, a sod out-of-band mgmt timeout will be triggered for that BIG-IP tenant even if the system controller's management interfaces are configured in a trunk. In some scenarios, this can trigger temporary split brain behavior between BIG-IP tenants in an HA pair.
This can cause unexpected HA failovers if the downtime is long enough and the tenants are multi-slot despite a TMM self-ip being configured in the HA mesh.
Workaround:
No workaround, only mitigations.
1. Do not reboot the active system controller. Perform a system controller failover, then reboot the controller that was previously active.
2. To mitigate issues during an unplanned controller failover, for example health check failures, increase each BIG-IP tenant's clusterd timeout and/or sod timeout up to 30 seconds to reduce erroneous sod and clusterd timeouts.
clusterd timeout can be modified in each BIG-IP via 'tmsh' modify sys db clusterd.peermembertimeout value <int>.
sod timeout can be modified in each BIG-IP via tmsh modify sys db failover.nettimeoutsec value <int>.
3. To mitigate issues during planned controller failovers in a maintenance window, it is possible to prevent unwanted inter BIG-IP tenant failovers or split brain behavior altogether. One strategy includes for each BIG-IP HA pair, set the BIG-IP device failover offline on the chassis where controller failovers are to be executed. While the BIG-IP device is offline, health checks like the sod and clusterd timeouts will not trigger a failover to offline BIG-IP devices. Once the maintenance window is over, each BIG-IP device should have failover set back online. Reference the following article to set a BIG-IP traffic-group's device offline. https://my.f5.com/manage/s/article/K15122.
Fix:
System controller failover incurs no chassis management backplane link downtime.
1555457 : System controller failover may take up to 60 seconds
Links to More Info: BT1555457
Component: F5OS-C
Symptoms:
During an HA failover of system controllers, it was observed that an system controller failover may take up to one minute.
Conditions:
System controllers failovers that are initiated by termination/restart of the vcc-confd container on the currently active system controller.
Impact:
A delay in system controller switchover negatively impacts system controller LACP (LACPD will only send PDUs from the active SC). This can cause problems with tenant HA, which sends HA to keep alive messages over the system controller control plane network.
Workaround:
Execute the system redundancy go-standby command to perform an HA switchover prior to rebooting the active system controller.
Fix:
During an system controller failover initiated by rebooting the active system controller, it takes 3 to 5 seconds for the ConfD Active role to change to the other system controller.
1552945-1 : Tenant images renamed with bracket are not supported★
Links to More Info: BT1552945
Component: F5OS-C
Symptoms:
Live upgrades with prior releases with tenants that use images with brackets in their name will fail when going to a version that restricts the tenant image name character set.
Conditions:
Tenants using image filename with brackets won't allow upgrades to releases that validate the image filename character set.
Impact:
The tenant will have to be recreated or upgrade to a version that does not have the validation.
Workaround:
Tenant has to be recreated with the original image that didn't contain brackets.
Fix:
Brackets were included in accepted character set for tenant image filename.
1552721 : Partition ipv6 managent address is not reachable after a partition switchover
Links to More Info: BT1552721
Component: F5OS-C
Symptoms:
Partition ipv6 management address is not reachable after a partition switchover.
Conditions:
Partition configured with an IPv6 management address.
Partition fails over (due to either go-standby or a fault) from one controller to the other, and then back.
Impact:
Partition is not reachable
Workaround:
Configure the partition system redundancy mode to "active-controller".
When the condition occurs, reboot the system controller that is running the standby partition, and then execute "system redundancy go-standby" on the active system controller.
Fix:
Partition management address is reachable after failover.
1552369 : F5OS-C: Partition volume cannot be removed if an active shell in that directory
Links to More Info: BT1552369
Component: F5OS-C
Symptoms:
The following error will be seen if there is an active shell(session) with the current directory /var/F5/partition{n}
+ lvremove -f /dev/partition_config/partition1
Logical volume partition_config/partition1 contains a filesystem in use.
Conditions:
There is an active shell(session) with the current directory /var/F5/partition{n}
Impact:
Partition volume fails to remove.
Workaround:
Don't ssh login to the system or don't change directory to /var/F5/partition* in ssh session.
Fix:
Any ssh session in the directory will be killed.
1550413 : System events visible in the CLI may not be visible in the GUI
Links to More Info: BT1550413
Component: F5OS-C
Symptoms:
Running "show system events" on the F5OS CLI typically reveals many events that are not visible in the GUI under System Settings > Alarms & Events.
The GUI filters the display of events according to their assigned severity. But since many events are not assigned a severity, such events will be hidden from view.
Conditions:
Events that are not assigned a severity are instead marked "NA". Such events are not visible in the GUI and can only be seen via the CLI or API.
Impact:
The omission of events displayed in the GUI can be misleading. Administrators using the GUI may not be aware of important events that have occurred on the platform.
Workaround:
All system events can be seen by running 'show system events' on the F5OS CLI or by retrieving them via the REST API.
Fix:
On fixed versions, a new option called 'All' has been added to the Severity drop-down selector in the GUI. This displays all events, including ones without a severity assigned.
1549753-1 : System telemetry exporter send queue and retry settings are causing memory issues
Links to More Info: BT1549753
Component: F5OS-C
Symptoms:
Memory issues are seen in system when telemetry exporter is not reachable for a long time.
Conditions:
When exporter is not reachable for a long time.
Impact:
System can go out of memory.
Workaround:
User can disable the send queue and retry setting using ConfD. For example:
appliance-1(config)# system telemetry exporters exporter <<exporter name>> config options send-queue-enabled false
appliance-1(config)# system telemetry exporters exporter <<exporter name>> config options state options retry-enabled false
Fix:
Send queue and retry settings are removed for telemetry exporters.
1549549 : Blades in the "none" partition may cause kubernetes services to fail.
Links to More Info: BT1549549
Component: F5OS-C
Symptoms:
If blades in a chassis a assigned to the none partition, it is possible that kubernetes services may get scheduled on that blade, and fail because they cannot find the correct container version for the service. This can cause the kubernetes cluster to fail, and specific services in the cluster to fail.
Conditions:
This can happen when there are one or more blades assigned to the none partition, and other blades and controllers in the chassis are rebooted. These reboots can cause the kubernetes services to get re-assigned to the blade in the none partition.
Impact:
The kubernetes cluster may show as failed, or the kubevirt or multus services may not operate correctly if their services land on one of the blades assigned to the none partition. This can cause existing tenants to fail, and new tenant deployments to fail.
Workaround:
The workaround is to move the blades in the none partition into a dummy partition that has a valid software version and is enabled. This will allow the blades to correctly start the kubernetes services assigned to those blades.
Fix:
Blades moved to the none partition are now marked as Non-Schedulable to that kubernetes will not try to schedule any services on them.
1549521-1 : VQF and VoQs fail to synchronize after system controller reboot
Links to More Info: BT1549521
Component: F5OS-C
Symptoms:
VQF and VoQs are unable to synchronize between blades after a system controller reboot.
Conditions:
System controller reboot.
Impact:
Loss of traffic between blades.
Workaround:
Reboot affected blades.
1538277-1 : Duplicate Service-Instance IDs for L2FwdSvc causes L2 entries to not be forwarded to all blades
Links to More Info: BT1538277
Component: F5OS-C
Symptoms:
Excessive DLFs in multi-bladed system causing traffic instability.
Conditions:
Two `L2FwdSvc` entries in the service-instance table have duplicate 'instance IDs'
Impact:
L2 entries are not forwarded to the affected blades causing excessive DLFs.
Workaround:
Reboot the higher number blade having the duplicate instance ID.
Fix:
Don't use the instance ID as the key into a map, using the slot number instead which is guranteed to be unique.
1538217-1 : View fpgamgr core file after partition shutdown
Links to More Info: BT1538217
Component: F5OS-C
Symptoms:
fpgamgr core file.
Conditions:
Partition shutdown.
Impact:
No impact other than the core file. Likely a timing problem as the portions of the fpgamgr shut down.
Workaround:
None
Fix:
This fpgamgr corefile on shutdown can be ignored.
1536413-1 : Allowed-ips allowed-ip <name> is not accepting the '-' in the names
Links to More Info: BT1536413
Component: F5OS-C
Symptoms:
Allowed IP profile got deleted while upgrading to 1.7.0 from lower versions. allowed-ip profile names with '-' got erased out. which got fixed in 1.8.0
Conditions:
While upgrading to 1.8.0 from lower versions other than 1.7.0, all allowed IP profile names should have atleast one alphanumeric and it should have not have any other special character other than ('-', '_' and '.')
Impact:
Allowed IP profile gets deleted if it is not matching the pattern.
Workaround:
Re-apply the allowed-IP profile configuration without eiphen '-' in the name
Fix:
Fixed the schema such that allowed IP profile name accepts the '-' in profile name.
1519869-1 : BIG-IP tenant reports blank interface
Links to More Info: BT1519869
Component: F5OS-C
Symptoms:
BIG-IP tenant reports a blank ("") interface member in the trunk when removing one or more interfaces from an aggregation.
Conditions:
BIG-IP tenant reports a blank ("") interface member in the trunk when removing one or more interfaces from an aggregation.
Impact:
BIG-IP tenant has an empty member in the trunk.
Workaround:
No workaround.
Fix:
BIG-IP tenant does not reports a blank ("") interface member in the trunk when removing one or more interfaces from an aggregation.
1505589 : Subject-Alternative-Name (SAN) feature now supports client-side SSL Validation
Links to More Info: K000139300, BT1505589
Component: F5OS-C
Symptoms:
Since no SAN was allowed to be inserted into the http-server’s self-signed certificate, client-side SSL validation was not supported.
This impacts Central Manager's VELOS/rSeries provider. The missing SAN field causes the certificate to be rejected.
Conditions:
Using the default self-signed certificate.
Impact:
Client-side SSL validation is not supported.
Workaround:
To add an SAN, you need to edit the /etc/pki/tls/openssl.cnf file and add it. However, this may not be effective for certain software that does not accurately read the configuration file.
Fix:
A new SAN field has been implemented, which is mandatory, and allows users to enter a value in the field. However, if the value “none” is used, the field can be omitted. Additionally, to allow entry of the SAN, a default tls certificate is created in /etc/auth-config/default/f5os.cert that has the SAN populated with the hostname and management-ip values. In the absence of a user-provided self-signed certificate, the http-server will automatically use the default certificate.
1505293 : Partition image removal message is truncated
Links to More Info: BT1505293
Component: F5OS-C
Symptoms:
If a partition is enabled and then disabled while running version A, and then upgraded to version "B", attempting to deport partition image "A" fails, the CLI throws truncated error messages.
Conditions:
The partition is upgraded with the state is disabled.
Impact:
Incomplete error messages for the failure reason. The error that is reported is:
"Error: Failed to remove software: 1.5.1-14085, error message: Standby removal failed for following reason: OS version".
Workaround:
None
1505221-1 : If accidentally import bad ISO images, it may not removed automatically
Links to More Info: BT1505221
Component: F5OS-C
Symptoms:
When you accidentally import ISO images from a faulty URL, they cannot be removed or replaced with the correct URL.
Conditions:
User accidentally imports faulty ISO images to the system.
Impact:
Deleting and importing system ISO images might have an impact.
Workaround:
Login to the command line with root user access and remove the image via 'rm' under '/var/import/staging', and import the correct ISO.
Fix:
Please refer to the workaround and further detail.
1498009 : Learned L2 entries in data-plane L2 forwarding table may disrupt some traffic flows between tenants
Links to More Info: BT1498009
Component: F5OS-C
Symptoms:
While a tenant transitions from active to standby, an egress packet in flight may trigger a L2 learn event in the FPGA data-plane. This can occur for tenants that transmit using a different MAC address while active, such as when MAC masquerading is enabled. If so, a dynamic L2 entry is created from the source MAC address of the egress packet. These dynamic entries also enable the service DAG without setting a service ID, which causes matching packets to be dropped in the VOQ system due to an invalid service DAG lookup result.
This can disrupt egress traffic for another tenant on the same device, attempting to transmit to the destination MAC address that was recently relinquished by the standby tenant. These drops increment the 'ic_voq_drops' counter in the tmctl vqf_global table.
These L2 entries will not be corrected by subsequent L2 learn events for the same MAC address from a different location. Thus, traffic disruption may persist until entries age out.
Conditions:
- MAC masquerade configured on the traffic-group of an HA pair of tenants.
- A failover from tenant A to tenant B.
- Another tenant running alongside tenant 'A' attempts to transmit to the MAC masquerade address that is now owned by tenant 'B'.
Impact:
Traffic disruption from one tenant to another in specific directions.
Workaround:
None
Fix:
L2 entries that are created from host generated L2 learn events, no longer enable the service DAG for matching packets.
1497657-1 : First SSH login after editing remote RADIUS or TACACS+ user privileges will still apply old privileges
Links to More Info: BT1497657
Component: F5OS-C
Symptoms:
The first SSH login after editing role-based privileges for a remote RADIUS or TACACS+ user will still give the user their prior privileges (or, if the user is newly created, login will be rejected with a message saying "This account is currently not available"). Subsequent logins will apply the updated user privileges.
Conditions:
1. RADIUS or TACACS+ Authentication is enabled.
2. A new user is created in one of the above auth systems, or an existing user’s role-based access is modified.
3. The affected user SSHs into F5OS for the first time after the change in step #2.
Impact:
First login to system after creation fails, or first login after modification of user privileges gives the user incorrect privileges.
Workaround:
None
Fix:
Fix issue where first SSH login after editing remote RADIUS or TACACS+ user privileges will still apply old privileges.
1497349 : Support for SSH-RSA host key algorithm for partitions added in non-fips mode
Links to More Info: BT1497349
Component: F5OS-C
Symptoms:
Unable to establish an SSH connection to the partition using the SSH-RSA host key algorithm in non-FIPS mode.
Conditions:
Attempting to connect to the partition from an SSH client using the SSH-RSA host key algorithm while in non-FIPS mode.
Impact:
SSH connections to the partition cannot be established using the SSH-RSA host key algorithm in non-FIPS mode.
Workaround:
None
Fix:
Support for the SSH-RSA host key algorithm has been added in non-FIPS mode.
1496977-2 : Remote GID mappings to F5OS roles are disconnected for TACACS+/RADIUS authentication methods.
Links to More Info: BT1496977
Component: F5OS-C
Symptoms:
Remote GID mappings (on a TACACS+ or RADIUS server) to F5OS GIDs/roles are not working correctly. When attempting to configure a remote mapping, it results in the access rejection with a message similar to below:
[root@system ~]# ssh radius_or_tacacs_user@<F5OS system mgmt IP>
Password:
Last login: <date> from <source IP>
No valid role group found in user groups: '9000'
Connection to <mgmt IP> closed.
Conditions:
A remote GID mapping is configured for a role in F5OS and the authentication method used for remote users is RADIUS or TACACS+.
Impact:
Remote users cannot log in to the system.
Workaround:
Configure remote user's GIDs in a way that they correspond to the GIDs in F5OS for the desired role(s). Then, remove any remote GID mappings in the F5OS configuration.
Fix:
Fixed remote GID mapping to F5OS roles for TACACS+/RADIUS authentication methods.
1496893 : Third etcd instance can get into an error state on controller upgrade from 1.5.1 to 1.6.1
Links to More Info: BT1496893
Component: F5OS-C
Symptoms:
The internal datastore of third Openshift etcd process has become out of sync with the etcd processes on the other two controllers.
Conditions:
A split brain situation occurred in the lower level database of the third etcd instance on each controller and is unable to recover.
Impact:
The user may notice inconsistencies with the display of tenants due to this condition.
1496837-2 : User-manager's ConfD socket getting closed.
Links to More Info: BT1496837
Component: F5OS-C
Symptoms:
After repeating the change of network type and device reboot, the device goes into a state where the user-manager is not interacting with ConfD.
Conditions:
- Change remote GID role and check '/etc/gid-map.txt' file if the value is reflected.
- Switch network type and reboot the device.
Repeat the above process until '/etc/gid-map.txt' file is not been updated correctly.
Impact:
Any ConfD configuration change that goes through user-manager fails. This includes any of the user’s password changes, or remote GID changes.
Workaround:
Rebooting the system will get the correct GID value from the ConfD and update the '/etc/gid-map.txt' file.
Fix:
The user-manager has no reason to use NSS to lookup any PW/group info, as it deals exclusively with the local user database.
Additionally, there is a ZMQ service that belongs in authentication-mgr (which understands remote authentication) that is in the user-manager container. It forces user-manager to use an ‘/etc/resolv.conf’ that can reference remote sources.
If the user-manager trips over a lookup that goes to LDAP (usually a local-db miss), it can be very slow and time out. The ConfD->user-manager channel is sensitive of slow responses, and shuts down subscriber/callpoint handler/daemon that takes over 15 to 30 seconds to respond. When this happens, the user-manager is going to see an EOF on its ConfD sockets.
This fix forces the user-manager to only lookup on local databases.
1496397-2 : Allowing entry of a Subject-Alternative-Name (SAN) for certificate and CSR creation
Links to More Info: BT1496397
Component: F5OS-C
Symptoms:
There is no method available for inputting the SAN field during the creation of certificates or CSR.
Conditions:
While creating a CSR through system aaa tls create-csr in ConfD.
Impact:
The option to include the SAN field in certificates and/or certificate request is not available.
Workaround:
To add an SAN, you need to edit the /etc/pki/tls/openssl.cnf file and add it. However, this may not be effective for certain software that does not accurately read the configuration file.
Fix:
A new SAN field has been implemented, which is mandatory, and allows users to enter a value in the field. However, if the value “none” is used, the field can be omitted. Additionally, to allow entry of the SAN, a default tls certificate is created in /etc/auth-config/default/f5os.cert that has the SAN populated with the hostname and management-ip values. In the absence of a user-provided self-signed certificate, the http-server will automatically use the default certificate.
As this is a new feature, back-porting to older versions has not been implemented and would be difficult and complex.
1494945-2 : ConfD Application Error when tenant interface stats are not available
Links to More Info: BT1494945
Component: F5OS-C
Symptoms:
When attempting to get tenant interface stats, the system displays "Error: application error".
Conditions:
The creation or modification of tenants may result in inaccurate handling of historical data by the tenant interface-stats logic. This could lead to the display of an “Error: application error” message when queried.
For example:
appliance-1# tenants tenant cbip-tenant-b state interface-stats down-sample-to 10 average 10s-avg
Error: application error
Impact:
Confd reports the error on the command line and logs the error in platform logs.
2024-01-24T20:12:37.123437567Z: [Error]: confd: msg="Action Point reply error" error="confd error: 'Unknown error', last='Invalid confd_vtype value: 0', errno=5"
Workaround:
None
Fix:
The problem has been resolved in more recent versions of F5OS-A. To resolve it, upgrade to a more recent version of F5OS-A. It will resolve once all interfaces are enabled.
1494809-1 : Allowing user to configure HostKeyAlgorithms parameters
Component: F5OS-C
Symptoms:
A new config CLI (system security services service sshd config host-key-algorithm) is implemented to allow HostKeyAlgorithms configuration.
Conditions:
In non FIPS mode, to enable or disable ssh-rsa HostKeyAlgorithm, this newly implemented CLI can be used.
Impact:
HostKeyAlgorithm usage was not configurable.
Workaround:
None
Fix:
This is a new CLI that can be used to enable or disable ssh-rsa HostKeyAlgorithm
1492621-4 : Config-restore fails when backup file has expiry-status field for admin or root user
Links to More Info: BT1492621
Component: F5OS-C
Symptoms:
For a root or admin user, if the value for Expiry-status in the backup file is not set to enabled, then config-restore fails.
Conditions:
During backup, if the "Expiry-status" value for admin or root user is not set to enabled, then restore fails with the backup.
Impact:
Database config-restore fails.
Workaround:
For admin and root user, comment expiry-status, expiry-date in the backup file and try to restore.
Fix:
Added NACM rules in ConfD for successful config-restore.
1492401-1 : User with operator role is not having read-access to all pages
Links to More Info: BT1492401
Component: F5OS-C
Symptoms:
- User experiences unauthorized error when trying to access "Tenant Images", "Software Management", "File
Utilities", "Configuration Backup", and "System Report"
- User sees no items when trying to access "File Utilities", "Configuration Backup", and "System Report" pages
Conditions:
User has operator role.
Impact:
User is not able to view certain pages.
1490753-2 : A linkUp and linkDown traps are sent when an up interface is disabled, and vice versa
Links to More Info: BT1490753
Component: F5OS-C
Symptoms:
When F5OS system is configured with SNMP Targets for managing the Trap notifications, linkUp and linkDown traps will be sent when interface state is toggled.
Conditions:
Always two traps (linkUp and linkDown) will be sent even when the interface state is toggled from UP to DOWN or DOWN to UP.
Impact:
No functional impact, but when two traps are sent, the interface state over SNMP can be misleading.
Workaround:
None
Fix:
The appropriate trap, that is, linkDown trap when F5OS interface state is down and linkUp trap when F5OS interface state is up, will be sent.
1488225 : Partition dagd cores during system startup
Links to More Info: BT1488225
Component: F5OS-C
Symptoms:
Occasionally, the partition dagd component triggers an assert and cores due to loss of connectivity with the internal system database. The partition dagd component will automatically restart.
Conditions:
The system database experiences a loss of connectivity during the startup of the partition.
Impact:
No functional impact.
Workaround:
None
1486697-2 : Configuring Expiry-status of root and admin users should not be allowed
Links to More Info: BT1486697
Component: F5OS-C
Symptoms:
Expiry-status of root and admin users are allowed to be configured and there is a chance of locking out these users.
Conditions:
If Expiry-status of any root or admin user is marked as Locked, that root or admin user cannot log in to the system.
Impact:
There is a chance that default users, such as root and admin, become locked out.
Workaround:
None
Fix:
You cannot edit the ‘Expiry-status’ field in webUI for admin and root users. Thus, it cannot be configured. The 'Expiry-status' field for root and admin users will now always display the default value as 'Enabled'.
1474833 : Debug output is missing from qkview
Links to More Info: BT1474833
Component: F5OS-C
Symptoms:
NSE debug registers missing from qkview output.
Conditions:
-- VELOS system
-- A qkview is taken
Impact:
Qkview file is missing some desired component output.
Workaround:
Without this fix, manually read desired debug registers with existing tools under the guidance of F5 support.
Fix:
With the current fix in place, all NSE debug registers are included in the standard QKView output.
1472917-1 : LDAP authenticated admins logging in via the serial console may have trouble disabing appliance mode during system instability
Links to More Info: BT1472917
Component: F5OS-C
Symptoms:
If ConfD is not running, F5OS offers an emergency option to disable appliance mode when an administrator logs in successfully via the serial console.
Conditions:
The admin role has been configured with a remote-gid that is not 9000 and the admin successfully authenticates via LDAP on the serial console while ConfD is not running.
Impact:
Remotely-authenticated admin users cannot disable appliance mode if ConfD is offline.
Workaround:
None
Fix:
Remotely-authenticated admin users can disable appliance mode if ConfD is offline.
1472373 : Failure of BX110 10G Links to recover after going DOWN
Links to More Info: BT1472373
Component: F5OS-C
Symptoms:
If the 10G link on the BX110 experiences a disruption, such as a cable pull or peer device shutdown, it may occasionally fail to re-establish connectivity even after the issue is resolved.
Conditions:
The 10G link on the BX110 experiences a disruption, such as a cable pull or shutdown on the peer device, leading to a DOWN state.
Impact:
Loss of connectivity.
Workaround:
None
Fix:
Regularly reset the 'DOWN' link to clear the failure state and enable the establishment of the connection.
1469385-2 : GUI freezes during LDAP user authentication if no remote GID mapped locally.
Links to More Info: BT1469385
Component: F5OS-C
Symptoms:
The LDAP remote user authentication freezes for a long time (more than a minute).
Conditions:
When trying to authenticate a remote LDAP user through the GUI without mapping any of the remote user GIDs to the F5OS local roles.
Impact:
Authentication freezes for a long period before rejecting the user.
Workaround:
One of the remote GIDs should be mapped to the local F5OS roles.
Fix:
Map the remote GID(s) to the F5OS role(s) to authenticate remote LDAP users successfully.
1469333-1 : VELOS management LAG may bridge traffic between management interfaces during LACP negotiation
Links to More Info: BT1469333
Component: F5OS-C
Symptoms:
When the management interfaces of VELOS system controllers are configured in a LACP LAG, the VELOS system may incorrectly forward some ethernet frames ingressing one management interface out the other management interface.
This behavior occurs during the period between when an interface links up and when the system completes LACP negotiation and adds the interface to the LAG.
This can result in management switches incorrectly learning non-VELOS MAC addresses as being present on the VELOS management LAG interface.
Conditions:
- VELOS system
- Management interfaces configured in LACP LAG
Impact:
VELOS management interfaces incorrectly forward non-VELOS frames from one management interface out the other, causing upstream switches to learn non-VELOS MAC addresses as being present on the VELOS management LAG interface.
Workaround:
Configure the upstream switch to be an LACP lag first, then configure the VELOS system MGMT interfaces to use an LACP lag.
1466397 : LDAP authentication is consuming several minutes to authenticate via GUI and SSH.
Links to More Info: BT1466397
Component: F5OS-C
Symptoms:
LDAP authentication is working fine. However, authentication takes several minutes, which lacks a user-friendly experience.
Conditions:
- Configure LDAP server-group.
- Configure LDAP_ALL as an authentication-method.
- Log in using LDAP user via GUI or SSH.
Impact:
The user is forced to wait for several minutes to get the result of LDAP authentication.
Workaround:
None
Fix:
Removed unnecessary GID lookup to speed up LDAP authentication.
1462329 : CC takes time to come up after reboot is triggered in active CC.
Links to More Info: BT1462329
Component: F5OS-C
Symptoms:
Containers take time to come up after reboot when active CC is rebooted.
Conditions:
Reboot should be triggered in active CC.
Impact:
Current standby CC takes time to come up after reboot.
Workaround:
None
1461289 : On a rSeries appliance, config-backup proceed is broken
Links to More Info: BT1461289
Component: F5OS-C
Symptoms:
On a rSeries appliance, system database config-backup 'proceed' is broken. It is about overwriting an existing backup file, but it prompts you to proceed even if a file does not exist.
Conditions:
System database config-backup always prompts for the user to proceed even if a file does not exist.
Impact:
No functional impact. When you provide input 'yes', the backup file will be generated.
Workaround:
When prompted to 'proceed', you must respond with 'yes'.
Fix:
The system database config-backup prompts the user with ‘proceed’ option only when the file exists and the user is not provided ‘proceed yes’ in the input CLI command.
1455913-4 : Tcpdump on F5OS does not honor the -c flag
Links to More Info: BT1455913
Component: F5OS-C
Symptoms:
When using Tcpdump on F5OS with the -c flag, Tcpdump will not stop after receiving the given number of packets.
Conditions:
A Tcpdump session is started with the -c or --count flag.
Impact:
The Tcpdump session will not terminate after receiving the requested number of packets and will continue until manually terminated.
Workaround:
N/A
Fix:
Tcpdump now honors the -c flag and will terminate after receiving the given number of packets.
1455769 : Slow execution of ansible-playbooks on cluster reinstall caused timeouts and retries for many hours.
Links to More Info: BT1455769
Component: F5OS-C
Symptoms:
A openshift cluster rebuild kept failing and retrying do to timeouts while running the ansible-playbooks to rebuild the cluster. This caused the cluster rebuild to fail for 8 plus hours, during which time not tenants could be started.
Conditions:
An openshift cluster rebuild was issued after upgrade the system.
Unable to reproduce this issue locally.
Impact:
While the ansible-playbook runs were timing out, it was not possible to launch tenants on the chassis.
Workaround:
The playbooks stopped timing out after 8 plus hours, no workaround is known.
Fix:
1.) Enhanced code that generates and corrects the /etc/hosts file to make sure all the necessary entries are always present and correct.
2.) Enhanced the code the handles the SSH connection caching to make sure it always cleared during ansible-playbook runs, so it won't get affected by a stale connection
3.) Playbook timeouts will be increased after a timeout failure up to 3x to try and allow the system to complete it's work even if something is slowing down the playbook runs.
1455725-1 : Partition go-standby command sometimes fails to change active instance
Links to More Info: BT1455725
Component: F5OS-C
Symptoms:
The partition "go-standby" command is sometimes too slow to finish taking over. When this happens usually the system briefly goes active/active and then resolves to the preferred node.
Conditions:
Attempting to force the partition active instance location using the go-standby. Normal HA framework initiated failovers work properly.
Impact:
When the confd instances are failing back & forth, the control plane daemons will be disconnected.
Workaround:
Allow the HA framework to manage instance locations and don't use "go-standby" to attempt to force instance location. If necessary, the "mode" can be set temporarily to "prefer" the desired location.
Fix:
Performance of yield/takeover operation has been improved.
1436153-2 : F5OS upgrades fail when SNMP configuration contains special characters.
Links to More Info: BT1436153
Component: F5OS-C
Symptoms:
As part of some security fixes, added a special character restriction in SNMP configuration in F5OS-A 1.5.1. This resulted in an upgrade failure to 1.5.1. If an upgrade to 1.5.1 is successful, the SNMP configuration will get deleted implicitly.
Conditions:
Upgrade to 1.5.1 fails when the SNMP configuration contains any special characters. The restricted special characters are: /*!<>^,/
Impact:
If the user encounters this issue, the system will go to an inaccessible state and require a forced downgrade.
Workaround:
Delete the SNMP configuration (community, target, or user) containing special characters before performing an upgrade to 1.5.1.
Fix:
The special characters in the SNMP configuration do not inject any security issues and can have special characters. Hence, the special characters restriction is removed in F5OS-A 1.5.2 and F5OS-A 1.8.0.
1429741-3 : Appliance management plane egress traffic from F5OS-A host going via BIG-IP Next tenant management interface instead of host management when both are in same subnet
Links to More Info: BT1429741
Component: F5OS-C
Symptoms:
When BIG-IP Next tenant is installed, a default route rule is added on host. If tenant management and host management IPs are on same subnet, then two similar rules are created with destination as same subnet.
The tenant route rule is created with higher priority (metric 0) resulting any management egress traffic destination belonging to same subnet is going through tenant management interface instead of host management interface.
Conditions:
BIG-IP Next tenant is deployed on appliance.
Impact:
End users receiving traffic from appliance, will observe sender IP as tenant management interface instead of host management interface.
Note:
a. This issue will be observed only when host management & tenant management subnet is same and also destination to which data is sent is on same subnet.
b. This impacts management plane traffic within the appliance's management subnets.
Workaround:
N/A
Fix:
N/A
1429721-2 : SCP as non-root user does not report errors correctly for bad/non-existent files.
Links to More Info: BT1429721
Component: F5OS-C
Symptoms:
Using SCP to retrieve files from F5OS as "admin" or other non-root users should report a proper error when attempting to access an invalid directory or non-existent file.
Instead, the SCP command does nothing, reports no error, and exits with an on-zero exit status.
Conditions:
Attempt to read a non-existent/inaccessible file via SCP.
Impact:
The user is not informed about the failed SCP operation and the reason for the failure.
Fix:
SCP server software now reports errors the invalid/inaccessible filenames.
1429713 : VELOS ATSE v7.10.4.12 firmware
Links to More Info: BT1429713
Component: F5OS-C
Symptoms:
VELOS ATSE v7.10.4.12 firmware
Conditions:
VELOS CX410 blades.
Impact:
Not applicable.
Workaround:
Not applicable.
Fix:
Fixes RRDAG issues. See ID1347997 or ID1785385 for more information.
1411137-2 : Audit log entries are missing when creating or deleting objects via UI or API
Links to More Info: BT1411137
Component: F5OS-C
Symptoms:
When creating or deleting multiple remote-server related objects via UI or API, multiple restart happens causing log message drop.
Conditions:
While creating or deleting multiple objects related to remote-server, rsyslog restart everytime to apply new configuration. Due to the restart, some log messages are dropped.
Impact:
Log messages are dropped due to multiple restarts of the rsyslog.
Workaround:
None
1410729 : VELOS backplane packet priority issue
Links to More Info: BT1410729
Component: F5OS-C
Symptoms:
A packet priority issue was discovered during internal testing.
Conditions:
No special conditions.
Impact:
No impact has been reported.
Workaround:
Fixed in VQF bitfiles v8.10.1.3 and newer.
Fix:
Updated priority of backplane traffic in VQF bitfile.
1410609 : Watchdog resets during PSU management may cause AOM/LOP to remain in bootloader mode
Links to More Info: BT1410609
Component: F5OS-C
Symptoms:
The system controller AOM/LOP may encounter a watchdog reset while doing PSU management. If multiple watchdog resets occur in succession, then the AOM/LOP may remain in bootloader mode and be unavailable.
When this occurs, the PEL log will indicate a LOP watchdog reset in the LopPsuManagement task, for example:
07/17/2024 06:27:14 | 36644 | AOM | 128 | Network Access | 5 | LopPsuManagement task 100% of watchdog period, resettin
07/17/2024 06:27:20 | 36645 | AOM | 190 | Network Access | 5 | watchdog reset, successive watchdog resets: 9
When there have been 10 successive watchdog resets then AOM/LOP remains in bootloader mode and needs to be reprogrammed.
Conditions:
- A VELOS system controller, in either the CX-410 or CX-1610 chassis.
Impact:
If multiple watchdog resets occur in succession, then the system controller's AOM/LOP may remain in bootloader mode and be unavailable.
Workaround:
At the system controller host prompt, verify that the AOM/LOP is in bootloader mode.
[root@controller-1 ~]# lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID f5f5:df11
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
An AOM/LOP in bootloader mode will enumerate as USB device f5f5:df11 as shown above.
To reprogram the AOM/LOP firmware, first locate the latest firmware version provided by F5OS-C. For example:
[root@controller-1 ~]# ls $(docker container inspect platform-fwu -f '{{ range.Mounts}}{{.Source}}{{printf "\n"}} {{end}}' | grep config_fw-volume) | grep ^lop-chassis-controller
lop-chassis-controller-v2.01.1238.0.1.dfu
Then reprogram the AOM/LOP using the firmware version located above, for example:
[root@controller-1 ~]# docker exec -it platform-fwu dfu-util -D /usr/lib/firmware/lop-chassis-controller-v2.01.1238.0.1.dfu
It will take approximately 2 minutes to reprogram the AOM/LOP firmware image. The AOM/LOP will enumerate as USB device f5f5:3000 after reprogramming, as shown:
[root@controller-1 ~]# lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID f5f5:3000
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Fix:
Fixed in lop-chassis-controller-application-2.01.1276.0.1 and later.
1410229 : Display a GUI warning to let user know tenants might be affected/reboot★
Links to More Info: BT1410229
Component: F5OS-C
Symptoms:
User is not informed about tenants getting temporarily affected when an F5OS upgrade operation is performed.
Conditions:
Upgrading a F5OS system
Impact:
User can be unaware that tenants will be affected/rebooted by performing a system upgrade.
Workaround:
None
Fix:
Added a warning message which will be displayed in the confirmation popup before triggering system upgrade. This new warning message conveys upgrade operation may lead to temporary downgrade of tenants.
1410225 : Enhanced the upgrade prompt for better understanding the impacts of upgrade on tenants
Links to More Info: BT1410225
Component: F5OS-C
Symptoms:
Older upgrade prompt didn't include information about impacts of upgrade on tenants.
Conditions:
An upgrade is triggered.
Impact:
Upgrading does not warn you that tenants will be started and traffic will be disrupted.
Workaround:
None
Fix:
Fixed in 1.8.0
1408477-1 : When more than one PCIe AER error has occurred, diag-agent reports this as a "RAS AER 'unknown' error" instead of the individual AER errors.
Links to More Info: BT1408477
Component: F5OS-C
Symptoms:
When more than one PCIe AER errors are occurred simultaneously, diagnostics will not report the events.
Conditions:
This occurs when more than one PCIe AER errors occur simultaneously.
Impact:
You are unable to see the individual PCIe errors.
Workaround:
None
Fix:
Updated diagnostics to consider and report more than one PCIe AER errors when occurred simultaneously.
1408369-1 : The "MAC exhaustion" error message during tenant creation may be caused by configuration processed during startup initialization★
Links to More Info: BT1408369
Component: F5OS-C
Symptoms:
If tenant configuration is processed before startup initialization has completed then there may be a MAC exhaustion error issued.
Conditions:
If startup initialization has not completed then the available MAC addresses are not known and a MAC exhaustion error is issued.
Impact:
The tenant configuration fails.
Workaround:
Reconfigure the tenant once startup has completed.
Fix:
The configuration code is now gated to not run until startup initialization has completed.
1403817 : SNMP IF-MIB misreport the status and speed of LACP LAGs
Links to More Info: BT1403817
Component: F5OS-C
Symptoms:
SNMP polling on IF-MIB provides incorrect status and speed of LACP Lag interfaces.
Conditions:
The issue is seen only on SNMP interface. The correct status and speed display on CLI or GUI.
Impact:
The user will see inappropriate status and speed details when polled for IF-MIB details on SNMP for LACP LAG interfaces.
Workaround:
None
Fix:
Fixed the issue to display the correct values of LACP LAG interfaces in IF-MIB SNMP polling.
1401965 : Copying BIG-IP ISO to /var/import/staging/, leaves ISO loopback mounted
Links to More Info: BT1401965
Component: F5OS-C
Symptoms:
An error occurs:
ERROR: sw-mgmt: priority=error msgid=0x3501000000000154 msg=Unexpected error processing "import /var/export/chassis/import/iso/<image>.iso": [Errno 30] Read-only file system: 'ace-1.1.7-0.0.3.i686.rpm'
Conditions:
Copying a BIG-IP ISO to /var/import/staging/ (rather than /var/F5/system/IMAGES or /var/F5/partition<num>/images)
Impact:
An error occurs and the ISO loopback remains mounted
Workaround:
None
Fix:
Fixed in F5OS-A/C 1.8.0
1401621-1 : Modifying a remote server with multiple selectors from the web UI removes the AUTHPRIV configuration.
Links to More Info: BT1401621
Component: F5OS-C
Symptoms:
The AUTHPRIV option is not available on the webUI. Modifying a remote log server, which has multiple servers, from the webUI removes the AUTHPRIV configuration
Conditions:
Modifying a remote server with multiple selectors from the webUI.
Impact:
The AUTHPRIV selector has been removed from the configuration.
Workaround:
To modify the configuration of a remote server with more than one selector, use the CLI.
Fix:
Added AUTHPRIV option to the webUI. Modifying the configuration of a remote server with more than one selector from the web UI will not remove AUTHPRIV from the configuration
1400557-1 : Incorrest slot info may cause blade backplane link errors
Links to More Info: BT1400557
Component: F5OS-C
Symptoms:
In VELOS v1.7.1, the system controller relies on the slot width of each blade in order to configure backplane port, as it is required for the blade to occupy the system controller blade slot. If the slot width is one, the system controller configures for Bx110 and if it is two, the configures Bx520. If the ConfD slot info is not available, the resulting backplane port configuration may be no suitable for the blade occupying the given slot.
Conditions:
A Bx110 or Bx520 blade is present in the chassis.
Impact:
Blade will fail to send /receive traffic over the backplane.
Workaround:
This defect is addressed in 1.8.0 by changing chassis manager, so that it updates confD with slot info provided by platofmr-ha before processing other HA info that might result in an error which skips the processing of the slot info.
Fix:
Blade backplane port link errors no longer observed when suitable slot info is present in system controller ConfD.
1400221-2 : OpenTelemetry exporters may not produce data upon first tenant being added to system
Links to More Info: BT1400221
Component: F5OS-C
Symptoms:
Telemetry streaming stops when the first tenant is configured.
Conditions:
When OpenTelemetry exporters are configured before the first tenant is configured within F5OS, this can lead to a condition where the exporters stop streaming metrics and logs.
Impact:
OpenTelemetry exporters stop producing metrics and logs.
Workaround:
The work-around is to disable and re-enable all exporters from the ConfD CLI.
system telemetry exporters exporter <name> config disabled
system telemetry exporters exporter <name> config enabled
Fix:
N/A
1400125 : Non-patch version of orchestration may start on controller after RMA replacement or rolling upgrade.
Links to More Info: BT1400125
Component: F5OS-C
Symptoms:
In a patch release after an RMA or rolling upgrade, the orchestration-manager version that is started may be from the base build of the patch release, rather than the version from the patch release, if orchestration-manager was updated in the patch release.
Conditions:
An RMA of a system running a patch release, or a rolling upgrade to a patch release.
Impact:
Base version of orchestration-manager may run on either controller instead of the patch version, until the controller is rebooted. This means that the patch release will not take affect until the controller is restarted.
Workaround:
Reboot the affected controller.
Fix:
Orchestration manager code was updated to wait for the patch registry to be created and populated before launching orchestration-manager.
1399929 : F5OS permits non-existent ethernet interfaces to be configured
Links to More Info: BT1399929
Component: F5OS-C
Symptoms:
F5OS allows you to manually type in non-existent interfaces of type "ethernetCsmacd" when adding an interface component.
The system later prohibits you from deleting this non-existent interface while the type is ethernetCsmacd.
Conditions:
User-triggered command for non-exposed interface type.
Impact:
The configuration contains a non-existent ethernet interface with no actual activity.
Attempting to delete the interface from the Partition CLI will result in the following error :
Partition-1(config)# no interfaces interface 1/1.1
Warning: Some elements could not be removed due to NACM rules prohibiting access.
From the Partition GUI The Network-> interfaces Page will show blank
"There are no items to show in this view."
Workaround:
Delete the non-existent interface.
Depending on the type of the created interface , there are 2 ways to delete it .
In case the interface is not a valid interface for the blade such as 25.0 or even a fake name like "example" you can delete it using "no interfaces interface 25.0"
If the interface is for a real interface, for example you have created interface 1/1.1 while the portgroup is in 100G mode (only 1.0 is valid) you will need the following procedure to delete it :
From from the Controller running the Active Partition
Login to controller that is the active one for the needed partition (can be seen via "show partitions" command on the controller). Also from "show partitions" on the controller check the partition ID.
From controller bash: docker exec -it partition2_manager bash (assuming the partition ID is 2)
From inside the partition run the following:
confd_cmd -c "mdel /interfaces/interface{1/1.1}"
Fix:
With this fix, F5OS will reject the creation of ethernetCsmacd.
1399757 : SNMP ifTable data missing for some interfaces when ports unbundled
Links to More Info: BT1399757
Component: F5OS-C
Symptoms:
SNMP interface data is not returned for all interfaces on the system when the device is configured with unbundled interfaces (4x10Gb or 4x25Gb modes).
Conditions:
Configure device with unbundled interfaces.
default-1(config)# portgroups portgroup 1/2 config mode MODE_4x25GB
default-1(config-portgroup-1/2)# commit
The following warnings were generated:
'portgroups portgroup': VLAN, LAG, FDB, L2 protocols configuration is lost for the interfaces corresponding to the changed portgroups. Blade(s) 1 will reboot.
Proceed? [yes,no] yes
Commit complete.
default-1(config-portgroup-1/2)# top
default-1(config)# portgroups portgroup 2/2 config mode MODE_4x25GB
default-1(config-portgroup-2/2)# commit
The following warnings were generated:
'portgroups portgroup': VLAN, LAG, FDB, L2 protocols configuration is lost for the interfaces corresponding to the changed portgroups. Blade(s) 2 will reboot.
Proceed? [yes,no] yes
Commit complete.
Check the SNMP output. it will not list all the interfaces.
Impact:
SNMP will not list all the unbundled (subports) interfaces.
Workaround:
None
Fix:
Fixed an issue with SNMP not listing all unbundled interfaces.
1397145-3 : Unable to add blade to Openshift cluster if VELOS partition root password is expired or locked
Links to More Info: BT1397145
Component: F5OS-C
Symptoms:
If a VELOS partition root password is expired or locked, the system may be unable to add the blade to the Openshift cluster (or manage the cluster).
The "show cluster" command output will report that a blade is reachable ("able to ping"), but will not be able to connect to it ("able to SSH"):
ABLE ABLE
IN READY TO TO PARTITION
INDEX NAME INSERTED CLUSTER CLUSTER PING SSH STATE LABEL
--------------------------------------------------------------------------------------------------
1 blade-1.chassis.local true false false true false Not In Cluster
2 blade-2.chassis.local true false false true false Not In Cluster
3 blade-3.chassis.local true false false true false Not In Cluster
Conditions:
-- VELOS partition
-- root account in partition is expired or locked
Impact:
- Blade will not join Openshift cluster.
- Unable to deploy Tenants to blade.
Workaround:
Re-enable the root user account for the partition:
system aaa authentication users user root config expiry-status enabled
1394993 : Upon configuration changes, the l2-agent container restarts with a core.
Links to More Info: BT1394993
Component: F5OS-C
Symptoms:
On systems running F5OS-A or F5OS-C, wen the owner field of the fdb entry is updated by the system, for L2_LISTENER entries, l2_agent crashes.
Conditions:
Configuration changes triggered by system for L2_LISTENER fdb entries. Note that this field is not used by STATIC fdb entries, but the problem can be reproduced easily with STATIC entries.
Impact:
When l2_agent crashes there is a potential disruption to configuration processing.
Workaround:
None
Fix:
The fix will avoid the crash, and the update of the owner leaf will be processed accordingly.
1394913 : Rare LACPD crash during process termination
Links to More Info: BT1394913
Component: F5OS-C
Symptoms:
LACPD crashes, generating a core file.
Conditions:
While the LACPD process terminates, it may crash. Operations such as a host reboot and software upgrade cause the process to terminate.
Impact:
A core file is generated. No functional impact to the system.
Workaround:
N/A
Fix:
LACPD no longer crashes during process termination.
1394201 : Vcc-lacpd can intermittently core dump when disconnected from system database
Links to More Info: BT1394201
Component: F5OS-C
Symptoms:
Vcc-lacpd unexpectedly restarts, leaving a core file on the related system controller.
Conditions:
Vcc-lacpd can disconnect from the system database while the process is running. A disconnect of this nature is hard to predict and is not typical. When the connection is reestablished, the process typically crashes.
Impact:
A core file for vcc-lacpd process is generated. Vcc-lacpd process restarts and recovers. Chassis backplane LACP aggregations may go down for a few seconds while the process restarts, briefly interrupting mgmt traffic to blades. User dataplane traffic is unaffected.
Workaround:
None
Fix:
Vcc-lacpd does not crash during this case.
1393441 : Partition fails over on link fault when mgmt ports are aggregated
Links to More Info: BT1393441
Component: F5OS-C
Symptoms:
After aggregating management ports, failover can occur if the active controller's management link goes down.
Conditions:
-- Aggregated system controller management ports
-- The active controller management link goes down
Impact:
An unexpected failover occurs
Workaround:
None
Fix:
In releases with this fix, if user aggregates mgmt ports and active system controller link goes down, no failover will occur.
1393269-2 : Error log: "PINGLOOP Failed to ssh to 127.0.0.1"
Links to More Info: BT1393269
Component: F5OS-C
Symptoms:
"PINGLOOP Failed to ssh to 127.0.0.1" logged in platform.log by Appliance Orchestration Manager.
Conditions:
1. root user locked with expiry status set to "locked".
2. Appliance rebooted after locking root user.
Impact:
Internal processes relying on root user may malfunction.
Workaround:
Avoid locking the root user account by not setting the expiry status to "locked".
Use appliance mode for root user lockdown.
1389001 : Controller upgrade failed with certificate bundle
Links to More Info: BT1389001
Component: F5OS-C
Symptoms:
System controller upgrade failed with "Compatibility verification failed" error in CLI and webUI.
Conditions:
If certificate bundle is configured.
Impact:
Upgrade failed.
Workaround:
Delete the certificate bundle.
1388525 : Partition configuration database locks up, preventing database changes
Links to More Info: BT1388525
Component: F5OS-C
Symptoms:
At times, the partition HA cluster fails to start up correctly, leading to issues with database replicas and the secondary controller instance not reaching "standby".
The "show system redundancy" command at the partition CLI can confirm this issue. Blades will be either "offline" or "failed", with a reason of "reconnecting" or "database disconnected" for an extended period (more than a few seconds).
Conditions:
Write transactions occurring during HA cluster formation can sometimes interfere with database initialization/replication, most often observed when multiple blades reboot together during a rolling upgrade.
Impact:
Blades fail to initialize, causing tenants to not restart correctly.
Workaround:
Disable and re-enable the partition.
If both partition controller instances are healthy (active/standby), in the partition CLI, enter config mode and use the "system redundancy go-standby" command.
Fix:
The HA framework recognizes the database replication lockup and automatically resets the cluster.
1388477-1 : Default GID group mapping authorized even when GID mapped to different group ID
Links to More Info: K000139503, BT1388477
1381737-1 : On VELOS, utils-agent generates "item is not writable" errors every fifteen minutes
Links to More Info: BT1381737
Component: F5OS-C
Symptoms:
The "utils-agent" daemon generates a number of error messages every 15 minutes:
utils-agent[18]: priority="Err" version=1.0 msgid=0x5e01000000000011 msg="utils-agent : failed get value for cdb" COMPONENT="/file/transfer-operations" ERROR="item is not writable" LASTERROR="Not allowed in slave mode" ERRORNO=4.
These error messages are generated on the standby node.
Conditions:
-- VELOS system controller or VELOS partition
Impact:
These error messages can be ignored.
Workaround:
None
1381661-1 : LDAP external authentication fails if there is no group definition for user's primary GID
Links to More Info: BT1381661
Component: F5OS-C
Symptoms:
LDAP external authentication (e.g. REST API or GUI; but not ssh) fails in the following scenario:
- User is defined in external auth system (e.g. LDAP)
- User has a primary GID assigned
- There is no group definition for user's primary GID
While this is legal, because the numeric GID should be sufficient, when we try to look up the group info and fail, this short circuits authentication resulting in an error.
Conditions:
- User is defined in external auth system (e.g. LDAP)
- User has a primary GID assigned
- There is no group definition for user's primary GID
Impact:
Externally defined users may not be able to log in.
Workaround:
Define a group for the user's primary group ID.
system aaa authentication roles role <group name> config remote-gid <group ID>
Fix:
LDAP external authentication no longer fails if there is no group definition for user's primary GID. The numeric GID is sufficient.
1381385-3 : Additional password policy settings
Component: F5OS-C
Symptoms:
Youa are unable to configure min-days, warn-age, and remember when configuring a password policy.
min-days: a limit on how many days a user must wait between password changes
warn-age: indicates how many days before their password expires a user will be warned
remember: indicates the number of previous user passwords that will be saved in the system
Conditions:
Configuring the password policy
Impact:
It is not possible to configure mid-days, warn-age, remember.
Workaround:
None
Fix:
You can now configure warn-age, min-days, and remember when setting a password policy.
1381277-1 : Most recent login information is not displayed in F5OS webUI
Links to More Info: BT1381277
Component: F5OS-C
Symptoms:
The most recent login information is not available in the F5OS webUI. These details can only be accessed through the CLI.
Conditions:
When using F5OS webUI.
Impact:
To access the most recent login information, you must use the CLI.
Workaround:
Use CLI command 'show last-logins' to access the recent login information.
Fix:
From F5OS-A 1.8.0, the most recent login information can be found in the User & Roles screen of the F5OS webUI.
1381057-2 : Opening and closing preview pane is causing the page scrollbar to disappear on View Tenant Deployments screen
Links to More Info: BT1381057
Component: F5OS-C
Symptoms:
On the "View Tenant Deployments" screen, when there are a significant number of tenants on the tenant data table, there will be a page level scroll. Opening and closing the preview pane by clicking on any row makes the page level scroll bar disappear.
Conditions:
User should be on the "View Tenant Deployments" screen and there should be many tenants configured on the system so that user can see a page level scroll bar.
Impact:
Opening and closing preview pane is causing the page level scrollbar to disappear making it impossible for a user to scroll down and see the tenants that are out of scroll view.
Workaround:
N/A
Fix:
The issue is now fixed and opening and closing preview pane no longer hides the page level scrollbar. The user can scroll down to see the tenants that are hidden in scroll view.
1379625-3 : Changing the max-age attribute in password policy is not reflecting immediately
Links to More Info: BT1379625
Component: F5OS-C
Symptoms:
Even after setting max-age value (maximum age, in days, after which password will be expired) less than 7 days, the warning for password expiration is not displaying at the time of next login.
Conditions:
Set max-age attribute to less than 7 (days) and check if password expiration warning is prompted at the time of next login.
Impact:
Password expiration feature is not working as expected.
Workaround:
N/A
Fix:
Fix is provided to sync the max-age value, updated from ConfD CLI, with the user's password expiration attribute in the /etc/shadow on the system.
1379565-2 : Observing QKView start from 100% and then going back to 1%
Links to More Info: BT1379565
Component: F5OS-C
Symptoms:
On a second execution of QKView, it is possible that the percent complete reported by the system diagnostics QKView status command will remain at the previous setting until the QKView collection set-up has been completed. This has no effect on the QKView collection, but it can be confusing.
Conditions:
QKView is executed two or more times.
Impact:
Confusing percent-complete number for a few moments.
Workaround:
Wait for a few moments until QKView capture set-up has finished (up to 30 seconds).
1378805-2 : Error occurs when changing LAG type for an existing LAG interface on webUI
Links to More Info: BT1378805
Component: F5OS-C
Symptoms:
On the webUI, if a LAG type changes from LACP, an error displays when that LAG type changes back to LACP.
Conditions:
The error occurs when attempting to change the LAG type on an existing LAG interface to a previously used type.
(i.e. Creating a LAG interface with type LACP, changing that type to Static, and then changing it back to LACP)
Impact:
This issue does not affect functionality; however, an unnecessary "Object Already Exist" error pop-up appears.
Workaround:
To avoid the pop-up, change the LAG type to LACP using the CLI in this scenario.
Fix:
Changing the LAG type on an existing LAG interface to a previously used type no longer triggers an error pop-up on the webUI.
1377945-2 : Controller Upgrade Failure Reported by ConfD★
Links to More Info: BT1377945
Component: F5OS-C
Symptoms:
During a rolling upgrade, the system controller image may display a completed status, but both controllers report running the new image.
Conditions:
Upgrading system controller images.
Impact:
The user's system functions as expected. To proceed with another upgrade, the user must execute the system image install-abort command. To clear the status and continue running the same image, the user must downgrade to the old image and then upgrade to the desired one.
Workaround:
Abort the failed upgrade using the system image install-abort command. Up/downgrade to a different version than the one currently running. After completion, upgrade to the desired version.
Fix:
After an upgrade where both controllers run the updated version, the "show system image" command will display an install-status of "success" for both controllers.
1367041 : Import of a system controller image fails on standby system controller during removal★
Links to More Info: BT1367041
Component: F5OS-C
Symptoms:
Import will fail on the Standby controller with previous releases when done during Software removals in progress.
Conditions:
Importing while removal is in progress
Impact:
Standby import fails.
Workaround:
On the standby system controller. run the linux command touch on the ISO that failed to import.
ex.
touch /var/import/staging/F5OS-C-1.6.0-18695.CONTROLLER.iso
Fix:
Import will not fail in 1.8.0 release. Import is delayed for 20 mins if imported while software removals are in progress. Now you will see log message in VELOS regarding the same.
<number of removals>:Removal of software is in progress, Import will take sometime, please wait...
1366417-1 : Long BIG-IP tenant names will cause not having virtual console access
Links to More Info: BT1366417
Component: F5OS-C
Symptoms:
No access to the BIG-IP tenant virtual console.
Conditions:
BIG-IP tenant name is longer than 32 characters.
Impact:
The creation of the tenant-console user fails, preventing access to the virtual console for that tenant.
Workaround:
Use tenant names that don't exceed 32 characters in length.
Fix:
Warn the user when using BIG-IP tenant names that exceed 32 character in length.
1366157-2 : Warning needed about creating tenant with same name as existing user account name
Links to More Info: BT1366157
Component: F5OS-C
Symptoms:
When a tenant is created with the same name as an existing user account, the end user will not be able to log into the tenant console with that user account. A warning is not included.
Conditions:
Creating the tenant with the same name as an existing user account.
Impact:
The end user will not be able to connect to the tenant mgmt-ip with the user account.
Workaround:
Delete and re-deploy the tenant again with a different name.
Fix:
A warning that a console user won't be created if it matches the same name as a user account has been added.
1365985-1 : GID role mapping may not work with secondary GID
Links to More Info: BT1365985
Component: F5OS-C
Symptoms:
When a user in an external authentication system (LDAP, Radius, TACACS) is given a GID for an F5 role, and that GID is a secondary GID, the role assignment may not be discovered. This would result in the inability to access the system or be able to configure the system for that user.
Conditions:
- User in an external authentication system (LDAP, Radius, TACACS)
- GID corresponding to F5 role is a secondary GID (for example, it is not the user's default GID, rather a GID from a group to which the user belongs)
Impact:
Inability to log into the system, or inability to configure the system for the user in question.
Workaround:
The GID for the desired role should be the GID directly mapped to the user in the external authentication system (for example, in LDAP, the gidNumber on the user object should be the F5 role GID), rather than a secondary GID (for example, in LDAP, the gidNumber on a group of which the user is a member).
Fix:
All GID role mappings are properly considered when discovering role assignments for users in external authentication systems.
1365977-1 : Container daemons running as PID 1 cannot be cored on-demand
Links to More Info: BT1365977
Component: F5OS-C
Symptoms:
- kill -QUIT (or any other core-producing signal) to a container process running as PID 1 does not cause a core file.
- Actual runtime errors do generate cores as expected.
Conditions:
Containers that run their services directly as PID 1.
Impact:
Not possible to force a core file for diagnostic purposes.
Workaround:
None
Fix:
Containers that were running directly as PID 1 have been modified to use a minimal "init" process to catch and forward signals to the real service process.
The command:
"docker exec {containername} kill -QUIT 1"
can be used to core a daemon running as a child of /dev/init.
More complicated containers that have multiple processes running under 'bash' script may need to use
"docker exec {containername} kill -ABRT -1"
Note that if the "docker kill" or "docker stop" commands are used instead of "docker exec", the container will not restart, resulting in an inoperative system.
1365409-2 : CVE-2023-3341: bind: stack exhaustion in control channel code may lead to DoS
Links to More Info: K000137582
1360905-1 : Unexpected log messages in /var/log/boot.log post-integrity recovery
Links to More Info: BT1360905
Component: F5OS-C
Symptoms:
Users may observe the following inappropriate log message in /var/log/boot.log after recovering from integrity failure:
Sep 28 08:45:08 appliance-1 journal: FIPS Integrity Check: This system has been placed in an error state. Try to recover the system using /usr/libexec/ostree_recover utility or reinstall the system. On many devices pressing the escape key followed by '(' key will bring up a menu that allows the system to be restarted.
Conditions:
The integrity failure occurs when the device is in FIPS mode, and a user alters or removes a file, subsequently executing an on-demand integrity test or a boot-up integrity test.
Impact:
There are no noticeable performance issues or anomalies associated with these log messages, and the issue does not affect the overall system performance or user experience. There are no potential risks or security concerns related to the inappropriate log messages.
Workaround:
N/A
Fix:
The code has been modified to provide more user-friendly log messages.
1360285-1 : Partition is not reachable after performing multiple powercycles
Links to More Info: BT1360285
Component: F5OS-C
Symptoms:
During boot up, there is a possibility that the primary key can cause the key logic to create a new key, thus making the partition unreachable.
Conditions:
The normal state of the primary key is to not change unless there is an error in reading the key incorrectly.
Attempting multiple reboots where there is a possibility of interruption with the key storage access can cause the key logic to create a new key.
Impact:
Once a new key is generated, the existing encryptions no longer can be decrypted and the partitions in particular become unreachable because of this condition.
Workaround:
N/A
Fix:
The retry logic was enhanced and no longer creates a new key based on recourses being temporarily unavailable.
1360137-2 : Non-root users unable to download or pull qkview/pcap/core files via SCP
Links to More Info: BT1360137
Component: F5OS-C
Symptoms:
Remote auth and non-root users are unable to download qkview/pcap/core files via SCP or pull files while specifying the local path.
Conditions:
When a non-root user tries to download/pull files (qkview/pcap/core) they don't have required permissions.
Impact:
Accounts with non-root remote access will not be able to download/pull the files using scp.
Workaround:
None
Fix:
Added support virtual paths with permissions using which a non-root user will be able to download/pull files using scp.
1359933 : System controller fails over when mgmt ports are aggregated
Links to More Info: BT1359933
Component: F5OS-C
Symptoms:
After aggregating mgmt ports failover can happen if active mgmt link goes down.
Conditions:
-- Aggregated system controller management ports
-- The active management link goes down
Impact:
An unexpected failover occurs
Workaround:
None
Fix:
In releases with this fix, if user aggregates mgmt ports and active system controller link goes down, no failover will occur.
1355277-1 : Incorrect Vlan Listeners when a Static FDB is configured
Links to More Info: BT1355277
Component: F5OS-C
Symptoms:
When a Static FDB is configured on an interface, Vlan Listeners associated with that interface will have an extra Service ID configured for Service ID 1.
Conditions:
A Static FDB is configured on an interface.
Impact:
Extra broadcast traffic will be generated on the system, which could affect performance.
Workaround:
N/A
Fix:
N/A
1354697 : Stale trunk data after trunk deletion
Links to More Info: BT1354697
Component: F5OS-C
Symptoms:
- There could be mismatching actor key for interfaces in the same aggregation.
- The non-selected LACP members could be marked as LACP_UP.
Conditions:
This happens every time after trunk deletion.
Impact:
LACP members and LACP aggregation might be in an unexpected state.
Workaround:
Restart lacpd container.
Fix:
Appropriately clean up the trunk data after deletion.
1354341-1 : Changing a VLAN from trunked (tagged) to native (untagged) on a LAG in a single transaction can cause traffic outage
Links to More Info: BT1354341
Component: F5OS-C
Symptoms:
Traffic outage after changing a VLAN assigned to a LAG from Trunk to Native in a single commit.
Conditions:
Changing a VLAN assigned to a LAG from Trunk to Native in a single commit.
Impact:
Traffic outage.
Workaround:
First remove the Trunk VLAN from the LAG, then commit the change. Then add the Native VLAN to the LAG and commit the change.
1354329-3 : Unable to access tenant through console access.
Links to More Info: BT1354329
Component: F5OS-C
Symptoms:
Admin can create a user with 'tenant-console' as its primary role from the ConfD CLI. This may create tenant console access issues if a tenant gets created with the same name as the user. The 'tenant-console' role is only for tenant and a new user with 'tenant-console' role cannot be created.
Conditions:
Admin has created a user with the 'tenant-console' role and then created a tenant with the same name as the 'tenant-console' user.
Impact:
Console access to the tenant (with the same name as a user, which is created earlier with tenant-console role) won't be working.
Workaround:
None
Fix:
Added a warning to be displayed during user creation with the 'tenant-console' role from ConfD CLI.
Example:
appliance-1(config)# system aaa authentication users user test_1 config role tenant-console
appliance-1(config-user-test_1)# commit
Aborted: 'system aaa authentication users user test_1 config role': tenant-console role cant be assigned to users other than tenant users.
1353985 : Controller-manager pods fail to start with status of CrashLoopBackOff
Links to More Info: BT1353985
Component: F5OS-C
Symptoms:
When the controller-manager pods are unable to start and have a status of CrashLoopBackOff, tenants may fail to start.
oc get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
default docker-registry-1-qf79w 1/1 Running 0 4d 100.77.0.44 controller-1.chassis.local <none>
default registry-console-1-dflwj 1/1 Running 0 4d 100.77.0.49 controller-1.chassis.local <none>
default router-1-cdb9h 1/1 Running 0 4d 100.76.0.43 controller-2.chassis.local <none>
default router-1-vtkv6 1/1 Running 0 4d 100.77.0.42 controller-1.chassis.local <none>
kube-service-catalog apiserver-5xz4z 1/1 Running 0 147d 100.77.0.46 controller-1.chassis.local <none>
kube-service-catalog apiserver-ltnkh 1/1 Running 6 147d 100.76.0.42 controller-2.chassis.local <none>
kube-service-catalog controller-manager-hkpz2 0/1 CrashLoopBackOff 8 18m 100.76.0.209 controller-2.chassis.local <none>
kube-service-catalog controller-manager-zw9kx 0/1 CrashLoopBackOff 3 1m 100.77.0.240 controller-1.chassis.local <none>
Conditions:
This issue is caused under the following conditions:
- VELOS chassis
- Upgrade
Impact:
Tenants fail to start.
Workaround:
Reinstall Openshift.
1353649-1 : System controller can configure an invalid chassis network prefix
Links to More Info: BT1353649
Component: F5OS-C
Symptoms:
If a booting system controller receives invalid chassis network prefix information from its peer, it may use that information and configure an invalid network prefix for the chassis.
Conditions:
Chassis startup.
Impact:
Docker fails to start because the configured prefix info is invalid.
Workaround:
Netowrk prefix info validation is added to ensure network prefix info received form the peer system controller is valid. If not, the receiver will reboot expecting to receive valid network prefix info on the next startup.
Fix:
After validating network prefix info received duing startup, docker startup failures caused by overlapping network prefixes are no longer seen.
1353429 : False indication of Always-On Management (AOM) Power-On Self-Test (POST) failure for I2C1 interface
Links to More Info: BT1353429
Component: F5OS-C
Symptoms:
The Always-On Management (AOM) may report a false indication of a Power-On Self-Test (POST) failure for the I2C1 interface.
Conditions:
- VELOS system controller.
Impact:
No functional impact. Although the AOM reports a Power-On Self-Test (POST) failure, the I2C1 interface is functional.
Workaround:
None
Fix:
Fixed in VELOS system controller AOM/LOP firmware v2.01.1282.0.1 and later.
1353161-1 : Snmpd daemon stuck in loop deleting and recreating 'system snmp communities community' entry after recreating and deleting SNMP config a few times
Links to More Info: BT1353161
Component: F5OS-C
Symptoms:
Snmpd daemon stuck in loop deleting and recreating 'system snmp communities community' entry after recreating and deleting SNMP config a few times.
Conditions:
1. Put an SNMP configuration, e.g.:
curl -sku admin:admin -H "content-type: application/yang-data+json" https://localhost/api/data/openconfig-system:system/f5-system-snmp:snmp -XPUT -d @put2.json
# jq -c . <put2.json
{"f5-system-snmp:snmp":{"targets":{"target":[{"name":"i10_2_108_100","config":{"name":"i10_2_108_100","community":"verynicecommunity","security-model":"v2c","ipv4":{"address":"10.2.108.100","port":162}}},{"name":"i10_2_108_101","config":{"name":"i10_2_108_101","community":"verynicecommunity","security-model":"v2c","ipv4":{"address":"10.2.108.101","port":162}}}]},"communities":{"community":[{"name":"verynicecommunity","config":{"name":"verynicecommunity","security-model":["v2c"]}}]},"engine-id":{"config":{"value":"mac"}}}}
#
2. Wait 10 seconds or so
3. Delete/clear the SNMP config, using one of the two methods:
a. curl -sku admin:admin -H "accept: application/yang-data+json" https://localhost/api/data/openconfig-system:system/f5-system-snmp:snmp -XDELETE
b. from the confd CLI in config mode:
no system snmp ; commit no-confirm
4. Wait 15 seconds, while monitoring /var/log/messages for repeating audit messages related to the SNMP config.
5. Repeat first three steps.
Impact:
High CPU and inconsistent state (SNMP community string comes and goes from 'show running-config system snmp' output while the user is watching it).
Workaround:
Restart snmpd container using docker command.
Fix:
We obsoleted old SNMP configuration commands.
Behavior Change:
In latest F5OS releases (from F50S-A-1.2.x and F5OS-C-1.6.x onwards) SNMP configuration commands have been simplified. For backward compatibility, the old style SNMP configuration works until F5OS 1.7.0 and keeping a confirmation warning in the CLI asking user to use new simplified snmp commands and the old style commands will be obsolete in future releases.
In latest release (from F50S-A-1.8.x and F50S-C-1.8.x), the old SNMP configuration commands are obsolete.
1353085-1 : Configure admin/operator roles in LDAP without uidNumber or gidNumber attributes
Links to More Info: BT1353085
Component: F5OS-C
Symptoms:
In previous versions of F5OS, when using LDAP for third-party authentication, having uidNumber and gidNumber LDAP attribute mappings was required. These attributes are common on unix systems and unix-based directories, but are optional in Windows environments. In Windows environments (For example, Active Directory), admin may be required to manually add uidNumber attributes to users, and gidNumber attributes to admin/operator groups.
Conditions:
Third-party LDAP authentication using Active Directory or other LDAP directory where uidNumber and gidNumber attributes are not provided by default.
Impact:
In the above conditions, administrators are required to add uidNumber attributes to users in the directory, and gidNumber attributes to admin/operator groups.
Workaround:
Create uidNumbmer/gidNumber attributes if not present in directory.
Fix:
A feature was added to map LDAP groups to F5OS roles using LDAP filter (group names) instead of numeric IDs. Additionally, code was added to use objectSid mapping instead of uidNumber/gidNumber to eliminate the need to create missing attributes in Active Directory environments.
1353001-1 : tcpdump service improvements
Links to More Info: K000139502, BT1353001
1352845-3 : Some internal log content may not appear in external log server
Links to More Info: BT1352845
Component: F5OS-C
Symptoms:
When a remote log server is configured, some internal log content may not appear in the logs on the remote server. Notable are logs related to audit login failures.
Conditions:
Remote logging server is configured. Log messages do not appear on remote server for user trying to log in with wrong password repeatedly, causing account lockout.
Impact:
Brute-force password attack indications may not be seen on external log server.
Workaround:
For logs of this type, consult the log files directly on the appliance.
1352449-3 : iHealth upload is failing with error "certificate signed by unknown authority"
Links to More Info: BT1352449
Component: F5OS-C
Symptoms:
When attempting to use the QKView upload feature, the upload may fail with the message "certificate signed by unknown authority". This is due to a recent change in certificate authority that is inconsistent between F5OS and iHealth.
Conditions:
Always, after mid-September 2023.
Impact:
Unable to upload QKView files to iHealth with a single click.
Workaround:
Users may use the File Export feature to download QKView files to their PCs, and then upload those files to iHealth.
You can find the qkview files in the GUI at System Settings :: File Utilities, then choose "diags/shared" as the base directory, then select "qkview".
Fix:
Certificate authorities used by the iHealth upload feature in F5OS will be updated.
1352353 : Remove integrity-check configurable option from CLI
Links to More Info: BT1352353
Component: F5OS-C
Symptoms:
In F5OS systems, root and admin users are allowed to toggle the integrity-check option from the CLI. When in FIPS mode, integrity-check should always execute on system startup and when demanded. Since the integrity-check option is configurable, users can disable it which puts the integrity of the system at risk.
Conditions:
The configurable integrity-check option is visible when the device is in FIPS mode.
Impact:
An admin or root user could access the CLI and disable integrity-check. This could replace files and packages which could impact the integrity of the system.
Workaround:
N/A
Fix:
We have removed the enable/disable integrity-check option from the CLI.
1351893-3 : ConfD Logging 'Failed to change working directory' Error Message
Links to More Info: BT1351893
Component: F5OS-C
Symptoms:
When running the tcpdump client from the ConfD command line interface, ConfD logs 'failed to change working directory /var/roothome' error message in the devel.log file.
Conditions:
Running tcpdump client from the ConfD CLI.
Impact:
No known impact.
Workaround:
No work around.
Fix:
When ConfD executes external commands, the working directory is set to the user home directory by default. ConfD logs error if unable to find the user's home directory.
1351541-1 : Unable to remove the ISO images that share the same minor version with the running version
Links to More Info: BT1351541
Component: F5OS-C
Symptoms:
Removal of ISO (controller/partition/appliance) fails when a same minor version is shared.
Example: Import 1.5.1 and upgraded the system to 1.6.1. Later import 1.6.2(1.6.*) and upgraded the system to 1.6.2. When the system is on 1.6.2 unable to delete 1.6.1.
Conditions:
The major and minor version of the current ISO must be same as the ISO version that is being removed/deleted.
Impact:
Unable to remove the unused ISO.
Workaround:
For controller/appliance, you must remove the ISO on a software version that includes different minor release. For example, you can remove 1.6.1-5555 while running ISO version 1.5.X or 1.7.X.
For partition, disable and unset the ISO versions of any partitions that use the same minor version of the ISO that needs to be removed. For example, you can remove 1.6.1-5555 by disabling all the partitions running on 1.6.X and de-configure the SW versions.
1349977-2 : Setup wizards fails and immediately exits if it is given incorrect credentials.
Links to More Info: BT1349977
Component: F5OS-C
Symptoms:
If incorrect credentials are entered while using the setup wizard tool, it fails and exits immediately without allowing the user to correct the given credentials.
The setup wizard utility should make it clear that only non-root admin accounts can be used.
Conditions:
Incorrect credentials are passed to the setup wizard tool.
Impact:
User is not given the chance to correct incorrect credentials.
1349953-2 : Setup wizard script gives an "All IP addresses must be unique" error when NTP and DNS servers match
Links to More Info: BT1349953
Component: F5OS-C
Symptoms:
When the given IP addresses of NTP and DNS servers match, the setup wizard script gives the error, "All IP addresses must be unique" even though it is a valid configuration.
Conditions:
The IP addresses of NTP and DNS servers given to the Setup wizard tool are the same.
Impact:
Through the setup wizard tool, the user is not able to provide the same IP address for NTP and DNS servers, which is a valid configuration.
Workaround:
The same IP address for NTP and DNS servers can be configured using the webUI or CLI instead of the setup wizard tool.
1349465 : Partition s/w upgrade compatibility check doesn't use correct target version
Links to More Info: BT1349465
Component: F5OS-C
Symptoms:
When performing the partition database compatibility upgrade check (check-version/set-version), the check logic does not always use the correct target version. This potentially can cause the compatibility check to pass, but the actual database upgrade can fail and automatically roll back.
Conditions:
When the target partition version is a patch release (such as 1.5.1, 1.6.1), the compatibility check will use the wrong (base release) version.
Impact:
The check-version/set-version database compatibility check might pass even though the actual upgrade would fail.
Workaround:
Upgrade the controller s/w to version F5OS-C 1.6.1 or later prior to attempting upgrade to a partition patch release.
Fix:
The controller OS services uses the correct partition patch version for the compatibility check.
1349257 : Rolling software upgrade is stuck with one system controller in an "in-progress" state, and a "No such file or directory" error in sw-mgmt.debug★
Links to More Info: K000137531, BT1349257
Component: F5OS-C
Symptoms:
While performing a rolling software upgrade on VELOS system controller software, one controller completes the installation process, but the other remains stuck in an "in-progress" state, and is not reachable on its management IP.
1. One of the two system controllers is "stuck" and largely inaccessible after a rolling upgrade:
a. Cannot connect to system controller's management IP.
b. Cannot connect to the system controller as root from the active system controller, e.g. "ssh controller-#"). The controller should be accessible over the "ccpeer" link.
c. Platform services are not running.
2. When you access the stuck system controller (via console or connection over the "ccpeer" link):
a. Some subset of the files in /var/docker/config/ are broken symlinks (env_var, env_var.patch, platform.yml, platform.patch.yml)
b. A log message similar to this is in /var/log/sw-mgmt.debug with the error "No such file or directory":
19-Oct-23 14:55:34 - ERROR: sw-mgmt: priority=error msgid=0x3501000000000153 msg=Unexpected error importing controller services 1.6.1-19136: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
Conditions:
Performing a rolling system update of VELOS system controller software.
Impact:
The upgrade process is stuck, and one controller remains inoperative.
Workaround:
To avoid running into this issue during an upgrade, either:
1) perform an out-of-service upgrade, rather than a rolling upgrade. Refer to https://techdocs.f5.com/en-us/velos-1-5-0/velos-systems-installation-upgrade/title-install-upgrade-software.html for more information.
2) add a systemd drop-in file for the sw-mgmt service on each controller, by logging into each controller as root and doing the following:
a. Create a systemd drop-in file for the sw-mgmt service by running the following commands:
mkdir /etc/systemd/system/sw-mgmt.service.d/
echo -e '[Unit]\nWants=docker.service\nAfter=docker.service' > /etc/systemd/system/sw-mgmt.service.d/deps.conf
cat /etc/systemd/system/sw-mgmt.service.d/deps.conf
The output of displaying the file should look like this:
[root@controller-2 ~]# cat /etc/systemd/system/sw-mgmt.service.d/deps.conf
[Unit]
Wants=docker.service
After=docker.service
[root@controller-2 ~]#
b. Activate the modified configuration:
systemctl daemon-reload
c. Verify that the service is in a functioning state, and now has an explicit dependency on docker:
systemctl status -l sw-mgmt
systemctl list-dependencies sw-mgmt | grep docker
To remove the workaround, run the following on each system controller individually:
a. Log into the system controller as root
b. Rename the systemd drop-in file to have a ".disabled" extension
mv /etc/systemd/system/sw-mgmt.service.d/deps.conf /etc/systemd/system/sw-mgmt.service.d/deps.conf.disabled
c. Reload systemd:
systemctl daemon-reload
If a system has encountered this problem:
1. Log into the active system controller via SSH as 'root'.
2. SSH to the offline system controller over the internal 'ccpeer' link.
By default, if a chassis uses RFC6598 IP addressing, the IP addresses of the system controllers on this network will be:
controller-1: 100.65.7.51
controller-2: 100.65.7.52
The IP address of the peer controller on the ccpeer link can be found by running this command:
echo "peer controller: $(ifconfig ccpeer | grep -Po '(?<=inet )([^.]+\.){3}')$(( 53 - $(grep Slot /etc/PLATFORM | cut -d':' -f2 | tr -d ' ')))"
3. Verify that both system controllers have the same set of software images present, by comparing the output of "ls /var/import/staging/*.iso" on both system controllers.
4. On the offline system controller, stop the sw-mgmt service
systemctl stop sw-mgmt
5. On the offline system controller, make a backup copy of import.json:
cp /var/import/import.json ~/import.json.bak
6. On the offline system controller, copy import.json from the working controller over the ccpeer link.
If controller-1 is working, and controller-2 is offline: scp 100.65.7.51:/var/import/import.json /var/import/import.json
If controller-1 is offline, and controller-2 is working: scp 100.65.7.52:/var/import/import.json /var/import/import.json
7. On the offline system controller, start the sw-mgmt service
systemctl start sw-mgmt
8. Wait about 5 or 10 minutes (you can monitor progress by tailing /var/log/sw-mgmt.debug), and then run this command to list the controller services versions that the sw-mgmt service has imported:
echo list cc_iso | nc -U /var/sw-mgmt.unix
9. If that works as expected, reboot the offline system controller.
reboot
After the system controller reboots, it should progress further in the installation process. If there are pending firmware upgrades, the system controller may reboot automatically again to complete those upgrades.
1348989-1 : GUI virtual server CLI has different limitations for days-valid
Links to More Info: BT1348989
Component: F5OS-C
Symptoms:
The range of acceptable values for days-valid for a certificate had inconsistent range limits between the GUI and CLI.
Conditions:
Creating a self-signed certificate.
Impact:
Possible to enter a value that cannot be reflected in both the GUI and CLI.
Workaround:
Limit the number of days-valid to the smaller of the two limits (65535).
Fix:
Both the CLI and the GUI now have the same range limits.
1348093-1 : Appliance-setup-wizard traceback on invalid NTP input
Links to More Info: BT1348093
Component: F5OS-C
Symptoms:
Appliance setup wizards throw an uncaught Python traceback if you enter non-numeric input for the NTP port
[root@appliance-1 ~]# appliance-setup-wizard
Traceback (most recent call last):
File "/usr/bin/appliance-setup-wizard", line 1355, in <module>
curses.wrapper(main)
File "/usr/lib64/python2.7/curses/wrapper.py", line 43, in wrapper
return func(stdscr, *args, **kwds)
File "/usr/bin/appliance-setup-wizard", line 1329, in main
if scene.setting.is_valid(input_string) is not True:
File "/usr/bin/appliance-setup-wizard", line 282, in is_valid_ntp_port
int(input_string) < MIN_NTP_PORT or
ValueError: invalid literal for int() with base 10: 'abc'
Conditions:
Giving non-numeric value as NTP port configuring via wizard-setup
Impact:
Throws an uncaught Python traceback.
Workaround:
None
Fix:
Fixed in F5OS-A 1.8.0
1345977-1 : VELOS interfaces flapping if an interface is disabled
Links to More Info: K000136113, BT1345977
Component: F5OS-C
Symptoms:
After disabling an interface on a VELOS blade:
-- Interfaces intermittently start flapping UP/DOWN.
-- "Optics removed" is found in fpgamgr logs for 4x10G or 4x25G port groups
-- LACP links go down randomly
Conditions:
-- VELOS system
-- Port groups configured in 4x10GB or 4x25GB mode.
-- One or more interfaces on the blade are disabled.
Impact:
Interfaces are intermittently marked DOWN and then UP. Traffic is disrupted while the interface is marked DOWN.
Workaround:
Enable all physical interfaces on the blade, even interfaces that may be unplugged or unused.
1342129-1 : Issues with liveness probe during tenant deploy/re-deploy causing incorrect identification of container health status
Links to More Info: BT1342129
Component: F5OS-C
Symptoms:
Occasional error messages may appear, indicating unhealthiness during tenant deploy and re-deploy due to liveness probe misidentification.
Conditions:
The issue may occur during tenant deploy/re-deploy.
Impact:
These error messages are false indications of an issue. If the tenant is operational, these messages can be disregarded. They should resolve themselves within two hours.
Workaround:
Ignore the false status report messages.
Fix:
The issue is resolved by implementing additional checks to ensure that the data remains up-to-date, thereby excluding stale positives from the results.
1341521-2 : Incorrect subnet mask returned for GET call for /systems
Links to More Info: BT1341521
Component: F5OS-C
Symptoms:
Subnet mask returned from Get call for /systems returns the wrong netmask for the management IP on VELOS and rSeries.
Conditions:
BIG-IP Next instances on VELOS and rSeries.
Impact:
Does not impact any functionality. GET API call for /systems returns the wrong subnet mask for the management IP.
Workaround:
Log in to the machine/tenant and check the management IP address by using the ip addr show command.
Fix:
N/A
1338521-1 : Unable to login when accessing F5OS GUI through a network proxy on a port other than 443.
Links to More Info: BT1338521
Component: F5OS-C
Symptoms:
Users are not able to log in to the UI when trying to access F5OS GUI through a network proxy running on a port other than 443.
Conditions:
GUI should be accessed via a network proxy running on a port other than 443.
Impact:
Users are not able to log in to the GUI.
Workaround:
None
Fix:
After the fix, GUI now reads the port along with the hostname from the URL and can use the port in making API calls (including login API calls).
1332781-1 : A remote user with the same username as the local F5OS user will be granted the local user's roles
Links to More Info: BT1332781
Component: F5OS-C
Symptoms:
If you create a remote user on the RADIUS, TACACS+, or LDAP servers with the same username as a local F5OS user, the remote user will be granted the local user's roles upon authentication.
Conditions:
A remote user is created with the same username as a local user and remote authentication is enabled.
Impact:
Remote user will take the local user's privileges.
Workaround:
Do not create a remote user with the same username as the local user. If you have created already, change the username for either the local user or the remote user.
Fix:
If a remote user is created with the same username as a local user, the remote user's authentication will be rejected. Only the local user will have access to the F5OS system.
1330797 : Interfaces removed from LACP trunk due to traffic congestion
Links to More Info: BT1330797
Component: F5OS-C
Symptoms:
Interfaces repeatedly removed and added to a LACP LAG due to dropped LACP PDUs.
Conditions:
High traffic volume resulting in weighted-random-early-drop (WRED) being invoked.
Impact:
LACP PDUs dropped resulting in loss of LACP state.
Workaround:
Reboot affected blade.
Fix:
Modify LACP, STP and LLDP to use class-of-service 0 (highest priority) for PDUs.
1330793 : Interfaces removed from LACP trunk due to traffic congestion
Links to More Info: BT1330793
Component: F5OS-C
Symptoms:
Interfaces repeatedly removed and added to a LACP LAG due to dropped LACP PDUs.
Conditions:
High traffic volume resulting in weighted-random-early-drop (WRED) being invoked.
Impact:
LACP PDUs dropped resulting in loss of LACP state.
Workaround:
Reboot affected blade.
Fix:
Adjust traffic management settings for Class-of-Service '0' (highest priority) so it is never dropped due to weighted-random-early-drop.
1329797-1 : RADIUS user logs in through the WebUI without configuring the F5-F5OS-UID, will be disconnected after 10 minutes
Links to More Info: BT1329797
Component: F5OS-C
Symptoms:
When a RADIUS user is configured without F5-F5OS-UID and then logged in through the WebUI, they will be disconnected after 10 minutes. This problem has also been observed with other remote authentication methods where the UID and GID are configured.
Conditions:
1) Create a RADIUS user without F5-F5OS-UID configured
2) Logged in as the RADIUS user through WebUI
Impact:
If logged in as the RADIUS user through the WebUI, they will be disconnected after 10 minutes. This problem has also been observed with other remote authentication methods where the UID and GID are configured.
Workaround:
To avoid encountering this problem, the F5-F5OS-UID should be provided. Additionally, the UID for every user (which spans across all remote users as well as local users) should be unique (or have the same GID).
Fix:
UID is not defaulting to 1001 for RADIUS and TACACS+ users anymore. UID is assigned from the range 40,000 - 65,000 for remote users.
1329449 : Missing days-valid, store, and key type logging items of a certificate
Links to More Info: BT1329449
Component: F5OS-C
Symptoms:
Logging most of the certificate request fields but not logging days-valid, store, and key type fields. This was because some fields were added for the creation of the certificate and the logging was done as part of the certificate request.
Conditions:
Always
Impact:
The user will still see logging of all items used in the creation of a self-signed certificate, except for a few that are not necessary for the certificate request.
Workaround:
Check the history and observe the values that were entered.
Fix:
The key type and days-valid will now be logged. The store-tls is a logic value and not loggged.
1329161-2 : In non-FIPS mode, added support for the SSH-RSA host key algorithm
Links to More Info: BT1329161
Component: F5OS-C
Symptoms:
Not able to establish an SSH connection using the SSH-RSA host key algorithm in non-FIPS mode.
Conditions:
Connect to the device from the SSH client using the SSH-RSA host key algorithm in non-FIPS mode.
Impact:
The SSH connection to the device could not be established.
Workaround:
None
Fix:
Added SSH-RSA host key algorithm support in non-FIPS mode.
1327689-1 : Manually remove root and user keys before entering Appliance Mode
Links to More Info: K000140574, BT1327689
1326125-1 : RADIUS authentication fails if F5-F5OS-HOMEDIR attribute is not specified
Links to More Info: BT1326125
Component: F5OS-C
Symptoms:
Authenticating F5OS users against an external RADIUS server fails if the server does not specify an F5-F5OS-HOMEDIR attribute.
The F5-F5OS-HOMEDIR attribute is supposed to be optional.
Conditions:
F5OS system authenticating against a RADIUS server
Impact:
F5OS authentication fails even if the server sends back the required F5-F5OS-GID attribute.
Workaround:
Configure the RADIUS server to include an F5-F5OS-HOMEDIR attribute with a value of "/tmp"
1325893-5 : A vqfdm system software core file is occasionally observed on system reboot
Links to More Info: BT1325893
Component: F5OS-C
Symptoms:
The line-dma-agent or vqf-dm occasionally hits a cosmetic failure state as the entire system is rebooting, leading to a core file being produced prior to shutdown. There is no problem with the state of the system.
Conditions:
Due to left over data on a communication buffer from the tcpdump daemon to the line-dma-agent, on live-upgrade reboot the line-dma-agent can segfault during its shutdown after the tcpdump daemon already has gone down.
Impact:
A core file is observed on the system after the system finishes rebooting. This will happen during an upgrade from a version that is affected by this bug. The core file can be ignored.
Workaround:
None
Fix:
The memory for the communication buffer between the line-dma-agent and the tcpdump daemon has been refactored so this is no longer a possibilty during shutdown. The only way this core care be seen now is if a system is live-upgrading from a version without the fix in the line-dma-agent to a new version with this fix, even then the core is completely cosmetic. Once the system is updated to a version with the fix the core will never be reproduce again on sequential upgrades/reboots.
1321429-5 : F5-PLATFORM-STATS-MIB::diskPercentageUsed not available.
Links to More Info: BT1321429
Component: F5OS-C
Symptoms:
The diskPercentageUsed OID is not available.
snmpwalks/getnext of diskUtilizationStatsTable will not return diskPercentageUsed.
snmpget of diskPercentageUsed will fail with a no Such instance error.
snmptable of F5-PLATFORM-STATS-MIB:diskUtilizationStatsTable shows a question mark (?) for diskPercentageUsed.
Conditions:
Snmpget of diskPercentageUsed
Impact:
The disk percentage used statistic is not available via SNMP.
Workaround:
None
1319613-1 : Sluggishness in SSH access to system on VELOS system controllers
Links to More Info: BT1319613
Component: F5OS-C
Symptoms:
User experiencing delays and slowness in SSH access to system on VELOS system controllers.
Conditions:
By default in controller
grep UseDNS /etc/ssh/sshd_config
UseDNS yes
Impact:
Slowness in SSH to access system controller.
Workaround:
Set UseDNS to no in file /etc/ssh/sshd_config.
Fix:
This is fixed in F5OS-C 1.7.0.
1316097 : LAGs not programmed when adding VLAN to LAG
Links to More Info: BT1316097
Component: F5OS-C
Symptoms:
Traffic from a LAG is not reaching the tenant.
Conditions:
1) Add a VLAN to a LAG and add that VLAN to a tenant in the same commit.
2) Configuration read following blade reboot.
Impact:
LAGs are not programmed; traffic doesn't reach tenant.
Workaround:
Workaround for condition (1): Add the VLAN to the LAG, commit; then add the VLAN to the tenant.
Fix:
Fix usage of mutexes to prevent deadlock with LAG programming is happening in parallel with VLAN programming.
1315425 : Manual Configuration of FEC for 25G ports
Links to More Info: BT1315425
Component: F5OS-C
Symptoms:
FEC configuration was automatic prior to this change. FEC can now be configured manually.
Conditions:
When using a 25G port FEC can be configured manually.
Impact:
FEC can be configured manually for a 25G interface.
Workaround:
None
Fix:
FEC can be configured manually for a 25G interface.
1315041-1 : Partition config-restore failed after reset-default-config is performed★
Links to More Info: BT1315041
Component: F5OS-C
Symptoms:
An error occurs when running 'system database config-restore name my-backup-partition proceed yes'
Error: Database config-restore failed.
Conditions:
Attempting to restore the partition database after an upgrade using 'system database config-restore'.
Impact:
Partition's database config restore is not possible.
Workaround:
Partition config-restore won't work until all the blades have started up at least once (usually takes around 5 minutes from when the partition containers start). Use 'show system redundancy' to see when the blades have finished starting.
Fix:
Added fix for not to delete the system generated configuration during database reset-default-config.
1314593 : The snmp table F5-PLATFORM-STATS-MIB::platformMemoryStatsTable is not available on a partition.
Links to More Info: BT1314593
Component: F5OS-C
Symptoms:
Snmpwalk for F5-PLATFORM-STATS-MIB::platformMemoryStatsTable is failing on partition.
Conditions:
Snmpwalk for platformMemoryStatsTable is executed on partition.
Impact:
PlatformMemoryStatsTable data will be available with snmp.
Fix:
Code modication done to support snmpwalk for platformMemoryStatsTable on the partition.
1314453-5 : Datapath is broken when LAG type is changed from LACP to Static on r2000/r4000 platforms
Links to More Info: BT1314453
Component: F5OS-C
Symptoms:
On r2000 and r4000 platforms, we can create a LAG as type LACP with a BIG-IP tenant. Later, when the datapath is up and running, if we change the LAG type to Static, the datapath on the tenant is broken. The platform sends the state of the members of the LAG as DOWN and hence LAG is DOWN on the BIG-IP tenant.
Conditions:
When LAG type is changed from LACP to Static.
Impact:
Datapath is completely broken while using the LAG configured.
Workaround:
Bringing the DOWN members of the LAG back to UP by below configurations
1. interfaces interface <ifc name> config admin disable
This will make interface to DOWN state and then move back to enabled state.
2. interfaces interface <ifc name> config admin enable
Fix:
Datapath no longer breaks when changing the LAG type from LACP.
1307577-1 : Add more resilience to the file download API
Links to More Info: BT1307577
Component: F5OS-C
Symptoms:
If basic authentication is being used in place of the x-auth-token, then the system blocks the requests and eventually stales in the request queue.
Conditions:
Use of basic authentication instead of the x-auth-token causes this situation in file download.
Impact:
No new download requests can be made.
Workaround:
Restart the platform-services.
Fix:
N/A
1307565-1 : The file download API is not working with the x-auth-token header
Links to More Info: BT1307565
Component: F5OS-C
Symptoms:
The x-auth-token in the header of the request is not working for file download.
Conditions:
Try to download a file using the file download API with the x-auth-token header.
Impact:
The file download fails when using the file download API with the x-auth-token header.
Workaround:
Pass x-auth-token as part of the form-data of the API instead of in the header.
Fix:
N/A
1305005-1 : Error handling in F5OS file-download API
Links to More Info: BT1305005
Component: F5OS-C
Symptoms:
Upon file download failure, API is returning an Apache error page that isn't an F5OS-specific error and isn't aligned with other F5OS API errors. This is a negative user experience.
Conditions:
Due to unhandled errors, when data not in the FormData format are passed through a Curl request, an Apache error page is thrown, misaligning from other F5OS APIs errors.
Impact:
There is no functional impact. It is a negative user experience.
Workaround:
N/A
Fix:
All errors are handled in the file-download API and aligned with other F5OS APIs errors with no more Apache error pages in error cases.
1304921-1 : F5OS file download API does not work with basic authentication
Links to More Info: BT1304921
Component: F5OS-C
Symptoms:
File upload and download using basic auth is not supported.
Conditions:
When trying to upload or download the file from F5OS using basic auth.
Impact:
Upload/download failed with authentication error.
Workaround:
None
Fix:
File download API work with basic auth and x-auth-token.
1304765-4 : A remote LDAP user with an admin role is unable to make config changes through the F5 webUI
Links to More Info: BT1304765
Component: F5OS-C
Symptoms:
When a remote user's GID is mapped to the F5OS system's local GID, the GID mapping is not parsed correctly by the system. If the remote GID is known to the F5 system, there is no issue. For example, a mapping of the form 9000:9000 works fine. However, mapping of the form 5555:9000, 6666:9000 etc. will not work.
Conditions:
Local GID is being mapped to a remote GID.
Impact:
The admin user mapped to a remote GID cannot access the ConfD config mode.
Fix:
Update the system to the version with the fix.
1304749-1 : Implements duplicate port check and fix logic on standby controller
Links to More Info: BT1304749
Component: F5OS-C
Symptoms:
An edge case that duplicates registry causes different controller level images to be incorrect and causes the live upgrade to hang in the standby controller in the middle of the live upgrade.
Conditions:
This condition might happen during live upgrade, where the standby was upgraded first. Since it was unable to even deploy services due to the duplicate port conflicting with the active CC services port, it never went active and was never able to fix itself.
Impact:
Live upgrade fail on the old standby controller.
Workaround:
Please contact F5 Support if this issue occurs. The workaround requires F5 Support to intervene to manually fix the file.
Fix:
Fix the duplicate port assignment edge case on standby cc.
1304085 : Unable to set local user's password if the same user exists on a remote LDAP server
Links to More Info: BT1304085
Component: F5OS-C
Symptoms:
If a user exists locally (in F5OS) as well as on a remote LDAP server, and LDAP-based authentication is configured as an accepted authentication method, attempting to set the user's local password in F5OS will fail. In the ConfD CLI, an error like the following will be observed:
syscon-1-active(config)# system aaa authentication users user ldap_user config set-password
Value for 'password' (<string>): ****************
Error: Rejected,
Configured password-policy:
min-length:6
required-differences:8
max-letter-repeat:3
policy applies to root:true
It should be emphasized that in the case of such duplicate user definitions locally/remotely, the local user's credentials will need to be used to login even if remote authentication is preferred.
Conditions:
A user exists locally (in F5OS) as well as on a remote LDAP server, and LDAP-based authentication is configured as an accepted authentication method.
Impact:
Unable to set the local user's password.
Workaround:
Temporarily remove LDAP as an authentication method, set the user's password, and then re-configure the preferred authentication method(s).
Fix:
Fixed issue with setting a local user's password when an identically named user exists on a remote LDAP server and LDAP is enabled as an authentication method
1300749-1 : Syslog target files do not use the hostname configured via system user interface.
Links to More Info: BT1300749
Component: F5OS-C
Symptoms:
Syslog target files, for example: /var/F5/system/log/platform.log, use a hardcoded nodename for every device as a hostname.
Conditions:
No special conditions.
Impact:
In a remote log collector, source IPs are the only way to differentiate among devices.
Workaround:
It is possible to do an irule workaround that replaces custom strings in syslog traffic depending on the client's IP address. This iRule is applied to the virtual server on another LTM that consumes the syslog traffic and load balances.
when CLIENT_DATA {
switch [IP::client_addr] {
"10.10.10.10" { UDP::payload replace 38 11 "ABCDC01F5OS01" }
"10.10.10.20" { UDP::payload replace 38 11 "ABCDC01F5OS02" }
}
}
Below is the example message after irule workaround.
Jul 31 03:33:50 10.10.10.10 2023-07-31T07:33:50.181136+00:00 appliance-1 lacpd[1]: priority="Info" version=1.0 msgid=0x3401000000000046 msg="" info_str="check_if_op_modify(): new oc_if_enabled: 0 (1:Enabled 2:Disabled ... )".
to this
Jul 31 06:00:01 10.10.10.10 2023-07-31T10:00:01.356324+00:00 ABCDC01F5OS01 lacpd[1]: priority="Info" version=1.0 msgid=0x3401000000000046 msg="" info_str="check_if_op_modify(): new oc_if_enabled: 1 (1:Enabled 2:Disabled ... )".
Jul 31 06:00:04 10.10.10.20 2023-07-31T10:00:04.983677+00:00 ABCDC01F5OS02 lacpd[1]: priority="Info" version=1.0 msgid=0x3401000000000046 msg="" info_str="check_if_op_modify(): new oc_if_enabled: 0 (1:Enabled 2:Disabled ... )".
Fix:
Infrastructure to use the system hostname user configuration in the syslog target logs has been added with a knob and it is enabled by default. It can be turned off if old behavior is preferred.
1298865-2 : Upgrade compatibility issue from 1.6.0-A to 1.7.0-A, 1.6.0-C to 1.8.0-C and 1.7.0-C to 1.8.0-C
Links to More Info: BT1298865
Component: F5OS-C
Symptoms:
As a part of this bug fix:
We are not allowing webUI banner text and color detail when webUI banner is disabled. We are only allowing to configure/show webUI banner test and color when webUI banner is enabled.
After this fix, We have some upgrade compatibility issue from 1.6.0-A to 1.7.0-A, 1.6.0-C to 1.8.0-C and 1.7.0-C to 1.8.0-C (or latest).
If we enable webUI banner without providing values for color and text in 1.6.0-A/C and 1.7.0-C build and if we upgrade to latest version(1.7.0-A build and 1.8.0-C) from 1.6.0-A/C and 1.7.0-C where we cannot enable banner without text, upgrade will fail with compatibility issue.
Conditions:
If webUI banner is enabled without text and color details then upgrade from 1.6.0-A to 1.7.0-A, 1.6.0-C to 1.8.0-C and 1.7.0-C to 1.8.0-C will fail with compatibility error.
Impact:
We will not be able to upgrade from 1.6.0-A to 1.7.0-A, 1.6.0-C to 1.8.0-C, and 1.7.0-C to 1.8.0-C with webUI banner enabled and color and text fields empty.
Workaround:
Either disable the webUI banner or enable the webUI banner with color and text fields.
Fix:
We are not allowing webUI banner's text and color details when webUI banner is disabled. We are only allowing to configure/show webUI banner's text and color when webUI banner is enabled.
1297357-4 : WebUI authentication does not follow best practices in some situations
Component: F5OS-C
Symptoms:
Under certain circumstances, the WebUI interface and RestConf requests do not follow best practices when handling authentication-related requests.
Conditions:
Undisclosed.
Impact:
Undisclosed.
Workaround:
Secure access to the F5OS GUI and expose only to trusted users and networks.
Fix:
WebUI and RestConf requests now follow best practices.
1297349-3 : Tightening controls on uploading files to F5OS
Component: F5OS-C
Symptoms:
The File Upload Manager permits arbitrary file types to be uploaded by an admin user.
Conditions:
-- Uploading files
-- User role is admin
Impact:
Arbitrary file types can be uploaded.
Workaround:
Do not upload untrusted files to the F5OS system. Reduce access to the management plane to trusted users.
Fix:
Only .iso, .os, .img, and .patch files are permitted to be uploaded.
1296997-3 : Large core files can cause system instability
Links to More Info: BT1296997
Component: F5OS-C
Symptoms:
When a system generates and stores large core files, it can cause the system unstable.
Conditions:
F5OS generates a large core file.
Impact:
F5OS core-writing script does not check filesystem availability before writing a core file and can fill up the filesystem, causing catastrophic system instability until disk-space is reclaimed.
For more information of other impacts see
1185577 - F5OS-A memory leak in ImageAgent process on rSeries hosts may affect tenant performance or lead to unexpected restarts of tenant or host
https://cdn.f5.com/product/bugtracker/ID1185577.html
1284705 - Appliance Orchestration Manager core file may consume entire root filesystem
https://cdn.f5.com/product/bugtracker/ID1284705.html
1290949 - Invalid memory read in appliance orchestration manager
https://cdn.f5.com/product/bugtracker/ID1290949.html
1327701 - Space in SNMP community/user/target name causing snmpd container restart
https://cdn.f5.com/product/bugtracker/ID1327701.html
Workaround:
None
Fix:
F5OS now takes into account the available filesystem space before writing a core file. If the core file is too large then it will be truncated and deleted to maintain system stability. The system log message will indicate if the core file was too large to safely write.
1295141 : Ability to change SNMPD listening port
Links to More Info: BT1295141
Component: F5OS-C
Symptoms:
When using default 161 SNMP listening port, user was not able to change/configure this to another port.
Conditions:
snmpwalk was working only on default 161 port.
Impact:
N/A
Workaround:
N/A
Fix:
Added below API to configure SNMP port.
Configuration:
CLI# system snmp config port <port_num>
Show:
CLI# show system snmp state port
1294561-1 : When OCSP is disabled, configurations are not accurately shown outside of 'config' mode
Links to More Info: BT1294561
Component: F5OS-C
Symptoms:
When the OCSP feature is disabled, making any changes to OCSP configurations (i.e. nonce request, override-responder) are not being updated outside of 'config' mode on the ConfD CLI. When the OCSP feature is enabled, there is no issue.
Conditions:
Occurs when OCSP is set to 'disabled' and changes are made to the OCSP configurations. Running 'show system aaa authentication ocsp' will display incorrect information.
Impact:
No functional impact. User will not be able to see an accurate display of the OCSP configurations while the feature is disabled.
Workaround:
N/A
Fix:
Starting in F5OS 1.8.0, OCSP configurations are accurately displayed even if the feature is disabled.
1293249-1 : AAA server group Port and Type are not displayed on ConfD
Links to More Info: BT1293249
Component: F5OS-C
Symptoms:
When a server group is created on an F5OS appliance, "show system aaa server-groups" does not display the Port and Type of the server group.
Conditions:
When a AAA server group is created (LDAP/RADIUS/TACACS).
Impact:
This is a cosmetic issue.
Port and Type information is not displayed on ConfD:
appliance-1# show system aaa server-groups
NAME TYPE ADDRESS PORT
-------------------------------------------
ldap-group - 10.50.5.25 -
Workaround:
The Port and Type information can be viewed via Web UI.
1291513-1 : Some log messages/timestamps do not observe configured timezone
Links to More Info: BT1291513
Component: F5OS-C
Symptoms:
Some logfiles and timestamps report the time as UTC even when the system is configured with a non-UTC timezone.
Conditions:
The orchestration-manager is not aware of the configured timezone, so Openshift/Kubernetes/Ansible log files produced by this component are reported as UTC. Also, the 'user login/last login' times reported by the CLI are always in UTC.
Impact:
Difficult to correlate timestamps across log files.
Workaround:
None
Fix:
Orchestration Manager recognizes the current timezone setting, and produces all timestamps as localtime using RFC3339 format (localtime + offset). All debug logfiles produced by this component are now timezone aware.
The sshd/login programs report login/last login times as localtime, not UTC. The CLI no longer (incorrectly) reports login time.
1289861-1 : Ability to suppress the proceed warning generated when portgroup mode is changed
Links to More Info: BT1289861
Component: F5OS-C
Symptoms:
When the user commits portgroup mode changes, the system generates a proceed warning to inform the user of the potential consequences.
Conditions:
When committing portgroup mode changes.
Impact:
While the proceed warning is present, the user needs to input “yes” or “no” before the transaction is committed.
Workaround:
None
Fix:
Now you have the option to suppress the proceed-warning for the entire system. The setting is called portgroup-confirmation-warning and can be disabled in confd with the following command:
system settings config portgroup-confirmation-warning off
1288765-1 : Provide ability to manage services through systemd/docker commands from F5OS CLI
Links to More Info: BT1288765
Component: F5OS-C
Symptoms:
You are unable to start/stop/check service status for systemd units or docker containers.
Conditions:
-- Confd CLI
-- You would like to check status of specific containers
Impact:
You are unable to check service status for specific containers.
Workaround:
None
Fix:
You can now start/stop/check service status for systemd services or docker containers:
system diagnostics os-utils docker [start|stop|restart] node platform service <name>
1287245 : DAGD component crashes during live upgrade or downgrade
Links to More Info: BT1287245
Component: F5OS-C
Symptoms:
The DAGD component crashes occasionally during live upgrade or downgrade. However, these incidents won't affect the overall system, and the DAGD component will restart automatically without requiring any user action.
Conditions:
The DAGD component crashes occur rarely during live upgrade or downgrade.
Impact:
There is no impact on the overall health of the system.
Workaround:
N/A
Fix:
N/A
1286153-1 : Error logs while generating the qkview
Links to More Info: BT1286153
Component: F5OS-C
Symptoms:
System logs following errors under platform.log while capturing qkview
---
2023-04-09T13:21:23.774606+00:00 appliance-1 tcam-manager[78]: priority="Err" version=1.0 msgid=0x6b01000000000007 msg="ERROR" MSG="handle_dbg_cmd_snapshot: bad tcam id 2".
2023-04-09T13:21:32.905003+00:00 appliance-1 tcam-manager[78]: priority="Err" version=1.0 msgid=0x6b01000000000007 msg="ERROR" MSG="handle_dbg_cmd_snapget: bad row id 512".
---
Conditions:
Generating a qkview
Impact:
The errors are false alarms, they don't have any functional impact.
1285997-7 : LLDP is allowed to configure on interfaces when virtual wire is enabled
Component: F5OS-C
Symptoms:
LLDP is allowed to configure on interfaces although virtual wire is enabled.
Conditions:
1) Enable virtual wire on interface.
2) Attach interfaces to a lag.
3) Enabled LLDP on the interfaces.
Impact:
When virtual wire is enabled, BIG-IP will function in transparent mode and is not expected to see interfaces on either side.
With this issue, F5 interfaces will be visible when LLDP is enabled.
Workaround:
Do not configure LLDP on the interfaces when virtual wire is enabled.
Fix:
N/A
1285669-6 : CVE-2022-21216 - Intel BIOS vulnerabilities on r2000/r4000 and r5000/r10000/r12000
Links to More Info: K000133432
1282185 : Unable to restore backup file containing expired TLS certificate
Links to More Info: BT1282185
Component: F5OS-C
Symptoms:
If a user attempts to restore a configuration backup whose contents include a TLS certificate that has expired, the configuration restore will fail.
Conditions:
User attempts to restore a configuration backup file which contains an expired TLS certificate.
Impact:
User is unable to restore their backed up configuration.
Workaround:
While there is no workaround for the issue, once the backup has been collected, this can be avoided by de-configuring any TLS certificates before collecting a configuration backup, and re-setting them manually after the configuration backup has been restored.
Fix:
Fixed issue where configuration backup files containing expired TLS certificates could not be successfully used for configuration restore.
1277429 : Operational and Configurational prompts do not persist through user sessions
Links to More Info: BT1277429
Component: F5OS-C
Symptoms:
prompt1 (Operational) and prompt2 (Configurational) do not persist over user sessions and logins once configured.
Conditions:
Configure both prompts, exit from session and re-login. It can be observed that the configured prompts are reset to default.
Impact:
Hard to identify the terminal session without configured prompts when working with multiple terminal sessions with new logins.
Workaround:
None
Fix:
Operational (oper-prompt) and Configurational (config-prompt) prompts can be configured which persist over sessions and logins.
1272469 : FPGA update status in ConfD may show error even though it was successful
Links to More Info: BT1272469
Component: F5OS-C
Symptoms:
The ConfD CLI "show components component blade-N" could show that the update of the FPGA-generated error, even though the FPGA loaded successfully.
Conditions:
In very remote cases, when blades are removed and then added back into a chassis, the status may fail to update correctly.
Impact:
The error message does not impact the operation of the product.
Workaround:
There is currently no way to remove the error message from the ConfD logs unless the chassis is power-cycled.
Fix:
N/A
1271417 : VELOS system controller fails to PXE boot when network-range-type is RFC1918
Links to More Info: BT1271417
Component: F5OS-C
Symptoms:
If the network-range-type is set to RFC1918, a VELOS system controller will fail to PXE boot from its peer system controller, reporting an error message "Unable to locate configuration file".
Conditions:
- The Internal Chassis Networking range (network-range-type) is set to RFC1918 (the default is RFC6598).
Impact:
Unable to PXE boot the system controller.
Workaround:
Log into the peer controller (the one NOT being PXE booted) as root, and navigate to the /var/images/pxelinux.cfg directory. In the directory, locate the file whose name is six hex characters ("0a", followed by two digits, followed by "07"), and rename the file to uppercase.
For example:
[root@controller-1(VELOS):Active ~]# cd /var/images/pxelinux.cfg/
[root@controller-1(VELOS):Active pxelinux.cfg]# ls -l 0a*
-rwxr--r--. 2 root root 352 Oct 7 15:14 0ae107
[root@controller-1(VELOS):Active pxelinux.cfg]# mv 0ae107 0AE107
[root@controller-1(VELOS):Active pxelinux.cfg]#
1268433-1 : Some firewall rules do not generate denial logs
Links to More Info: BT1268433
Component: F5OS-C
Symptoms:
system_latest_vers network namespaces are disabled by default to prevent host kernel log flooding from inside a container.
Conditions:
By default, all network namespace logs are disabled except for init namespace.
Impact:
When traffic is denied from an IP, we do not get a message saying traffic from a particular IP is denied.
Workaround:
Command to enable system_latest_vers network namespace denial logs:
sysctl -w net.netfilter.nf_log_all_netns=1 (not-persistent)
Persistent solution:
1) Create a file: /etc/sysctl.conf
2) Run the command:
echo "net.netfilter.nf_log_all_netns = 1" >> /etc/sysctl.conf
1251957-1 : SNMP OIDs to monitor serial number of the device, type of hardware and hostname
Component: F5OS-C
Symptoms:
Device serial number, type, and hostname are not available for the SNMP interface.
Conditions:
Install the F5OS-A/F5OS-C version and run SnmpWalk.
You cannot find the device’s serial number, type, and hostname.
Impact:
You are not able to poll for device serial number, type, and hostname through the SNMP interface.
Workaround:
None
Fix:
Added support for device serial number, type, and hostname for SNMP interfaces.
1251161-3 : Authentication fails via the webUI when “:” is at the end or beginning of the password
Links to More Info: BT1251161
Component: F5OS-C
Symptoms:
After modifying the user's password to include ":" either at the beginning or the end of the password, the user is not able to log in via the webUI.
The user is able to log in via the CLI (SSH).
Conditions:
The password includes ":" at the beginning or end of the password string.
Impact:
User not able to log in via the webUI.
Workaround:
Do not use ":" at the beginning or end of the password string.
Since it is possible to log in via the CLI, modify the password accordingly.
1233865 : Memory capacity and utilization details are confusing / misleading
Links to More Info: BT1233865
Component: F5OS-C
Symptoms:
The memory statistics do not provide a clear or accurate representation of the total memory and how it is being utilized.
Conditions:
Using ConfD to retrieve information about memory capacity and utilization.
Impact:
There are no clear, easy-to-understand statistics for memory capacity and utilization.
Workaround:
N/A
Fix:
More detailed, granular memory statistics are provided to give user a clear understanding of total memory and how it is being used.
1229465-1 : QKView is not collecting core files in /var/crash
Links to More Info: BT1229465
Component: F5OS-C
Symptoms:
QKView was designed to collect core files in /var/core only. The operating system kernel can create core files in /var/crash. SEs need to know about these files.
Conditions:
OS kernel creates a core file.
Impact:
Core file not collected by QKView.
Workaround:
Core file can be manually copied from /var/crash.
Fix:
QKView takes a directory listing from /var/crash and collects core files in that directory.
1224261-1 : Chassis internal controlplane and mgmtplane traffic outage during failover and controller reboot.
Links to More Info: BT1224261
Component: F5OS-C
Symptoms:
Mgmt and controlplane traffic can be unstable due to several issues in the system controller LACP implementation.
Conditions:
Standby system controller reboot, system controller software failover using "go-standby" command, and system controller software upgrade.
Impact:
Mgmt and controlplane traffic outage is anticipated between system controllers and blades will go down between 5 and 60 seconds. The system impacts include user losing connection to the tenant mgmt address, errors on blade processes that communicate with controller processes, and some system database write or read operations failing.
Workaround:
N/A
Fix:
During system controller reboot, there is no mgmt or controlplane traffic outage. During System controller failover, there is typically a brief traffic outage lasting around 3 seconds.
1211233-5 : F5OS dashboard in webUI displays the system root file system usage, not the entire disk
Links to More Info: BT1211233
Component: F5OS-C
Symptoms:
The Dashboard page displays disk usage information that can be misleading.
For example, on an r5900 the following information may be shown:
Storage Capacity: 109.4GB
System Storage Free: 89.1GB
System Storage Used: 15%
However, the storage capacity is a value taken from the root (/) filesystem. It does not represent the entire 800GB disk, and does not show information about the file systems where tenant images reside.
Conditions:
View Dashboard page in webUI.
Impact:
This is a cosmetic issue.
Workaround:
Linux commands such as "df -hl -t ext4" will provide detailed information about disk usage.
Another breakdown of the disk partition use can also be seen using "lsblk /dev/nvme0n1". Note that nvme0n1 is the physical disk of interest.
Example from rSeries appliance:
# lsblk /dev/nvme0n1
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 683.5G 0 disk
|-nvme0n1p1 259:1 0 1G 0 part /boot/efi
|-nvme0n1p2 259:2 0 1G 0 part /boot
|-nvme0n1p3 259:3 0 455.3G 0 part
| `-partition_tenant-root 253:2 0 455.3G 0 lvm /var/F5/system/cbip-disks
|-nvme0n1p4 259:4 0 113.9G 0 part
| `-vdo_vol 253:3 0 227.7G 0 vdo
| `-partition_image-export_chassis 253:4 0 227.7G 0 lvm /var/export/chassis
Fix:
N/A
1208573-3 : Disabling Basic Authentication does not block the RESTCONF GET requests
Links to More Info: BT1208573
Component: F5OS-C
Symptoms:
When basic authentication is disabled by user, RESTCONF GET requests are not getting blocked.
Conditions:
User disables basic authentication. RESTCONF GET requests never get blocked.
Impact:
No effect on configuration. Some of the APIs data will be displayed in RESTCONF GET requests, even when basic authentication is disabled.
Workaround:
N/A
Fix:
The GET operation for the APIs has been blocked when basic authentication is disabled.
1204985-1 : The root-causes of F5OS upgrade compatibility check failures are hidden in /var/log/sw-util.log.
Links to More Info: BT1204985
Component: F5OS-C
Symptoms:
When performing a live upgrade, if the upgrade compatibility check fails, users can only see "System database upgrade compatibility check failed" error message. The applicable information about what failed is neither displayed nor shown in platform.log/velos.log.
Conditions:
1. Perforrm a live-upgrade.
2. If the upgrade compatibility check fails, users can only see "System database upgrade compatibility check failed" error message. The applicable information about what failed is neither displayed nor shown in platform.log/velos.log.
Impact:
Upgrade failure logs are not logged in platform.log/velos.log.
Workaround:
None
Fix:
This issue is fixed and displays the error scenarios in platform.log/velos.log.
1196813-3 : Adding or removing nodes from a running BIG-IP tenant instance can cause data plane and management IP access issues
Links to More Info: BT1196813
Component: F5OS-C
Symptoms:
If nodes are added to the tenant, then tenant management IP may bounce between nodes of a tenant instance. There may also be data plane issues where traffic will not be routed to the nodes added to an existing tenant instance. This occurs because the slot masks are not being updated in the existing tenant instances.
Conditions:
- Nodes are added or removed from a BIG-IP tenant instance on F5OS.
Impact:
Data plane traffic may be impacted, and management access to the tenant IP may be unreliable.
Workaround:
- If the node population of a tenant has already been modified, then as a workaround configure the tenant to provisioned and then back to deployed. This will restart all the tenant instances and make the node masks consistent across all instances.
If a node population change is planned, then the as a workaround configure the tenant to provisioned, configure the different node population on the tenant and then configure back to deployed.
Fix:
Dynamic updates of the node population are allowed.
1196417-2 : First time user SSH session is getting closed after password change
Links to More Info: BT1196417
Component: F5OS-C
Symptoms:
User SSH session is getting closed after password change, at the time of first SSH login.
Conditions:
When changing password at the time of first SSH login.
Following is an example:
ssh jeevan1@10.238.160.60
The authenticity of host '10.238.160.60 (10.238.160.60)' can't be established.
ECDSA key fingerprint is SHA256:RlyjC/Tx6uI7rX9zZy6q0ADKkx6GNReSyb1iohYnKio.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '10.238.160.60' (ECDSA) to the list of known hosts.
jeevan1@10.238.160.60's password:
You are required to change your password immediately (root enforced)
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for user jeevan1.
Changing password for jeevan1.
(current) UNIX password:
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
Connection to 10.238.160.60 closed. <=== SSH session shouldn't be closed.
Impact:
No impact on any of the features due to this issue. The user just needs to log in again with the changed password as the current SSH session will be closed after password change.
Workaround:
N/A
Fix:
N/A
1189057-1 : LACPD fails to read system-priority at container starting time
Links to More Info: BT1189057
Component: F5OS-C
Symptoms:
Error logs occur when LACPD starts.
Conditions:
Occurs every time LACPD starts up.
Impact:
User is not able to configure system-priority and the system-priority remains with the default value.
Workaround:
N/A
Fix:
LACPD is now able to read system-priority properly. User is able to configure system-priority and see the field in the CLI.
1188825-1 : New role named "user" with read-only access to non-sensitive system level data
Component: F5OS-C
Symptoms:
To meet security requirements, you need to create a user account on F5OS that cannot access sensitive data, such as platform logs, system events, login activities, and more.
Conditions:
Create user account with roles available on the F5OS using the following CLI command:
system aaa authentication users user <user_name> config role <role_name>
Impact:
F5OS is unable to meet defined security requirements.
Workaround:
None
Fix:
A new user role named “user” is provided on F5OS to have a role with no access to the sensitive data such as platform logs, system events, and login activities and meet security requirements.
1188069-1 : F5OS installer does not indicate progress or completion state
Links to More Info: BT1188069
Component: F5OS-C
Symptoms:
The F5OS installer does not indicate the process or completion state of upgrade/installation.
Conditions:
Upgrade/reboot the system.
Impact:
You are unable to identify the readiness state of system.
Workaround:
None
Fix:
The upgrade, installation or initialization detail is now included in the system's bash prompt.
1186781 : "Warning: Invalid HW_TYPE_MINOR: 01." is observed in BIOS banner during the controller restart
Links to More Info: BT1186781
Component: F5OS-C
Symptoms:
A warning occurs, "Warning: Invalid HW_TYPE_MINOR: 01.", in the BIOS banner.
Conditions:
A system controller (CX16xx variant only) that is at hardware minor type of 01 with a BIOS earlier than BIOS version 2.03.171.1
Impact:
No functional impact. Only the warning in the banner.
Workaround:
Requires a BIOS update to BIOS version 2.03.171.1 or later
Fix:
The HW_TYPE_MINOR of 01 is supported in BIOS version 2.03.171.1 and later. With this BIOS version there is no warning in the BIOS banner.
1185805 : The "test media" option during USB install may be interrupted by the hardware watchdog
Links to More Info: BT1185805
Component: F5OS-C
Symptoms:
During USB booting there is an option for "Test this media & install F5OS". If this is selected then the system verifies the media for only 5 minutes before the hardware watchdog reboots the device and the verification is interrupted.
Conditions:
USB booting, "test media" option selected.
Impact:
The "test media" option does not work.
1181929-1 : F5OS install may partially fail, leaving system with mismatched OS and services★
Links to More Info: BT1181929
Component: F5OS-C
Symptoms:
After an attempted upgrade, administrators are unable to access the system via management UI, or log into the system as any user other than "root".
A message such as the following in the platform log:
priority=Fatal msgid=0x3501000000000021 msg=OStree rebase to version 1.2.0-10139 failed.
Conditions:
The first part of an F5OS software upgrade fails, but the system continues on and performs subsequent steps of the upgrade.
Impact:
The system may be completely inoperative, or the system may be running with different OS and services versions, which could lead to unknown problems.
On VELOS systems, "show system image" will report a failed install, and one of the system controllers may report a running OS version that is not aligned with the OS version and services versions, as can be seen on system controller 2 here:
syscon-1-active# show system image
SERVICE ISO INSTALL
NUMBER OS VERSION VERSION VERSION STATUS
----------------------------------------------------
1 1.8.0-18829 1.8.0-18829 - failed
2 1.6.1-19136 1.8.0-18829 - failed
Workaround:
If this issue occurs, contact F5 Support for assistance.
Note: This issue exists in the starting software version. It can affect upgrades to versions where this issue is fixed, i.e. upgrades to F5OS-A 1.7.0 or later or F5OS-C 1.8.0 or later.
1166313 : QKView now collects data from unassigned but active blades
Links to More Info: BT1166313
Component: F5OS-C
Symptoms:
If a blade in a chassis is unassigned from a partition, qkview ignores it, and will not collect any data from it.
Conditions:
-- Taking a partition qkview
-- You wish to see qkview data for a blade that was recently removed from that partition
Impact:
Diagnostics data is not collected from unassigned blades.
Workaround:
Run qkview-collect on an unassigned blade.
1. ssh blade-n
2. qkview-collect
3. resulting qkview data will be in qkview.tgz
Fix:
Chassis qkviews will now contain the results of qkview-collect for unassigned blades.
1162341-1 : Front panel interface status is not reported in alarms or events
Links to More Info: BT1162341
Component: F5OS-C
Symptoms:
Front panel interface flap events are not displayed in alarms or events CLI/GUI.
Conditions:
Front panel interface is down or oper-status changes.
Impact:
Interface status is not shown in alarms or events.
Workaround:
View interface with "show interfaces interface state oper-status".
1161117 : DNS warning on cluster status is ambiguous
Links to More Info: BT1161117
Component: F5OS-C
Symptoms:
If the F5OS configured DNS server is not reachable then the cluster summary status displays 'Check DNS server configuration'. This warning is not specific enough to quickly identify the problem due to it appearing as if the cluster DNS is problematic, rather than the F5OS configuration.
Conditions:
The DNS server that was configured on F5OS is not reachable.
Impact:
The DNS warning on the cluster summary status slows the process of identifying the cause of the message.
Workaround:
Configure reachable DNS servers on F5OS.
For example:
syscon-2-active# config
Entering configuration mode terminal
syscon-2-active(config)# system dns servers server 10.1.1.1
syscon-2-active(config)# show cluster
The output of "show cluster" may take a short time to update, assuming the configured DNS IP(s) is reachable.
Fix:
Delete the "Check DNS server configuration" cluster summary and change the event log as well.
1148177 : Add MAC Address to "show system mgmt ip" Command
Links to More Info: BT1148177
Component: F5OS-C
Symptoms:
Show system mgmt-ip does not output the mgmt interface MAC address.
Conditions:
Execute show system mgmt-ip
Impact:
User has to execute ifconfig mgmt-fixed/mgmt-floating to determine the mgmt interface mac address.
Workaround:
Execute ifconfig mgmt-fixed, ifconfig mgmt-floating to determine the mac address associated with the mgmt interfaces.
Fix:
In releases with this fix in place the user can now get the mac address by executing "show system mgmt-ip" in the CLI.
1147673-1 : Downloading QKViews directly from the System Reports screen.
Links to More Info: BT1147673
Component: F5OS-C
Symptoms:
The F5OS-A webUI lacks the ability to download QKView files directly from the System Reports screen. You must navigate to the File Utilities screen to perform the action.
Conditions:
Download QKView files.
Impact:
No functional impact, you need to navigate to a different webUI screen to download QKView files.
Workaround:
Navigate to the File Utilities screen to download QKView files.
Fix:
From F5OS-A v1.8.0, QKView files can be downloaded from System Reports screen.
1141573-1 : ConfD management IP configuration command DHCP shows unusable extra options which might confuse user
Links to More Info: BT1141573
Component: F5OS-C
Symptoms:
ConfD management IP configuration command DHCP shows unusable extra options like IP address, gateway, and prefix.
User do not need to pass IP address, gateway, and prefix when configuring management IP with DHCP.
Conditions:
User is configuring management IP with DHCP and checking command argument after DHCP over CLI.
Impact:
A few extra unusable options exist after the DHCP command over CLI.
Workaround:
Do not pass any value on the arguments passed after DHCP.
Fix:
Added restrictions in the ConfD CLI command, which will not display extra options after DHCP over CLI.
1137413 : F5OS prompt parses \t incorrectly
Links to More Info: BT1137413
Component: F5OS-C
Symptoms:
F5OS prompt converts \t into a tab character instead of displaying the time.
Conditions:
Configure prompt1/prompt2 using below CLI commnad:
""prompt2 "Config \d \h \t #""
prompt will convert '\t' into tab:
syscon-2-active# config
Entering configuration mode terminal
Config 2022-08-04 syscon-2-active #
Impact:
The prompt displays a tab character instead of the time.
Workaround:
Put the prompt string in single-quotes, or use "\\t".
Fix:
Put the prompt string in single-quotes, or use "\\t".
1136557-4 : F5OS config restore fails if .iso or components vary between two devices.
Links to More Info: BT1136557
Component: F5OS-C
Symptoms:
If the .iso or components in the backup file do not match the ones in the restore file, the restore operation fails with admin access denied error:
Error: Database config-restore failed.
Conditions:
Take a config backup from one device and restore it on another device on where .iso or components vary.
Impact:
Configuration restore fails.
Workaround:
Ensure that .iso and components match when performing backup and restore between devices.
1135845-4 : Increased interval for boot device selector hot-key 'b' acceptance after the BIOS banner
Component: F5OS-C
Symptoms:
Users may miss the boot selector hot-key 'b' at the BIOS banner because of the short interval, with the banner displayed, before boot proceeds.
Conditions:
Reboot of the appliance.
Impact:
Reboot required to catch the interval where the hot-key is accepted.
Workaround:
Repeatedly pressing the 'b' hotkey during BIOS POST codes will not negatively affect the BIOS POST and should fall within the 3 second interval after the banner is displayed.
Fix:
'b' hot-key accept interval, after the BIOS banner is displayed, has been increased to 5 seconds.
1135021-2 : F5OS config-restore with an incorrect primary-key does not produce a warning
Links to More Info: BT1135021
Component: F5OS-C
Symptoms:
'system database config-restore' does not verify that the backup file is encrypted with the same database primary-key that is currently active on the device.
Conditions:
Restoring a config-backup on a device with a different primary-key than when the backup was produced.
Impact:
System will not operate properly because it will not be able to decode encrypted secrets that control certificates, private keys, and other items. Tenants will not operate properly.
Workaround:
Ensure that a new config-backup is created after executing the "system aaa authentication primary-key set" command.
Fix:
Config-restore fails is the database primary key does not match the config backup file, and reports the primary-key hash. Reset the primary-key to match the backup file in order to restore the backup file.
1128633 : Failed upload entries displayed under CLI file transfer-operations
Links to More Info: BT1128633
Component: F5OS-C
Symptoms:
Old, failed uploads continue to display in the file transfer-operations list for an unknown period of time both in CLI and GUI.
Conditions:
If the image upload operation fails for some unknown reason, then the failed entries are listed under both the transfer-status list and the transfer-operations list. The list under transfer-status is cleared every 24 hours, but the list under transfer-operations remains.
Impact:
- As old, failed uploads continue to display in the list for an unknown period of time, the list under transfer-operations is more cluttered.
- There is no functional impact.
Workaround:
None
Fix:
All operation entries are cleared if their transfer time exceeds 24 hrs making the file transfer-operations list clutter free.
1126865 : F5OS HAL lock up if the LCD module is not responding.
Links to More Info: BT1126865
Component: F5OS-C
Symptoms:
There are rare cases where the LCD module is present, enabled, and its network link is up; however, it does not respond to requests made by the HAL. Ultimately this causes a the HAL services to become unresponsive.
Conditions:
There are rare cases where the LCD does not respond to requests from the HAL services. When this happens, the HAL service can get locked up.
Impact:
When this rare event occurs, the HAL becomes unresponsive for other devices in the system, like the AOM for example.
Workaround:
If this occurs, a restart of the HAL services or a reset of the system is required to clear the condition.
1124809-1 : Add or improve the reporting status of imported images
Links to More Info: BT1124809
Component: F5OS-C
Symptoms:
There are no correct error messages or status is shown in the log files and in the CLI, when the non-compatible images, corrupted images, or zero-sized images are copied to the imported directories.
It is difficult to determine the exact problem, as they had to examine the import directory and mount status of the ISO file being copied.
Conditions:
Coping zero-length, file name having special characters, corrupted or incompatible ISO files to the import directory /var/import/staging.
Impact:
No status is displayed in the CLI and in the log files.
Workaround:
None
Fix:
The log files will display the exact error messages. System events will show the cause of the error and SNMP traps are generated in the event of the error.
1121921-2 : Common name for setup-wizard tool across platforms
Links to More Info: BT1121921
Component: F5OS-C
Symptoms:
The setup-wizard tool command is named differently in F5OS-A and F5OS-C, which can be confusing for administrators of both systems.
Conditions:
'appliance-setup-wizard' is used to run tool in F5OS-A bash prompt whereas 'velos-setup-wizard' is used in F5OS-C.
Impact:
Increases complexity and creates confusion in running the tool on device.
Workaround:
None
Fix:
'setup-wizard' is made as a common command name to run the tool on both F5OS-A and F5OS-C
1096341-3 : During ISO import, the size was incorrectly displayed as 1
Links to More Info: BT1096341
Component: F5OS-C
Symptoms:
When the ISO file is copied to the /var/import/staging directory, during the verification phase the size of the ISO file was displayed as 1.
Conditions:
The size of the ISO file was shown as 1 during the verification phase.
Impact:
This was misleading as the file size was in terms of GBs.
Workaround:
None
Fix:
The problem has been fixed to display the ISO file size as - (hyphen) till the verification phase is completed.
1069365-1 : Error shown when configuring known-host for file transfer when FIPS mode is enabled`
Links to More Info: BT1069365
Component: F5OS-C
Symptoms:
"Host unreachable" error is sometimes displayed when FIPS mode is enabled, if a user tries to configure known-host. The ssh-keyscan fails, as ssh-keyscan is not using FIPS approved ciphers.
Conditions:
- FIPS mode is enabled
- User configures known-host for file transfer
Impact:
"Host unreachable" error is thrown.
Workaround:
N/A
Fix:
Updated ssh-keyscan to use FIPS approved ciphers when FIPS mode is enabled.
1047689-5 : Sw_rbcast core file found on system
Links to More Info: BT1047689
Component: F5OS-C
Symptoms:
Partition_sw_rbcast producing core.
Conditions:
Starting a tenant which requires the sw_rbcast container running in the following platforms:
- r5x00
- r10x00
- VELOS
Impact:
The sw_rbcast process crashes and produces a core file.
Workaround:
None
Fix:
A new version of sw_rbcast correctly handles tenant broadcast packets.
1018557-1 : On system controller failover, tenant mgmt IP's may be unreachable for several minutes.
Links to More Info: BT1018557
Component: F5OS-C
Symptoms:
During a system controller failover, tenant management IP's may be unreachable for several minutes. Once the ARP entry for the tenants IP times out in the upstream router, it will be re-populated with the correct MAC after the failover and begin working again.
Conditions:
This occurs during a system controller failover due to the ARP entries not being updated for the tenants.
Impact:
The tenant management IP may be unreachable for several minutes after a system controller failover. Once the upstream ARP entry has timed out, the tenant management IP will be reachable again.
Workaround:
There is no workaround, and once the upstream ARP entry has timed out, the tenant management IP will be reachable again.
Fix:
The tenant orchestration layer will now cause Gratuitous ARPs to be sent for the tenant management IPs when a system controller failover happens. The restores tenant management IP connectivity quickly after a system controller failover.
Known Issues in F5OS-C v1.8.x
F5OS-C Issues
ID Number | Severity | Links to More Info | Description |
1927557-1 | 1-Blocking | Blades are not upgraded after partition upgraded to 1.8.1 from 1.8.0 EHF build | |
1827869-1 | 1-Blocking | BT1827869 | Partition upgrade or creation fails on controller★ |
1772669 | 1-Blocking | Displayed Qkview file size can sometimes be indicated as negative | |
1691661-1 | 1-Blocking | BT1691661 | ASM performance drop of 5-10% after upgrading to 1.8.0 |
1627085-1 | 1-Blocking | QAT devices do not de-allocate after tenant deletion | |
1321593 | 1-Blocking | BT1321593 | Peer controller compatibility verification failed★ |
1962261 | 2-Critical | BT1962261 | The controller-manager pods can enter CrashLoopBackOff due to expired API server certificate★ |
1952797-1 | 2-Critical | Partitions can leave stale tenant pods when controller configuration reset to default is issued | |
1928137-1 | 2-Critical | During partition software upgrade dagd process may crash and dump a core | |
1819873-1 | 2-Critical | Tenant may not come to Running state after quick movement of slots between partitions | |
1818777-1 | 2-Critical | When FIPS license is applied and telemetry enabled, some of the containers metrics will be missing in exporter data | |
1754997-1 | 2-Critical | BT1754997 | Tenant instance may fail to come up after repeated blade reboots. |
1754769-1 | 2-Critical | BT1754769 | The third Openshift ETCD instance may not start up after a power cycle |
1712009-2 | 2-Critical | BT1712009 | Attempting to perform a configuration restore, after downgrading from v1.8.0, makes the system inoperable★ |
1694317-1 | 2-Critical | BT1694317 | Tenant config changes may not occur if multiple tenants are changed at once. |
1615105-1 | 2-Critical | BT1615105 | Observing Active-Active status in both controller bash prompts for long period of time after reboot |
1591961-2 | 2-Critical | Observing "Failed to send restarting msg to VF" errors during reboot | |
1567497 | 2-Critical | Compatibility verification failed during downgrade from 1.8.0 to 1.6.0★ | |
1566917-1 | 2-Critical | The ha-1-deployment pod may get restarted after HA setup and system upgrades | |
1550693-1 | 2-Critical | Missing LACP ConfD events may lead to loss of connectivity to blade control plane | |
1959361-1 | 3-Major | When running a tenant with more than 72 VCPUs / cores, adminstall crashes | |
1937881-1 | 3-Major | Telemetry exporter attribute values are not showing in CLI | |
1934645-1 | 3-Major | Logging doest work properly if wrong tls is configured for remote log servers | |
1926417-1 | 3-Major | BT1926417 | Traffic over a LAG not working after upgrade from v1.6.2 to v1.8.0 or v1.8.1★ |
1926413-1 | 3-Major | BT1926413 | Traffic over a LAG not working after upgrade from v1.6.2 to v1.8.0 or v1.8.1★ |
1921261-1 | 3-Major | Duplicate lag members in show interfaces interface lag output | |
1820613-1 | 3-Major | BX520 Port LED solidly illuminated indicating link up while system software reports link down. | |
1812497-3 | 3-Major | BT1812497 | Restoring a backup with an SNMP user on a system with a different SNMP Engine ID will duplicate the SNMP user |
1786385-1 | 3-Major | Libvirt core is generated on upgrade from F5OS-C 1.6.1 to F5OS-C 1.8.1 | |
1784125-1 | 3-Major | Controller prompt stuck "Waiting for firmware status" | |
1711105-1 | 3-Major | BT1711105 | The presence of a /var/docker/config/platform.override.yml file causes the upgrade to hang/fail from versions earlier than F5OS 1.8.0★ |
1692277-1 | 3-Major | BT1692277 | Tenant is unreachable after changing the management VLAN |
1682441-1 | 3-Major | BT1682441 | After simultaneous VELOS controller RMA, Openshift cluster needs to be reinstalled manually |
1671781 | 3-Major | BT1671781 | Lldp crash in when chassis goes for reboot |
1623325-3 | 3-Major | BT1623325 | VLAN groups or VLAN group members may be deleted on F5OS tenant |
1615849 | 3-Major | BT1615849 | LAG interface ifAdminStatus always shows "up" in SNMP ifTable o/p despite the same not being populated in cli as admin up/down is not configurable |
1612429-1 | 3-Major | BT1612429 | License installation is not working with HTTPS Proxy server |
1579781-1 | 3-Major | Power supply controller firmware update can cause failover | |
1552921-1 | 3-Major | BT1552921 | Password policy option reject-username set to false has no effect |
1505497-2 | 3-Major | During remote logging server configuration, selectors help menu does not display when using Tab key. | |
1497893-1 | 3-Major | BT1497893 | Unable to deport previously referenced ISO of now-disabled partition |
1497385-3 | 3-Major | BT1497385 | F5OS SNMP IF-MIB::ifAlias missing from snmpwalk |
1491209 | 3-Major | BT1491209 | Non-root, local authentication fails when LDAP is configured with chase referrals and an invalid DNS server is configured |
1471673-1 | 3-Major | Tenants may be in a failed state after downgrade from VELOS v1.7.1 to v1.7.0 and then back to 1.7.1★ | |
1381053-5 | 3-Major | BT1381053 | Cluster IP is unavailable for some time during tenant reboot |
1332293-4 | 3-Major | BT1332293 | Tcpdump performed with an interface filter on VELOS or rSeries will show broadcast traffic from all interfaces |
1273129-4 | 3-Major | TPM status may not reporting during PXE install | |
1222721-2 | 3-Major | BT1222721 | Deletion of STP configuration using "no stp" is failing |
1102869-1 | 3-Major | BT1102869 | Link stats misrepresentation on interfaces in Autonegotiate mode when link goes down |
1028389 | 3-Major | Tenant status/error messages in the partition CLI display are misleading | |
1785481-2 | 4-Minor | When the restconf-max-session-limit is exceeded, a more specific warning message should be displayed | |
1730881-1 | 4-Minor | BT1730881 | QKview may truncate non-truncatable log files |
1730793-1 | 4-Minor | BT1730793 | Config-restore fails with an error: "tenant-console role cant be assigned to users other than tenant users"★ |
1490169-1 | 4-Minor | BT1490169 | Monitor Error Event logged on controller and partition |
1322245-1 | 4-Minor | BT1322245 | After downgrading from version 1.6.0 to 1.5.1 and then upgrading back to 1.6.0 both packages are installed and install status is set to none.★ |
1112317-2 | 4-Minor | BT1112317 | Null bytes or non-ascii characters are present in velos.log |
Known Issue details for F5OS-C v1.8.x
1962261 : The controller-manager pods can enter CrashLoopBackOff due to expired API server certificate★
Links to More Info: BT1962261
Component: F5OS-C
Symptoms:
After a controller restart, controller-manager pods enter CrashLoopBackOff state, if the API server certificate has expired.
Conditions:
API server certificate is expired and a controller is rebooted.
Impact:
The controller-manager pods are currently experiencing a recurring crash loop and new blades can not be added.
Workaround:
To check if cert is expired:
oc get secret apiserver-ssl -n kube-service-catalog -o jsonpath='{.data.tls\.crt}' | base64 --decode | openssl x509 -noout -enddate
As the root user:
docker exec -it orchestration_manager bash
ansible-playbook -v -i /tmp/omd/etc_ansible_hosts playbooks/openshift-service-catalog/config.yml
This script takes about 5 minutes to run and then the pods are fixed.
1959361-1 : When running a tenant with more than 72 VCPUs / cores, adminstall crashes
Component: F5OS-C
Symptoms:
When running a tenant with more than 72 VCPUs / cores, adminstall crashes.
Conditions:
When ASM provisioned and running a Tenant with more than 72 VCPUs / cores per blade.
Impact:
DOSL7 (BADOS) is not functioning. Core created.
Workaround:
None
1952797-1 : Partitions can leave stale tenant pods when controller configuration reset to default is issued
Component: F5OS-C
Symptoms:
Partition tenants that are configured with slots greater than max-nodes can fail to come back when resetting the controller configuration and restoring it back via save configuration for controller and partition.
Conditions:
Occurs typically when
- Partition with ID 1
- Tenant uses virtual slots that do not match the physical slot.
- Controller reset-to-default is issued
Impact:
Stale tenant pods for partition 1 tenants will still show after restoring the controller configuration and partition configuration, but the impacted tenants may not came back up fully (multinodes case).
Workaround:
- Bring partition tenants down if planning to do controller configuration reset-to-default.
or
- Manually delete default partition 1 after reset-to-default before restoring the save controller configuration. This should take care of removing everything associated to the namespace before the config-restore happens.
Partitions with ID different than 1 should clear the namespace automatically, since they don't get recreated as part of reset-to-default.
1937881-1 : Telemetry exporter attribute values are not showing in CLI
Component: F5OS-C
Symptoms:
Telemetry exporter attribute values are not showing in CLI.
Conditions:
Occurs when user creates new exporter and attributes together.
Impact:
Telemetry attribute values will not be displayed in the CLI.
Workaround:
User can create exporter first and add attributes. This way, the issue will not be seen.
Alternatively, exporters and attributes can be added from the GUI.
1934645-1 : Logging doest work properly if wrong tls is configured for remote log servers
Component: F5OS-C
Symptoms:
Logging is halted
qkview generation fails
Conditions:
-- F5OS configured for remote log server authentication for secure log forwarding.
-- TLS settings are missing or incorrect
Impact:
F5OS logging will be stopped.
Workaround:
Fix the wrong or missed tls configuration for remote logging servers.
1928137-1 : During partition software upgrade dagd process may crash and dump a core
Component: F5OS-C
Symptoms:
During partition software upgrade, dagd process crashes and produces a core dumpump a core.
Conditions:
Partition software version is upgraded.
Impact:
This has no impact except for the core dump.
Workaround:
Prior to a partition software upgrade, manually move your tenants into provisioned running state.
1927557-1 : Blades are not upgraded after partition upgraded to 1.8.1 from 1.8.0 EHF build
Component: F5OS-C
Symptoms:
Blades are still running on the previous version of F5OS even after the partition upgrade.
Conditions:
Control plane network connectivity to the blades has been lost due to incomplete internal trunk / VLAN programming.
Impact:
Blades report that they are not running the current version of F5OS software and/or control plane Connectivity may be lost between CC and one or more blades over unprogrammed vlans.
Workaround:
Recommended recovery path is to perform a staggered reboot of both System Controllers:
1. Reboot the standby system controller.
2. After the standby is rebooted, run the SC confd configuration command system redundancy go-standby.
3. After the go-standby completes, reboot the new standby System Controller.
1926417-1 : Traffic over a LAG not working after upgrade from v1.6.2 to v1.8.0 or v1.8.1★
Links to More Info: BT1926417
Component: F5OS-C
Symptoms:
After upgrading from v1.6.2 to v1.8.0 or v1.8.1, the internal LAG programming may become faulty, resulting in incorrect configuration of interfaces within the LAG. This issue disrupts the proper flow of traffic.
Conditions:
Upgrade from v1.6.2 to v1.8.0 or v1.8.1
Impact:
Traffic over the LAG doesn’t work correctly.
Workaround:
Reboot the blades hosting the members of the LAG.
1926413-1 : Traffic over a LAG not working after upgrade from v1.6.2 to v1.8.0 or v1.8.1★
Links to More Info: BT1926413
Component: F5OS-C
Symptoms:
After upgrading from v1.6.2 to v1.8.0 or v1.8.1, the internal LAG programming may become faulty, resulting in incorrect configuration of interfaces within the LAG. This issue disrupts the proper flow of traffic.
Conditions:
Upgrade from v1.6.2 to v1.8.0 or v1.8.1
Impact:
Traffic over the LAG does not work correctly.
Workaround:
Reboot the blades hosting the members of the LAG.
1921261-1 : Duplicate lag members in show interfaces interface lag output
Component: F5OS-C
Symptoms:
The show lag output may display multiple occurrences of the same members within a lag.
Conditions:
In some instances, after an upgrade, some members show up twice.
Impact:
No functional impact is identified at this time. This is considered to be cosmetic.
Workaround:
There is no workaround identified at this time.
1827869-1 : Partition upgrade or creation fails on controller★
Links to More Info: BT1827869
Component: F5OS-C
Symptoms:
Partition upgrade or creation fails due to missing partition_image volume.
vcc-confd - /confd/scripts/f5_confd_run_cmd show partitions install
# show partitions install
INSTALL INSTALL
BLADE OS SERVICE BLADE OS SERVICE INSTALL INSTALLING
NAME ID VERSION VERSION VERSION VERSION STATUS CONTROLLER
------------------------------------------------------------------------------------------
none - - - - - - -
controller-1 1 1.8.0-26321 1.8.0-26321 1.8.0-26321 1.8.0-26321 success -
controller-2 2 1.6.2-26579 1.6.2-26579 1.8.0-26321 1.8.0-26321 in-progress 2
You may see below messages in /var/log/sw-util.log:
/usr/libexec/sw-mgmt/sw-util.sh nodename=controller-2 resize 1 10 15 10: priority=Error msgid=0x3501000000000074 msg=Failed to create LV for partition 1.
/usr/libexec/sw-mgmt/sw-util.sh nodename=controller-2 enable_partition 2 10 15 10: priority=Error msgid=0x3501000000000074 msg=Failed to create LV for partition 2
Conditions:
The partition upgrade or creation is unsuccessful because the partition_image volume is not present. This is a rare problem that may occur due to a mismatch in metadata while converting the LVM to VDO volume.
Impact:
The partition upgrade or creation is unsuccessful.
Workaround:
Do PXE clean installation of the affected system controller.
1820613-1 : BX520 Port LED solidly illuminated indicating link up while system software reports link down.
Component: F5OS-C
Symptoms:
The BX520 Port LED may be solidly illuminated indicating link up while system software reports link down. This is typically a transient condition during initial port bringup. If it persists, it could be an indication of a problem with the fiber or at the link partner.
Conditions:
The BX520 Port LED is illuminated solid when the associated port has achieved RX alignment. The system software indicates port status UP when the BX520 port has achieved RX Alignment AND the link partner has also signaled it has achieved RX Alignment through the 802.3 Remote Fault Indicator protocol.
It is expected that there may be transient cases of this during port bringup. If it persists, it can be an indication that the BX520 was able to achieve RX alignment but the link partner was not.
Impact:
Differences in Link status as reported by HW LED and SW Status can cause confusion.
Workaround:
None
1819873-1 : Tenant may not come to Running state after quick movement of slots between partitions
Component: F5OS-C
Symptoms:
After quick movement of a slot between different partitions, it is possible that tenants on that slot will not come back to the Running state.
Conditions:
This situation can occur if nodes are moved from current partition to another partition and then back to original partition.
Impact:
Tenant may not come to Running state.
Workaround:
Toggle the running-state of the tenant from deployed to configured and then back to deployed.
1818777-1 : When FIPS license is applied and telemetry enabled, some of the containers metrics will be missing in exporter data
Component: F5OS-C
Symptoms:
When FIPS license is applied and telemetry enabled, randomly some of the containers metrics will not be transmitted to exporter.
Conditions:
Happens only when FIPS license is applied and telemetry is enabled and only when instrument type "all" or "container" is selected in F5OS.
Impact:
Randomly some of the containers metrics will not be transmitted to exporter.
Workaround:
None
1812497-3 : Restoring a backup with an SNMP user on a system with a different SNMP Engine ID will duplicate the SNMP user
Links to More Info: BT1812497
Component: F5OS-C
Symptoms:
If you restore a backup containing an SNMP user, but the SNMP user’s SNMP Engine ID does not match the current system, a new SNMP user will be created with the same name and the current system’s SNMP Engine ID. However, this is only seen when the database is later backed up.
Conditions:
-- Restoring a database backup that contains an SNMP user.
-- Doing the restore on a system with a different SNMP Engine ID.
Impact:
Two SNMP users with the same name (but different SNMP Engine IDs) are saved to subsequent backups. SNMP will not work.
Workaround:
Reconfigure the SNMP user authentication and privacy passwords after restoring the backup. SNMP will work after configuring passwords.
1786385-1 : Libvirt core is generated on upgrade from F5OS-C 1.6.1 to F5OS-C 1.8.1
Component: F5OS-C
Symptoms:
A flawed libvirt core file is generated on blades intermittently during blade reboots such as upgrading from F5OS-C 1.6.1 to F5OS-C 1.8.1, partition disabling/enabling, and so on. However, the tenant remains healthy and functional.
Conditions:
Occurs intermittently during blade reboots such as upgrading from F5OS-C 1.6.1 to F5OS-C 1.8.1, partition disabling/enabling, and so on. However, the tenant remains healthy and functional.
Impact:
It has no impact. The libvirt core file is observed, but the tenant remains healthy and functional.
Workaround:
None
1785481-2 : When the restconf-max-session-limit is exceeded, a more specific warning message should be displayed
Component: F5OS-C
Symptoms:
If you try to establish a session on a system that exceeds the restconf-max-session-limit, the new session will be unsuccessful. Instead of the current generic error message "Authentication failed," a more precise message should be provided to explain that the authentication failed due to exceeding the restconf-max-session-limit, such as "You have exceeded the restconf-max-session-limit."
Conditions:
If restconf-max-session-limit, the limit on GUI sessions, is set, then the user will be able to establish a restconf-max-session-limit number of GUI sessions on a system.
If the user then attempts to start more than the restconf-max-session-limit number of sessions on the system, they will get the error message: “Authentication failed.”
Impact:
The error message does not provide precise details as to why the new session could not be established.
Workaround:
This is a cosmetic issue related to the contents of an error message.
If the creation of a new GUI session fails and the error message “Authentication Failed” is present, then the user needs to check the value of restconf-max-session-limit and ensure that they have not started too many GUI sessions on the system.
1784125-1 : Controller prompt stuck "Waiting for firmware status"
Component: F5OS-C
Symptoms:
The command line system prompt perpetually says "Waiting for firmware status" on both controllers.
Conditions:
This occurs when a user has issued the command to reset the confd database (reset-default-config) without rebooting the controllers.
Impact:
Message will stay until controllers are rebooted.
Workaround:
Reboot both controllers.
1772669 : Displayed Qkview file size can sometimes be indicated as negative
Component: F5OS-C
Symptoms:
If a qkview file exceeds 2.1 GB it's size may be indicated as negative when using the show system diagnostics qkview command (and others).
Conditions:
Qkview file exceeds 2.1 GB in size
Impact:
Cosmetic
Workaround:
None
1754997-1 : Tenant instance may fail to come up after repeated blade reboots.
Links to More Info: BT1754997
Component: F5OS-C
Symptoms:
A tenant instance may fail to come up to running in the BIG-IP cluster after repeated reboots of the blade hosting the tenant instance.
In this case the blade will show as offline in the "show sys cluster" output.
---------------------------------------------------------------------------------------------------------
| Sys::Cluster Members
| ID Address Alt-Address Availability State Licensed HA Clusterd Reason
---------------------------------------------------------------------------------------------------------
| 1 :: :: offline enabled false unknown shutdown Slot Failed
| 2 :: :: available enabled true active running Run
| 3 :: :: unknown enabled false unknown shutdown Slot powered off or empty
| 4 :: :: unknown enabled false unknown shutdown Slot powered off or empty
| 5 :: :: unknown enabled false unknown shutdown Slot powered off or empty
| 6 :: :: unknown enabled false unknown shutdown Slot powered off or empty
| 7 :: :: unknown enabled false unknown shutdown Slot powered off or empty
| 8 :: :: unknown enabled false unknown shutdown Slot powered off or empty
Conditions:
Repeated reboots of the blade hosting the BIG-IP tenant instance.
Impact:
The affected tenant instance will be inoperable until the blade is rebooted again to recover.
Workaround:
Rebooting the blade while the instance is in the impacted state will restore the tenant instance.
1754769-1 : The third Openshift ETCD instance may not start up after a power cycle
Links to More Info: BT1754769
Component: F5OS-C
Symptoms:
Upon running the 'show cluster' command on the controller CLI, you will observe that the etcd-ha-running field is marked as false.
Conditions:
After a chassis power cycle followed by a contoller failover, the third Openshift ETCD instance may fail to start. This is caused by a lock in the underlying database.
Impact:
You will not see any effect on tenants.
Workaround:
You can initiate a controller failover
1730881-1 : QKview may truncate non-truncatable log files
Links to More Info: BT1730881
Component: F5OS-C
Symptoms:
Qkview collects certain files that are not allowed to be truncated, even if you specify a maximum file size; however, certain non-truncatable files become truncated.
Conditions:
-- You run 'system diagnostics qkview capture filename <filename> maxfilesize <num>'
Impact:
Some diagnostics data may not be collected
Workaround:
Do not use the -maxfilesize argument to limit file size.
1730793-1 : Config-restore fails with an error: "tenant-console role cant be assigned to users other than tenant users"★
Links to More Info: BT1730793
Component: F5OS-C
Symptoms:
Config-restore fails when restoring a backed-up configuration where user entry with tenant-console role exists without having actual tenant with the same name as user.
Example:
system database config-restore name <config_file name>
A clean configuration is required before restoring to a previous configuration.
Please perform a reset-to-default operation if you have not done so already.
Proceed? [yes/no]: yes
Error: /oc-sys:system/aaa/authentication/f5-system-aaa:users/user{<user_name>}/config/role: tenant-console role cant be assigned to users other than tenant users.
Database config-restore failed.
Conditions:
1) After the software upgrade from v1.6.0 and below to v1.8.0, with an additional tenant-console users (tenant-console users with no tenant associated) will create tenant-console user on v1.8.0. However, in the event of a backup and reset to default configuration with subsequent config restore, an error will occur as version 1.8.0 does not support tenant-console users without an associated tenant.
2) In the event that a tenant is deleted without clearing the associated tenant-console user entry, future actions such as config-backup and reset-to-default will result in a failed config-restore as version 1.8.0 does not support tenant-console users without an associated tenant.
Impact:
Unable to restore the configurations after performing reset-to-default.
Workaround:
Remove the tenant-console user entry (without corresponding tenant) from the config backup file and then perform the configuration restore with modified config file.
1712009-2 : Attempting to perform a configuration restore, after downgrading from v1.8.0, makes the system inoperable★
Links to More Info: BT1712009
Component: F5OS-C
Symptoms:
After a downgrading from v1.8.0 and reset-to-default process, ConfD fails to start.
Conditions:
Upgrade system from F5OS-A-1.4.0 or F5OS-C-1.6.0 and later to F5OS 1.8.0. Then downgrade and attempt to perform a config-restore to the prior version configuration.
Impact:
The system becomes inoperable, with no access to the CLI or UI. Interaction is restricted to a root-level bash login. Following a database reset, access is exclusively available through the serial console.
Workaround:
Perform the below steps for a successful configuration restore or reset-to-default operation following a version downgrade from 1.8.0.
=====================================================================================
F5 rSeries system's config-restore workaround after downgrading from v1.8.0
========================================================================
step-1: Log in to the command line interface (CLI) of the system using an account with root access.
step-2: Copy the below content to a new file f5_dyncfg_config_restor_fix.xml
<!-- File Begin -->
<!-- XML file content for fixing the config-restore issue. -->
<config xmlns='http://tail-f.com/ns/config/1.0'>
<confdConfig xmlns='http://tail-f.com/ns/confd_dyncfg/1.0'>
<restconf>
<transport>
<tcp>
<enabled>false</enabled>
</tcp>
</transport>
</restconf>
<webui>
<enabled>false</enabled>
<transport>
<tcp>
<enabled>true</enabled>
</tcp>
</transport>
</webui>
</confdConfig>
</config>
<!-- End of file -->
step-3: Move the file (f5_dyncfg_config_restor_fix.xml) created in step-2 to /var/F5/system/
step-4: Execute the below command.
docker exec -it system_manager /confd/bin/confd_load -U -c system -m -l /var/F5/partition/f5_dyncfg_config_restor_fix.xml
step-5: delete the file /var/F5/system/f5_dyncfg_config_restor_fix.xml
System Controller’s config-restore workaround after downgrading from v1.8.0
===========================================================================
step-1: Log into the command line interface (CLI) of the Active controller using an account with root access.
step-2: Copy the below content to file f5_dyncfg_config_restor_fix.xml
<!-- File Begin -->
<!-- XML file content for fixing the config-restore issue. -->
<config xmlns='http://tail-f.com/ns/config/1.0'>
<confdConfig xmlns='http://tail-f.com/ns/confd_dyncfg/1.0'>
<restconf>
<transport>
<tcp>
<enabled>false</enabled>
</tcp>
</transport>
</restconf>
<webui>
<enabled>false</enabled>
<transport>
<tcp>
<enabled>true</enabled>
</tcp>
</transport>
</webui>
</confdConfig>
</config>
<!-- End of file -->
step-3: Move the file (f5_dyncfg_config_restor_fix.xml) created in step-2 to /var/F5/system/
Step-4: Execute the below command.
docker exec -it vcc-confd confd_load -U -c system -m -l /var/F5/system/f5_dyncfg_config_restor_fix.xml
step-5: Delete the file /var/F5/system/f5_dyncfg_config_restor_fix.xml
Chassis Partition's config-restore workaround after Partition downgrading from 1.8.0
==================================================================================
step-1: Log in to the command line interface (CLI) of the blade using an account with root access.
step-2: copy the below content to file f5_dyncfg_config_restor_fix.xml
<!-- File Begin -->
<!-- XML file content for fixing the config-restore issue. -->
<config xmlns='http://tail-f.com/ns/config/1.0'>
<confdConfig xmlns='http://tail-f.com/ns/confd_dyncfg/1.0'>
<restconf>
<transport>
<tcp>
<enabled>false</enabled>
</tcp>
</transport>
</restconf>
<webui>
<enabled>false</enabled>
<transport>
<tcp>
<enabled>true</enabled>
</tcp>
</transport>
</webui>
</confdConfig>
</config>
<!-- End of file -->
step-3: Move the file (f5_dyncfg_config_restor_fix.xml) created in step-2 to /var/F5/partition<id>/
Step-4: Execute the below command.
docker exec -it partition<id>_manager confd_load -U -c system -m -l f5_dyncfg_config_restor_fix.xml
step-5: Delete the file /var/F5/system/f5_dyncfg_config_restor_fix.xml
Follow the below steps to fix the system after it enters a failed state following a version downgrade v1.8.0
=====================================================================================
To restore functionality, you must access a bash shell using an account with root access (most likely through the system's serial console) and delete the files in the "cdb/" directory and perform a restart. This action will erase all settings, including licensing and the system’s management IP.
Next, get a new license, configure the system management IP address, verify or reset the primary key, and initiate a configuration restoration using the previously saved backup.
If the system controller is reset using this method, the empty partitions must be recovered from backup and the tenants must then be restored.
If a partition experiences this type of failure and is cleared and reset, it must not be deleted or recreated in the system controller. This is because it will result in a mismatch of primary keys and the configuration restoration will not function properly.
1711105-1 : The presence of a /var/docker/config/platform.override.yml file causes the upgrade to hang/fail from versions earlier than F5OS 1.8.0★
Links to More Info: BT1711105
Component: F5OS-C
Symptoms:
If a platform.override.yml file exists from a version of F5OS prior to 1.8.0, platform-services will fail to start when the new software version boots.
This file is not part of the software distribution, and will only exist if an administrator created it after installation.
Conditions:
Platform.override.yml exists with a version that is not '2.2'.
Impact:
Platform-services fails to start after reboot.
Workaround:
Prior to attempting to install F5OS 1.8.0 or later on an older version, make sure that there is no /var/docker/config/platform.override.yml file on either controller, any blade or the appliance filesystem.
If the issue is encountered and platform-services does not start, remove the platform.override.yml and issue the command:
systemctl restart platform-services-deployment.service
1694317-1 : Tenant config changes may not occur if multiple tenants are changed at once.
Links to More Info: BT1694317
Component: F5OS-C
Symptoms:
It's possible that if more than 1 tenant has its configuration changed, along with toggle to configured and then back to deployed state, that some of the tenants will not have their configuration updated.
Conditions:
Multiple tenants exist, and config changes are made to more than 1 in rapid succession.
Impact:
Not all tenants will be redeployed with new configuration.
Workaround:
For any tenants that did not change their configuration, toggle them again: deployed->configured->deployed.
1692277-1 : Tenant is unreachable after changing the management VLAN
Links to More Info: BT1692277
Component: F5OS-C
Symptoms:
If the management VLAN for the tenant management interface is changed while the tenant is deployed, the management VLAN change will appear to be successful in both the running config and state output for the tenant, but the tenant will not be reachable on the reconfigured VLAN.
Conditions:
-- BIG-IP tenant deployed on F5OS
-- You change the management VLAN of the tenant
Impact:
Tenant is not reachable on the new VLAN, but the tenant state and the running configuration makes it look like the change was successfully applied.
Workaround:
There are two workarounds:
1. If the mgmt-vlan is configured, transition the tenant from deployed to the provisioned or configured state (aka bounce the tenant).
2. Prior to configuring the tenant management VLAN, transition the tenant from the deployed to configured state, configure the management VLAN and redeploy the tenant.
1691661-1 : ASM performance drop of 5-10% after upgrading to 1.8.0
Links to More Info: BT1691661
Component: F5OS-C
Symptoms:
ASM performance drop will be seen after upgrading to 1.8.0 from 1.5.1.
Conditions:
Upgrade system from 1.5.1 or older to 1.8.0 will cause this issue.
Impact:
Performance drop of 5-10% for ASM.
Workaround:
None
1682441-1 : After simultaneous VELOS controller RMA, Openshift cluster needs to be reinstalled manually
Links to More Info: BT1682441
Component: F5OS-C
Symptoms:
If both VELOS system controllers are swapped simultaneously, Openshift may not start properly, and the system will not recover or be able to add new blades into the cluster.
The openshift log file (/var/log/openshift.log) will show this log message repeating:
Restarting openshift origin-node, controllers and api
Messages similar to these in /var/log/messages:
nodename=controller-1 2024-09-19 17:39:13.686057 C | etcdmain: listen tcp 100.65.3.52:2380: bind: cannot assign requested address
nodename=controller-2 2024-09-19 21:00:05.873025 C | etcdmain: listen tcp 100.65.3.51:2380: bind: cannot assign requested address
2024-09-24 06:59:16.591720 I | etcdmain: rejected connection from "100.65.3.52:39400" (error "remote error: tls: bad certificate", ServerName "")
Conditions:
Both system controllers are replaced simultaneously in a VELOS chassis.
Impact:
System remains unhealthy.
Workaround:
This issue should not occur if each controller is replaced one-at-a-time.
If both system controllers are swapped simultaneously, then once they have booted up, reinstall Openshift by doing the following:
1. Log into the active VELOS system controller as root, and run:
touch /var/omd/CLUSTER_REINSTALL
The VELOS system begins the OpenShift cluster reinstallation process. This operation can take 90 minutes or more to complete.
2. In order to check the progress of the rebuild, you can run the following command:
tail -F /var/log/openshift.log
1671781 : Lldp crash in when chassis goes for reboot
Links to More Info: BT1671781
Component: F5OS-C
Symptoms:
Lldp might crash when a reboot is triggered on the chassis.
Conditions:
System controller(s) are rebooted.
Impact:
As chassis reboots a lldp core file may be present on the system controller. If the core occurred during the rboot, it does not cause any issue.
Workaround:
None
1627085-1 : QAT devices do not de-allocate after tenant deletion
Component: F5OS-C
Symptoms:
You see a stale tenant entry under "show cluster nodes nodes appliance-1 tenants tenant" table
Conditions:
This occurs rarely, after rebooting within 5-8 minutes of deleting a tenant.
Impact:
No functional impact but the 'show cluster nodes node <blade>' command may show tenants that have been previously deleted along with an associated QAT device name.
TENANT ASLA ASLA ASLA SLA SLA SLA
NAME QAT DEVICE NAME BDF MIN USED UTIL MIN USED UTIL
--------------------------------------------------------------------------
bigip1 qat_dev_vf08pf00_hi b5:02.0 2000 0 0 2000 0 0
qat_dev_vf08pf01_hi b6:02.0 2000 0 0 2000 0 0
qat_dev_vf08pf02_hi b7:02.0 2000 0 0 2000 0 0
qat_dev_vf09pf00_hi b5:02.1 2000 0 0 2000 0 0
qat_dev_vf09pf01_hi b6:02.1 2000 0 0 2000 0 0
qat_dev_vf09pf02_hi b7:02.1 2000 0 0 2000 0 0
Workaround:
Create a tenant with same tenant name and delete it to remove the stale entry
1623325-3 : VLAN groups or VLAN group members may be deleted on F5OS tenant
Links to More Info: BT1623325
Component: F5OS-C
Symptoms:
If using VLAN groups on a tenant running on an rSeries appliance or VELOS chassis, the system may delete the VLAN group or VLAN group members unexpectedly.
This will happen when configuration changes to the tenant are made in F5OS or if the interface members of the VLAN change state (i.e. link down)
- If the VLAN groups are in a non-"Common" partition, any members of the VLAN group will be removed, but the VLAN group will remain.
- If the VLAN groups are in the Common partition, but are not referenced by higher-level objects, the VLAN group will be removed.
- If the VLAN groups are in the Common partition and are referenced by higher-level objects, the system will not delete the VLAN group, but will log messages similar to the following:
err mcpd[9181]: 01070623:3: The vlangroup (/Common/otters-vlangroup) is referenced by one or more virtual servers.
err chmand[4691]: 012a0003:3: hal_mcp_process_error: result_code=0x1070623 for result_operation=eom result_type=eom
Conditions:
- BIG-IP tenant running on rSeries appliance or VELOS chassis
- VLAN group configured in tenant, and not using virtual wire
Impact:
Traffic disrupted due to removal of VLAN group objects or VLAN group members.
Workaround:
To avoid this problem, define an unused VLAN group in the Common partition and assign it to the VLAN list for a virtual server.
tmsh create net vlan-group /Common/unused-vg
tmsh create ltm virtual /Common/unused-virtual vlans-enabled vlans add { unused-vg } description "Workaround for ID1623325"
tmsh save sys config
Note the use of "vlans-enabled" and adding the empty VLAN group to the virtual server's VLAN list. This means that the BIG-IP system will never actually process traffic via this virtual server, as it would only accept traffic to the virtual server that arrives over the VLAN group, but the VLAN group will never receive any actual traffic.
As a result of implementing this workaround, when the tenant processes any configuration updates from F5OS, the tenant will log error messages similar to the following:
err mcpd[10720]: 01070623:3: The vlangroup (/Common/unused-vg) is referenced by one or more virtual servers.
err chmand[6781]: 012a0003:3: hal_mcp_process_error: result_code=0x1070623 for result_operation=eom result_type=eom
1615849 : LAG interface ifAdminStatus always shows "up" in SNMP ifTable o/p despite the same not being populated in cli as admin up/down is not configurable
Links to More Info: BT1615849
Component: F5OS-C
Symptoms:
LAG interface ifAdminStatus always shows "up". This stats should be up only for ianaift_ethernetCsmacd.
Conditions:
1. Upgrade chassis to 1.8.0-14272
2. Create LACP interface
3. Configure SNMP
4. Run snmp walk from workstation
5. Check ifAdminStatus stats on SNMP table.
Impact:
Incorrect information is displayed for ifAdminStatus.
Workaround:
None
1615105-1 : Observing Active-Active status in both controller bash prompts for long period of time after reboot
Links to More Info: BT1615105
Component: F5OS-C
Symptoms:
The system status is reported as "Active" on both system controllers for ~115 seconds after system reboot.
Prompt will be updated correctly after ~115 seconds.
Conditions:
Reboot both system controllers.
Impact:
Both system controllers report Active status for ~115 seconds after the reboot.
Workaround:
None
1612429-1 : License installation is not working with HTTPS Proxy server
Links to More Info: BT1612429
Component: F5OS-C
Symptoms:
License installation is not working with SSL-enabled proxy server.
Conditions:
The SSL-enabled proxy server is unable to perform an SSL handshake when installing a license through a proxy server.
Impact:
License installation will fail with proxy server.
Workaround:
Install the license manually or use an HTTP proxy.
1591961-2 : Observing "Failed to send restarting msg to VF" errors during reboot
Component: F5OS-C
Symptoms:
This error “Failed to send restarting msg to VF” appears during reboot and causes a delay in reboot.
Conditions:
When two or more BIG-IP tenants are deployed.
Impact:
Delay in reboot time.
Workaround:
None
1579781-1 : Power supply controller firmware update can cause failover
Component: F5OS-C
Symptoms:
In some instances, the failure of a power supply controller firmware update or hardware issues in the power supply controller may result in system controllers fail over.
Conditions:
During the firmware update fail or any hardware issue, the system controller can fail over multiple times.
Impact:
System controllers show unhealthy behavior and then fail over to the peer system controller.
Workaround:
None
1567497 : Compatibility verification failed during downgrade from 1.8.0 to 1.6.0★
Component: F5OS-C
Symptoms:
The ignore initial validation flag is not available in the 1.6.0 release of the schema. It has been enabled in later releases starting from 1.6.2 due to potential compatibility check failures in the downgrade matrix
Conditions:
Occurs when a system downgrades from 1.8.0 -C to 1.6.0 -C
Impact:
The downgrade may go well with some intermittent failures.
Workaround:
Delete the allowed IPS configuration and trigger the downgrade
1566917-1 : The ha-1-deployment pod may get restarted after HA setup and system upgrades
Component: F5OS-C
Symptoms:
When HA is configured on the BIG-IP Next tenants, a new pod name <tenant-name>ha-1-deployment-<replica-set-hash>-<pod-id> will be created in the tenant namespace.
In some cases, the pod restart count may be 1 or 5.
Conditions:
When HA is set up on BIG-IP Next tenants on rSeries and after upgrading F5OS 1.7.0 to F5OS 1.8.0 version.
Impact:
No functional impact. The pod will automatically transition to a running state.
Workaround:
NoneThe
1552921-1 : Password policy option reject-username set to false has no effect
Links to More Info: BT1552921
Component: F5OS-C
Symptoms:
When the administrator configures 'system aaa password-policy config reject-username false', F5OS will still reject passwords that contain the username.
Conditions:
System aaa password-policy config reject-username is set to false
Impact:
When a user tries to set or change a password containing their username in any part of the password, F5OS will reject that password.
Workaround:
Do not use passwords that contain the username.
1550693-1 : Missing LACP ConfD events may lead to loss of connectivity to blade control plane
Component: F5OS-C
Symptoms:
If an LACP working member update is missed (either LACPD fails to send or switchd fails to reeive the update) control plane connectivity between the SC and blade may be lost.
Conditions:
Any event giving LACP a reason to change the working members of a control plane aggregation (ie reboot/removal/instertion of a blade or CC).
Impact:
Connectivity may be lost between CC, one or more blades and possibly loss of management port traffic if management ports are aggregated.
Workaround:
Restart cc-switchd and cc-lacpd (in that order) on both SCs or reboot both SCs.
1505497-2 : During remote logging server configuration, selectors help menu does not display when using Tab key.
Component: F5OS-C
Symptoms:
While configuring the remote logging server, using the Tab key does not display selector help menu.
Conditions:
While configuring the remote logging server, using the Tab key does not display selector help menu.
Impact:
No help menu is displayed
Workaround:
Use ? key to get help in selectors menu, while configuring remote server.
1497893-1 : Unable to deport previously referenced ISO of now-disabled partition
Links to More Info: BT1497893
Component: F5OS-C
Symptoms:
Upgrading a partition to a new ISO in a disabled state does not completely switch the partition’s OS from the previous version to new, in turn, the system is failing to deport the previously referenced ISO file from the system.
Conditions:
- Enable a partition with version A.
- Disable the partition.
- Upgrade the partition to new version B.
- Attempt to deport version A.
Version A deport fails claiming it is still in use.
Impact:
You cannot remove the previously referenced ISO from the system.
Workaround:
Enable the partition, which was upgraded to a new version. Now deport the previously referenced ISO.
1497385-3 : F5OS SNMP IF-MIB::ifAlias missing from snmpwalk
Links to More Info: BT1497385
Component: F5OS-C
Symptoms:
The following SNMP MIB OID from IF-MIB table is missing on F5OS-A and F5OS-C.
1.3.6.1.2.1.31.1.1.1.18
Example snmpwalk result:
~ % snmpwalk -c public -v 2c 10.10.10.33 1.3.6.1.2.1.31.1.1.1.18
IF-MIB::ifAlias = No Such Instance currently exists at this OID
Conditions:
Snmpwalk -c public -v 2c 10.10.10.33 1.3.6.1.2.1.31.1.1.1.18
IF-MIB::ifAlias = No Such Instance currently exists at this OID
Impact:
Cannot get results for MIB OID
1.3.6.1.2.1.31.1.1.1.18
Workaround:
None
1491209 : Non-root, local authentication fails when LDAP is configured with chase referrals and an invalid DNS server is configured
Links to More Info: BT1491209
Component: F5OS-C
Symptoms:
Local and remote authentication to F5OS will timeout and fail. Running commands as root may take 60 seconds before each command returns.
Conditions:
LDAP authentication is configured with chase-referrals set to true and an invalid or non-responsive DNS server is also configured.
Impact:
Users cannot successfully authenticate via the GUI. Local admin users cannot successfully authenticate. Logging in as root takes 2 minutes and many system commands will take at least 60 seconds to complete.
Workaround:
Set 'system aaa authentication ldap chase-referrals false' or ensure a working DNS server is always configured.
1490169-1 : Monitor Error Event logged on controller and partition
Links to More Info: BT1490169
Component: F5OS-C
Symptoms:
The system may periodically log platform-monitor errors in the velos.log on the controller:
<timestamp> controller-1 platform-monitor[8]: priority="Err" msg="Monitor Error Event" kind="service:monitor-error" error="Get \"http://localhost:10080/v3/qkviewd/health\": dial tcp [::1]:10080: connect: connection refused" interface="console-output"
<timestamp> controller-1 platform-monitor[8]: priority="Err" msg="Monitor Error Event" kind="service:monitor-error" error="ReadyRequest failed for 'platform-hal' @ 'tcp://127.0.0.1:1046', Inner -> 'receive timeout'" interface="console-output"
The VELOS system may also log these errors in the partition's velos.log:
<timestamp> controller-1(p2) platform-monitor[8]: priority="Err" msg="Monitor Error Event" kind="service:monitor-error" error="HealthRequest failed for 'partition2_tcpdumpd_manager' @ 'tcp://127.0.0.1:3510', Inner -> 'receive timeout'" interface="console-output"
<timestamp> 100.65.18.52 controller-2(p2) platform-monitor[9]: priority="Err" msg="Monitor Error Event" kind="service:monitor-error" error="HealthRequest failed for 'partition2_tcpdumpd_manager' @ 'tcp://127.0.0.1:3510', Inner -> 'receive timeout'" interface="console-output"
The platform-monitor events normally only occur on tcpdump monitors.
The log messages can occur even on an idle VELOS system.
Conditions:
VELOS system with partition enabled.
Impact:
The issue is transient and cosmetic.
Workaround:
None.
1471673-1 : Tenants may be in a failed state after downgrade from VELOS v1.7.1 to v1.7.0 and then back to 1.7.1★
Component: F5OS-C
Symptoms:
After a VELOS software downgrade from v1.7.1 to v1.7.0 and then upgrading back to a v1.7.1 VELOS version may put the tenants into a failed state.
Conditions:
The VELOS v1.7.0 software is intended to be an initial manufacturing build in which the user upgrades to the released v1.7.1 VELOS version. Downgrading back to the VELOS v1.7.0 version is not supported.
Impact:
Tenants may be in a failed state.
Workaround:
None
1381053-5 : Cluster IP is unavailable for some time during tenant reboot
Links to More Info: BT1381053
Component: F5OS-C
Symptoms:
Cluster IP/Floating IP becomes inactive, causing API calls failure temporarily.
Conditions:
Intermittently when the system/tenant is rebooted.
When tenant running-state is toggled (deployed->configured->deployed).
Impact:
API calls are failing temporarily. CM will not be able to get the status of the HA.
Workaround:
1. Login to the rSeries device on which the current ACTIVE HA node is running.
2. execute the below command with appropriate changes,
docker exec -it node-agent arping -q -c 5 -W 0.01 -U -P -I <tenant mgmt interface> -S <tenant mgmt VIP> <tenant mgmt VIP>
tenant-mgmt interface can be found using 'ip a s | grep mgmt' on the root.
1332293-4 : Tcpdump performed with an interface filter on VELOS or rSeries will show broadcast traffic from all interfaces
Links to More Info: BT1332293
Component: F5OS-C
Symptoms:
When performing a tcpdump in VELOS or an rSeries appliance, a traffic capture limited to a specific interface will show broadcast traffic hitting other interfaces.
Conditions:
- VELOS platform or r5000 / r10000 / r12000 series appliance
- Running a packet capture on a specific interface (e.g. 1/1.0 or 1.0)
Impact:
This can cause confusion or impede troubleshooting when unexpected broadcast traffic is seen in a capture such as ARP or Miscabling Protocol traffic.
Workaround:
None
1322245-1 : After downgrading from version 1.6.0 to 1.5.1 and then upgrading back to 1.6.0 both packages are installed and install status is set to none.★
Links to More Info: BT1322245
Component: F5OS-C
Symptoms:
"PACKAGE INSTALLED" and "INSTALL STATUS" are none after downgrading from 1.6.0 to 1.5.1 and upgrading back to 1.6.0.
Conditions:
1. Install the optics package.
2. Downgrade from 1.6.0 to 1.5.1
2. Upgrade back to 1.6.0
Impact:
There is no functional impact with this.
The previously installed packages will still be available in the system. And optics will be updated.
Workaround:
There is no functional impact.
As a workaround, switch to an alternative package version and then revert to the previous package version.
The package installed and install status field would be updated.
1321593 : Peer controller compatibility verification failed★
Links to More Info: BT1321593
Component: F5OS-C
Symptoms:
During a downgrade of the chassis from 1.6.0 to 1.5.1, the partition downgrade will succeed, but the controller downgrade may show a message indicating that compatibility verification failed. This is due to missing certain firewall rules in the controllers.
Conditions:
Intermittently in downgrades, controller configuration indicates that compatibility verification failed.
Impact:
Intermittently in downgrades.
Workaround:
1. Restart the iptables-config.service in both controllers
$ systemctl restart iptables-config.service
2. Reboot the chassis
1273129-4 : TPM status may not reporting during PXE install
Component: F5OS-C
Symptoms:
The tpm-integrity-status parameter may incorrectly be displayed as "Unavailable" when running the ‘show components component platform command’.
Conditions:
When performing a PXE install downgrade, SIRR DB does not persist across OS updates. This leads to mismatch SIRR and BIOS version; SIRR DB may not have the info about the newer BIOS version, causing inconsistencies in TPM validation.
Impact:
This may impacting TPM integrity validation.
Workaround:
Contact F5 support for further assistance and more details.
1222721-2 : Deletion of STP configuration using "no stp" is failing
Links to More Info: BT1222721
Component: F5OS-C
Symptoms:
"no stp" is failing with below error
Aborted: 'stp rstp config' : IEEE Std 802.1Q-2018: A Bridge shall enforce the following relationships:
Due to this, user cannot delete/disable STP with a single comamnd.
Conditions:
In case of VELOS paltforms, "no stp" will fail with error.
Impact:
User will not be able to delete/disable STP configuration with single command "no stp".
Workaround:
Except below configurations, all other configurations can be deleted.
1)no stp rstp config
2)no stp stp config
3)no stp mstp config
1112317-2 : Null bytes or non-ascii characters are present in velos.log
Links to More Info: BT1112317
Component: F5OS-C
Symptoms:
Null bytes are created in the log files.
Conditions:
Abrupt restarts may cause this issue.
Impact:
Grep considers the log file as a binary file.
Workaround:
Use ‘-a’ option in grep command.
1102869-1 : Link stats misrepresentation on interfaces in Autonegotiate mode when link goes down
Links to More Info: BT1102869
Component: F5OS-C
Symptoms:
When an interface is configured for autonegotiation and then this link goes down, the port-speed and duplex-mode attributes are not cleared and are still displayed.
Conditions:
This issue occurs when the interface is configured for autonegotiation mode, has port-speed and duplex-mode populated from a prior active connection, and the link subsequently goes down
Impact:
Users might misinterpret the current state of the interface.
Workaround:
To accurately determine the link status, users should rely on the state oper-status field.
1028389 : Tenant status/error messages in the partition CLI display are misleading
Component: F5OS-C
Symptoms:
A confusing and misleading status and error message is displayed within the 'show tenants' command when the partition services are not fully functional:
"Resource allocation failed - Verify node is synchronized with the partition"
"Tenant deployment will be processed when the blade synchronized with partition"
Conditions:
This is encountered with the CLI command 'show tenants' when the partition services are not fully operational on the blade.
Impact:
When you run the CLI command 'show tenants' and see this specific status and error message, the actual problem with the tenant can be confusing and misleading.
★ This issue may cause the configuration to fail to load or may significantly impact system performance after upgrade
For additional support resources and technical documentation, see:
- The F5 Technical Support website: https://www.f5.com/support/
- The MyF5 website: https://my.f5.com/manage/s/
- The F5 DevCentral website: https://community.f5.com/