yv3.5: bb: Implement Class-2 Cable Detection#279
Closed
DelphineChiu wants to merge 1 commit into
Closed
Conversation
Contributor
|
@GoldenBug has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
GoldenBug
suggested changes
May 11, 2022
Comment on lines
+158
to
+169
| if (status == IPMB_ERROR_FAILURE) { | ||
| printf("Fail to post msg to InF_target 0x%x txqueue for addsel\n", msg->InF_target); | ||
| SAFE_FREE(msg); | ||
| return false; | ||
| } else if (status == IPMB_ERROR_GET_MESSAGE_QUEUE) { | ||
| printf("No response from InF_target 0x%x for addsel\n", msg->InF_target); | ||
| SAFE_FREE(msg); | ||
| return false; | ||
| } |
Contributor
There was a problem hiding this comment.
Can we implement this as a switch statement instead.
0e2e5c5 to
0dfd6ea
Compare
Contributor
|
@DelphineChiu has updated the pull request. You must reimport the pull request before landing. |
[Summary] - Implement Class-2 Cable Detection for system present, system absent, cable absent and cable mismatch situations (Add SEL) - Create common add SEL structure for BB BIC (BB BIC needs InF_target to determine which slot to send) - Cable Detection logic: ┌───────────────────────────┐ ┌───────────────────────────────────────────┐ │ BB BIC GPIO ISR A3/B7 │ │ Check Cable status ? Cable Absent : Next │ │ PRSNT_MB_BIC_SLOT3_BB_N_R ├───────────────>│ Check GPIO status ? System Absent : Next │ │ PRSNT_MB_BIC_SLOT1_BB_N_R │ │ Add System Present SEL │ └───────────────────────────┘ └───────────────────────────────────────────┘ ┌───────────────────────────┐ ▲ │ ┌────────────────────────────────┐ │ BMC Send IPMI request │ │ │ │ Get GPIO SlotID │ │ NetFn: OEM 0x30 ├───────────Fn Call─────────┘ └──────Fn Return─────>│ Check SlotID ? Mismatch : Exit │ │ Command: OEM 0xCB (Cable) │ │ Note: Slot1 = 0x3, Slot3 = 0x1 │ └───────────────────────────┘ └────────────────────────────────┘ Note: Cable detection only examines peer blade because local blade should be present if BMC is operable - Known issue: Sometimes, BMC request or GPIO interrupt may call the function twice cause double SELs [Test Plan & Log] 1. Build code: Pass 2. Cable Detection using decoded BMC test firmware: A. SLED cycle when system normal: a. Action: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:48:16 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:22 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:22, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion 6 nic 2018-03-09 04:35:22 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:35:22 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:35:22 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:35:22 healthd SLED Powered OFF at Fri Mar 9 04:35:22 2018 0 all 2018-03-09 04:35:22 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018 - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:35:15 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:20 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:20 2018 0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018 B. BMC reboot after peer cable removed: a. Action on upper blade: - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:37:22 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:37:41 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:41, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion -> 1 slot1 2018-03-09 04:39:14 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:14, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion 6 nic 2018-03-09 04:39:14 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:39:14 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:39:14 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:39:14 healthd BMC Reboot detected - caused by reboot command b. Action on lower blade: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:35:45 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:54 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:54, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion -> 1 slot1 2018-03-09 04:37:25 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:25, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion 6 nic 2018-03-09 04:37:25 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:37:25 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:37:25 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:37:25 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:37:25 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:37:25 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:37:25 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:37:25 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:37:25 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:37:25 healthd BMC Reboot detected - caused by reboot command C. BMC reboot after peer blade removed: a. Remove upper blade & Reboot lower BMC: - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:35:45 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion -> 1 slot1 2018-03-09 04:37:15 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion 6 nic 2018-03-09 04:37:15 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:37:15 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:37:15 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:37:15 healthd BMC Reboot detected - caused by reboot command b. Remove lower blade & Reboot upper BMC: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:39:26 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:39:32 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:32, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion -> 1 slot1 2018-03-09 04:40:59 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:59, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion 6 nic 2018-03-09 04:40:59 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:40:59 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:40:59 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:40:59 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:40:59 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:40:59 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:40:59 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:40:59 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:40:59 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:40:59 healthd BMC Reboot detected - caused by reboot command D. BMC reboot when cable mismatch: a. Exchange both cables & Reboot upper blade: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 05:54:13 log-util: User cleared all logs -> 1 slot1 2018-03-09 05:55:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion -> 1 slot1 2018-03-09 05:55:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion 6 nic 2018-03-09 05:55:51 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 0 all 2018-03-09 05:55:51 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 05:55:51 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 05:55:51 healthd BMC Reboot detected - caused by reboot command 6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM sensor monitoring enabled - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 05:54:06 log-util: User cleared all logs b. Exchange both cables & Reboot lower blade: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 05:58:10 log-util: User cleared all logs - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 06:00:55 log-util: User cleared all logs -> 1 slot1 2018-03-09 06:02:15 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:02:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion -> 1 slot1 2018-03-09 06:02:15 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:02:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (31FFFF) Abnormal - slot3 instead of slot1 Assertion 6 nic 2018-03-09 06:02:16 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 06:02:16 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 06:02:16 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 06:02:16 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 06:02:16 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 06:02:16 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 06:02:16 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 06:02:16 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 06:02:16 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 06:02:16 healthd BMC Reboot detected - caused by reboot command c. Upper blade insert into Slot1, peer blade not present: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 05:58:10 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:20 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion -> 1 slot1 2018-03-09 04:35:20 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:20 2018 0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018 d. Lower blade insert into Slot3, peer blade not present: - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 06:10:27 log-util: User cleared all logs -> 1 slot1 2018-03-09 06:11:45 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:11:45, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion -> 1 slot1 2018-03-09 06:11:45 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:11:45, Sensor: SLOT_PRESENT (0xCB), Event Data: (31FFFF) Abnormal - slot3 instead of slot1 Assertion 6 nic 2018-03-09 06:11:45 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 06:11:45 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 06:11:45 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 06:11:45 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 06:11:45 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 06:11:45 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 06:11:45 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 06:11:45 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 06:11:45 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 06:11:45 healthd BMC Reboot detected - caused by reboot command E. Hot-remove and hot-add peer blade: a. Action on upper blade: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:39:12 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:20 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:20 2018 0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018 - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:39:25 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:39:55 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:55, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion -> 1 slot1 2018-03-09 04:40:08 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:08, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion b. Action on lower blade: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:36:28 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:36:38 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:38, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion -> 1 slot1 2018-03-09 04:36:48 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:48, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:42:30 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:20 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:20 2018 0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018 F. Hot-remove and hot-plug peer cable: a. Action on upper blade: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:36:55 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:21 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:21 2018 0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018 - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:37:23 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:37:36 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:36, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion -> 1 slot1 2018-03-09 04:37:47 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:47, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion b. Action on lower blade: - Upper SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:40:15 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:40:21 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion -> 1 slot1 2018-03-09 04:40:35 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:35, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion - Lower SEL: root@bmc-oob:~# log-util all --print 2018 Mar 09 04:43:29 log-util: User cleared all logs -> 1 slot1 2018-03-09 04:35:20 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled 0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35) 0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node 0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:20 2018 0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018
0dfd6ea to
6205ae1
Compare
Contributor
|
@DelphineChiu has updated the pull request. You must reimport the pull request before landing. |
Contributor
|
@GoldenBug has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
facebook-github-bot
pushed a commit
that referenced
this pull request
May 12, 2022
Summary: fby3.5: bb: Support inform sled cycle - Add an SEL to inform peer BMC that BB BIC will execute the power sled cycle. Dependency: #279 Pull Request resolved: #280 Test Plan: - Build code: Pass - Inform peer BMC: Pass Log: 1. Sled cycle on slot1. - Check log on slot3 root@bmc-oob:~# bic-util slot1 0xE0 0x02 0x9C 0x9C 0x00 0x10 0xC0 0xF0 9C 9C 00 10 31 F0 00 03 root@bmc-oob:~# log-util --print all 1 slot1 2018-03-09 04:36:24 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:24, Sensor: POWER_DETECT (0xE1), Event Data: (000000) SLED_CYCLE by BB BIC Assertion 1 slot1 2018-03-09 04:35:23 power-util SERVER_POWER_ON successful for FRU: 1 0 all 2018-03-09 04:35:23 show_sys_config Abnormal - slot3 instead of slot1 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM sensor monitoring enabled 2. Sled cycle on slot3. - Check log on slot1 root@bmc-oob:~# bic-util slot1 0xE0 0x02 0x9C 0x9C 0x00 0x10 0xC0 0xF0 9C 9C 00 10 31 F0 00 01 root@bmc-oob:~# log-util --print all 2018 Mar 09 05:05:52 log-util: User cleared all logs 1 slot1 2018-03-09 05:08:13 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:08:13, Sensor: POWER_DETECT (0xE1), Event Data: (000000) SLED_CYCLE by BB BIC Assertion 1 slot1 2018-03-09 04:35:23 power-util SERVER_POWER_ON successful for FRU: 1 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type supported = 0x35 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0 6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM sensor monitoring enabled Reviewed By: garnermic Differential Revision: D36321503 Pulled By: GoldenBug fbshipit-source-id: cf2e0a29a0bd451eb126c9f360bc735f8486e06d
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[Summary]
Implement Class-2 Cable Detection for system present, system absent, cable absent and cable mismatch situations (Add SEL)
Create common add SEL structure for BB BIC (BB BIC needs InF_target to determine which slot to send)
Cable Detection logic:
┌───────────────────────────┐ ┌───────────────────────────────────────────┐
│ BB BIC GPIO ISR A3/B7 │ │ Check Cable status ? Cable Absent : Next │
│ PRSNT_MB_BIC_SLOT3_BB_N_R ├───────────────>│ Check GPIO status ? System Absent : Next │
│ PRSNT_MB_BIC_SLOT1_BB_N_R │ │ Add System Present SEL │
└───────────────────────────┘ └───────────────────────────────────────────┘
┌───────────────────────────┐ ▲ │ ┌────────────────────────────────┐
│ BMC Send IPMI request │ │ │ │ Get GPIO SlotID │
│ NetFn: OEM 0x30 ├───────────Fn Call─────────┘ └──────Fn Return─────>│ Check SlotID ? Mismatch : Exit │
│ Command: OEM 0xCB (Cable) │ │ Note: Slot1 = 0x3, Slot3 = 0x1 │
└───────────────────────────┘ └────────────────────────────────┘
Known issue: Sometimes, BMC request or GPIO interrupt may call the function twice cause double SELs
[Test Plan & Log]
Build code: Pass
Cable Detection using decoded BMC test firmware:
A. SLED cycle when system normal:
a. Action:
- Upper SEL:
root@bmc-oob:~# log-util all --print
2018 Mar 09 04:48:16 log-util: User cleared all logs
-> 1 slot1 2018-03-09 04:35:22 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:22, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
6 nic 2018-03-09 04:35:22 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type supported = 0x35
6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM sensor monitoring enabled
0 all 2018-03-09 04:35:22 healthd ASSERT: Verified boot failure (3,35)
0 all 2018-03-09 04:35:22 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
0 all 2018-03-09 04:35:22 healthd SLED Powered OFF at Fri Mar 9 04:35:22 2018
0 all 2018-03-09 04:35:22 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018
B. BMC reboot after peer cable removed:
a. Action on upper blade:
- Lower SEL:
root@bmc-oob:~# log-util all --print
2018 Mar 09 04:37:22 log-util: User cleared all logs
-> 1 slot1 2018-03-09 04:37:41 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:41, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
-> 1 slot1 2018-03-09 04:39:14 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:14, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
6 nic 2018-03-09 04:39:14 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type supported = 0x35
6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM sensor monitoring enabled
0 all 2018-03-09 04:39:14 healthd ASSERT: Verified boot failure (3,35)
0 all 2018-03-09 04:39:14 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
0 all 2018-03-09 04:39:14 healthd BMC Reboot detected - caused by reboot command
C. BMC reboot after peer blade removed:
a. Remove upper blade & Reboot lower BMC:
- Lower SEL:
root@bmc-oob:~# log-util all --print
2018 Mar 09 04:35:45 log-util: User cleared all logs
-> 1 slot1 2018-03-09 04:35:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
-> 1 slot1 2018-03-09 04:37:15 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
6 nic 2018-03-09 04:37:15 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type supported = 0x35
6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM sensor monitoring enabled
0 all 2018-03-09 04:37:15 healthd ASSERT: Verified boot failure (3,35)
0 all 2018-03-09 04:37:15 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
0 all 2018-03-09 04:37:15 healthd BMC Reboot detected - caused by reboot command
D. BMC reboot when cable mismatch:
a. Exchange both cables & Reboot upper blade:
- Upper SEL:
root@bmc-oob:~# log-util all --print
2018 Mar 09 05:54:13 log-util: User cleared all logs
-> 1 slot1 2018-03-09 05:55:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
-> 1 slot1 2018-03-09 05:55:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion
6 nic 2018-03-09 05:55:51 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type supported = 0x35
6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
0 all 2018-03-09 05:55:51 healthd ASSERT: Verified boot failure (3,35)
0 all 2018-03-09 05:55:51 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
0 all 2018-03-09 05:55:51 healthd BMC Reboot detected - caused by reboot command
6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM sensor monitoring enabled
E. Hot-remove and hot-add peer blade:
a. Action on upper blade:
- Upper SEL:
root@bmc-oob:~# log-util all --print
2018 Mar 09 04:39:12 log-util: User cleared all logs
-> 1 slot1 2018-03-09 04:35:20 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled
0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35)
0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:20 2018
0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018
F. Hot-remove and hot-plug peer cable:
a. Action on upper blade:
- Upper SEL:
root@bmc-oob:~# log-util all --print
2018 Mar 09 04:36:55 log-util: User cleared all logs
-> 1 slot1 2018-03-09 04:35:21 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled
0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35)
0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:21 2018
0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018