Skip to content

fby3.5: bb: Support inform sled cycle#280

Closed
DelphineChiu wants to merge 2 commits into
facebook:mainfrom
Wiwynn:Sara/fby3.5-bb-Inform_sled_cycle
Closed

fby3.5: bb: Support inform sled cycle#280
DelphineChiu wants to merge 2 commits into
facebook:mainfrom
Wiwynn:Sara/fby3.5-bb-Inform_sled_cycle

Conversation

@DelphineChiu

Copy link
Copy Markdown

fby3.5: bb: Support inform sled cycle
Summary:

  • Add an SEL to inform peer BMC that BB BIC will execute the power sled cycle.

Dependency: #279

Test plan:

  • Build code: Pass
  • Inform peer BMC: Pass

Log:

  1. Sled cycle on slot1.
  • Check log on slot3
    root@bmc-oob:~# bic-util slot1 0xE0 0x02 0x9C 0x9C 0x00 0x10 0xC0 0xF0
    9C 9C 00 10 31 F0 00 03

root@bmc-oob:~# log-util --print all
1 slot1 2018-03-09 04:36:24 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:24, Sensor: POWER_DETECT (0xE1), Event Data: (000000) SLED_CYCLE by BB BIC Assertion
1 slot1 2018-03-09 04:35:23 power-util SERVER_POWER_ON successful for FRU: 1
0 all 2018-03-09 04:35:23 show_sys_config Abnormal - slot3 instead of slot1
6 nic 2018-03-09 04:35:24 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type supported = 0x35
6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM sensor monitoring enabled

  1. Sled cycle on slot3.
  • Check log on slot1
    root@bmc-oob:# bic-util slot1 0xE0 0x02 0x9C 0x9C 0x00 0x10 0xC0 0xF0
    9C 9C 00 10 31 F0 00 01
    root@bmc-oob:
    # log-util --print all
    2018 Mar 09 05:05:52 log-util: User cleared all logs
    1 slot1 2018-03-09 05:08:13 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:08:13, Sensor: POWER_DETECT (0xE1), Event Data: (000000) SLED_CYCLE by BB BIC Assertion
    1 slot1 2018-03-09 04:35:23 power-util SERVER_POWER_ON successful for FRU: 1
    6 nic 2018-03-09 04:35:24 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
    6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type supported = 0x35
    6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
    6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
    6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
    6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
    6 nic 2018-03-09 04:35:24 ncsid FRU: 6 PLDM sensor monitoring enabled
KaidanWu-wiwynn and others added 2 commits May 11, 2022 11:52
[Summary]
- Implement Class-2 Cable Detection for system present, system absent, cable absent and cable mismatch situations (Add SEL)
- Create common add SEL structure for BB BIC (BB BIC needs InF_target to determine which slot to send)
- Cable Detection logic:
		┌───────────────────────────┐                ┌───────────────────────────────────────────┐
		│ BB BIC GPIO ISR A3/B7     │                │ Check Cable status ?  Cable Absent : Next │
		│ PRSNT_MB_BIC_SLOT3_BB_N_R ├───────────────>│ Check GPIO status  ? System Absent : Next │
		│ PRSNT_MB_BIC_SLOT1_BB_N_R │                │ Add System Present SEL                    │
		└───────────────────────────┘                └───────────────────────────────────────────┘
		┌───────────────────────────┐                           ▲                     │                     ┌────────────────────────────────┐
		│ BMC Send IPMI request     │                           │                     │                     │ Get GPIO SlotID                │
		│ NetFn:   OEM 0x30         ├───────────Fn Call─────────┘                     └──────Fn Return─────>│ Check SlotID ? Mismatch : Exit │
		│ Command: OEM 0xCB (Cable) │                                                                       │ Note: Slot1 = 0x3, Slot3 = 0x1 │
		└───────────────────────────┘                                                                       └────────────────────────────────┘

		Note: Cable detection only examines peer blade because local blade should be present if BMC is operable

- Known issue: Sometimes, BMC request or GPIO interrupt may call the function twice cause double SELs

[Test Plan & Log]
1. Build code:			Pass
2. Cable Detection using decoded BMC test firmware:
	A. SLED cycle when system normal:
		a. Action:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:48:16 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:22    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:22, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:22    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:22    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:22    healthd          SLED Powered OFF at Fri Mar  9 04:35:22 2018
				0    all      2018-03-09 04:35:22    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:35:15 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

	B. BMC reboot after peer cable removed:
		a. Action on upper blade:
			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:37:22 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:37:41    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:41, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
			->	1    slot1    2018-03-09 04:39:14    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:14, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:39:14    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:39:14    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:39:14    healthd          BMC Reboot detected - caused by reboot command

		b. Action on lower blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:35:45 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:54    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:54, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
			->	1    slot1    2018-03-09 04:37:25    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:25, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:37:25    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:37:25    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:37:25    healthd          BMC Reboot detected - caused by reboot command

	C.	BMC reboot after peer blade removed:
		a. Remove upper blade & Reboot lower BMC:
			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:35:45 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:51    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:37:15    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:37:15    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:37:15    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:37:15    healthd          BMC Reboot detected - caused by reboot command

		b. Remove lower blade & Reboot upper BMC:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:39:26 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:39:32    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:32, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:40:59    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:59, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:40:59    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:40:59    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:40:59    healthd          BMC Reboot detected - caused by reboot command

	D. BMC reboot when cable mismatch:
		a. Exchange both cables & Reboot upper blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 05:54:13 log-util: User cleared all logs
			->	1    slot1    2018-03-09 05:55:51    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
			->	1    slot1    2018-03-09 05:55:51    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				0    all      2018-03-09 05:55:51    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 05:55:51    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 05:55:51    healthd          BMC Reboot detected - caused by reboot command
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM sensor monitoring enabled

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 05:54:06 log-util: User cleared all logs

		b. Exchange both cables & Reboot lower blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 05:58:10 log-util: User cleared all logs

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 06:00:55 log-util: User cleared all logs
			->	1    slot1    2018-03-09 06:02:15    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:02:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
			->	1    slot1    2018-03-09 06:02:15    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:02:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (31FFFF) Abnormal - slot3 instead of slot1 Assertion
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 06:02:16    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 06:02:16    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 06:02:16    healthd          BMC Reboot detected - caused by reboot command

		c. Upper blade insert into Slot1, peer blade not present:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 05:58:10 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

		d. Lower blade insert into Slot3, peer blade not present:
			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 06:10:27 log-util: User cleared all logs
			->	1    slot1    2018-03-09 06:11:45    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:11:45, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 06:11:45    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:11:45, Sensor: SLOT_PRESENT (0xCB), Event Data: (31FFFF) Abnormal - slot3 instead of slot1 Assertion
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 06:11:45    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 06:11:45    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 06:11:45    healthd          BMC Reboot detected - caused by reboot command

	E. Hot-remove and hot-add peer blade:
		a. Action on upper blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:39:12 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:39:25 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:39:55    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:55, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:40:08    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:08, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion

		b. Action on lower blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:36:28 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:36:38    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:38, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:36:48    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:48, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:42:30 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

	F. Hot-remove and hot-plug peer cable:
		a. Action on upper blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:36:55 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:21    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:21 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:37:23 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:37:36    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:36, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
			->	1    slot1    2018-03-09 04:37:47    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:47, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion

		b. Action on lower blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:40:15 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:40:21    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
			->	1    slot1    2018-03-09 04:40:35    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:35, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:43:29 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018
Summary:
- Add an SEL to inform peer BMC that BB BIC will execute the power sled cycle.

Test plan:
- Build code: Pass
- Inform peer BMC: Pass

Log:
1. Sled cycle on slot1.
- Check log on slot3
root@bmc-oob:~# bic-util slot1 0xE0 0x02 0x9C 0x9C 0x00 0x10 0xC0 0xF0
9C 9C 00 10 31 F0 00 03

root@bmc-oob:~# log-util --print all
1    slot1    2018-03-09 04:36:24    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:24, Sensor: POWER_DETECT (0xE1), Event Data: (000000) SLED_CYCLE by BB BIC Assertion
1    slot1    2018-03-09 04:35:23    power-util       SERVER_POWER_ON successful for FRU: 1
0    all      2018-03-09 04:35:23    show_sys_config  Abnormal - slot3 instead of slot1
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type supported = 0x35
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM sensor monitoring enabled

2. Sled cycle on slot3.
- Check log on slot1
root@bmc-oob:~# bic-util slot1 0xE0 0x02 0x9C 0x9C 0x00 0x10 0xC0 0xF0
9C 9C 00 10 31 F0 00 01
root@bmc-oob:~# log-util --print all
2018 Mar 09 05:05:52 log-util: User cleared all logs
1    slot1    2018-03-09 05:08:13    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:08:13, Sensor: POWER_DETECT (0xE1), Event Data: (000000) SLED_CYCLE by BB BIC Assertion
1    slot1    2018-03-09 04:35:23    power-util       SERVER_POWER_ON successful for FRU: 1
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type supported = 0x35
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM sensor monitoring enabled
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2022
@DelphineChiu

Copy link
Copy Markdown
Author

Dependency: #279

@facebook-github-bot

Copy link
Copy Markdown
Contributor

@GoldenBug has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@SaraSYLin SaraSYLin deleted the Sara/fby3.5-bb-Inform_sled_cycle branch May 25, 2022 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

4 participants