Skip to content

yv3.5: bb: Implement Class-2 Cable Detection#279

Closed
DelphineChiu wants to merge 1 commit into
facebook:mainfrom
Wiwynn:kaidan/Implement_Class2_Cable_Detection
Closed

yv3.5: bb: Implement Class-2 Cable Detection#279
DelphineChiu wants to merge 1 commit into
facebook:mainfrom
Wiwynn:kaidan/Implement_Class2_Cable_Detection

Conversation

@DelphineChiu

Copy link
Copy Markdown

[Summary]

  • Implement Class-2 Cable Detection for system present, system absent, cable absent and cable mismatch situations (Add SEL)

  • Create common add SEL structure for BB BIC (BB BIC needs InF_target to determine which slot to send)

  • Cable Detection logic:
    ┌───────────────────────────┐ ┌───────────────────────────────────────────┐
    │ BB BIC GPIO ISR A3/B7 │ │ Check Cable status ? Cable Absent : Next │
    │ PRSNT_MB_BIC_SLOT3_BB_N_R ├───────────────>│ Check GPIO status ? System Absent : Next │
    │ PRSNT_MB_BIC_SLOT1_BB_N_R │ │ Add System Present SEL │
    └───────────────────────────┘ └───────────────────────────────────────────┘
    ┌───────────────────────────┐ ▲ │ ┌────────────────────────────────┐
    │ BMC Send IPMI request │ │ │ │ Get GPIO SlotID │
    │ NetFn: OEM 0x30 ├───────────Fn Call─────────┘ └──────Fn Return─────>│ Check SlotID ? Mismatch : Exit │
    │ Command: OEM 0xCB (Cable) │ │ Note: Slot1 = 0x3, Slot3 = 0x1 │
    └───────────────────────────┘ └────────────────────────────────┘

      Note: Cable detection only examines peer blade because local blade should be present if BMC is operable
    
  • Known issue: Sometimes, BMC request or GPIO interrupt may call the function twice cause double SELs

[Test Plan & Log]

  1. Build code: Pass

  2. Cable Detection using decoded BMC test firmware:
    A. SLED cycle when system normal:
    a. Action:
    - Upper SEL:
    root@bmc-oob:~# log-util all --print
    2018 Mar 09 04:48:16 log-util: User cleared all logs
    -> 1 slot1 2018-03-09 04:35:22 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:22, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
    6 nic 2018-03-09 04:35:22 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
    6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type supported = 0x35
    6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
    6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
    6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
    6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
    6 nic 2018-03-09 04:35:22 ncsid FRU: 6 PLDM sensor monitoring enabled
    0 all 2018-03-09 04:35:22 healthd ASSERT: Verified boot failure (3,35)
    0 all 2018-03-09 04:35:22 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
    0 all 2018-03-09 04:35:22 healthd SLED Powered OFF at Fri Mar 9 04:35:22 2018
    0 all 2018-03-09 04:35:22 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018

     	- Lower SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:35:15 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
     		0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
     		0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
     		0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
     		0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018
    

    B. BMC reboot after peer cable removed:
    a. Action on upper blade:
    - Lower SEL:
    root@bmc-oob:~# log-util all --print
    2018 Mar 09 04:37:22 log-util: User cleared all logs
    -> 1 slot1 2018-03-09 04:37:41 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:41, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
    -> 1 slot1 2018-03-09 04:39:14 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:14, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
    6 nic 2018-03-09 04:39:14 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
    6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type supported = 0x35
    6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
    6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
    6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
    6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
    6 nic 2018-03-09 04:39:14 ncsid FRU: 6 PLDM sensor monitoring enabled
    0 all 2018-03-09 04:39:14 healthd ASSERT: Verified boot failure (3,35)
    0 all 2018-03-09 04:39:14 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
    0 all 2018-03-09 04:39:14 healthd BMC Reboot detected - caused by reboot command

     b. Action on lower blade:
     	- Upper SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:35:45 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:35:54    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:54, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
     	->	1    slot1    2018-03-09 04:37:25    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:25, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
     		6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
     		6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type supported = 0x35
     		6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
     		6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
     		6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
     		6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
     		6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM sensor monitoring enabled
     		0    all      2018-03-09 04:37:25    healthd          ASSERT: Verified boot failure (3,35)
     		0    all      2018-03-09 04:37:25    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
     		0    all      2018-03-09 04:37:25    healthd          BMC Reboot detected - caused by reboot command
    

    C. BMC reboot after peer blade removed:
    a. Remove upper blade & Reboot lower BMC:
    - Lower SEL:
    root@bmc-oob:~# log-util all --print
    2018 Mar 09 04:35:45 log-util: User cleared all logs
    -> 1 slot1 2018-03-09 04:35:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
    -> 1 slot1 2018-03-09 04:37:15 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
    6 nic 2018-03-09 04:37:15 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
    6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type supported = 0x35
    6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
    6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
    6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
    6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
    6 nic 2018-03-09 04:37:15 ncsid FRU: 6 PLDM sensor monitoring enabled
    0 all 2018-03-09 04:37:15 healthd ASSERT: Verified boot failure (3,35)
    0 all 2018-03-09 04:37:15 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
    0 all 2018-03-09 04:37:15 healthd BMC Reboot detected - caused by reboot command

     b. Remove lower blade & Reboot upper BMC:
     	- Upper SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:39:26 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:39:32    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:32, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
     	->	1    slot1    2018-03-09 04:40:59    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:59, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
     		6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
     		6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type supported = 0x35
     		6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
     		6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
     		6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
     		6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
     		6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM sensor monitoring enabled
     		0    all      2018-03-09 04:40:59    healthd          ASSERT: Verified boot failure (3,35)
     		0    all      2018-03-09 04:40:59    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
     		0    all      2018-03-09 04:40:59    healthd          BMC Reboot detected - caused by reboot command
    

    D. BMC reboot when cable mismatch:
    a. Exchange both cables & Reboot upper blade:
    - Upper SEL:
    root@bmc-oob:~# log-util all --print
    2018 Mar 09 05:54:13 log-util: User cleared all logs
    -> 1 slot1 2018-03-09 05:55:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
    -> 1 slot1 2018-03-09 05:55:51 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion
    6 nic 2018-03-09 05:55:51 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
    6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type supported = 0x35
    6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
    6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
    0 all 2018-03-09 05:55:51 healthd ASSERT: Verified boot failure (3,35)
    0 all 2018-03-09 05:55:51 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
    0 all 2018-03-09 05:55:51 healthd BMC Reboot detected - caused by reboot command
    6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
    6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
    6 nic 2018-03-09 05:55:51 ncsid FRU: 6 PLDM sensor monitoring enabled

     	- Lower SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 05:54:06 log-util: User cleared all logs
    
     b. Exchange both cables & Reboot lower blade:
     	- Upper SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 05:58:10 log-util: User cleared all logs
    
     	- Lower SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 06:00:55 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 06:02:15    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:02:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
     	->	1    slot1    2018-03-09 06:02:15    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:02:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (31FFFF) Abnormal - slot3 instead of slot1 Assertion
     		6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
     		6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type supported = 0x35
     		6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
     		6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
     		6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
     		6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
     		6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM sensor monitoring enabled
     		0    all      2018-03-09 06:02:16    healthd          ASSERT: Verified boot failure (3,35)
     		0    all      2018-03-09 06:02:16    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
     		0    all      2018-03-09 06:02:16    healthd          BMC Reboot detected - caused by reboot command
    
     c. Upper blade insert into Slot1, peer blade not present:
     	- Upper SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 05:58:10 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
     	->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
     		0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
     		0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
     		0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
     		0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018
    
     d. Lower blade insert into Slot3, peer blade not present:
     	- Lower SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 06:10:27 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 06:11:45    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:11:45, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
     	->	1    slot1    2018-03-09 06:11:45    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:11:45, Sensor: SLOT_PRESENT (0xCB), Event Data: (31FFFF) Abnormal - slot3 instead of slot1 Assertion
     		6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
     		6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type supported = 0x35
     		6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
     		6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
     		6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
     		6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
     		6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM sensor monitoring enabled
     		0    all      2018-03-09 06:11:45    healthd          ASSERT: Verified boot failure (3,35)
     		0    all      2018-03-09 06:11:45    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
     		0    all      2018-03-09 06:11:45    healthd          BMC Reboot detected - caused by reboot command
    

    E. Hot-remove and hot-add peer blade:
    a. Action on upper blade:
    - Upper SEL:
    root@bmc-oob:~# log-util all --print
    2018 Mar 09 04:39:12 log-util: User cleared all logs
    -> 1 slot1 2018-03-09 04:35:20 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled
    0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35)
    0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
    0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:20 2018
    0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018

     	- Lower SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:39:25 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:39:55    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:55, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
     	->	1    slot1    2018-03-09 04:40:08    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:08, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
    
     b. Action on lower blade:
     	- Upper SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:36:28 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:36:38    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:38, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
     	->	1    slot1    2018-03-09 04:36:48    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:48, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
    
     	- Lower SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:42:30 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
     		0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
     		0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
     		0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
     		0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018
    

    F. Hot-remove and hot-plug peer cable:
    a. Action on upper blade:
    - Upper SEL:
    root@bmc-oob:~# log-util all --print
    2018 Mar 09 04:36:55 log-util: User cleared all logs
    -> 1 slot1 2018-03-09 04:35:21 ipmid SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type supported = 0x35
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 0 version = 1.0.0.0
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 2 version = 1.1.0.0
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 4 version = 1.0.0.0
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM type 5 version = 1.0.0.0
    6 nic 2018-03-09 04:35:21 ncsid FRU: 6 PLDM sensor monitoring enabled
    0 all 2018-03-09 04:35:21 healthd ASSERT: Verified boot failure (3,35)
    0 all 2018-03-09 04:35:21 healthd Verified boot failure reason: U-Boot FIT did not contain the /keys node
    0 all 2018-03-09 04:35:21 healthd SLED Powered OFF at Fri Mar 9 04:35:21 2018
    0 all 2018-03-09 04:35:21 healthd SLED Powered ON at Fri Mar 9 04:34:27 2018

     	- Lower SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:37:23 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:37:36    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:36, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
     	->	1    slot1    2018-03-09 04:37:47    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:47, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
    
     b. Action on lower blade:
     	- Upper SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:40:15 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:40:21    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
     	->	1    slot1    2018-03-09 04:40:35    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:35, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
    
     	- Lower SEL:
     		root@bmc-oob:~# log-util all --print
     		2018 Mar 09 04:43:29 log-util: User cleared all logs
     	->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
     		6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
     		0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
     		0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
     		0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
     		0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018
    
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2022
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@GoldenBug has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Comment thread common/service/ipmi/ipmi.c Outdated
Comment on lines +158 to +169
if (status == IPMB_ERROR_FAILURE) {
printf("Fail to post msg to InF_target 0x%x txqueue for addsel\n", msg->InF_target);
SAFE_FREE(msg);
return false;
} else if (status == IPMB_ERROR_GET_MESSAGE_QUEUE) {
printf("No response from InF_target 0x%x for addsel\n", msg->InF_target);
SAFE_FREE(msg);
return false;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we implement this as a switch statement instead.

@KaidanWu-wiwynn KaidanWu-wiwynn force-pushed the kaidan/Implement_Class2_Cable_Detection branch from 0e2e5c5 to 0dfd6ea Compare May 12, 2022 01:09
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@DelphineChiu has updated the pull request. You must reimport the pull request before landing.

[Summary]
- Implement Class-2 Cable Detection for system present, system absent, cable absent and cable mismatch situations (Add SEL)
- Create common add SEL structure for BB BIC (BB BIC needs InF_target to determine which slot to send)
- Cable Detection logic:
		┌───────────────────────────┐                ┌───────────────────────────────────────────┐
		│ BB BIC GPIO ISR A3/B7     │                │ Check Cable status ?  Cable Absent : Next │
		│ PRSNT_MB_BIC_SLOT3_BB_N_R ├───────────────>│ Check GPIO status  ? System Absent : Next │
		│ PRSNT_MB_BIC_SLOT1_BB_N_R │                │ Add System Present SEL                    │
		└───────────────────────────┘                └───────────────────────────────────────────┘
		┌───────────────────────────┐                           ▲                     │                     ┌────────────────────────────────┐
		│ BMC Send IPMI request     │                           │                     │                     │ Get GPIO SlotID                │
		│ NetFn:   OEM 0x30         ├───────────Fn Call─────────┘                     └──────Fn Return─────>│ Check SlotID ? Mismatch : Exit │
		│ Command: OEM 0xCB (Cable) │                                                                       │ Note: Slot1 = 0x3, Slot3 = 0x1 │
		└───────────────────────────┘                                                                       └────────────────────────────────┘

		Note: Cable detection only examines peer blade because local blade should be present if BMC is operable

- Known issue: Sometimes, BMC request or GPIO interrupt may call the function twice cause double SELs

[Test Plan & Log]
1. Build code:			Pass
2. Cable Detection using decoded BMC test firmware:
	A. SLED cycle when system normal:
		a. Action:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:48:16 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:22    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:22, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:22    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:22    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:22    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:22    healthd          SLED Powered OFF at Fri Mar  9 04:35:22 2018
				0    all      2018-03-09 04:35:22    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:35:15 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

	B. BMC reboot after peer cable removed:
		a. Action on upper blade:
			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:37:22 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:37:41    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:41, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
			->	1    slot1    2018-03-09 04:39:14    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:14, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:39:14    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:39:14    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:39:14    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:39:14    healthd          BMC Reboot detected - caused by reboot command

		b. Action on lower blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:35:45 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:54    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:54, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
			->	1    slot1    2018-03-09 04:37:25    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:25, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:37:25    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:37:25    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:37:25    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:37:25    healthd          BMC Reboot detected - caused by reboot command

	C.	BMC reboot after peer blade removed:
		a. Remove upper blade & Reboot lower BMC:
			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:35:45 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:51    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:37:15    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:37:15    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:37:15    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:37:15    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:37:15    healthd          BMC Reboot detected - caused by reboot command

		b. Remove lower blade & Reboot upper BMC:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:39:26 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:39:32    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:32, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:40:59    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:59, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:40:59    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:40:59    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:40:59    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:40:59    healthd          BMC Reboot detected - caused by reboot command

	D. BMC reboot when cable mismatch:
		a. Exchange both cables & Reboot upper blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 05:54:13 log-util: User cleared all logs
			->	1    slot1    2018-03-09 05:55:51    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
			->	1    slot1    2018-03-09 05:55:51    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:55:51, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				0    all      2018-03-09 05:55:51    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 05:55:51    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 05:55:51    healthd          BMC Reboot detected - caused by reboot command
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 05:55:51    ncsid            FRU: 6 PLDM sensor monitoring enabled

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 05:54:06 log-util: User cleared all logs

		b. Exchange both cables & Reboot lower blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 05:58:10 log-util: User cleared all logs

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 06:00:55 log-util: User cleared all logs
			->	1    slot1    2018-03-09 06:02:15    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:02:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
			->	1    slot1    2018-03-09 06:02:15    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:02:15, Sensor: SLOT_PRESENT (0xCB), Event Data: (31FFFF) Abnormal - slot3 instead of slot1 Assertion
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 06:02:16    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 06:02:16    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 06:02:16    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 06:02:16    healthd          BMC Reboot detected - caused by reboot command

		c. Upper blade insert into Slot1, peer blade not present:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 05:58:10 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (33FFFF) Abnormal - slot1 instead of slot3 Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

		d. Lower blade insert into Slot3, peer blade not present:
			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 06:10:27 log-util: User cleared all logs
			->	1    slot1    2018-03-09 06:11:45    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:11:45, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 06:11:45    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 06:11:45, Sensor: SLOT_PRESENT (0xCB), Event Data: (31FFFF) Abnormal - slot3 instead of slot1 Assertion
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 06:11:45    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 06:11:45    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 06:11:45    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 06:11:45    healthd          BMC Reboot detected - caused by reboot command

	E. Hot-remove and hot-add peer blade:
		a. Action on upper blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:39:12 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:39:25 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:39:55    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:39:55, Sensor: SLOT_PRESENT (0xCB), Event Data: (13FFFF) Abnormal - slot3(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:40:08    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:08, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion

		b. Action on lower blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:36:28 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:36:38    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:38, Sensor: SLOT_PRESENT (0xCB), Event Data: (11FFFF) Abnormal - slot1(peer slot) not detected Assertion
			->	1    slot1    2018-03-09 04:36:48    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:48, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:42:30 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

	F. Hot-remove and hot-plug peer cable:
		a. Action on upper blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:36:55 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:21    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:21 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:37:23 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:37:36    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:36, Sensor: SLOT_PRESENT (0xCB), Event Data: (23FFFF) Slot3 cable is not connected to the baseboard Assertion
			->	1    slot1    2018-03-09 04:37:47    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:37:47, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion

		b. Action on lower blade:
			- Upper SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:40:15 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:40:21    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:21, Sensor: SLOT_PRESENT (0xCB), Event Data: (21FFFF) Slot1 cable is not connected to the baseboard Assertion
			->	1    slot1    2018-03-09 04:40:35    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:40:35, Sensor: SLOT_PRESENT (0xCB), Event Data: (01FFFF) slot1(peer slot) present Assertion

			- Lower SEL:
				root@bmc-oob:~# log-util all --print
				2018 Mar 09 04:43:29 log-util: User cleared all logs
			->	1    slot1    2018-03-09 04:35:20    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:35:20, Sensor: SLOT_PRESENT (0xCB), Event Data: (03FFFF) slot3(peer slot) present Assertion
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type supported = 0x35
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
				6    nic      2018-03-09 04:35:21    ncsid            FRU: 6 PLDM sensor monitoring enabled
				0    all      2018-03-09 04:35:21    healthd          ASSERT: Verified boot failure (3,35)
				0    all      2018-03-09 04:35:21    healthd          Verified boot failure reason: U-Boot FIT did not contain the /keys node
				0    all      2018-03-09 04:35:21    healthd          SLED Powered OFF at Fri Mar  9 04:35:20 2018
				0    all      2018-03-09 04:35:21    healthd          SLED Powered ON at Fri Mar  9 04:34:27 2018
@KaidanWu-wiwynn KaidanWu-wiwynn force-pushed the kaidan/Implement_Class2_Cable_Detection branch from 0dfd6ea to 6205ae1 Compare May 12, 2022 01:12
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@DelphineChiu has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot

Copy link
Copy Markdown
Contributor

@GoldenBug has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request May 12, 2022
Summary:
fby3.5: bb: Support inform sled cycle

- Add an SEL to inform peer BMC that BB BIC will execute the power sled cycle.

Dependency: #279

Pull Request resolved: #280

Test Plan:
- Build code: Pass
- Inform peer BMC: Pass

Log:
1. Sled cycle on slot1.
- Check log on slot3
root@bmc-oob:~# bic-util slot1 0xE0 0x02 0x9C 0x9C 0x00 0x10 0xC0 0xF0
9C 9C 00 10 31 F0 00 03

root@bmc-oob:~# log-util --print all
1    slot1    2018-03-09 04:36:24    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 04:36:24, Sensor: POWER_DETECT (0xE1), Event Data: (000000) SLED_CYCLE by BB BIC Assertion
1    slot1    2018-03-09 04:35:23    power-util       SERVER_POWER_ON successful for FRU: 1
0    all      2018-03-09 04:35:23    show_sys_config  Abnormal - slot3 instead of slot1
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type supported = 0x35
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM sensor monitoring enabled

2. Sled cycle on slot3.
- Check log on slot1
root@bmc-oob:~# bic-util slot1 0xE0 0x02 0x9C 0x9C 0x00 0x10 0xC0 0xF0
9C 9C 00 10 31 F0 00 01
root@bmc-oob:~# log-util --print all
2018 Mar 09 05:05:52 log-util: User cleared all logs
1    slot1    2018-03-09 05:08:13    ipmid            SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2018-03-09 05:08:13, Sensor: POWER_DETECT (0xE1), Event Data: (000000) SLED_CYCLE by BB BIC Assertion
1    slot1    2018-03-09 04:35:23    power-util       SERVER_POWER_ON successful for FRU: 1
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 NIC AEN Supported: 0x7, AEN Enable Mask=0x7
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type supported = 0x35
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 0 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 2 version = 1.1.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 4 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM type 5 version = 1.0.0.0
6    nic      2018-03-09 04:35:24    ncsid            FRU: 6 PLDM sensor monitoring enabled

Reviewed By: garnermic

Differential Revision: D36321503

Pulled By: GoldenBug

fbshipit-source-id: cf2e0a29a0bd451eb126c9f360bc735f8486e06d
@KaidanWu-wiwynn KaidanWu-wiwynn deleted the kaidan/Implement_Class2_Cable_Detection branch June 7, 2022 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

4 participants