Skip to content

main_v2 - fby3.5: cl:Added memory thermal trip event.#235

Closed
DelphineChiu wants to merge 8 commits into
facebook:main_v2from
Wiwynn:Lora/main_v2/Add_memtrip_event
Closed

main_v2 - fby3.5: cl:Added memory thermal trip event.#235
DelphineChiu wants to merge 8 commits into
facebook:main_v2from
Wiwynn:Lora/main_v2/Add_memtrip_event

Conversation

@DelphineChiu

Copy link
Copy Markdown

fby3.5: cl:Added memory thermal trip event.
Summary:

  • Added event log for memory thermal trip assert event.

Dependency: #234

Test Plan:

  • Build code: Pass

Log:
Memory thermal trip event
[BIC console]
uart:$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 1
uart:
$ gpio conf GPIO0_E_H 26 out
Configuring GPIO0_E_H pin 26
uart:$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 0
uart:
$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 1
uart:$ gpio conf GPIO0_A_D 20 out
Configuring GPIO0_A_D pin 20
uart:
$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 0

[BMC console]
root@bmc-oob:~# log-util slot1 --print
SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2022-03-31 00:34:09, Sensor: SYSTEM_STATUS (0x10), Event Data: (11FFFF) CPU/Memory thermal trip Assertion

SOC thermal trip event
[BIC console]
uart:$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 1
uart:
$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 1
uart:$ gpio conf GPIO0_A_D 20 out
Configuring GPIO0_A_D pin 20
uart:
$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 0

[BMC console]
root@bmc-oob:~# log-util slot1 --print
SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2022-03-31 00:35:09, Sensor: SYSTEM_STATUS (0x10), Event Data: (00FFFF) SOC thermal trip Assertion

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 18, 2022
@DelphineChiu

Copy link
Copy Markdown
Author

Dependency: #234

RenChen-wiwynn and others added 8 commits April 18, 2022 16:05
Summary:
- Support MP5990 sensor device

Test Plan:
1. Build code: pass
2. [MPS] Check GPIOA7 and the MP5990 configure register(38h and 46h)
3. [MPS] Get MP5990 sensor reading: pass
4. [ADI] Get ADM1278 sensor reading: pass

Log:
1. Class type: class-1, 1ou present status: false, 2ou present status: true, board revision: EVT3(EFUSE)
root@bmc-oob:/root# bic-util slot1 --get_gpio|grep "HSC_SET_EN_R"
7 HSC_SET_EN_R: 1
root@bmc-oob:/root# bic-util slot1 0x18 0x52 0x5 0x16 0x2 0x38
BF 01
root@bmc-oob:/root# bic-util slot1 0x18 0x52 0x5 0x16 0x2 0x46
46 00
root@bmc-oob:/root# sensor-util slot1|grep HSC
HSC Temp                     (0xE) :   25.00 C     | (ok)
HSC Input Vol                (0x29) :   12.31 Volts | (ok)
HSC Output Cur               (0x30) :    0.25 Amps  | (ok)
HSC Input Pwr                (0x39) :    0.00 Watts | (ok)

2. Class type: class-1, 1ou present status: false, 2ou present status: false, board revision: EVT3(EFUSE)
root@bmc-oob:/root# bic-util slot1 --get_gpio|grep "HSC_SET_EN_R"
7 HSC_SET_EN_R: 0
root@bmc-oob:/root# bic-util slot1 0x18 0x52 0x5 0x16 0x2 0x46
28 00
root@bmc-oob:/root# bic-util slot1 0x18 0x52 0x5 0x16 0x2 0x38
04 01
root@bmc-oob:/root# sensor-util slot1|grep HSC
HSC Temp                     (0xE) :   25.00 C     | (ok)
HSC Input Vol                (0x29) :   12.44 Volts | (ok)
HSC Output Cur               (0x30) :    0.25 Amps  | (ok)
HSC Input Pwr                (0x39) :    0.00 Watts | (ok)

3. Class type: class-1, 1ou present status: false, 2ou present status:false, board revision: POC
root@bmc-oob:/root# sensor-util slot3|grep HSC
HSC Temp                     (0xE) :   27.62 C     | (ok)
HSC Input Vol                (0x29) :   12.00 Volts | (ok)
HSC Output Cur               (0x30) :   10.52 Amps  | (ok)
HSC Input Pwr                (0x39) :  129.35 Watts | (ok)
Summary:
- K_WORK_DELAYABLE_DEFINE needs a callback function with the function parameter type of "struct k_work *".
- util_spi.c had a missing header that needed to be included as well as an improperly formatted print statement.
- Mark card_type_1ou that is currently unused in Yv3.5 CL as unused to silence compiler warnings.
- Initialized status in fall through case.

Test Plan:
- Build code: Pass
Summary:
- Support TMP431 sensor device
  For EVT3 ADI system, the "HSC Temp" and "MB Outlet Temp" should be read from TMP431 chip.
  For EVT3 MPS system, the "HSC Temp" sensor value is read from MP5990 and "MB Outlet Temp" is read from TMP75.

Test Plan:
1. Build code: pass
2. [EVT3 MPS] Check "HSC Temp" and "MB Outlet Temp" sensor reading: pass
3. [EVT3 ADI] Check "HSC Temp" and "MB Outlet Temp" sensor reading: pass
4. [POC] Check "HSC Temp" and "MB Outlet Temp" sensor reading: pass

Log:
1. Class type: class-1, 1ou present status: false, 2ou present status: true, board revision: EVT3(ADI)
root@bmc-oob:~# sensor-util slot1|grep "HSC\|MB Outlet Temp"
MB Outlet Temp               (0x2) :   25.44 C     | (ok)
HSC Temp                     (0xE) :   23.81 C     | (ok)
HSC Input Vol                (0x29) :   12.00 Volts | (ok)
HSC Output Cur               (0x30) :    0.28 Amps  | (ok)
HSC Input Pwr                (0x39) :    3.53 Watts | (ok)

2. Class type: class-1, 1ou present status: false, 2ou present status: false, board revision: EVT3(MPS)
root@bmc-oob:~# sensor-util slot1|grep "HSC\|MB Outlet Temp"
MB Outlet Temp               (0x2) :   24.00 C     | (ok)
HSC Temp                     (0xE) :   25.00 C     | (ok)
HSC Input Vol                (0x29) :   12.22 Volts | (ok)
HSC Output Cur               (0x30) :    0.25 Amps  | (ok)
HSC Input Pwr                (0x39) :    6.00 Watts | (ok)

3. Class type: class-1, 1ou present status: false, 2ou present status:false, board revision: POC
root@bmc-oob:~# sensor-util slot3|grep "HSC\|MB Outlet Temp"
MB Outlet Temp               (0x2) :   33.00 C     | (ok)
HSC Temp                     (0xE) :   29.52 C     | (ok)
HSC Input Vol                (0x29) :   12.00 Volts | (ok)
HSC Output Cur               (0x30) :    8.82 Amps  | (ok)
HSC Input Pwr                (0x39) :  108.70 Watts | (ok)
Summary:
- Add Zephyr Kernel patches
1. drivers: ipmb: Extend the ipmb buffer
2. drivers: i2c: Correct the timeout time as 35ms
3. peci: aspeed: Avoid race condition of accessing peci device.

Test Plan:
1. Build code: pass
Summary:
- In order to avoid lost post code , modify get/send post code stop time from post complete to host power off.
- Modify get/send post code reference signal.

Test Plan:
- Build code: Pass
- Test DC cycle/reset/BIC reboot: Pass

Log:
Before modify get/send post code stop time

root@bmc-oob:~# bic-util slot1 --get_post_code
util_get_postcode: returns 244 bytes
92 92 92 92 92 92 92 92 99 92 98 97 92 92 92 92
92 92 92 92 92 91 99 92 92 92 92 92 92 92 92 EF
96 95 94 94 94 94 94 94 94 94 94 94 94 94 94 94
94 94 92 91 79 70 68 61 4F 47 42 41 41 7F 00 00
15 0D 0C 0B 06 04 00 50 22 02 23 03 EE ED EC EB
E9 E6 AF AF AF BF 7E C6 CE BC BC BC BC BC CC 7E
DC CA CA B7 7E 70 7E D1 7E D1 7E D0 7E D0 7E D0
7E 7E BB BB BB BB BB BB BB BB BB BB CB 7E 70 70
70 B9 BA DB D9 DA C9 D7 B8 B8 B8 B8 B8 B8 B8 B8
B8 B7 B7 B7 B7 B7 B7 B7 B7 B9 70 D6 D2 7E D2 7E
7E BE BE 7E B7 70 70 7E 70 7E B7 B7 B6 B6 B7 B7
7E 7E 7E B6 B6 B7 B0 B6 B6 B6 B3 B3 B3 B3 B3 B3
B3 C6 B2 C5 B8 7E B4 7E B6 B1 B1 B1 7E 7E 70 7E
C2 7E B1 B1 70 C1 7E B0 CD 73 7E CF 7E B5 B0 AF
AF E5 E3 E4 E1 E0 E0 E0 AE AA A8 A9 A9 A7 A7 A7
A2 A2 A9 A9

After modify get/send post code stop time

root@bmc-oob:~# bic-util slot1 --get_post_code
util_get_postcode: returns 244 bytes
9C 9C 00 E3 E3 E3 AA 84 B1 AD D9 AD D9 AD 92 92
92 92 92 99 92 98 97 92 92 92 92 92 92 91 99 92
92 92 92 92 EF 96 95 94 94 94 94 94 94 94 94 94
94 94 94 94 92 91 79 70 68 61 4F 47 42 41 40 7F
00 7F 15 0D 0C 0C 06 04 00 50 22 02 23 03 EE ED
EC EB E9 E7 E6 AF AF AF BF 7E C6 CE BC BC BC BC
BC CC 7E DC CA CA B7 7E 70 7E D1 7E D1 7E D0 7E
D0 7E D0 7E 7E BB BB BB BB BB BB BB BB BB BB CB
7E 70 70 70 B9 BA DB D9 DA C9 D7 B8 B8 B8 B8 B8
B8 B8 B8 B8 B7 B7 B7 B7 B7 B7 B7 B7 B9 70 D6 D2
7E D2 7E 7E BE BE 7E B7 70 70 7E 70 7E B7 B7 B6
B6 B7 B7 7E 7E 7E B6 B6 B7 B0 B6 B6 B6 B3 B3 B3
B3 B3 B3 B3 C6 B2 C5 B8 7E B4 7E B6 B1 B1 B1 7E
7E 70 7E C2 7E B1 B1 70 C1 7E B0 CD 73 7E CF 7E
BF B0 AF AF E5 E3 E4 E1 E0 E0 E0 AE AA A8 A9 A9
A7 A7 A7 A2 A2 A9 A9
Summary:
- Fix BB BIC can't reset problem.
The root cause is that Zephyr SDK04 change the wdt of dts architecture.
If we don't enable wdt0 before, the kernel won't find wdt1.

Test plan:
- Build code: Pass
- Warm reset: Pass
- Cold reset: Pass

Log:
1. Check warm reset can do successfully.
- Before fix
[BIC console]
uart:~$ kernel reboot warm
No device named wdt1.
Failed to reboot: spinning endlessly...

- After fix
[BIC console]
uart:~$ kernel reboot warm0)I00:00:00.000,000] <inf> usb_dc_aspeed: select ep[0x81] as IN endpoint
[00:00:00.000,000] <inf> usb_dc_aspeed: select ep[0x82] as IN endpoint
[00:00:00.000,000] <wrn> usb_dc_aspeed: pre-selected ep[0x1] as IN endpoint
[00:00:00.000,000] <wrn> usb_dc_aspee*** Booting Zephyr OS build v00.01.04-3-g32eed3dd510b  ***
Hello, wellcome to yv35 baseboard 2022.1.1
...

2. Check cold reset can do successfully.
- Before fix
[BIC console]
uart:~$ kernel reboot cold
No device named wdt1.
Failed to reboot: spinning endlessly...

- After fix
[BIC console]
uart:~$ kernel reboot cold0)I00:00:00.000,000] <inf> usb_dc_aspeed: select ep[0x81] as IN endpoint
[00:00:00.000,000] <inf> usb_dc_aspeed: select ep[0x82] as IN endpoint
[00:00:00.000,000] <wrn> usb_dc_aspeed: pre-selected ep[0x1] as IN endpoint
[00:00:00.000,000] <wrn> usb_dc_aspee*** Booting Zephyr OS build v00.01.04-3-g32eed3dd510b  ***
Hello, wellcome to yv35 baseboard 2022.1.1
...
Summary:
- Added event log for memory thermal trip assert event.

Test Plan:
- Build code: Pass

Log:
Memory thermal trip event
[BIC console]
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 1
uart:~$ gpio conf GPIO0_E_H 26 out
Configuring GPIO0_E_H pin 26
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 0
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 1
uart:~$ gpio conf GPIO0_A_D 20 out
Configuring GPIO0_A_D pin 20
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 0

[BMC console]
root@bmc-oob:~# log-util slot1 --print
SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2022-03-31 00:34:09, Sensor: SYSTEM_STATUS (0x10), Event Data: (11FFFF) CPU/Memory thermal trip Assertion

SOC thermal trip event
[BIC console]
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 1
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 1
uart:~$ gpio conf GPIO0_A_D 20 out
Configuring GPIO0_A_D pin 20
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 0

[BMC console]
root@bmc-oob:~# log-util slot1 --print
SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2022-03-31 00:35:09, Sensor: SYSTEM_STATUS (0x10), Event Data: (00FFFF) SOC thermal trip Assertion
@LoraLin1 LoraLin1 force-pushed the Lora/main_v2/Add_memtrip_event branch from bedd9c0 to 0d5bf9f Compare April 18, 2022 11:15
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@GoldenBug has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Apr 19, 2022
Summary:
fby3.5: cl:Added memory thermal trip event.

- Added event log for memory thermal trip assert event.

Dependency: #234

Pull Request resolved: #235

Test Plan:
- Build code: Pass

Log:
Memory thermal trip event
[BIC console]
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 1
uart:~$ gpio conf GPIO0_E_H 26 out
Configuring GPIO0_E_H pin 26
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 0
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 1
uart:~$ gpio conf GPIO0_A_D 20 out
Configuring GPIO0_A_D pin 20
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 0

[BMC console]
root@bmc-oob:~# log-util slot1 --print
SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2022-03-31 00:34:09, Sensor: SYSTEM_STATUS (0x10), Event Data: (11FFFF) CPU/Memory thermal trip Assertion

SOC thermal trip event
[BIC console]
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 1
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 1
uart:~$ gpio conf GPIO0_A_D 20 out
Configuring GPIO0_A_D pin 20
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 0

[BMC console]
root@bmc-oob:~# log-util slot1 --print
SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2022-03-31 00:35:09, Sensor: SYSTEM_STATUS (0x10), Event Data: (00FFFF) SOC thermal trip Assertion

Reviewed By: garnermic

Differential Revision: D35731220

Pulled By: GoldenBug

fbshipit-source-id: fec9fcb1fb8809f9e2245b9963e7bf139fcb8301
@DelphineChiu

Copy link
Copy Markdown
Author

close the PR since the codes have been merged

facebook-github-bot pushed a commit that referenced this pull request Apr 19, 2022
Summary:
fby3.5: cl:Add read and write BIC register command

- Add OEM command to read and write BIC register

Dependency: #235

Pull Request resolved: #236

Test Plan:
- Build Code: Pass
- Command Test: Pass

Log:
read register 0x7e7b0300
[BIC console]
uart:~$ md 0x7e7b0300

[7e7b0300] 00000001
[BMC console]
root@bmc-oob:~# bic-util slot1 0xe0 0x68 0x9c 0x9c 0x00 0x00 0x03 0x7b 0x7e 0x4
9C 9C 00 01 00 00 00

write 2 bytes 0x2211 to register 0x7e7b0300
[BMC console]
root@bmc-oob:~# bic-util slot1 0xe0 0x69 0x9c 0x9c 0x00 0x00 0x03 0x7b 0x7e 0x2 0x11 0x22
9C 9C 00

read register 0x7e7b0300
[BIC console]
uart:~$ md 0x7e7b0300

[7e7b0300] 00002211
[BMC console]
root@bmc-oob:~# bic-util slot1 0xe0 0x68 0x9c 0x9c 0x00 0x00 0x03 0x7b 0x7e 0x4
9C 9C 00 11 22 00 00

Reviewed By: garnermic

Differential Revision: D35731221

Pulled By: GoldenBug

fbshipit-source-id: e2042e36cb48736aa7fcea8833ab49987c97bce9
facebook-github-bot pushed a commit that referenced this pull request Apr 26, 2022
Summary:
fby3.5: cl:Added memory thermal trip event.

- Added event log for memory thermal trip assert event.

Dependency: #234

Pull Request resolved: #235

Test Plan:
- Build code: Pass

Log:
Memory thermal trip event
[BIC console]
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 1
uart:~$ gpio conf GPIO0_E_H 26 out
Configuring GPIO0_E_H pin 26
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 0
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 1
uart:~$ gpio conf GPIO0_A_D 20 out
Configuring GPIO0_A_D pin 20
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 0

[BMC console]
root@bmc-oob:~# log-util slot1 --print
SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2022-03-31 00:34:09, Sensor: SYSTEM_STATUS (0x10), Event Data: (11FFFF) CPU/Memory thermal trip Assertion

SOC thermal trip event
[BIC console]
uart:~$ gpio get GPIO0_E_H 26
Reading GPIO0_E_H pin 26
Value 1
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 1
uart:~$ gpio conf GPIO0_A_D 20 out
Configuring GPIO0_A_D pin 20
uart:~$ gpio get GPIO0_A_D 20
Reading GPIO0_A_D pin 20
Value 0

[BMC console]
root@bmc-oob:~# log-util slot1 --print
SEL Entry: FRU: 1, Record: Standard (0x02), Time: 2022-03-31 00:35:09, Sensor: SYSTEM_STATUS (0x10), Event Data: (00FFFF) SOC thermal trip Assertion

Reviewed By: garnermic

Differential Revision: D35941588

Pulled By: GoldenBug

fbshipit-source-id: 95360b8fe627649f2358e84338d59379da69a5dd
facebook-github-bot pushed a commit that referenced this pull request Apr 26, 2022
Summary:
fby3.5: cl:Add read and write BIC register command

- Add OEM command to read and write BIC register

Dependency: #235

Pull Request resolved: #236

Test Plan:
- Build Code: Pass
- Command Test: Pass

Log:
read register 0x7e7b0300
[BIC console]
uart:~$ md 0x7e7b0300

[7e7b0300] 00000001
[BMC console]
root@bmc-oob:~# bic-util slot1 0xe0 0x68 0x9c 0x9c 0x00 0x00 0x03 0x7b 0x7e 0x4
9C 9C 00 01 00 00 00

write 2 bytes 0x2211 to register 0x7e7b0300
[BMC console]
root@bmc-oob:~# bic-util slot1 0xe0 0x69 0x9c 0x9c 0x00 0x00 0x03 0x7b 0x7e 0x2 0x11 0x22
9C 9C 00

read register 0x7e7b0300
[BIC console]
uart:~$ md 0x7e7b0300

[7e7b0300] 00002211
[BMC console]
root@bmc-oob:~# bic-util slot1 0xe0 0x68 0x9c 0x9c 0x00 0x00 0x03 0x7b 0x7e 0x4
9C 9C 00 11 22 00 00

Reviewed By: garnermic

Differential Revision: D35941583

Pulled By: GoldenBug

fbshipit-source-id: 477405a14f8ced5e79a56300342583f9a6479a3f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

5 participants