Skip to content

fby3.5: cl: Modified IPMB architecture#47

Closed
rainlin024 wants to merge 1 commit into
facebook:mainfrom
rainlin024:fby3.5-Modified_IPMB_architecture
Closed

fby3.5: cl: Modified IPMB architecture#47
rainlin024 wants to merge 1 commit into
facebook:mainfrom
rainlin024:fby3.5-Modified_IPMB_architecture

Conversation

@rainlin024

Copy link
Copy Markdown

Summary:

  • There were some threads creating with CMSIS-OS2 API and which required fixed memory space for CMSIS-OS2.
    For Yv3.5, it requires IPMB TX and RX threads for four I2C buses and occupying too much memory to fix memory for maximum usage.
    Modified OS API from CMSIS-OS2 API to Zephyr kernel API, including threads, mutex, message queue.
    Removed memory reservation for CMSIS-OS2 stacks.
  • To reduce threads stack size, modified IPMB buffer from local variable to pointer and only allocate when process running.
  • Removed worker from common code, in that Zephyr kernel provided the service already.

Test plan:
-Build code: Pass
-Stress IPMB: Pass
Stressed get ME version for over thousands times.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 8, 2021
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@GoldenBug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Comment thread common/ipmi/ipmb.c Outdated
ipmi_msg_cfg *ptr_start = pnode;

osMutexAcquire(mutex_id[index], osWaitForever);
k_mutex_lock(&mutex_id[index], K_FOREVER);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really wait forever? It seems better to use a timeout as documented in the Zephyr doc at https://docs.zephyrproject.org/latest/reference/kernel/synchronization/mutexes.html

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For function insert node, find node and timeout handler, in timeout case, it risks a condition that BIC sending IPMB success but fail to record IPMB request package, which makes BIC take IPMB response as invalid package according to IPMI specification.
To avoid thread pending forever in mutex, we think we can add mutex status check. If BIC lock mutex fail due to one second timeout, will return directly and record a mutex lock fail in BIC console.
Is that make sense to you? Or do you think there is a better solution?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue is to not hang forever. Having a one second timeout and recording the lock failure is sufficient to meet this requirement of not hanging the BMC if the mutex is unable to be acquired.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we will fix it soon.

Comment thread common/ipmi/ipmb.c Outdated
ipmi_msg_cfg *ptr_start = pnode;

osMutexAcquire(mutex_id[index], osWaitForever);
k_mutex_lock(&mutex_id[index], K_FOREVER);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above...Shouldn't this timeout at some point?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comment in insert_node case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity, I agree with the same change as indicated in the insert_node case.

Comment thread common/ipmi/ipmb.c

while (1) {
osMessageQueueGet(ipmb_txqueue[ipmb_cfg.index], (void *)&current_msg_tx, NULL, osWaitForever); // Wait for OS queue send interrupt
current_msg_tx = (struct ipmi_msg_cfg *) malloc(sizeof(struct ipmi_msg_cfg));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a check here to ensure the memory was allocated?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add if memory NULL check.

Comment thread common/ipmi/ipmb.c
while (1) {
osMessageQueueGet(ipmb_txqueue[ipmb_cfg.index], (void *)&current_msg_tx, NULL, osWaitForever); // Wait for OS queue send interrupt
current_msg_tx = (struct ipmi_msg_cfg *) malloc(sizeof(struct ipmi_msg_cfg));
k_msgq_get(&ipmb_txqueue[ipmb_cfg.index], (ipmi_msg_cfg *)current_msg_tx, K_FOREVER); // Wait for OS queue send interrupt

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a timeout here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In architecture design, Tx thread is only in charge of message sending and driver status checking.
So for mutex in Tx thread, we would like to keep it waiting for OS ISR at the moment there is another process requesting sending message by putting mqueue, instead of keeping polling mqueue. It helps a lot in reducing task switch and optimize BIC performance.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with leaving it as it is to reduce the task switching and optimize performance for this instance.

Comment thread common/ipmi/ipmb.c Outdated

} else { // TODO: else if (ipmb_cfg.Inf == I3C_IF)
printf("IPMB %d using not support interface: %x\n", ipmb_cfg.index, ipmb_cfg.Inf);
free(current_msg_tx);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a check that current_msg_tx is not NULL before freeing it?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add if memory NULL check.

Comment thread common/ipmi/ipmb.c
req_cfg.retries = 0;
/* Blocks here until is able put message in tx queue */
if (osMessageQueuePut(ipmb_txqueue[index], &req_cfg, 0, osWaitForever) != osOK) {
if (k_msgq_put(&ipmb_txqueue[index], &req_cfg, K_FOREVER) != osOK) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout instead of forever?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add mqueue status check and return corresponding ipmb_error for project layer handling.

Comment thread common/ipmi/ipmb.c Outdated
ipmb_error ipmb_send_response(ipmi_msg *resp, uint8_t index)
{
osMutexAcquire(mutex_send_res, osWaitForever);
k_mutex_lock(&mutex_send_res, K_FOREVER);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout vs. forever?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add timeout and mutex status check, returning ipmb_error_timeout in mutex timeout case.

Comment thread common/ipmi/ipmb.c
if (k_msgq_put(&ipmb_txqueue[index], &resp_cfg, K_FOREVER) != osOK) {
//free(resp_cfg);
osMutexRelease(mutex_send_res);
k_mutex_unlock(&mutex_send_res);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout vs. forever?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add mqueue status check and return corresponding ipmb_error for project layer handling for line 576.

Comment thread common/ipmi/ipmb.c Outdated
}

osMutexAcquire(mutex_id[index], osWaitForever);
k_mutex_lock(&mutex_id[index], K_FOREVER);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout vs. Forever?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comment in insert_node case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with making the same change as is mentioned in insert_node case comment.

Comment thread meta-facebook/yv35-cl/prj.conf Outdated
#CONFIG_CMSIS_V2_MSGQ_MAX_DYNAMIC_SIZE=32768
CONFIG_HEAP_MEM_POOL_SIZE=32768
CONFIG_CMSIS_V2_MSGQ_MAX_DYNAMIC_SIZE=16384
CONFIG_CMSIS_V2_MSGQ_MAX_DYNAMIC_SIZE=8000

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be set to 8192 to be consistent with the max sizes aligning with 4K boundaries?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we will check if any side effect with setting to 8192 and fix it.

@garnermic

Copy link
Copy Markdown
Contributor

I agree with the proposed changes and recommendations for changes (or in the case of the tx thread, leaving it as is). Please move forward with implementing these changes.

Summary:
- There were some threads creating with CMSIS-OS2 API and which required fixed memory space for CMSIS-OS2.
  For Yv3.5, it requires IPMB TX and RX threads for four I2C buses and occupying too much memory to fix memory for maximum usage.
  Modified OS API from CMSIS-OS2 API to Zephyr kernel API, including threads, mutex, message queue.
  Removed memory reservation for CMSIS-OS2 stacks.
- To reduce threads stack size, modified IPMB buffer from local variable to pointer and only allocate when process running.
- Removed worker from common code, in that Zephyr kernel provided the service already.

Test plan:
-Build code: Pass
-Stress IPMB: Pass
Stressed get ME version for over thousands times.
@rainlin024 rainlin024 force-pushed the fby3.5-Modified_IPMB_architecture branch from f1647c2 to aefdfe8 Compare November 11, 2021 02:21
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@rainlinWW has updated the pull request. You must reimport the pull request before landing.

@rainlin024

Copy link
Copy Markdown
Author

I will close this PR and PR new commit again.
Thanks for your review and suggestion.

@rainlin024 rainlin024 closed this Nov 11, 2021
@rainlin024 rainlin024 deleted the fby3.5-Modified_IPMB_architecture branch November 30, 2021 03:35
HungYi-Li pushed a commit to HungYi-Li/OpenBIC that referenced this pull request May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

4 participants