More fixes towards v1.0 #52

wnienhaus · 2021-09-29T18:55:45Z

Some more fixes to move us towards being able to assemble all the currently skipped examples from the binutils-esp32ulp test suite (to resolve #49).

This PR contains a few fixes - felt it was easier to send them as one PR, than 1 PR per commit.

The commits are independent of each other and can be reviewed one by one. They contain commit messages explaining what they each fix. Each commit also has a reference to the issue it fixed in the relevant binutils-esp32ulp test.

The last commits enables the tests that can now pass because of the fixes. 2 more files to go.

Offsets specified as immediate values are given in bytes but the underlying machine instruction needs a word offset. Thus, we need to convert immediate values to words by dividing by 4. This commit contributes to being able to eventually assemble the esp32ulp_all.s test from binutils-esp32ulp. It addresses this line: https://github.com/espressif/binutils-esp32ulp/blob/249ec34cc2c9574a86f3f86bbb175a863f988bcf/gas/testsuite/gas/esp32ulp/esp32/esp32ulp_all.s#L109

This was a previously untested and incorrectly handled case. There was already some offset manipulation to account for the extra instructions for handling non-native conditions. Because of the way symbol offsets are calculated by py-esp32-ulp, namely that extra instructions are already included in the offsets, the relative distance to a forward lying label is 1 instruction too much. Or put another way, when expanding a jump into 2 instructions the "current offset" refers to the first instruction, while it's only the second instruction that jumps to the intended label, so that difference must be accounted for. Thus, for symbols we need to always subtract 1 from the offset. For immediate values, we only subtract 1 for a backward jump.

binutil-esp32ulp accepts an undocumented 4th parameter for the ADC instruction, even though this parameter is unused as per binutils' current code. This commit contributes to being able to eventually assemble the esp32ulp_all.s test from binutils-esp32ulp. It addresses this line: https://github.com/espressif/binutils-esp32ulp/blob/249ec34cc2c9574a86f3f86bbb175a863f988bcf/gas/testsuite/gas/esp32ulp/esp32/esp32ulp_all.s#L174

So far line parsing made many assumptions on the format of lines. This commit makes line parsing more flexible and tolerable to different formats. The change is in how labels are identified. Now any symbol that is first on the line (excluding whitespace) and that is followed directly with a colon (:), will be identified as a label. Labels allow all characters that are valid for symbol names, including ._$. This commit contributes to being able to eventually assemble the esp32ulp_all.s test from binutils-esp32ulp. It addresses this line: [esp32ulp_all.s:2](https://github.com/espressif/binutils-esp32ulp/blob/249ec34cc2c9574a86f3f86bbb175a863f988bcf/gas/testsuite/gas/esp32ulp/esp32/esp32ulp_all.s#L2) (no space between colon and next character) and also this line: [esp32ulp_globals.s:92](https://github.com/espressif/binutils-esp32ulp/blob/249ec34cc2c9574a86f3f86bbb175a863f988bcf/gas/testsuite/gas/esp32ulp/esp32/esp32ulp_globals.s#L92) (label indented).

The last few commits/fixes allow these tests to now pass so we no longer need to skip them.

ThomasWaldmann · 2021-09-29T20:58:40Z

esp32_ulp/opcodes.py

@@ -619,7 +619,7 @@ def i_jump(target, condition='--'):
        raise ValueError("invalid flags condition")
    if target.type == IMM or target.type == SYM:
        _bx.dreg = 0
-        _bx.addr = get_abs(target)
+        _bx.addr = get_abs(target) if target.type == SYM else get_abs(target) >> 2  # bitwise version of "// 4"


I don't get this.

If it is a symbol, it directly uses its value (absolute address) else it uses an immediate offset // 4?

Considering everything else in the opcode is the same, how does the cpu know what to do? like whether it is an offset or an absolute address?

In both cases what the machine instruction gets is an absolute address (not relative offset). The absolute address is either the address of a label, or an absolute address specified as immediate value.

What is different in the two cases is that the assembler instruction must take an address given in bytes, while the machine instruction needs to contain the address in words. Because in py-esp32-ulp we already track label addresses in words internally, we do not need to divide those symbol addresses by 4. But for immediate values we need to (see: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/ulp_instruction_set.html#note-about-addressing).

Does that make sense, or did I miss what you asked?

However... I did just discover one more case for the JUMP instruction, where we don't properly divide by 4. So this current logic is not 100% perfect. It appears to be that it's not only IMM values that must be divided by 4, but also symbol values that we internally label with the ABS type (as opposed to REL). Those ABS type symbols are defined by .set. So maybe the logic here should really be: Is what was given to us a relocatable thing (label) or not. If yes, use as is, if not divide by 4.

This logic is starting to look valid to me, the more I think about it. It should give the same result for the cases currently handled correctly, but also for the case of symbols defined with .set. It would requires a bit of a bigger change though, because of expression evaluation, which potentially uses multiple symbols that could have different types. So I will try and see how binutils-esp32ulp handles mixing symbols types, and likely a valid solution could be to consider the result of an expression to be a REL type if any symbols in the expression were of that type, otherwise consider the result to be an ABS type. And then use ABS and REL here rather than SYM and IMM to decide whether to divide by 4.

thanks for explaining, please add a comment above this like:

# we track label addresses in 32bit words, but immediate values are in bytes and need to get divided by 4.

Maybe we could also have ABS1 (byte addresses), ABS4 (32bit word addrs), etc. so we better know what we have.

So, my last part about the REL vs ABS, I think is not relevant anymore. I believe that "one more case" I talked about above, is actually a bug in binutils-esp32ulp

I have now tested quite a bit more and stepped through binutils-esp32ulp a bit and found the following: whether the value (that was defined with .set) gets divided by 4 depends on whether the symbol is marked as "global" or not. If marked as global, then the value will be taken as-is, when not marked as global, the value will be divided by 4. I don't see how this could be intended behaviour.

The reason I started to get skeptical is that binutils-esp32ulp behaves correctly for JUMPS and JUMPR cases, as per description in the documentation: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/ulp_instruction_set.html#note-about-addressing. The documentation shows an example, where values, which come from .set'ed symbols, should not be converted from bytes to words. So the fact that the JUMP instruction then tried to divide the value of the symbol by 4 (in the case I tried to fix) seems inconsistent.

I think the issues comes up in binutils-esp32ulp, depending on where the value is actually "filled-in" into the instruction. For non-global symbols, the value is already set during the assembly stage (as), while in the global case, the symbol value is only "filled in" during the linking stage (ld) - it's 0 in the instruction before that stage. The implementation of this logic in the assembler and linker differ.

So that means, I would not change anything in our implementation (other than adding the comment you suggested).

I might file a bug report to the binutils-esp32ulp project.

Yeah, sounds strange, please file a bug.

I filed a bug report now: espressif/binutils-esp32ulp#22

tests/compat/jumps.S

esp32_ulp/assemble.py

tests/assemble.py

Using a regex to parse each line makes the code easier to read and reason about. It uses capture groups to extract the exact substrings we need, removing the need for extra whitespace trimming and string splitting. Benchmarking showed that it even performs slightly better on the ESP32, while raising the committed memory by only 64 extra bytes of memory compared with the previous algorithm (measured after garbage collect).

The comment clarifies that because we track label addresses in 32bit words, but that immediate values are in bytes, the code needs to divide immediate values by 4, while leaving symbols (label) addresses as-is.

wnienhaus · 2021-10-03T16:03:44Z

I have added the changes as new commits this time.

I will leave the JUMP instruction's SYM vs IMM handling as it is now, because as I described in one of the comments, I believe the "special case" I saw in one of the remaining test scripts from binutils-esp32ulp, is a bug in their implementation.

So after this PR is merged, there is only 1 more issue I think (I will create a new PR after this one is merged) before we can pass all the skipped bintutils-esp32ulp tests. (There are some cases I don't think we should cater for, or that we could discuss how to cater for, but I will describe that in the next PR).

ThomasWaldmann · 2021-10-04T09:45:58Z

Merged, thanks!

wnienhaus added 5 commits September 29, 2021 09:14

enable more skipped compat tests

ff7db42

The last few commits/fixes allow these tests to now pass so we no longer need to skip them.

ThomasWaldmann reviewed Oct 2, 2021

View reviewed changes

wnienhaus added 3 commits October 3, 2021 18:45

fix comments syntax

6fa631a

Add comment about dividing by 4

78e9812

The comment clarifies that because we track label addresses in 32bit words, but that immediate values are in bytes, the code needs to divide immediate values by 4, while leaving symbols (label) addresses as-is.

ThomasWaldmann merged commit 02e94ba into micropython:master Oct 4, 2021

wnienhaus mentioned this pull request Oct 7, 2021

Last fixes for v1 #53

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More fixes towards v1.0 #52

More fixes towards v1.0 #52

wnienhaus commented Sep 29, 2021

ThomasWaldmann Sep 29, 2021

wnienhaus Oct 3, 2021

ThomasWaldmann Oct 3, 2021

ThomasWaldmann Oct 3, 2021

wnienhaus Oct 3, 2021

ThomasWaldmann Oct 4, 2021

wnienhaus Oct 10, 2021

wnienhaus commented Oct 3, 2021

ThomasWaldmann commented Oct 4, 2021

More fixes towards v1.0 #52

More fixes towards v1.0 #52

Conversation

wnienhaus commented Sep 29, 2021

ThomasWaldmann Sep 29, 2021

Choose a reason for hiding this comment

wnienhaus Oct 3, 2021

Choose a reason for hiding this comment

ThomasWaldmann Oct 3, 2021

Choose a reason for hiding this comment

ThomasWaldmann Oct 3, 2021

Choose a reason for hiding this comment

wnienhaus Oct 3, 2021

Choose a reason for hiding this comment

ThomasWaldmann Oct 4, 2021

Choose a reason for hiding this comment

wnienhaus Oct 10, 2021

Choose a reason for hiding this comment

wnienhaus commented Oct 3, 2021

ThomasWaldmann commented Oct 4, 2021