Hot Linked Questions - Stack Overflow

56 votes

2 answers

26k views

gcc optimization flag -O3 makes code slower than -O2

I find this topic Why is it faster to process a sorted array than an unsorted array? . And try to run this code. And I find strange behavior. If I compile this code with -O3 optimization flag it takes ...

Mike Minaev

2,132

asked Mar 5, 2015 at 10:17

16 votes

1 answer

5k views

Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?

I am disassembling this code on llvm clang Apple LLVM version 8.0.0 (clang-800.0.42.1): int main() { float a=0.151234; float b=0.2; float c=a+b; printf("%f", c); } I compiled with no -...

Stefano Borini

145k

asked Nov 18, 2018 at 23:16

12 votes

2 answers

3k views

Can modern x86 implementations store-forward from more than one prior store?

In the case that a load overlaps two earlier stores (and the load is not fully contained in the oldest store), can modern Intel or AMD x86 implementations forward from both stores to satisfy the load? ...

BeeOnRope

66.7k

asked Sep 9, 2017 at 22:45

1 vote

2 answers

3k views

Why bubble sort is not efficient?

I am developing backend project using node.js and going to implement sorting products functionality. I researched some articles and there were several articles saying bubble sort is not efficient. ...

Steven

687

asked May 25, 2020 at 7:43

10 votes

2 answers

1k views

What are the costs of failed store-to-load forwarding on x86?

What are the costs of a failed store-to-load forwarding on recent x86 architectures? In particular, store-to-load forwarding that fails because the load partly overlaps an earlier store, or because ...

BeeOnRope

66.7k

asked Sep 9, 2017 at 21:43

2 votes

3 answers

4k views

Why Java turns out to be faster than C++ in this simple BubbleSort benchmark example?

I hear from colleagues that C++ is faster than Java and when looking for top performance, especially for finance applications, that's the route to go. But my observations differ a bit. Can anyone ...

SpeedChaser

63

asked Jun 24, 2022 at 12:54

1 vote

1 answer

1k views

Possible to develop a game using assembly language? [closed]

I am wondering what it takes to develop a game in assembly language. For example, what are the limitations or advantages from using assembly language in game development? Also, are there any programs/...

Josh Lcs

55

asked Oct 11, 2014 at 17:24

-7 votes

3 answers

661 views

For comparison in speed between C and Assembly language using bubble sort, which of it in general is faster? [closed]

I have researched bubble sort speed differences between C and Assembly language, and found that code optimization is one factor. What other factors are there to consider for bubble sort speed ...

John

17

asked Oct 20, 2023 at 6:11

2 votes

0 answers

832 views

Big penalty of unaligned load and store on x86_64 [duplicate]

I thought unaligned access and write has got cheaper on recent x86_64 CPUs compared to the older ones. However, I recently found out that doing a series of unaligned load and stores can be a huge ...

xiver77

2,372

asked Jan 23, 2022 at 0:31

7 votes

0 answers

319 views

MSVC generating unnecessary complicated instructions

While benchmarking code involving std::optional<double>, I noticed that the code MSVC generates runs at roughly half the speed compared to the one produced by clang or gcc. After spending some ...

Sedenion

6,363

asked Jun 25, 2022 at 16:09

Collectives™ on Stack Overflow

Linked Questions

gcc optimization flag -O3 makes code slower than -O2

Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?

Can modern x86 implementations store-forward from more than one prior store?

Why bubble sort is not efficient?

What are the costs of failed store-to-load forwarding on x86?

Why Java turns out to be faster than C++ in this simple BubbleSort benchmark example?

Possible to develop a game using assembly language? [closed]

For comparison in speed between C and Assembly language using bubble sort, which of it in general is faster? [closed]

Big penalty of unaligned load and store on x86_64 [duplicate]

MSVC generating unnecessary complicated instructions

Hot Network Questions