How much can you detect undefined behaviour using testing [closed]

Question

Closed. This question needs details or clarity. It is not currently accepting answers.

Want to improve this question? As written, this question is lacking some of the information it needs to be answered. If the author adds details in comments, consider editing them into the question. Once there's sufficient detail to answer, vote to reopen the question.

Closed 10 years ago.

Improve this question

I hope this question fits this site. You may know you can't detect undefined behaviour in C using compilers - and some tools (static analysis) can help you detect it. My question is more empirical - I know exact solution cant' exist. Question:

Say one developed software and performed intensive testing of the application for several days and no issues found, how much can he be sure there is no undefined behaviour constructs used in his application?
Consider same situation as above and that the software the programmer developed has been deployed to a company who is using it for several months and no issues have been reported. In practice, how much can one be sure in such case the there is no Undefined behaviour in software?

PS. Extra: If there is undefined behaviour in function A() in program, can program behave well when it reaches A() and malfunction when it reaches function B() - which has no problems?

interesting... consider a threaded program - how much can you test to prove the threads do not corrupt each others state as such a bug would never be deterministic? — gbjbaanb
– gbjbaanb, Commented Jul 2, 2015 at 7:47
@gbjbaanb: don't know. and don't know also if threads==UB in this context of testing — user185707
– user185707, Commented Jul 2, 2015 at 7:48

Community · Accepted Answer · 2017-05-23 12:40:26Z

Many seemingly reasonable things in C (and C++) actually have undefined behavior and this is a common source of bugs in programs.

Moreover there is no good way to determine whether a large scale application is free of undefined behavior and thus not susceptible to breaking in the future.

Some techniques like Undefined Behavior Coverage (see also the Klee project that uses symbolic analysis to try every possible path) have been proposed but they don't quantify the "security level".

There are many good tools:

Valgrind memcheck (can't find things the optimizer removes)
clang -fcatch-undefined-behavior that inserts runtime checks to find violations
c-semantic-tools (see An Executable Semantics For C Is Useful for an introduction)
clang static analizer
...

but nothing that gives full confidence that your code won't break in the future (or at least a security index).

The best you can do is be very careful, use good tools and hope for the best.

Some other interesting / related link are:

What Every C Programmer Should Know About Undefined Behavior (especially part 2 and 3)
A C++ implementation that detects undefined behavior?

yeah but talk here is about chances I'd think it is reasonable as more you test belief that program is free of UB increases (with some proportion even say 0.01) — user185707
– user185707, Commented Jul 2, 2015 at 11:23

Peter · Accepted Answer · 2018-04-29 15:21:00Z

In both cases, testing will provide 0% additional certainty that no undefined behavior constructs are used in the application. Testing inherently does not have the ability to detect undefined behavior. Testing tests for wrong behavior. Undefined is very different from wrong.

"Undefined behavior constructs" are often deterministic (meaning we can know what happens, and always the same thing happens) if run in a well specified environment. Many types of undefined behavior work similar to the following: The program writes to a variable that it shouldn't write to, that variable is never read afterwards, and the program works correctly. From the perspective of the external test, there is no issue. Undefined behavior that gives a wrong result also exists, but is usually detected and fixed quickly, because the wrong result makes it easy to spot.

Such "harmless" undefined behavior is still seriously dangerous, because it relies on the aforementioned well specified environment. Over the lifetime of a program, there are often some minor and some major changes to that well specified environment. Some of these changes are compiler changes, OS updates, change of CPU architecture (the results of going from multi-threaded single core to multi-threaded multi-core were particularly nasty).

Okay so let's say it doesn't detect 100% of the cases, right? If it detects 95% though isn't that better than 0%? Just because it can't detect every single case doesn't mean it's not valuable. And if it does detect some (and compilers can do this so I know it's not true to say that they detect none) then once those are fixed there are fewer instances of UB. — Pryftan
– Pryftan, Commented Oct 26, 2019 at 15:52

BЈовић · Accepted Answer · 2015-07-02 07:03:30Z

2

It is hard to say. Like any other bug, it can work for years, until one day your application starts crashing, after aparently innocent change.

One such example is writing buffer after it's end.

That is why the software needs to be tested at various levels (unit, functional, integration, etc), checked with static analisys tools, and those tests should be executed with memory checkers (like valgrind, when possible). Then you can say that you extended your safety net, and that the probability for bug (including undefined behaviours) is reduced.

If there is undefined behaviour in function A() in program, can program behave well when it reaches A() and malfunction when it reaches function B() - which has no problems?

With undefined behaviour anything is possible, including the nasal demons.

As I said above, even a simple change can cause UB to misbehave in very bad ways. To make things worse, it doesn't have to trigger the same behaviour every time. It can appear to work in 99 execution, and crash your application on the 100th.

edited Jul 2, 2015 at 7:03

answered Jul 2, 2015 at 6:52

BЈовић

14k8 gold badges63 silver badges82 bronze badges

writing past buffer is potential undefined behaviour let's call it - if the input is small and there is no writing past the buffer all is OK. I am however refering to cases which are 100% undefined behaviour and revealing them throughout time and testing?

user185707
– user185707

2015-07-02 06:56:09 +00:00
Commented Jul 2, 2015 at 6:56
1

@user30020 It is not a potential undefine behaviour - it is an undefined behaviour. What exactly do you have on your mind?

BЈовић
– BЈовић

2015-07-02 06:57:34 +00:00
Commented Jul 2, 2015 at 6:57
I mean doing memcpy(dest,src,len) is not undefined behaviour if all three parameters are proper, enough len etc. But if user supplies len larger then dest size - this is undefined behaviour. I referred to cases which are 100% UB. (Not like this example)

user185707
– user185707

2015-07-02 06:59:23 +00:00
Commented Jul 2, 2015 at 6:59
@user30020 So, what exactly do you ask?

BЈовић
– BЈовић

2015-07-02 07:03:59 +00:00
Commented Jul 2, 2015 at 7:03
You buffer writing example is not what I referred to because that is a potential undefined behaviour. My question is asked in the context when there are 100% undefined behaviour cases in code. And how testing can reveal them ... ?

user185707
– user185707

2015-07-02 07:04:56 +00:00
Commented Jul 2, 2015 at 7:04

| Show 3 more comments

Stack Exchange Network

How much can you detect undefined behaviour using testing [closed]

3 Answers 3

Hot Network Questions

How much can you detect undefined behaviour using testing [closed]

3 Answers 3

Related

Hot Network Questions