1

Program

If the following program is run on Windows with a single command-line argument, it will crash:

# threading-crash.py
"""Reproduce a crash involving Qt and threading"""

from PyQt5 import QtCore

import sys
from threading import Thread

from typing import Optional


class WorkerManager(QtCore.QObject):
  # Signal emitted when thread is finished.
  worker_finished = QtCore.pyqtSignal()

  def start_worker(self) -> None:
    def worker() -> None:
      # Printing here is necessary for the crash to happen *reliably*,
      # though it still happens without it (just less often).
      print("Emitting worker_finished signal")

      self.worker_finished.emit()

    t = Thread(target=worker)
    t.start()


def run_test() -> None:
  # When using `mypy`, I cannot assign `None` to `app` at the end unless
  # the type is declared to be optional here.
  app: Optional[QtCore.QCoreApplication] = QtCore.QCoreApplication(sys.argv)
  assert(app)      # Pacify mypy.

  mgr = WorkerManager()

  def finished() -> None:
    # Terminate the `exec_` call below.
    assert(app)    # Pacify mypy.
    app.exit(0)

  # Make a queued connection since this is a cross-thread signal.  (This
  # is not necessary to reproduce the crash; auto does the same thing.)
  mgr.worker_finished.connect(
    finished, QtCore.Qt.QueuedConnection) # type: ignore

  # Start the worker thread, which will signal `finished`.
  mgr.start_worker()

  # Wait for the signal to be received.
  app.exec_()

  if len(sys.argv) == 1:
    # This fixes the crash!
    app = None


def main() -> None:
  for i in range(10):
    print(f"{i}: run_test")
    run_test()     # Crashes on the second call.


if __name__ == "__main__":
  main()


# EOF

Demonstration

On my system (and with the print call in worker) this program crashes or hangs 100% of the time in the second run_test call.

Example run:

$ python threading-crash.py CRASH
0: run_test
Emitting worker_finished signal
1: run_test
Emitting worker_finished signal
Segmentation fault
Exit 139

The exact behavior varies unpredictably; another example:

$ python threading-crash.py CRASH
0: run_test
Emitting worker_finished signal
1: run_test
Emitting worker_finished signal
Exception in thread Thread-2 (worker):
Traceback (most recent call last):
  File "D:\opt\Python311\Lib\threading.py", line 1038, in _bootstrap_inner
Exit 127

Other possibilities include popping up an error dialog box ("The instruction at (hex) referenced memory at (hex)."), or just hanging completely.

In contrast, when run without arguments, thus activating the app = None line, it runs fine (even with a large iteration count like 1000):

$ python threading-crash.py
0: run_test
Emitting worker_finished signal
1: run_test
Emitting worker_finished signal
[...]
9: run_test
Emitting worker_finished signal

Other variations

Removing the print in start_worker makes the crash happen less frequently, but does not solve it.

Joining the worker thread at the end of start_worker (so there is no concurrency) removes the crash.

Joining the worker after app.exec_() does not help; it still crashes. Calling time.sleep(1) there (with or without the join) also does not help. This means the crash happens even though there is only one thread running at the time.

Disconnecting the worker_finished signal after app.exec_() does not help.

Adding a call to gc.collect() at the top of run_test has no effect.

Using QtCore.QThread instead of threading.Thread also has no effect on the crash.

Question

Why does this program crash? In particular:

  • Why does it not crash when I reset app to None? Shouldn't that (or something equivalent) automatically happen when run_test returns?

  • Is this a bug in my program, or a bug in Python or Qt?

Why am I making multiple QCoreApplications?

This example is reduced from a unit test suite. In that suite, each test is meant to be independent of any other, so those tests that need it create their own QCoreApplication object. The documentation does not appear to prohibit this.

Versions, etc.

$ python -V -V
Python 3.11.5 (tags/v3.11.5:cce6ba9, Aug 24 2023, 14:38:34) [MSC v.1936 64 bit (AMD64)]

$ python -m pip list | grep -i qt
PyQt5               5.15.11
PyQt5-Qt5           5.15.2
PyQt5_sip           12.17.0
PyQt5-stubs         5.15.6.0

I'm running this on Windows 10 Home. The above examples use a Cygwin shell, but the same thing happens under cmd.exe. This is all using the native Windows port of Python.


Further simplified

In comments, @ekhumoro suggested replacing the thread with a timer, and to my surprise, the crash still happens! (I was evidently misled by the highly non-deterministic behavior, not all of which I've shared.) Here is a more minimal reroducer (with typing annotations also removed):

# threading-crash.py
"""Reproduce a crash involving Qt and (not!) threading"""

from PyQt5 import QtCore
import sys


class WorkerManager(QtCore.QObject):
  # Signal emitted... never, now.
  the_signal = QtCore.pyqtSignal()


def run_test() -> None:
  app = QtCore.QCoreApplication(sys.argv)
  mgr = WorkerManager()

  def finished() -> None:
    # This call is required since it keeps `app` alive.
    app.exit(0)

  # Connect the signal (which is never emitted) to a local lambda.
  mgr.the_signal.connect(finished)

  # Start and stop the event loop.
  QtCore.QTimer.singleShot(100, app.quit)
  app.exec_()

  if len(sys.argv) == 1:
    # This fixes the crash!
    app = None # type: ignore


def main() -> None:
  for i in range(4):
    print(f"{i}: run_test")
    run_test()     # Crashes on the second call.


if __name__ == "__main__":
  main()


# EOF

Now, the key element seems to be that we have a signal connected to a local lambda that holds a reference to the QCoreApplication.

If the signal is disconnected before exec_() (i.e., right after it was connected), then no crash occurs. (Of course, that is not a solution to the original problem, since in the original program, the point of the signal was to cause exec_() to return.)

If the signal is disconnected after exec_(), then the program crashes; the lambda lives on, apparently.

22
  • PyQt5 has QThread (PyQt5.QtCore.QThread) Commented Oct 4 at 16:07
  • @furas Good suggestion, thanks; but using QThread instead of threading.Thread did not change the observed behavior. Commented Oct 4 at 17:42
  • 1
    @ScottMcPeak In fact, there are cases for which it's not possible to recreate a new QApplication within the same process: most notably, when using the QtWebEngine module (I cannot find it right now, but there's a related bug on Qt's report system, and it's marked as "won't do", as it's related to the underlying Chromium engine). Similar issues happen when using threading, especially if you're not explicitly deleting objects that may delay the final shutdown/deletion of the QApplication (due to complex ownerships). If you really need multiple app instances, spawned multiprocessing is safer. Commented Oct 5 at 1:36
  • 1
    @ScottMcPeak I finally found it: a comment to QTBUG-128345 explains that specific case; also see my related answer. Yes, there's no absolute reference about completely forbidding the shut-down/recreation of a Q*Application, which is theoretically possible, but is a rare occurrence and should always be done under proper scrutiny. Using Qt from Python adds a further complication layer (also involving closure aspects noted by ekhumoro): simply put, while the suggestions above should theoretically work if properly » Commented Oct 7 at 2:45
  • 1
    @ScottMcPeak » applied, you're still "fidgeting" with relatively low level aspects of both Python and Qt. That should suffice in a perfect-world scenario, but be aware that there's not a 100% guarantee that it will always work. I understand your point about the slower speed when using mp, but reliability comes before "speed"; at the very least, you should always keep in mind that changes to Qt, PyQt (or PySide), Python or even your tests could introduce further unexpected aspects. The day one of your tests fails, the first thing to consider is if that test is done under proper premises. Commented Oct 7 at 2:57

1 Answer 1

2

The fundamental problem is that this code creates two QCoreApplication objects whose lifetimes overlap. Evidently, that condition leads to random crashes on Windows.

The overlapping lifetimes comes about as a result of creating a cycle in the object graph that prevents garbage collection from cleaning them up. First, the run_test invocation creates a WorkerManager instance and has a local variable, mgr, pointing to it. Next, the WorkerManager has its the_signal signal connected to the finished closure. And the finished closure has a pointer back to the run_test invocation because that is necessary for it to be able to look up the local variable app.

The situation is depicted in the following object diagram:

Object graph diagram

The reason setting app to None fixes the crash is that it breaks the link to the QCoreApplication object, thus allowing it to be destroyed. When I first discovered this, I found it surprising because I thought that run_test returning would be sufficient, because I overlooked the impact of the finished closure still being alive.

Alternatively, if we disconnect the the_signal signal when it is received, then that breaks one edge of the cycle, allowing the rest of the cycle to be collected, which in turn allows the QCoreApplication to be destroyed.

As pointed out by @ekhumoro in a comment, there are at least two other ways to fix this. The first is to call exit(0) using QCoreApplication.instance() rather than using the app local variable, because then no closure is needed. The other is to connect the_signal directly to app.quit, which again avoids creating a closure.

Interestingly, it does not work to connect the_signal to app.exit. app.exit has a default argument of zero, but evidently the way that's implemented in a case like this involves creating a closure in order to pass the default argument to the underlying function.

Related:

Sign up to request clarification or add additional context in comments.

1 Comment

The answer regarding closures also suggests several other ways to fix the example. The two simplest would be: (1) connect the_signal to app.quit (which both eliminates the closure and avoids increasing the reference count of the connected slot); (2) call QCoreApplication.instance().exit(0) in the finished slot, which eliminates the closure and avoids increasing the reference count of app. It's important to understand that these exit-crash problems don't just apply to application objects: several other parts of the Qt framework exhibit similar behaviours.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.