執行緒狀態與全域直譯器鎖¶
Unless on a free-threaded build of CPython, the Python interpreter is generally not thread-safe. In order to support multi-threaded Python programs, there's a global lock, called the global interpreter lock or GIL, that must be held by a thread before accessing Python objects. Without the lock, even the simplest operations could cause problems in a multi-threaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice.
As such, only a thread that holds the GIL may operate on Python objects or invoke Python's C API.
In order to emulate concurrency, the interpreter regularly tries to switch
threads between bytecode instructions (see sys.setswitchinterval()).
This is why locks are also necessary for thread-safety in pure-Python code.
Additionally, the global interpreter lock is released around blocking I/O operations, such as reading or writing to a file. From the C API, this is done by detaching the thread state.
The Python interpreter keeps some thread-local information inside
a data structure called PyThreadState, known as a thread state.
Each thread has a thread-local pointer to a PyThreadState; a thread state
referenced by this pointer is considered to be attached.
A thread can only have one attached thread state at a time. An attached thread state is typically analogous with holding the GIL, except on free-threaded builds. On builds with the GIL enabled, attaching a thread state will block until the GIL can be acquired. However, even on builds with the GIL disabled, it is still required to have an attached thread state, as the interpreter needs to keep track of which threads may access Python objects.
備註
Even on the free-threaded build, attaching a thread state may block, as the GIL can be re-enabled or threads might be temporarily suspended (such as during a garbage collection).
Generally, there will always be an attached thread state when using Python's
C API, including during embedding and when implementing methods, so it's uncommon
to need to set up a thread state on your own. Only in some specific cases, such
as in a Py_BEGIN_ALLOW_THREADS block or in a fresh thread, will the
thread not have an attached thread state.
If uncertain, check if PyThreadState_GetUnchecked() returns NULL.
If it turns out that you do need to create a thread state, call PyThreadState_New()
followed by PyThreadState_Swap(), or use the dangerous
PyGILState_Ensure() function.
Detaching the thread state from extension code¶
Most extension code manipulating the thread state has the following simple structure:
Save the thread state in a local variable.
... Do some blocking I/O operation ...
Restore the thread state from the local variable.
This is so common that a pair of macros exists to simplify it:
Py_BEGIN_ALLOW_THREADS
... Do some blocking I/O operation ...
Py_END_ALLOW_THREADS
The Py_BEGIN_ALLOW_THREADS macro opens a new block and declares a
hidden local variable; the Py_END_ALLOW_THREADS macro closes the
block.
The block above expands to the following code:
PyThreadState *_save;
_save = PyEval_SaveThread();
... Do some blocking I/O operation ...
PyEval_RestoreThread(_save);
Here is how these functions work:
The attached thread state implies that the GIL is held for the interpreter.
To detach it, PyEval_SaveThread() is called and the result is stored
in a local variable.
By detaching the thread state, the GIL is released, which allows other threads
to attach to the interpreter and execute while the current thread performs
blocking I/O. When the I/O operation is complete, the old thread state is
reattached by calling PyEval_RestoreThread(), which will wait until
the GIL can be acquired.
備註
Performing blocking I/O is the most common use case for detaching
the thread state, but it is also useful to call it over long-running
native code that doesn't need access to Python objects or Python's C API.
For example, the standard zlib and hashlib modules detach the
thread state when compressing or hashing
data.
On a free-threaded build, the GIL is usually out of the question, but detaching the thread state is still required, because the interpreter periodically needs to block all threads to get a consistent view of Python objects without the risk of race conditions. For example, CPython currently suspends all threads for a short period of time while running the garbage collector.
警告
Detaching the thread state can lead to unexpected behavior during interpreter finalization. See Cautions regarding runtime finalization for more details.
APIs¶
The following macros are normally used without a trailing semicolon; look for example usage in the Python source distribution.
備註
These macros are still necessary on the free-threaded build to prevent deadlocks.
-
Py_BEGIN_ALLOW_THREADS¶
- 為 穩定 ABI 的一部分.
This macro expands to
{ PyThreadState *_save; _save = PyEval_SaveThread();. Note that it contains an opening brace; it must be matched with a followingPy_END_ALLOW_THREADSmacro. See above for further discussion of this macro.
-
Py_END_ALLOW_THREADS¶
- 為 穩定 ABI 的一部分.
This macro expands to
PyEval_RestoreThread(_save); }. Note that it contains a closing brace; it must be matched with an earlierPy_BEGIN_ALLOW_THREADSmacro. See above for further discussion of this macro.
-
Py_BLOCK_THREADS¶
- 為 穩定 ABI 的一部分.
This macro expands to
PyEval_RestoreThread(_save);: it is equivalent toPy_END_ALLOW_THREADSwithout the closing brace.
-
Py_UNBLOCK_THREADS¶
- 為 穩定 ABI 的一部分.
This macro expands to
_save = PyEval_SaveThread();: it is equivalent toPy_BEGIN_ALLOW_THREADSwithout the opening brace and variable declaration.
Non-Python created threads¶
When threads are created using the dedicated Python APIs (such as the
threading module), a thread state is automatically associated with them,
However, when a thread is created from native code (for example, by a
third-party library with its own thread management), it doesn't hold an
attached thread state.