執行緒狀態與全域直譯器鎖¶
Unless on a free-threaded build of CPython, the Python interpreter is generally not thread-safe. In order to support multi-threaded Python programs, there's a global lock, called the global interpreter lock or GIL, that must be held by a thread before accessing Python objects. Without the lock, even the simplest operations could cause problems in a multi-threaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice.
As such, only a thread that holds the GIL may operate on Python objects or invoke Python's C API.
In order to emulate concurrency, the interpreter regularly tries to switch
threads between bytecode instructions (see sys.setswitchinterval()).
This is why locks are also necessary for thread-safety in pure-Python code.
Additionally, the global interpreter lock is released around blocking I/O operations, such as reading or writing to a file. From the C API, this is done by detaching the thread state.
The Python interpreter keeps some thread-local information inside
a data structure called PyThreadState, known as a thread state.
Each thread has a thread-local pointer to a PyThreadState; a thread state
referenced by this pointer is considered to be attached.
A thread can only have one attached thread state at a time. An attached thread state is typically analogous with holding the GIL, except on free-threaded builds. On builds with the GIL enabled, attaching a thread state will block until the GIL can be acquired. However, even on builds with the GIL disabled, it is still required to have an attached thread state, as the interpreter needs to keep track of which threads may access Python objects.
備註
Even on the free-threaded build, attaching a thread state may block, as the GIL can be re-enabled or threads might be temporarily suspended (such as during a garbage collection).
Generally, there will always be an attached thread state when using Python's
C API, including during embedding and when implementing methods, so it's uncommon
to need to set up a thread state on your own. Only in some specific cases, such
as in a Py_BEGIN_ALLOW_THREADS block or in a fresh thread, will the
thread not have an attached thread state.
If uncertain, check if PyThreadState_GetUnchecked() returns NULL.
If it turns out that you do need to create a thread state, call PyThreadState_New()
followed by PyThreadState_Swap(), or use the dangerous
PyGILState_Ensure() function.
Detaching the thread state from extension code¶
Most extension code manipulating the thread state has the following simple structure:
Save the thread state in a local variable.
... Do some blocking I/O operation ...
Restore the thread state from the local variable.
This is so common that a pair of macros exists to simplify it:
Py_BEGIN_ALLOW_THREADS
... Do some blocking I/O operation ...
Py_END_ALLOW_THREADS
The Py_BEGIN_ALLOW_THREADS macro opens a new block and declares a
hidden local variable; the Py_END_ALLOW_THREADS macro closes the
block.
The block above expands to the following code:
PyThreadState *_save;
_save = PyEval_SaveThread();
... Do some blocking I/O operation ...
PyEval_RestoreThread(_save);
Here is how these functions work:
The attached thread state implies that the GIL is held for the interpreter.
To detach it, PyEval_SaveThread() is called and the result is stored
in a local variable.
By detaching the thread state, the GIL is released, which allows other threads
to attach to the interpreter and execute while the current thread performs
blocking I/O. When the I/O operation is complete, the old thread state is
reattached by calling PyEval_RestoreThread(), which will wait until
the GIL can be acquired.
備註
Performing blocking I/O is the most common use case for detaching
the thread state, but it is also useful to call it over long-running
native code that doesn't need access to Python objects or Python's C API.
For example, the standard zlib and hashlib modules detach the
thread state when compressing or hashing
data.
On a free-threaded build, the GIL is usually out of the question, but detaching the thread state is still required, because the interpreter periodically needs to block all threads to get a consistent view of Python objects without the risk of race conditions. For example, CPython currently suspends all threads for a short period of time while running the garbage collector.
警告
Detaching the thread state can lead to unexpected behavior during interpreter finalization. See Cautions regarding runtime finalization for more details.
APIs¶
The following macros are normally used without a trailing semicolon; look for example usage in the Python source distribution.
備註
These macros are still necessary on the free-threaded build to prevent deadlocks.
-
Py_BEGIN_ALLOW_THREADS¶
- 為 穩定 ABI 的一部分.
This macro expands to
{ PyThreadState *_save; _save = PyEval_SaveThread();. Note that it contains an opening brace; it must be matched with a followingPy_END_ALLOW_THREADSmacro. See above for further discussion of this macro.
-
Py_END_ALLOW_THREADS¶
- 為 穩定 ABI 的一部分.
This macro expands to
PyEval_RestoreThread(_save); }. Note that it contains a closing brace; it must be matched with an earlierPy_BEGIN_ALLOW_THREADSmacro. See above for further discussion of this macro.
-
Py_BLOCK_THREADS¶
- 為 穩定 ABI 的一部分.
This macro expands to
PyEval_RestoreThread(_save);: it is equivalent toPy_END_ALLOW_THREADSwithout the closing brace.
-
Py_UNBLOCK_THREADS¶
- 為 穩定 ABI 的一部分.
This macro expands to
_save = PyEval_SaveThread();: it is equivalent toPy_BEGIN_ALLOW_THREADSwithout the opening brace and variable declaration.
Non-Python created threads¶
When threads are created using the dedicated Python APIs (such as the
threading module), a thread state is automatically associated with them,
However, when a thread is created from native code (for example, by a
third-party library with its own thread management), it doesn't hold an
attached thread state.
If you need to call Python code from these threads (often this will be part of a callback API provided by the aforementioned third-party library), you must first register these threads with the interpreter by creating a new thread state and attaching it.
The most robust way to do this is through PyThreadState_New() followed
by PyThreadState_Swap().
備註
PyThreadState_New requires an argument pointing to the desired
interpreter; such a pointer can be acquired via a call to
PyInterpreterState_Get() from the code where the thread was
created.
For example:
/* The return value of PyInterpreterState_Get() from the
function that created this thread. */
PyInterpreterState *interp = thread_data->interp;
/* Create a new thread state for the interpreter. It does not start out
attached. */
PyThreadState *tstate = PyThreadState_New(interp);
/* Attach the thread state, which will acquire the GIL. */
PyThreadState_Swap(tstate);
/* Perform Python actions here. */
result = CallSomeFunction();
/* evaluate result or handle exception */
/* Destroy the thread state. No Python API allowed beyond this point. */
PyThreadState_Clear(tstate);
PyThreadState_DeleteCurrent();
警告
If the interpreter finalized before PyThreadState_Swap was called, then
interp will be a dangling pointer!
Legacy API¶
Another common pattern to call Python code from a non-Python thread is to use
PyGILState_Ensure() followed by a call to PyGILState_Release().
These functions do not work well when multiple interpreters exist in the Python
process. If no Python interpreter has ever been used in the current thread (which
is common for threads created outside Python), PyGILState_Ensure will create
and attach a thread state for the "main" interpreter (the first interpreter in
the Python process).
Additionally, these functions have thread-safety issues during interpreter
finalization. Using PyGILState_Ensure during finalization will likely
crash the process.
Usage of these functions look like such:
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();
/* Perform Python actions here. */
result = CallSomeFunction();
/* evaluate result or handle exception */
/* Release the thread. No Python API allowed beyond this point. */
PyGILState_Release(gstate);
Cautions about fork()¶
Another important thing to note about threads is their behaviour in the face
of the C fork() call. On most systems with fork(), after a
process forks only the thread that issued the fork will exist. This has a
concrete impact both on how locks must be handled and on all stored state
in CPython's runtime.
The fact that only the "current" thread remains
means any locks held by other threads will never be released. Python solves
this for os.fork() by acquiring the locks it uses internally before
the fork, and releasing them afterwards. In addition, it resets any
Lock 物件 in the child. When extending or embedding Python, there
is no way to inform Python of additional (non-Python) locks that need to be
acquired before or reset after a fork. OS facilities such as
pthread_atfork() would need to be used to accomplish the same thing.
Additionally, when extending or embedding Python, calling fork()
directly rather than through os.fork() (and returning to or calling
into Python) may result in a deadlock by one of Python's internal locks
being held by a thread that is defunct after the fork.
PyOS_AfterFork_Child() tries to reset the necessary locks, but is not
always able to.
The fact that all other threads go away also means that CPython's
runtime state there must be cleaned up properly, which os.fork()
does. This means finalizing all other PyThreadState objects
belonging to the current interpreter and all other
PyInterpreterState objects. Due to this and the special
nature of the "main" interpreter,
fork() should only be called in that interpreter's "main"
thread, where the CPython global runtime was originally initialized.
The only exception is if exec() will be called immediately
after.
High-level APIs¶
These are the most commonly used types and functions when writing multi-threaded C extensions.
-
type PyThreadState¶
- 為 受限 API 的一部分 (做為一個不透明結構 (opaque struct)).
This data structure represents the state of a single thread. The only public data member is:
-
PyInterpreterState *interp¶
這個執行緒的直譯器狀態。
-
PyInterpreterState *interp¶
-
void PyEval_InitThreads()¶
- 為 穩定 ABI 的一部分.
已棄用的函式,什麼也不做。
In Python 3.6 and older, this function created the GIL if it didn't exist.
在 3.9 版的變更: 此函式現在不會做任何事情。
在 3.7 版的變更: This function is now called by
Py_Initialize(), so you don't have to call it yourself anymore.在 3.2 版的變更: This function cannot be called before
Py_Initialize()anymore.在 3.9 版之後被棄用.
-
PyThreadState *PyEval_SaveThread()¶
- 為 穩定 ABI 的一部分.
Detach the attached thread state and return it. The thread will have no thread state upon returning.
-
void PyEval_RestoreThread(PyThreadState *tstate)¶
- 為 穩定 ABI 的一部分.
Set the attached thread state to tstate. The passed thread state should not be attached, otherwise deadlock ensues. tstate will be attached upon returning.
備註
Calling this function from a thread when the runtime is finalizing will hang the thread until the program exits, even if the thread was not created by Python. Refer to Cautions regarding runtime finalization for more details.
在 3.14 版的變更: Hangs the current thread, rather than terminating it, if called while the interpreter is finalizing.
-
PyThreadState *PyThreadState_Get()¶
- 為 穩定 ABI 的一部分.
Return the attached thread state. If the thread has no attached thread state, (such as when inside of
Py_BEGIN_ALLOW_THREADSblock), then this issues a fatal error (so that the caller needn't check forNULL).
-
PyThreadState *PyThreadState_GetUnchecked()¶
Similar to
PyThreadState_Get(), but don't kill the process with a fatal error if it is NULL. The caller is responsible to check if the result is NULL.在 3.13 版被加入: 在 Python 3.5 到 3.12 中,此函式是私有的,被稱為
_PyThreadState_UncheckedGet()。
- PyThreadState *PyThreadState_Swap(PyThreadState *tstate)