Introduction¶
The Application Programmer’s Interface to Python gives C and C++ programmers access to the Python interpreter at a variety of levels. The API is equally usable from C++, but for brevity it is generally referred to as the Python/C API. There are two fundamentally different reasons for using the Python/C API. The first reason is to write extension modules for specific purposes; these are C modules that extend the Python interpreter. This is probably the most common use. The second reason is to use Python as a component in a larger application; this technique is generally referred to as embedding Python in an application.
Writing an extension module is a relatively well-understood process, where a “cookbook” approach works well. There are several tools that automate the process to some extent. While people have embedded Python in other applications since its early existence, the process of embedding Python is less straightforward than writing an extension.
Many API functions are useful independent of whether you’re embedding or extending Python; moreover, most applications that embed Python will need to provide a custom extension as well, so it’s probably a good idea to become familiar with writing an extension before attempting to embed Python in a real application.
Language version compatibility¶
Python’s C API is compatible with C11 and C++11 versions of C and C++.
This is a lower limit: the C API does not require features from later C/C++ versions. You do not need to enable your compiler’s “c11 mode”.
Coding standards¶
If you’re writing C code for inclusion in CPython, you must follow the guidelines and standards defined in PEP 7. These guidelines apply regardless of the version of Python you are contributing to. Following these conventions is not necessary for your own third party extension modules, unless you eventually expect to contribute them to Python.
Include Files¶
All function, type and macro definitions needed to use the Python/C API are included in your code by the following line:
#define PY_SSIZE_T_CLEAN
#include <Python.h>
This implies inclusion of the following standard headers: <stdio.h>,
<string.h>, <errno.h>, <limits.h>, <assert.h> and <stdlib.h>
(if available).
Note
Since Python may define some pre-processor definitions which affect the standard
headers on some systems, you must include Python.h before any standard
headers are included.
It is recommended to always define PY_SSIZE_T_CLEAN before including
Python.h. See Parsing arguments and building values for a description of this macro.
All user visible names defined by Python.h (except those defined by the included
standard headers) have one of the prefixes Py or _Py. Names beginning
with _Py are for internal use by the Python implementation and should not be
used by extension writers. Structure member names do not have a reserved prefix.
Note
User code should never define names that begin with Py or _Py. This
confuses the reader, and jeopardizes the portability of the user code to
future Python versions, which may define additional names beginning with one
of these prefixes.
The header files are typically installed with Python. On Unix, these are
located in the directories prefix/include/pythonversion/ and
exec_prefix/include/pythonversion/, where prefix and
exec_prefix are defined by the corresponding parameters to Python’s
configure script and version is
'%d.%d' % sys.version_info[:2]. On Windows, the headers are installed
in prefix/include, where prefix is the installation
directory specified to the installer.
To include the headers, place both directories (if different) on your compiler’s
search path for includes. Do not place the parent directories on the search
path and then use #include <pythonX.Y/Python.h>; this will break on
multi-platform builds since the platform independent headers under
prefix include the platform specific headers from
exec_prefix.
C++ users should note that although the API is defined entirely using C, the
header files properly declare the entry points to be extern "C". As a result,
there is no need to do anything special to use the API from C++.
Useful macros¶
Several useful macros are defined in the Python header files. Many are
defined closer to where they are useful (for example, Py_RETURN_NONE,
PyMODINIT_FUNC).
Others of a more general utility are defined here. This is not necessarily a
complete listing.
-
Py_CAN_START_THREADS¶
If this macro is defined, then the current system is able to start threads.
Currently, all systems supported by CPython (per PEP 11), with the exception of some WebAssembly platforms, support starting threads.
Added in version 3.13.
-
Py_GETENV(s)¶
Like
getenv(s), but returnsNULLif-Ewas passed on the command line (seePyConfig.use_environment).
Docstring macros¶
-
PyDoc_STRVAR(name, str)¶
Creates a variable with name name that can be used in docstrings. If Python is built without docstrings (
--without-doc-strings), the value will be an empty string.Example:
PyDoc_STRVAR(pop_doc, "Remove and return the rightmost element."); static PyMethodDef deque_methods[] = { // ... {"pop", (PyCFunction)deque_pop, METH_NOARGS, pop_doc}, // ... }
Expands to
PyDoc_VAR(name) = PyDoc_STR(str).
-
PyDoc_STR(str)¶
Expands to the given input string, or an empty string if docstrings are disabled (
--without-doc-strings).Example:
static PyMethodDef pysqlite_row_methods[] = { {"keys", (PyCFunction)pysqlite_row_keys, METH_NOARGS, PyDoc_STR("Returns the keys of the row.")}, {NULL, NULL} };
-
PyDoc_VAR(name)¶
Declares a static character array variable with the given name. Expands to
static const char name[]For example:
PyDoc_VAR(python_doc) = PyDoc_STR( "A genus of constricting snakes in the Pythonidae family native " "to the tropics and subtropics of the Eastern Hemisphere.");
General utility macros¶
The following macros are for common tasks not specific to Python.
-
Py_UNUSED(arg)¶
Use this for unused arguments in a function definition to silence compiler warnings. Example:
int func(int a, int Py_UNUSED(b)) { return a; }.Added in version 3.4.
-
Py_GCC_ATTRIBUTE(name)¶
Use a GCC attribute name, hiding it from compilers that don’t support GCC attributes (such as MSVC).
This expands to
__attribute__((name))on a GCC compiler, and expands to nothing on compilers that don’t support GCC attributes.
Numeric utilities¶
-
Py_ABS(x)¶
Return the absolute value of
x.The argument may be evaluated more than once. Consequently, do not pass an expression with side-effects directly to this macro.
If the result cannot be represented (for example, if
xhasINT_MINvalue for int type), the behavior is undefined.Corresponds roughly to
((x) < 0 ? -(x) : (x))Added in version 3.3.
-
Py_MAX(x, y)¶
-
Py_MIN(x, y)¶
Return the larger or smaller of the arguments, respectively.
Any arguments may be evaluated more than once. Consequently, do not pass an expression with side-effects directly to this macro.
Py_MAXcorresponds roughly to(((x) > (y)) ? (x) : (y)).Added in version 3.3.
-
Py_ARITHMETIC_RIGHT_SHIFT(type, integer, positions)¶
Similar to
integer >> positions, but forces sign extension, as the C standard does not define whether a right-shift of a signed integer will perform sign extension or a zero-fill.integer should be any signed integer type. positions is the number of positions to shift to the right.
Both integer and positions can be evaluated more than once; consequently, avoid directly passing a function call or some other operation with side-effects to this macro. Instead, store the result as a variable and then pass it.
type is unused and only kept for backwards compatibility. Historically, type was used to cast integer.
Changed in version 3.1: This macro is now valid for all signed integer types, not just those for which
unsigned typeis legal. As a result, type is no longer used.
-
Py_CHARMASK(c)¶
Argument must be a character or an integer in the range [-128, 127] or [0, 255]. This macro returns
ccast to anunsigned char.
Assertion utilities¶
-
Py_UNREACHABLE()¶
Use this when you have a code path that cannot be reached by design. For example, in the
default:clause in aswitchstatement for which all possible values are covered incasestatements. Use this in places where you might be tempted to put anassert(0)orabort()call.In release mode, the macro helps the compiler to optimize the code, and avoids a warning about unreachable code. For example, the macro is implemented with
__builtin_unreachable()on GCC in release mode.In debug mode, and on unsupported compilers, the macro expands to a call to
Py_FatalError().A use for
Py_UNREACHABLE()is following a call to a function that never returns but that is not declared_Noreturn.If a code path is very unlikely code but can be reached under exceptional case, this macro must not be used. For example, under low memory condition or if a system call returns a value out of the expected range. In this case, it’s better to report the error to the caller. If the error cannot be reported to caller,
Py_FatalError()can be used.Added in version 3.7.
-
Py_SAFE_DOWNCAST(value, larger, smaller)¶
Cast value to type smaller from type larger, validating that no information was lost.
On release builds of Python, this is roughly equivalent to
((smaller) value)(in C++,static_cast<smaller>(value)will be used instead).On debug builds (implying that
Py_DEBUGis defined), this asserts that no information was lost with the cast from larger to smaller.value, larger, and smaller may all be evaluated more than once in the expression; consequently, do not pass an expression with side-effects directly to this macro.
-
Py_BUILD_ASSERT(cond)¶
Asserts a compile-time condition cond, as a statement. The build will fail if the condition is false or cannot be evaluated at compile time.
Corresponds roughly to
static_assert(cond)on C23 and above.For example:
Py_BUILD_ASSERT(sizeof(PyTime_t) == sizeof(int64_t));
Added in version 3.3.
-
Py_BUILD_ASSERT_EXPR(cond)¶
Asserts a compile-time condition cond, as an expression that evaluates to
0. The build will fail if the condition is false or cannot be evaluated at compile time.For example:
#define foo_to_char(foo) \ ((char *)(foo) + Py_BUILD_ASSERT_EXPR(offsetof(struct foo, string) == 0))
Added in version 3.3.
Type size utilities¶
-
Py_ARRAY_LENGTH(array)¶
Compute the length of a statically allocated C array at compile time.
The array argument must be a C array with a size known at compile time. Passing an array with an unknown size, such as a heap-allocated array, will result in a compilation error on some compilers, or otherwise produce incorrect results.
This is roughly equivalent to:
sizeof(array) / sizeof((array)[0])
-
Py_MEMBER_SIZE(type, member)¶
Return the size of a structure (type) member in bytes.
Corresponds roughly to
sizeof(((type *)NULL)->member).Added in version 3.6.
Macro definition utilities¶
-
Py_FORCE_EXPANSION(X)¶
This is equivalent to
X, which is useful for token-pasting in macros, as macro expansions in X are forcefully evaluated by the preprocessor.
-
Py_STRINGIFY(x)¶
Convert
xto a C string. For example,Py_STRINGIFY(123)returns"123".Added in version 3.4.
Declaration utilities¶
The following macros can be used in declarations. They are most useful for defining the C API itself, and have limited use for extension authors. Most of them expand to compiler-specific spellings of common extensions to the C language.
-
Py_ALWAYS_INLINE¶
Ask the compiler to always inline a static inline function. The compiler can ignore it and decide to not inline the function.
Corresponds to
always_inlineattribute in GCC and__forceinlinein MSVC.It can be used to inline performance critical static inline functions when building Python in debug mode with function inlining disabled. For example, MSC disables function inlining when building in debug mode.
Marking blindly a static inline function with Py_ALWAYS_INLINE can result in worse performances (due to increased code size for example). The compiler is usually smarter than the developer for the cost/benefit analysis.
If Python is built in debug mode (if the
Py_DEBUGmacro is defined), thePy_ALWAYS_INLINEmacro does nothing.It must be specified before the function return type. Usage:
static inline Py_ALWAYS_INLINE int random(void) { return 4; }
Added in version 3.11.
-
Py_NO_INLINE¶
Disable inlining on a function. For example, it reduces the C stack consumption: useful on LTO+PGO builds which heavily inline code (see bpo-33720).
Corresponds to the
noinlineattribute/specification on GCC and MSVC.Usage:
Py_NO_INLINE static int random(void) { return 4; }
Added in version 3.11.
-
Py_DEPRECATED(version)¶
Use this to declare APIs that were deprecated in a specific CPython version. The macro must be placed before the symbol name.
Example:
Py_DEPRECATED(3.8) PyAPI_FUNC(int) Py_OldFunction(void);
Changed in version 3.8: MSVC support was added.
-
Py_LOCAL(type)¶
Declare a function returning the specified type using a fast-calling qualifier for functions that are local to the current file. Semantically, this is equivalent to
static type.
-
Py_LOCAL_SYMBOL¶
Macro used to declare a symbol as local to the shared library (hidden). On supported platforms, it ensures the symbol is not exported.
On compatible versions of GCC/Clang, it expands to
__attribute__((visibility("hidden"))).
-
Py_EXPORTED_SYMBOL¶
Macro used to declare a symbol (function or data) as exported. On Windows, this expands to
__declspec(dllexport). On compatible versions of GCC/Clang, it expands to__attribute__((visibility("default"))). This macro is for defining the C API itself; extension modules should not use it.
-
Py_IMPORTED_SYMBOL¶
Macro used to declare a symbol as imported. On Windows, this expands to
__declspec(dllimport). This macro is for defining the C API itself; extension modules should not use it.
-
PyAPI_FUNC(type)¶
Macro used by CPython to declare a function as part of the C API. Its expansion depends on the platform and build configuration. This macro is intended for defining CPython’s C API itself; extension modules should not use it for their own symbols.
-
PyAPI_DATA(type)¶
Macro used by CPython to declare a public global variable as part of the C API. Its expansion depends on the platform and build configuration. This macro is intended for defining CPython’s C API itself; extension modules should not use it for their own symbols.
Outdated macros¶
The following macros have been used to features that have been standardized in C11.
-
Py_ALIGNED(num)¶
Specify alignment to num bytes on compilers that support it.
Consider using the C11 standard
_Alignasspecifier over this macro.
-
Py_LL(number)¶
-
Py_ULL(number)¶
Use number as a
long longorunsigned long longinteger literal, respectively.Expands to number followed by
LLorLLU, respectively, but will expand to some compiler-specific suffixes on some older compilers.Consider using the C99 standard suffixes
LLandLLUdirectly.
-
Py_MEMCPY(dest, src, n)¶
This is a soft deprecated alias to
memcpy(). Usememcpy()directly instead.Deprecated since version 3.14: The macro is soft deprecated.
-
Py_VA_COPY¶
This is a soft deprecated alias to the C99-standard
va_copyfunction.Historically, this would use a compiler-specific method to copy a
va_list.Changed in version 3.6: This is now an alias to
va_copy.
Objects, Types and Reference Counts¶
Most Python/C API functions have one or more arguments as well as a return value
of type PyObject*. This type is a pointer to an opaque data type
representing an arbitrary Python object. Since all Python object types are
treated the same way by the Python language in most situations (e.g.,
assignments, scope rules, and argument passing), it is only fitting that they
should be represented by a single C type. Almost all Python objects live on the
heap: you never declare an automatic or static variable of type
PyObject, only pointer variables of type PyObject* can be
declared. The sole exception are the type objects; since these must never be
deallocated, they are typically static PyTypeObject objects.
All Python objects (even Python integers) have a type and a
reference count. An object’s type determines what kind of object it is
(e.g., an integer, a list, or a user-defined function; there are many more as
explained in The standard type hierarchy). For each of the well-known types there is a macro
to check whether an object is of that type; for instance, PyList_Check(a) is
true if (and only if) the object pointed to by a is a Python list.
Reference Counts¶
The reference count is important because today’s computers have a finite (and often severely limited) memory size; it counts how many different places there are that have a strong reference to an object. Such a place could be another object, or a global (or static) C variable, or a local variable in some C function. When the last strong reference to an object is released (i.e. its reference count becomes zero), the object is deallocated. If it contains references to other objects, those references are released. Those other objects may be deallocated in turn, if there are no more references to them, and so on. (There’s an obvious problem with objects that reference each other here; for now, the solution is “don’t do that.”)
Reference counts are always manipulated explicitly. The normal way is
to use the macro Py_INCREF() to take a new reference to an
object (i.e. increment its reference count by one),
and Py_DECREF() to release that reference (i.e. decrement the
reference count by one). The Py_DECREF() macro
is considerably more complex than the incref one, since it must check whether
the reference count becomes zero and then cause the object’s deallocator to be
called. The deallocator is a function pointer contained in the object’s type
structure. The type-specific deallocator takes care of releasing references
for other objects contained in the object if this is a compound
object type, such as a list, as well as performing any additional finalization
that’s needed. There’s no chance that the reference count can overflow; at
least as many bits are used to hold the reference count as there are distinct
memory locations in virtual memory (assuming sizeof(Py_ssize_t) >= sizeof(void*)).
Thus, the reference count increment is a simple operation.
It is not necessary to hold a strong reference (i.e. increment the reference count) for every local variable that contains a pointer to an object. In theory, the object’s reference count goes up by one when the variable is made to point to it and it goes down by one when the variable goes out of scope. However, these two cancel each other out, so at the end the reference count hasn’t changed. The only real reason to use the reference count is to prevent the object from being deallocated as long as our variable is pointing to it. If we know that there is at least one other reference to the object that lives at least as long as our variable, there is no need to take a new strong reference (i.e. increment the reference count) temporarily. An important situation where this arises is in objects that are passed as arguments to C functions in an extension module that are called from Python; the call mechanism guarantees to hold a reference to every argument for the duration of the call.
However, a common pitfall is to extract an object from a list and hold on to it
for a while without taking a new reference. Some other operation might
conceivably remove the object from the list, releasing that reference,
and possibly deallocating it. The real danger is that innocent-looking
operations may invoke arbitrary Python code which could do this; there is a code
path which allows control to flow back to the user from a Py_DECREF(), so
almost any operation is potentially dangerous.
A safe approach is to always use the generic operations (functions whose name
begins with PyObject_, PyNumber_, PySequence_ or PyMapping_).
These operations always create a new strong reference
(i.e. increment the reference count) of the object they return.
This leaves the caller with the responsibility to call Py_DECREF() when
they are done with the result; this soon becomes second nature.
Reference Count Details¶
The reference count behavior of functions in the Python/C API is best explained
in terms of ownership of references. Ownership pertains to references, never
to objects (objects are not owned: they are always shared). “Owning a
reference” means being responsible for calling Py_DECREF on it when the
reference is no longer needed. Ownership can also be transferred, meaning that
the code that receives ownership of the reference then becomes responsible for
eventually releasing it by calling Py_DECREF() or Py_XDECREF()
when it’s no longer needed—or passing on this responsibility (usually to its
caller). When a function passes ownership of a reference on to its caller, the
caller is said to receive a