3. Data model¶
3.1. Objects, values and types¶
Objects are Python’s abstraction for data. All data in a Python program is represented by objects or by relations between objects. (In a sense, and in conformance to Von Neumann’s model of a 「stored program computer,」 code is also represented by objects.)
Every object has an identity, a type and a value. An object’s identity never
changes once it has been created; you may think of it as the object’s address in
memory. The 『is』 operator compares the identity of two objects; the
id() function returns an integer representing its identity.
CPython implementation detail: For CPython, id(x) is the memory address where x is stored.
An object’s type determines the operations that the object supports (e.g., 「does
it have a length?」) and also defines the possible values for objects of that
type. The type() function returns an object’s type (which is an object
itself). Like its identity, an object’s type is also unchangeable.
[1]
The value of some objects can change. Objects whose value can change are said to be mutable; objects whose value is unchangeable once they are created are called immutable. (The value of an immutable container object that contains a reference to a mutable object can change when the latter’s value is changed; however the container is still considered immutable, because the collection of objects it contains cannot be changed. So, immutability is not strictly the same as having an unchangeable value, it is more subtle.) An object’s mutability is determined by its type; for instance, numbers, strings and tuples are immutable, while dictionaries and lists are mutable.
Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. An implementation is allowed to postpone garbage collection or omit it altogether — it is a matter of implementation quality how garbage collection is implemented, as long as no objects are collected that are still reachable.
CPython implementation detail: CPython currently uses a reference-counting scheme with (optional) delayed
detection of cyclically linked garbage, which collects most objects as soon
as they become unreachable, but is not guaranteed to collect garbage
containing circular references. See the documentation of the gc
module for information on controlling the collection of cyclic garbage.
Other implementations act differently and CPython may change.
Do not depend on immediate finalization of objects when they become
unreachable (so you should always close files explicitly).
Note that the use of the implementation’s tracing or debugging facilities may
keep objects alive that would normally be collectable. Also note that catching
an exception with a 『try…except』 statement may keep
objects alive.
Some objects contain references to 「external」 resources such as open files or
windows. It is understood that these resources are freed when the object is
garbage-collected, but since garbage collection is not guaranteed to happen,
such objects also provide an explicit way to release the external resource,
usually a close() method. Programs are strongly recommended to explicitly
close such objects. The 『try…finally』 statement
and the 『with』 statement provide convenient ways to do this.
Some objects contain references to other objects; these are called containers. Examples of containers are tuples, lists and dictionaries. The references are part of a container’s value. In most cases, when we talk about the value of a container, we imply the values, not the identities of the contained objects; however, when we talk about the mutability of a container, only the identities of the immediately contained objects are implied. So, if an immutable container (like a tuple) contains a reference to a mutable object, its value changes if that mutable object is changed.
Types affect almost all aspects of object behavior. Even the importance of
object identity is affected in some sense: for immutable types, operations that
compute new values may actually return a reference to any existing object with
the same type and value, while for mutable objects this is not allowed. E.g.,
after a = 1; b = 1, a and b may or may not refer to the same object
with the value one, depending on the implementation, but after c = []; d =
[], c and d are guaranteed to refer to two different, unique, newly
created empty lists. (Note that c = d = [] assigns the same object to both
c and d.)
3.2. The standard type hierarchy¶
Below is a list of the types that are built into Python. Extension modules (written in C, Java, or other languages, depending on the implementation) can define additional types. Future versions of Python may add types to the type hierarchy (e.g., rational numbers, efficiently stored arrays of integers, etc.), although such additions will often be provided via the standard library instead.
Some of the type descriptions below contain a paragraph listing 『special attributes.』 These are attributes that provide access to the implementation and are not intended for general use. Their definition may change in the future.
- None
This type has a single value. There is a single object with this value. This object is accessed through the built-in name
None. It is used to signify the absence of a value in many situations, e.g., it is returned from functions that don’t explicitly return anything. Its truth value is false.- NotImplemented
This type has a single value. There is a single object with this value. This object is accessed through the built-in name
NotImplemented. Numeric methods and rich comparison methods should return this value if they do not implement the operation for the operands provided. (The interpreter will then try the reflected operation, or some other fallback, depending on the operator.) Its truth value is true.See Implementing the arithmetic operations for more details.
- Ellipsis
This type has a single value. There is a single object with this value. This object is accessed through the literal
...or the built-in nameEllipsis. Its truth value is true.numbers.NumberThese are created by numeric literals and returned as results by arithmetic operators and arithmetic built-in functions. Numeric objects are immutable; once created their value never changes. Python numbers are of course strongly related to mathematical numbers, but subject to the limitations of numerical representation in computers.
Python distinguishes between integers, floating point numbers, and complex numbers:
numbers.IntegralThese represent elements from the mathematical set of integers (positive and negative).
There are two types of integers:
Integers (
int)These represent numbers in an unlimited range, subject to available (virtual) memory only. For the purpose of shift and mask operations, a binary representation is assumed, and negative numbers are represented in a variant of 2’s complement which gives the illusion of an infinite string of sign bits extending to the left.- Booleans (
bool) These represent the truth values False and True. The two objects representing the values
FalseandTrueare the only Boolean objects. The Boolean type is a subtype of the integer type, and Boolean values behave like the values 0 and 1, respectively, in almost all contexts, the exception being that when converted to a string, the strings"False"or"True"are returned, respectively.
The rules for integer representation are intended to give the most meaningful interpretation of shift and mask operations involving negative integers.
- Booleans (
numbers.Real(float)These represent machine-level double precision floating point numbers. You are at the mercy of the underlying machine architecture (and C or Java implementation) for the accepted range and handling of overflow. Python does not support single-precision floating point numbers; the savings in processor and memory usage that are usually the reason for using these are dwarfed by the overhead of using objects in Python, so there is no reason to complicate the language with two kinds of floating point numbers.
numbers.Complex(complex)These represent complex numbers as a pair of machine-level double precision floating point numbers. The same caveats apply as for floating point numbers. The real and imaginary parts of a complex number
zcan be retrieved through the read-only attributesz.realandz.imag.
- Sequences
These represent finite ordered sets indexed by non-negative numbers. The built-in function
len()returns the number of items of a sequence. When the length of a sequence is n, the index set contains the numbers 0, 1, …, n-1. Item i of sequence a is selected bya[i].Sequences also support slicing:
a[i:j]selects all items with index k such that i<=k<j. When used as an expression, a slice is a sequence of the same type. This implies that the index set is renumbered so that it starts at 0.Some sequences also support 「extended slicing」 with a third 「step」 parameter:
a[i:j:k]selects all items of a with index x wherex = i + n*k, n>=0and i<=x<j.Sequences are distinguished according to their mutability:
- Immutable sequences
An object of an immutable sequence type cannot change once it is created. (If the object contains references to other objects, these other objects may be mutable and may be changed; however, the collection of objects directly referenced by an immutable object cannot change.)
The following types are immutable sequences:
- 字串 (String)
A string is a sequence of values that represent Unicode code points. All the code points in the range
U+0000 - U+10FFFFcan be represented in a string. Python doesn’t have achartype; instead, every code point in the string is represented as a string object with length1. The built-in functionord()converts a code point from its string form to an integer in the range0 - 10FFFF;chr()converts an integer in the range0 - 10FFFFto the corresponding length1string object.str.encode()can be used to convert astrtobytesusing the given text encoding, andbytes.decode()can be used to achieve the opposite.- Tuples
The items of a tuple are arbitrary Python objects. Tuples of two or more items are formed by comma-separated lists of expressions. A tuple of one item (a 『singleton』) can be formed by affixing a comma to an expression (an expression by itself does not create a tuple, since parentheses must be usable for grouping of expressions). An empty tuple can be formed by an empty pair of parentheses.
- Bytes
A bytes object is an immutable array. The items are 8-bit bytes, represented by integers in the range 0 <= x < 256. Bytes literals (like
b'abc') and the built-inbytes()constructor can be used to create bytes objects. Also, bytes objects can be decoded to strings via thedecode()method.
- Mutable sequences
Mutable sequences can be changed after they are created. The subscription and slicing notations can be used as the target of assignment and
del(delete) statements.There are currently two intrinsic mutable sequence types:
- List(串列)
The items of a list are arbitrary Python objects. Lists are formed by placing a comma-separated list of expressions in square brackets. (Note that there are no special cases needed to form lists of length 0 or 1.)
- Byte Arrays
A bytearray object is a mutable array. They are created by the built-in
bytearray()constructor. Aside from being mutable (and hence unhashable), byte arrays otherwise provide the same interface and functionality as immutablebytesobjects.
The extension module
arrayprovides an additional example of a mutable sequence type, as does thecollectionsmodule.
- Set types
These represent unordered, finite sets of unique, immutable objects. As such, they cannot be indexed by any subscript. However, they can be iterated over, and the built-in function
len()returns the number of items in a set. Common uses for sets are fast membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.For set elements, the same immutability rules apply as for dictionary keys. Note that numeric types obey the normal rules for numeric comparison: if two numbers compare equal (e.g.,
1and1.0), only one of them can be contained in a set.There are currently two intrinsic set types:
- Sets
These represent a mutable set. They are created by the built-in
set()constructor and can be modified afterwards by several methods, such asadd().- Frozen sets
These represent an immutable set. They are created by the built-in
frozenset()constructor. As a frozenset is immutable and hashable, it can be used again as an element of another set, or as a dictionary key.
- Mappings
These represent finite sets of objects indexed by arbitrary index sets. The subscript notation
a[k]selects the item indexed bykfrom the mappinga; this can be used in expressions and as the target of assignments ordelstatements. The built-in functionlen()returns the number of items in a mapping.There is currently a single intrinsic mapping type:
- 字典
These represent finite sets of objects indexed by nearly arbitrary values. The only types of values not acceptable as keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity, the reason being that the efficient implementation of dictionaries requires a key’s hash value to remain constant. Numeric types used for keys obey the normal rules for numeric comparison: if two numbers compare equal (e.g.,
1and1.0) then they can be used interchangeably to index the same dictionary entry.Dictionaries are mutable; they can be created by the
{...}notation (see section Dictionary displays).The extension modules
dbm.ndbmanddbm.gnuprovide additional examples of mapping types, as does thecollectionsmodule.
- Callable types
These are the types to which the function call operation (see section Calls) can be applied:
- User-defined functions
A user-defined function object is created by a function definition (see section Function definitions). It should be called with an argument list containing the same number of items as the function’s formal parameter list.
Special attributes:
Attribute Meaning __doc__The function’s documentation string, or Noneif unavailable; not inherited by subclassesWritable __name__The function’s name
