3. 資料模型

3.1. 物件、數值和型別

物件 是 Python 為資料的抽象表示方式。一個 Python 程式當中的所有資料皆由物件或物件之間的關係來呈現。程式碼也都是以物件呈現的。

每個物件都有一個識別性、型別,和數值。物件的識別性在物件建立後永遠不會改變;你也可以把它想成是該物件在記憶體中的位址。is 運算子會比較兩個物件的識別性是否相同;id() 函式則會回傳代表一個該物件的識別性的整數。

在 CPython 當中,id(x) 就是 x 所儲存在的記憶體位址。

一個物件的型別決定了該物件所支援的操作(例如「它有長度嗎?」),也同時定義該型別的物件能夠擁有的數值。type() 函式會回傳一個物件的型別(而該型別本身也是一個物件)。如同它的識別性,一個物件的型別 (type) 也是不可變的。[1]

某些物件的數值可被改變,這種物件稱作「可變的」(mutable);建立後數值不能變更的物件則稱作「不可變的」(immutable)。(不可變的容器物件中如果包含對於可變物件的參照,則後者的數值改變的時候前者的數值也會跟著一起改變;這種時候該容器仍會被當成是不可變的,因為它包含的物件集合仍然無法變更。因此可變或不可變嚴格說起並不等同於數值是否能被改變,它的定義有其他不明顯的細節。)一個物件是否為可變取決於它的型別;舉例來說,數字、字串和 tuple 是不可變的,而字典與串列則是可變的。

物件永遠不會被明示的摧毀;但當它們變得不再能夠存取的時候可能會被作為垃圾回收。每個實作都能延後垃圾回收或是乾脆忽略它 --- 垃圾回收如何進行完全取決於各個實作,只要沒有被回收的物件仍是可達的。

CPython 目前使用一種參照計數的方案,並提供可選的循環連結垃圾延遲偵測,這個方案會在大部分物件變得不可存取時馬上回收它們,但不保證能夠回收包含循環參照的垃圾。關於控制循環垃圾回收的資訊請見 gc 模組的說明文件。其他實作的行為不會相同,CPython 也有可能改變,因此請不要仰賴物件在變得不可存取時能夠馬上被最終化(亦即你應該總是明確關閉檔案)。

請注意,使用一個實作的追蹤或除錯工具可能會讓原本能夠回收的物件被維持存活。也請注意,使用 try...except 陳述式來抓捕例外也可能會讓物件維持存活。

某些物件包含對於「外部」資源的參照,像是開啟的檔案或是視窗。基本上這些資源會在物件被回收時釋放,但因為垃圾回收不保證會發生,這種物件也會提供明確釋放外部資源的方式 --- 通常是 close() method。強烈建議各個程式明確關閉這種物件。try...finally 陳述式與 with 陳述式提供進行明確關閉的方便手段。

某些物件包含對於其他物件的參照;這種物件被叫做「容器」。容器的範例有 tuple、串列與字典。這些參照是容器的數值的一部分。通常當我們提到容器的數值的時候,我們指的是其中包含的物件的數值,而不是它們的識別性;但當我們提到容器是否可變的時候,我們指的是直接包含在其中的物件的識別性。因此,如果一個不可變的容器(像一個 tuple)包含對於可變物件的參照,該可變物件被變更時該容器的數值也會跟著變更。

型別幾乎影響物件行為的所有面向。就連物件識別性的重要性某種程度上也受型別影響:對於不可變的型別,計算新數值的操作可能其實會回傳一個某個相同型別且相同數值的現存物件的參照;對於可變型別這則不會發生。舉例來說,在進行 a = 1; b = 1 之後,ab 可能會參照同一個物件,也可能不會,取決於所使用的實作。這是因為 int 是不可變的型別,因此 1 的參照可以重複利用。這個行為取決於所使用的實作,因此不應該依賴它,但在進行物件識別性測試的時候還是需要注意有這件事情。而在進行 c = []; d = [] 之後,cd 則保證會參照兩個不同、獨特、且新建立的空白串列。(請注意,e = f = [] 則會將同一個物件同時指派給 ef。)

3.2. 標準型別階層

Below is a list of the types that are built into Python. Extension modules (written in C, Java, or other languages, depending on the implementation) can define additional types. Future versions of Python may add types to the type hierarchy (e.g., rational numbers, efficiently stored arrays of integers, etc.), although such additions will often be provided via the standard library instead.

Some of the type descriptions below contain a paragraph listing 'special attributes.' These are attributes that provide access to the implementation and are not intended for general use. Their definition may change in the future.

3.2.1. None

這個型別只有一個數值。只有一個物件有這個數值。這個物件由內建名稱 None 存取。它用來在許多情況下代表數值不存在,例如沒有明確回傳任何東西的函式就會回傳這個物件。它的真值是 false。

3.2.2. NotImplemented

這個型別只有一個數值。只有一個物件有這個數值。這個物件由內建名稱 NotImplemented 存取。數字方法和 rich comparison 方法應該在沒有為所提供的運算元實作該操作的時候回傳這個數值。(直譯器接下來則會依運算子嘗試反轉的操作或是其他的後備方案。)它不應該在預期布林值的情境中被計算。

更多細節請見 實作算術操作

在 3.9 版的變更: Evaluating NotImplemented in a boolean context was deprecated.

在 3.14 版的變更: 在預期布林值的情境中計算 NotImplemented 現在會引發 TypeError。它先前會計算為 True,並自 Python 3.9 起會發出 DeprecationWarning

3.2.3. Ellipsis

這個型別只有一個數值。只有一個物件有這個數值。這個物件由文本 ... 或內建名稱 Ellipsis 存取。它的真值是 true。

3.2.4. numbers.Number

These are created by numeric literals and returned as results by arithmetic operators and arithmetic built-in functions. Numeric objects are immutable; once created their value never changes. Python numbers are of course strongly related to mathematical numbers, but subject to the limitations of numerical representation in computers.

The string representations of the numeric classes, computed by __repr__() and __str__(), have the following properties:

  • They are valid numeric literals which, when passed to their class constructor, produce an object having the value of the original numeric.

  • The representation is in base 10, when possible.

  • Leading zeros, possibly excepting a single zero before a decimal point, are not shown.

  • Trailing zeros, possibly excepting a single zero after a decimal point, are not shown.

  • A sign is shown only when the number is negative.

Python distinguishes between integers, floating-point numbers, and complex numbers:

3.2.4.1. numbers.Integral

These represent elements from the mathematical set of integers (positive and negative).

備註

The rules for integer representation are intended to give the most meaningful interpretation of shift and mask operations involving negative integers.

There are two types of integers:

Integers (int)

These represent numbers in an unlimited range, subject to available (virtual) memory only. For the purpose of shift and mask operations, a binary representation is assumed, and negative numbers are represented in a variant of 2's complement which gives the illusion of an infinite string of sign bits extending to the left.

Booleans (bool)

These represent the truth values False and True. The two objects representing the values False and True are the only Boolean objects. The Boolean type is a subtype of the integer type, and Boolean values behave like the values 0 and 1, respectively, in almost all contexts, the exception being that when converted to a string, the strings "False" or "True" are returned, respectively.

3.2.4.2. numbers.Real (float)

These represent machine-level double precision floating-point numbers. You are at the mercy of the underlying machine architecture (and C or Java implementation) for the accepted range and handling of overflow. Python does not support single-precision floating-point numbers; the savings in processor and memory usage that are usually the reason for using these are dwarfed by the overhead of using objects in Python, so there is no reason to complicate the language with two kinds of floating-point numbers.

3.2.4.3. numbers.Complex (complex)

These represent complex numbers as a pair of machine-level double precision floating-point numbers. The same caveats apply as for floating-point numbers. The real and imaginary parts of a complex number z can be retrieved through the read-only attributes z.real and z.imag.

3.2.5. Sequences

These represent finite ordered sets indexed by non-negative numbers. The built-in function len() returns the number of items of a sequence. When the length of a sequence is n, the index set contains the numbers 0, 1, ..., n-1. Item i of sequence a is selected by a[i]. Some sequences, including built-in sequences, interpret negative subscripts by adding the sequence length. For example, a[-2] equals a[n-2], the second to last item of sequence a with length n.

The resulting value must be a nonnegative integer less than the number of items in the sequence. If it is not, an