A Python dataclass lets you define classes for storing data with less boilerplate. Use @dataclass to generate .__init__(), .__repr__(), and .__eq__() automatically. Dataclasses allow you to create classes quickly, but you can also add defaults, custom methods, ordering, immutability, inheritance, and even slots.
By the end of this tutorial, you’ll understand that:
- Mutable defaults use
field(default_factory=...), while simple defaults usefield(default=...)or an inline value. - Type hints are required for fields, but they’re not enforced at runtime. Use
typing.Anywhen needed. - Ordering comes from
@dataclass(order=True)and can be customized by computing a.sort_indexin.__post_init__(). - Immutability is enabled with
frozen=True, yet nested mutable fields can still change if their types allow it. - Inheritance rules require that any non-default field in a subclass cannot follow defaulted base-class fields.
When working through this tutorial, you’ll also get to compare dataclasses with Python’s namedtuple and attrs and identify when each option fits best.
Free Download: Get a sample chapter from Python Tricks: The Book that shows you Python’s best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.
Take the Quiz: Test your knowledge with our interactive “Data Classes in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Data Classes in PythonIn this quiz, you'll test your understanding of Python data classes. Data classes, a feature introduced in Python 3.7, are a type of class mainly used for storing data. They come with basic functionality already implemented, such as instance initialization, printing, and comparison.
Python’s Dataclass in a Nutshell
Since Python 3.7, Python ships with the @dataclass decorator that allows you to create data classes. A data class is a class typically containing mainly data, although there aren’t really any restrictions. You create it using the @dataclass decorator, as follows:
from dataclasses import dataclass
@dataclass
class DataClassCard:
rank: str
suit: str
A data class comes with basic functionality already implemented. For instance, you can instantiate, print, and compare data class instances straight out of the box:
>>> queen_of_hearts = DataClassCard('Q', 'Hearts')
>>> queen_of_hearts.rank
'Q'
>>> queen_of_hearts
DataClassCard(rank='Q', suit='Hearts')
>>> queen_of_hearts == DataClassCard('Q', 'Hearts')
True
Compare that to a regular class. A minimal regular class would look something like this:
class RegularCard:
def __init__(self, rank, suit):
self.rank = rank
self.suit = suit
While this is not much more code to write, you can already see signs of the boilerplate pain: rank and suit are both repeated three times simply to initialize an object. Furthermore, if you try to use this plain class, you’ll notice that the representation of the objects is not very descriptive, and for some reason a queen of hearts is not the same as a queen of hearts:
>>> queen_of_hearts = RegularCard('Q', 'Hearts')
>>> queen_of_hearts.rank
'Q'
>>> queen_of_hearts