Data Classes in Python 3.7 (And Above)

Data Classes in Python (Guide)

by Geir Arne Hjelle Reading time estimate 26m intermediate python stdlib

A Python dataclass lets you define classes for storing data with less boilerplate. Use @dataclass to generate .__init__(), .__repr__(), and .__eq__() automatically. Dataclasses allow you to create classes quickly, but you can also add defaults, custom methods, ordering, immutability, inheritance, and even slots.

By the end of this tutorial, you’ll understand that:

  • Mutable defaults use field(default_factory=...), while simple defaults use field(default=...) or an inline value.
  • Type hints are required for fields, but they’re not enforced at runtime. Use typing.Any when needed.
  • Ordering comes from @dataclass(order=True) and can be customized by computing a .sort_index in .__post_init__().
  • Immutability is enabled with frozen=True, yet nested mutable fields can still change if their types allow it.
  • Inheritance rules require that any non-default field in a subclass cannot follow defaulted base-class fields.

When working through this tutorial, you’ll also get to compare dataclasses with Python’s namedtuple and attrs and identify when each option fits best.

Take the Quiz: Test your knowledge with our interactive “Data Classes in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Data Classes in Python

In this quiz, you'll test your understanding of Python data classes. Data classes, a feature introduced in Python 3.7, are a type of class mainly used for storing data. They come with basic functionality already implemented, such as instance initialization, printing, and comparison.

Python’s Dataclass in a Nutshell

Since Python 3.7, Python ships with the @dataclass decorator that allows you to create data classes. A data class is a class typically containing mainly data, although there aren’t really any restrictions. You create it using the @dataclass decorator, as follows:

Language: Python
from dataclasses import dataclass

@dataclass
class DataClassCard:
    rank: str
    suit: str

A data class comes with basic functionality already implemented. For instance, you can instantiate, print, and compare data class instances straight out of the box:

Language: Python
>>> queen_of_hearts = DataClassCard('Q', 'Hearts')
>>> queen_of_hearts.rank
'Q'
>>> queen_of_hearts
DataClassCard(rank='Q', suit='Hearts')
>>> queen_of_hearts == DataClassCard('Q', 'Hearts')
True

Compare that to a regular class. A minimal regular class would look something like this:

Language: Python
class RegularCard:
    def __init__(self, rank, suit):
        self.rank = rank
        self.suit = suit

While this is not much more code to write, you can already see signs of the boilerplate pain: rank and suit are both repeated three times simply to initialize an object. Furthermore, if you try to use this plain class, you’ll notice that the representation of the objects is not very descriptive, and for some reason a queen of hearts is not the same as a queen of hearts:

Language: Python
>>> queen_of_hearts = RegularCard('Q', 'Hearts')
>>> queen_of_hearts.rank
'Q'
>>> queen_of_hearts