Dataclasses
19th Nov 2024
I was trying to write an article on Dataclasses1 and I looked up an example on the official website2
Easy enough on the face of it, but it led me into quite a rabbit hole. A dataclass allows for the creation of “special methods” for a class when the annotation @dataclass is sighted.
So if I have a class “Superhero” with the annotation :
1
2@dataclass
3class Superhero:
4 """
5 A class of superheroes and properties about their appearance
6 in Marvel movies
7 """
8 name: str
9 superpower: str
10 appearances: int = 0
11
12 def beckon(self):
13 print(f"{self.name} with the super power - {self.superpower} has appeared in Marvel movies {self.appearances} times")
14
15
16superhero = Superhero("Aquaman","Underwater breathing",2)
17superhero.beckon()
It effectively means that :
The __init__ function, __repr__ function and many more are created for it automatically.
Here’s something cool though.
I created one more function without dataclasses -
1class Superhero_nodc():
2 """
3 A class of superheroes and properties about their appearance
4 in Marvel movies
5 """
6 def __init__(self, name, superpower,appearances=0):
7
8 self.name = name
9 self.superpower = superpower
10 self.appearances = appearances
11
12 def beckon(self):
13 print(f"{self.name} with the super power - {self.superpower} has appeared in Marvel movies {self.appearances} times")
14
Now when I check the attributes of each of these classes using -
1attributes_dc = dir(Superhero)
2attributes_nodc = dir(Superhero_nodc)
As a result of understanding what dataclasses do, seeing that the following were the attributes for Superhero wasn’t surprising -
1['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
But when we try to see what the attributes of the vanilla class Superhero_nodc are versus Superhero
1>>> from pprint import pprint # For pretty printing
2>>> pprint([i for i in attr_dc if i not in attr_nodc])
3['__annotations__',
4 '__dataclass_fields__',
5 '__dataclass_params__',
6 '__match_args__',
7 '__replace__',
8 'appearances']
The dataclass fields and params are understandable and I will look into why annotations and __replace__ show up, but the thing that caught my attention was the presence of appearances .
Turns out that the default values set during definition of a dataclass end up manifesting as class attributes and thereby populate the __dict__ of the class.
Hence it shows up when dir is called.
I also learnt about the interactive Help that the Python REPL offers along the way -

Also here’s a neat little experiment I ended up doing - Comparing the attributes for a dataclass and a regular class using tabs in Textual3 !
