EverythingPython

Dataclasses

19th Nov 2024

I was trying to write an article on Dataclasses1 and I looked up an example on the official website2

Easy enough on the face of it, but it led me into quite a rabbit hole. A dataclass allows for the creation of “special methods” for a class when the annotation @dataclass is sighted.

So if I have a class “Superhero” with the annotation :

 1
 2@dataclass
 3class Superhero:
 4    """
 5    A class of superheroes and properties about their appearance 
 6    in Marvel movies
 7    """
 8    name: str
 9    superpower: str
10    appearances: int = 0
11
12    def beckon(self):
13	    print(f"{self.name} with the super power - {self.superpower} has appeared in Marvel movies {self.appearances} times")
14         
15
16superhero = Superhero("Aquaman","Underwater breathing",2)
17superhero.beckon()

It effectively means that :

The __init__ function, __repr__ function and many more are created for it automatically.


Here’s something cool though.

I created one more function without dataclasses -

 1class Superhero_nodc():
 2    """
 3    A class of superheroes and properties about their appearance 
 4    in Marvel movies
 5    """
 6    def __init__(self, name, superpower,appearances=0):
 7         
 8        self.name = name
 9        self.superpower = superpower
10        self.appearances = appearances
11
12    def beckon(self):
13	    print(f"{self.name} with the super power - {self.superpower} has appeared in Marvel movies {self.appearances} times")
14         

Now when I check the attributes of each of these classes using -

1attributes_dc = dir(Superhero)
2attributes_nodc = dir(Superhero_nodc)

As a result of understanding what dataclasses do, seeing that the following were the attributes for Superhero wasn’t surprising -

1['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

But when we try to see what the attributes of the vanilla class Superhero_nodc are versus Superhero

1>>> from pprint import pprint # For pretty printing
2>>> pprint([i for i in attr_dc if i not in attr_nodc])
3['__annotations__',
4 '__dataclass_fields__',
5 '__dataclass_params__',
6 '__match_args__',
7 '__replace__',
8 'appearances']

The dataclass fields and params are understandable and I will look into why annotations and __replace__ show up, but the thing that caught my attention was the presence of appearances .

Turns out that the default values set during definition of a dataclass end up manifesting as class attributes and thereby populate the __dict__ of the class. Hence it shows up when dir is called.


I also learnt about the interactive Help that the Python REPL offers along the way - Alt Text

Also here’s a neat little experiment I ended up doing - Comparing the attributes for a dataclass and a regular class using tabs in Textual3 ! Alt Text


References

#TIL