Python access property of class

I had a programming interview recently, a phone-screen in which we used a collaborative text editor.

I was asked to implement a certain API, and chose to do so in Python. Abstracting away the problem statement, let’s say I needed a class whose instances stored some data and some other_data.

I took a deep breath and started typing. After a few lines, I had something like this:

class Service[object]:
    data = []

    def __init__[self, other_data]:
        self.other_data = other_data
    ...

My interviewer stopped me:

  • Interviewer: “That line: data = []. I don’t think that’s valid Python?”
  • Me: “I’m pretty sure it is. It’s just setting a default value for the instance attribute.”
  • Interviewer: “When does that code get executed?”
  • Me: “I’m not really sure. I’ll just fix it up to avoid confusion.”

For reference, and to give you an idea of what I was going for, here’s how I amended the code:

class Service[object]:

    def __init__[self, other_data]:
        self.data = []
        self.other_data = other_data
    ...

As it turns out, we were both wrong. The real answer lay in understanding the distinction between Python class attributes and Python instance attributes.

Note: if you have an expert handle on class attributes, you can skip ahead to use cases.

Python Class Attributes

My interviewer was wrong in that the above code is syntactically valid.

I too was wrong in that it isn’t setting a “default value” for the instance attribute. Instead, it’s defining data as a class attribute with value [].

In my experience, Python class attributes are a topic that many people know something about, but few understand completely.

Python Class Variable vs. Instance Variable: What’s the Difference?

A Python class attribute is an attribute of the class [circular, I know], rather than an attribute of an instance of a class.

Let’s use a Python class example to illustrate the difference. Here, class_var is a class attribute, and i_var is an instance attribute:

class MyClass[object]:
    class_var = 1

    def __init__[self, i_var]:
        self.i_var = i_var

Note that all instances of the class have access to class_var, and that it can also be accessed as a property of the class itself:

foo = MyClass[2]
bar = MyClass[3]

foo.class_var, foo.i_var
## 1, 2
bar.class_var, bar.i_var
## 1, 3
MyClass.class_var ## = self.limit:
            raise Exception["Too many elements"]
        self.data.append[e]

MyClass.limit
## 10

We could then create instances with their own specific limits, too, by assigning to the instance’s limit attribute.

foo = MyClass[]
foo.limit = 50
## foo can now hold 50 elements—other instances can hold 10

This only makes sense if you will want your typical instance of MyClass to hold just 10 elements or fewer—if you’re giving all of your instances different limits, then limit should be an instance variable. [Remember, though: take care when using mutable values as your defaults.]

  • Tracking all data across all instances of a given class. This is sort of specific, but I could see a scenario in which you might want to access a piece of data related to every existing instance of a given class.

    To make the scenario more concrete, let’s say we have a Person class, and every person has a name. We want to keep track of all the names that have been used. One approach might be to iterate over the garbage collector’s list of objects, but it’s simpler to use class variables.

    Note that, in this case, names will only be accessed as a class variable, so the mutable default is acceptable.

    class Person[object]:
        all_names = []
    
        def __init__[self, name]:
            self.name = name
            Person.all_names.append[name]
    
    joe = Person['Joe']
    bob = Person['Bob']
    print Person.all_names
    ## ['Joe', 'Bob']
    

    We could even use this design pattern to track all existing instances of a given class, rather than just some associated data.

    class Person[object]:
        all_people = []
    
        def __init__[self, name]:
            self.name = name
            Person.all_people.append[self]
    
    joe = Person['Joe']
    bob = Person['Bob']
    print Person.all_people
    ## [, ]
    
  • Performance [sort of… see below].

  • Under-the-hood

    Note: If you’re worrying about performance at this level, you might not want to be use Python in the first place, as the differences will be on the order of tenths of a millisecond—but it’s still fun to poke around a bit, and helps for illustration’s sake.

    Recall that a class’s namespace is created and filled in at the time of the class’s definition. That means that we do just one assignment—ever—for a given class variable, while instance variables must be assigned every time a new instance is created. Let’s take an example.

    def called_class[]:
        print "Class assignment"
        return 2
    
    class Bar[object]:
        y = called_class[]
    
        def __init__[self, x]:
            self.x = x
    
    ## "Class assignment"
    
    def called_instance[]:
        print "Instance assignment"
        return 2
    
    class Foo[object]:
        def __init__[self, x]:
            self.y = called_instance[]
            self.x = x
    
    Bar[1]
    Bar[2]
    Foo[1]
    ## "Instance assignment"
    Foo[2]
    ## "Instance assignment"
    

    We assign to Bar.y just once, but instance_of_Foo.y on every call to __init__.

    As further evidence, let’s use the Python disassembler:

    import dis
    
    class Bar[object]:
        y = 2
    
        def __init__[self, x]:
            self.x = x
    
    class Foo[object]:
        def __init__[self, x]:
            self.y = 2
            self.x = x
    
    dis.dis[Bar]
    ##  Disassembly of __init__:
    ##  7           0 LOAD_FAST                1 [x]
    ##              3 LOAD_FAST                0 [self]
    ##              6 STORE_ATTR               0 [x]
    ##              9 LOAD_CONST               0 [None]
    ##             12 RETURN_VALUE
    
    dis.dis[Foo]
    ## Disassembly of __init__:
    ## 11           0 LOAD_CONST               1 [2]
    ##              3 LOAD_FAST                0 [self]
    ##              6 STORE_ATTR               0 [y]
    
    ## 12           9 LOAD_FAST                1 [x]
    ##             12 LOAD_FAST                0 [self]
    ##             15 STORE_ATTR               1 [x]
    ##             18 LOAD_CONST               0 [None]
    ##             21 RETURN_VALUE
    

    When we look at the byte code, it’s again obvious that Foo.__init__ has to do two assignments, while Bar.__init__ does just one.

    In practice, what does this gain really look like? I’ll be the first to admit that timing tests are highly dependent on often uncontrollable factors and the differences between them are often hard to explain accurately.

    However, I think these small snippets [run with the Python timeit module] help to illustrate the differences between class and instance variables, so I’ve included them anyway.

    Note: I’m on a MacBook Pro with OS X 10.8.5 and Python 2.7.2.

    Initialization

    10000000 calls to `Bar[2]`: 4.940s
    10000000 calls to `Foo[2]`: 6.043s
    

    The initializations of Bar are faster by over a second, so the difference here does appear to be statistically significant.

    So why is this the case? One speculative explanation: we do two assignments in Foo.__init__, but just one in Bar.__init__.

    Assignment

    10000000 calls to `Bar[2].y = 15`: 6.232s
    10000000 calls to `Foo[2].y = 15`: 6.855s
    10000000 `Bar` assignments: 6.232s - 4.940s = 1.292s
    10000000 `Foo` assignments: 6.855s - 6.043s = 0.812s
    

    Note: There’s no way to re-run your setup code on each trial with timeit, so we have to reinitialize our variable on our trial. The second line of times represents the above times with the previously calculated initialization times deducted.

    From the above, it looks like Foo only takes about 60% as long as Bar to handle assignments.

    Why is this the case? One speculative explanation: when we assign to Bar[2].y, we first look in the instance namespace [Bar[2].__dict__[y]], fail to find y, and then look in the class namespace [Bar.__dict__[y]], then making the proper assignment. When we assign to Foo[2].y, we do half as many lookups, as we immediately assign to the instance namespace [Foo[2].__dict__[y]].

    In summary, though these performance gains won’t matter in reality, these tests are interesting at the conceptual level. If anything, I hope these differences help illustrate the mechanical distinctions between class and instance variables.

    In Conclusion

    Class attributes seem to be underused in Python; a lot of programmers have different impressions of how they work and why they might be helpful.

    My take: Python class variables have their place within the school of good code. When used with care, they can simplify things and improve readability. But when carelessly thrown into a given class, they’re sure to trip you up.

    Appendix: Private Instance Variables

    One thing I wanted to include but didn’t have a natural entrance point…

    Python doesn’t have private variables so-to-speak, but another interesting relationship between class and instance naming comes with name mangling.

    In the Python style guide, it’s said that pseudo-private variables should be prefixed with a double underscore: ‘__’. This is not only a sign to others that your variable is meant to be treated privately, but also a way to prevent access to it, of sorts. Here’s what I mean:

    class Bar[object]:
        def __init__[self]:
        self.__zap = 1
    
    a = Bar[]
    a.__zap
    ## Traceback [most recent call last]:
    ##   File "", line 1, in 
    ## AttributeError: 'Bar' object has no attribute '__baz'
    
    ## Hmm. So what's in the namespace?
    a.__dict__
    {'_Bar__zap': 1}
    a._Bar__zap
    ## 1
    

    Look at that: the instance attribute __zap is automatically prefixed with the class name to yield _Bar__zap.

    While still settable and gettable using a._Bar__zap, this name mangling is a means of creating a ‘private’ variable as it prevents you and others from accessing it by accident or through ignorance.

    Edit: as Pedro Werneck kindly pointed out, this behavior is largely intended to help out with subclassing. In the PEP 8 style guide, they see it as serving two purposes: [1] preventing subclasses from accessing certain attributes, and [2] preventing namespace clashes in these subclasses. While useful, variable mangling shouldn’t be seen as an invitation to write code with an assumed public-private distinction, such as is present in Java.

    How do I see properties of an object in Python?

    Use Python's dir to Print an Object's Attributes One of the easiest ways to access a Python object's attributes is the dir[] function. This function is built-in directly into Python, so there's no need to import any libraries.

    How do you access the data members of a class in Python?

    In Python, we use a dot [.] operator to access the members of a class.

    Do Python classes have properties?

    Class Properties In Python, a property in the class can be defined using the property[] function. The property[] method in Python provides an interface to instance attributes. It encapsulates instance attributes and provides a property, same as Java and C#.

    Chủ Đề