Classes and Objects

Classes are fun!

OOPs

OOPs! I tried to learn about classes an all I got were these lousy objects.

Souce tutorial

'Objects', 'classes', 'object-oriented'...you've likely heard hese terms banging around in the coding/Python universe. If you're like me in a past life, you have a sense that they're all related in some way, but you're not exactly sure how, and you're not exactly sure what they are or why they are useful. Have no fear! It's actually pretty simple—so much so that you've already been working with objects and probably didn't even know it. To wit, let's start with something we already know about: lists.

In [ ]:
puppers = ['Shea', 'Barney', 'Daisy']

So what can we say about puppers? First of all, it is a particular instance of a more general thing called a list. Second, it contains some data (some pups). And finally, it has methods associated with it:

In [ ]:
puppers.append('Luna')
print(puppers)
puppers.sort()
print(puppers)

SURPRISE! puppers is an object. Objects are named 'things' in your code that have those properties: they can contain data and methods, and they are members of broader classes of 'things'. The terminology we use is that the types are called classes and the specific things you make are called objects or instances.

So in our example, puppers is an object, and it is an instance of the class list.

What is object-oriented programming (OOP)?

So it turns out you've already been working with objects and classes. All the 'things' you know and love—all the lists, dicts, sets, ndarrays, panda dataframes, etc. that you've made have secretly been objects, members of their respective classes. The incredibly simple idea behind 'object-oriented programming' is:

Instead of being limited to the types of objects that the makers of Python created, what if we could create our own classes that would serve the specific needs of our programs?

And that's really it. Objected-oriented programming is just coding in which you make your own classes. Now we just need to learn about classes in Python...

NOTE ON TERMINOLOGY: You'll variably hear the terms 'Object-oriented programming', its acronym 'OOP', and 'classes'. They're all the same thing. In fact C++, which is the object-oriented version of the classic C language, was originally called "C with classes". All these terms really just mean using classes in code.
MINOR NOTE ON TERMINOLOGY: The alternative to object-oriented programming ('normal' coding) is often referred to as 'procedural' programming.

Classes in Python

OK, so let's imagine we are trying to write some code about our lab's dogs. To start with, we might want to store a bunch of information about them. We might want to store, say, each dog's name, breed, weight, age, and floofiness. To do this, one solution would be to make separate lists for name, breed, weight, age and floofiness, and each dog gets its own index:

In [ ]:
# Make lists for attributes.
name = []
breed = []
weight = []
age = []
floof = []

# Add Shea's attributes to each list.
name.append('Shea')
breed.append('Collie')
weight.append(60)
age.append(10)
floof.append(100)

# Recall name entry for Shea.
print(floof[0])

This solution works, but then we have to remember the index for each dog. Another solution might be to make a dictionary dictionaries, with the dog name keying the first dict:

In [ ]:
# Make a dictionary of dogs.
dogs = {}

# Make a new dictionary for Shea.
dogs['Shea']= {}

# Populate Shea's dictionary.
dogs['Shea']['name'] = 'Shea'
dogs['Shea']['breed'] = 'Collie'
dogs['Shea']['weight'] = 60
dogs['Shea']['age'] = 10
dogs['Shea']['floof'] = 100

# Access attributes by dictionary lookup.
print(dogs['Shea']['floof'])

These solutions are both fine—they work! Many of us have written complicated code using data structures just like these. But imagine for a second that Mike tells us on Friday that COVID has made him realize that life is short, and we're going to drop these silly fruit flies and everyone is switching to projects focused on dogs. Now we are all writing code about dogs all day. You might find that you keep needing the data structure above over and over in different scripts, and you get tired of either re-writing it or copying and pasting it from all over your last script. So you get the clever idea to go ahead and organize your code, so that every time you create a new dog you automatically make this data structure, so you only have to copy and paste one section of code (you're shooting for 'portability'):

In [1]:
# Initialize data for a new dog.
def initialize_dog(name, puppers, breed='', weight=0, age=0, floof=0):
    puppers[name] = {}
    puppers[name]['name'] = name
    puppers[name]['breed'] = breed
    puppers[name]['weight'] = weight
    puppers[name]['age'] = age
    puppers[name]['floof'] = floof

# Make a new dict for dogs
puppers = {}

# Add data for Shea using our new function.
initialize_dog('Shea', puppers, 'Collie', 60, 10, 100)
initialize_dog("Luna", puppers, 'Bichon Frise', 20, 5, 80)

# Access data through dict lookup.
print(puppers['Shea']['name'])
print(puppers['Luna']['breed'])
Shea
Bichon Frise

This is a pretty good solution! It organizes your code in a logical way, and it increases its portability. Congratulations! You've (just about) invented classes. To get all the way there, we go ahead and decide we are going to formally create a new type of 'thing' in Python to represent a dog. A new Python class! To do this, we have to use the incredibly complicated process of writing the keyword class and the name we want for our new class:

In [ ]:
class GoodBoy:
    pass

shea = GoodBoy()
type(shea)

Note about style: Python classes are typically names with CamelCase, capitalizing the first letter of each word, without underscores.

Class attributes

We've made our first class. How do we populate it with useful things? Here its useful to go back to the distinction between classes and objects. Remember: class is the type of thing (e.g., a list), and the instance is a specific thing of the type that you've made (e.g., puppers). Attributes (which is the name for data within classes) can belong to either the class or the instance. This sounds a little complicated, so let's clear it up: a class attribute is something that every member of the class will have. For instance, all dogs are mammals, so we can make a class attribute to provide this information:

In [3]:
class GoodBoy:
    # Class attributes
    phylo_class = 'mammal'

# Class attributes are available from instances...
shea = GoodBoy()
print(shea.phylo_class)

# Class attributes can also be accessed directly from the class.
print(GoodBoy.phylo_class)
mammal
mammal

To make a class attribute, we simply write a standard variable assignment within the class. Note that we access class attributes using the dot notation, which is familiar to us by now.

Instance attributes

phylo_class belongs to GoodBoys in general. What about attributes that don't belong to all dogs, but to a specific dog, say name or weight? These are called instance attributes. To make instance attributes, first have to create an instance of the class. To do this, we use the __init__ method. This isn't anything complicated: __init__ is just the function that gets called whenever you create a new instance of the class. You never have to call __init__, since it is automatically called whenever you create a new instance of the class. It's basically the initialize_dog function we wrote a minute ago:

In [6]:
class GoodBoy:
    # Class attributes
    phylo_class = 'mammal'
        
    # Initializer. In English: whenever we make a new dog, do the following...
    def __init__(self, name):
        self.name = name
        self.breed = ''
        self.age = 0
        self.weight = 0
        self.floof = 0

shea = GoodBoy('Shea')
print(shea.name)
shea.floof = 100
print(shea.floof)
Shea
100

self, explained:

What's going on with self??? self is how we refer to the instance of a class within our code defining the class. It solves the problem that we want to work on instances of the class (e.g., shea, or barney, or luna), but in our class code we don't yet know the name of the instance. self is a stand-in for the name of the instance, so you can mentally replace self with shea when thinking about how this code will operate when you make shea. self is always the first argument to the __init__ method, and we use self + dot notation to do things specifically to the intance that gets created out of our class.

So in the code above, when we wrote shea = GoodBoy('Shea'), this called the __init__ method of the class GoodBoy. This method created a new object called shea and then assigned 'Shea' to the name attribute specifically of shea (rather than to all GoodBoys — only shea get's the name 'Shea').

NOTE: when calling the method you ignore self and just start with the second argument (if there is one). Python automatically supplies the object as the first argument "behind the scenes".

Whew! Let's pause for a mini-review of what we've learned:

  • We made a new class of Python thing called a GoodBoy which represents dogs.
  • We gave a class attribute to GoodBoy that tells us that all dogs are mammals.
  • We used the __init__ method to initialize individual instances of GoodBoys, and assigned instance attributes like name and floof that belong to individual GoodBoys (not all GoodBoys).

We've made a nifty container object. And this is just bare bones. All of our attributes here are simple strings or ints, but they can be ANYTHING. Attributes can be lists, ndarrays, dataframe, functions, or other objects (!). But attributes are only half the story on classes. Remember from looking at our old friend the humble list, Python objects don't just store data but can also do stuff like—for lists—append and sort. How do we make our classes be not just containers for data but capable of actions? For that, we need to implement methods.

Instance methods

Happily, we don't have to learn anything new to build methods in our classes. Methods is just a term we use for functions when they occur in classes—and we make them the EXACT same way, with the def keyword. The literal ONLY thing we have to do different is that an instance method (which belongs to a specific instance of the class) has to take self as its first argument. That's it! Otherwise you just write a function like you always do. Here let's write an instance method introducing our good boys:

In [ ]:
class GoodBoy:
    # Class attributes
    phylo_class = 'mammal'
        
    # Initializer
    def __init__(self, name, breed, age, weight, floof):
        self.name = name
        self.breed = breed
        self.age = age
        self.weight = weight
        self.floof = floof
    
    # Instance method
    def introduce(self):
        if (self.floof > 50):
            floofter = "I am a floofter"
        else:
            "I am not a floofter"
        
        print("Hi! My name is {} and I am a very good boy! I am a {}, I am {} years old in human years, I weigh {} pounds, and {}!".format(self.name, self.breed, str(self.age), str(self.weight), floofter))

# Initialize Shea as a GoodBoy
shea = GoodBoy('Shea', 'Collie',10, 60, 100)
# Call method using dot notation.
shea.introduce()

Instance methods: exactly like normal Python functions with self as first argument.

Class methods

Instance methods, as the name suggests, belong to the instance. What does that mean? Primarily, it means that we have to create a member of that class in order to call it. What if we want to create a method for the class that doesn't require creating an instance? For example, what if we wanted a function that gives some basic information about dogs that isn't specific to any one dog? For this, we use class methods:

In [ ]:
class GoodBoy:
    # Class attributes
    phylo_class = 'mammal'
        
    # Initializer
    def __init__(self, name, breed, age, weight, floof):
        self.name = name
        self.breed = breed
        self.age = age
        self.weight = weight
        self.floof = floof
    
    # Instance method
    def introduce(self):
        if (self.floof > 50):
            floofter = "I am a floofter"
        else:
            "I am not a floofter"
        
        print("Hi! My name is {} and I am a very good boy! I am a {}, I am {} years old in human years, I weigh {} pounds, and {}!".format(self.name, self.breed, str(self.age), str(self.weight), floofter))

    @classmethod
    def information(cls):
        info = 'The dog is a member of the genus Canis (canines), which forms part of the wolf-like canids, and is the most widely abundant terrestrial carnivore. The dog was the first species to be domesticated, and has been selectively bred over millennia for various behaviors, sensory capabilities, and physical attributes. Their long association with humans has led dogs to be uniquely attuned to human behavior and has earned them the distinction of being mans best friend. Additionally, they are all very good boys.\n'
        print(info)

# Call class method directly from class.
GoodBoy.information()
shea = GoodBoy('Shea', 'Collie',10, 60, 100)
shea.introduce()
print('')
# Call class method from a class instance.
shea.information()

Class methods are marked by the @classmethod decorator, which must occur on the line preceding the function. They take a class (cls) as their first argument. Just like self, you don't have to supply cls when you call these functions—Python takes care of it behind the scenes. Note that class methods can be called directly from the class (line 26) or from an instance of the class (line 30). Class methods 'belong' to the class generally and to instances.

BONUS MATERIAL: A third type of method you can create in classes is the @staticmethod which takes no special arguments and is indeed just a regular function that simply makes sense to package with the class (often 'utility' functions). Like class methods, static methods can be called from the class or instantiated objects. Read more about class methods vs. static methods.

Quick Review of methods in Python classes:

  • Methods is just what we call functions inside classes, and they are created using the def keyword like all Python functions.
  • Instance methods take self as their first argument and can only be called from instances of the class.
  • Class methods are marked by the @classmethod decorator, take cls as a first argument, and can be called from the class or from an instance.

That's most of what we need to get started with Python classes. There's just one more piece...

Inheritance and Python classes

Let's say we're coding along, and we decide that floofters are really distinct enough that we want to make a new class just for them. This class will need most of what we put into GoodBoy, but will add a few things. Do we have to copy everything from GoodBoy and paste it into Floofter? Nope (copy and pasting code is almost never the right answer!). We can use object inheritance!

In [ ]:
class Floofter(GoodBoy):
    pass

shea = Floofter('Shea', 'Collie',10, 60, 100)
shea.introduce()
print(type(shea))

By putting GoodBoy in the parentheses in the class construction statement, we told Python that Floofter is a child of the parent class GoodBoy. This means that any Floofter we create will have all the attributes and methods that we already made for GoodBoy, without us having to copy a single line of code! As we can see above, shea can still do all the GoodBoy things (initialize, introduce), but Python now says shea's type is Floofter.

Here's a more formal description of object inheritance:

Inheritance is the process by which one class takes on the attributes and methods of another. Newly formed classes are called child classes, and the classes that child classes are derived from are called parent classes. The child class inherits all of the functionality (attributes and methods) of its parent classes, and in the case of any conflicts, the child class wins: Changes in the child class override anything in the parent class. The new child class extends or overrides functionality of the parent class. This is incredibly powerful, as it means that instead of having to rewrite a class every time we want to add or remove or change a little bit, we can just create a child class and make the changes to it, preserving everything we already created in the parent.

Let's play with Floofter to give it some new functionality. Notice we didn't give our new class an __init__ method. Python is smart, and will search "up the chain" of and object's parentage until it finds one. In this case, it (correctly) used the __init__ method from GoodBoy. Let's make a new __init__ just for Floofter that will override the parent method. All that floof means we might be more interested in coat properties, so let's add some of those to our initialization. Our furry friends will probably want to tell people about their floof, so we will also add a method to help them with that:

In [ ]:
class Floofter(GoodBoy):
    def __init__(self, name, breed, age, weight, floof, fcolors, flength, ftexture, shed):
        self.name = name
        self.breed = breed
        self.age = age
        self.weight = weight
        self.floof = floof
        self.fcolors = fcolors
        self.flength = flength
        self.ftexture = ftexture
        self.shed = shed
    
    def describe_floof(self):
        colorstring = self.fcolors[0]
        for color in self.fcolors[1:]:
            colorstring = colorstring + ' and ' + color
        shedstring = 'I shed like crazy'
        if not self.shed:
            shedstring = "I do not shed"
            
        print("My floof is {}, about {} inches long, of a {} texture, and {}!".format(colorstring, str(self.flength), self.ftexture, shedstring))
shea = Floofter('Shea', 'Collie',10, 60, 100, ['brown', 'white'], 2, 'babysoft', True)
shea.introduce()
shea.describe_floof()

Let's take a look at how inheritance has worked here. First, our new she still has the introduce method from GoodBoy, but now has a new method to describe his floofiness that is only available to the child class. We also overrode the __init__ method of GoodBoy. To check this, let's try to initialize without the new arguments:

In [ ]:
shea = GoodBoy('Shea', 'Collie',10, 60, 100)
shea.introduce()
shea2 = Floofter('Shea', 'Collie',10, 60, 100)

These 5 arguments will work with a GoodBoy, but not with a Floofter because we gave it a new __init__ method which demands 8 arguments. This is a critical concept so I'll emphasize again:

Child classes override parents!

That's the critical thing you need to remember about Python object inheritance. I think that's sufficient for this intro, but I should note that inheritance can get pretty complex if you want, usually because objects can inherit from multiple parents (called multiple inheritance) and you have to understand how conflicts between these classes are resolved. There are some good descriptions here and here for those interested.

To summarize object inheritance in Python:

  • We make a class the child of another class by putting that parent in the parentheses after our class construction statment like so: class Name(Parent):
  • A child class inherits all the attributes and methods of the parent class.
  • In conflicts between child and parent, the parents always supersedes (beats) the parent.

One last thing: let's get super

The last thing I want to touch on, just because Mike used it in his gff parser, is Python's super() function. Let's say we want to change our information function to include more information specific to floofters. We want to keep calling this method 'information' to retain consistency for the user. But we also want to keep the option of calling the shorter general dog information. Child methods override parent methods...how can we get the parent method back after we've overriden it? That's where super() comes in!

In [ ]:
class Floofter(GoodBoy):
    def __init__(self, name, breed, age, weight, floof, fcolors, flength, ftexture, shed):
        self.name = name
        self.breed = breed
        self.age = age
        self.weight = weight
        self.floof = floof
        self.fcolors = fcolors
        self.flength = flength
        self.ftexture = ftexture
        self.shed = shed
    
    def describe_floofter(self):
        colorstring = self.fcolors[0]
        for color in self.fcolors[1:]:
            colorstring = colorstring + ' and ' + color
        shedstring = 'I shed like crazy'
        if not self.shed:
            shedstring = "I do not shed"
    
    # New information method updates text.
    @classmethod
    def information(cls):
        info = 'The dog is a member of the genus Canis (canines), which forms part of the wolf-like canids, and is the most widely abundant terrestrial carnivore. The dog was the first species to be domesticated, and has been selectively bred over millennia for various behaviors, sensory capabilities, and physical attributes. Their long association with humans has led dogs to be uniquely attuned to human behavior and has earned them the distinction of being mans best friend. Additionally, they are all very good boys.\n\nAmong all the very good boys, floofters are the very best-loved. They are much sought after by humans, especially tiny humans, and bring them many years of great joy.\n'
        print(info)
    
    # New method calls the information function of the parent object.
    @classmethod
    def information_brief(cls):
        super().information()

print("Call 1:")
Floofter.information()
print('\nCall 2:')
Floofter.information_brief()

super() creates a temporary instance of the parent object, giving you access to all of its functionality (including overriden parts!) from inside the child class. Pretty cool! super() is also useful within class method generally as a means to call methods from the parent when you haven't instantiated an object yet.

A fair question to ask is why we can't just explicitly create a temporary GoodBoy object to get access to parent methods. The answer is that we can! That'll work. The advantages of super() are that its a bit more elegant (whatever that means) and that when you get into more complicated patterns of inheritance, it can be advantageous not to explicitly name the parent but rather to generaically call "whatever the parent is". That's pretty advanced stuff though, so for now let's just say using super() to call parent functions from within a child object is nifty and Pythonic and will make you look like you know what you're doing! For the curious, here's a lesson from RealPython that goes into some depth about super.

Why classes?

So why do we need classes? Do we need classes? The answer is probably no: object-oriented programming is just a style. Put another way: OOP isn't for computers—it's for humans. The computer is ultimately getting the same instructions whether you write it in an object-oriented or procedural (or any other) style. OOP exists for the people who write and read code as a useful way to organize and understand programs.

So what are the advantages of classes? When you read intros to OOP, you they always emphasize that it makes programming match the way you look at the world already. We have a concept of what a “car” is, and within that car we know what a 1995 Toyota Celica is, and we know that Stad’s busted-ass-because-he’s-a-forever-postdoc 1995 Toyota Celica is a specific instance of that kind of car. So object-oriented approaches match programming abstraction to human abstraction.

That’s probably true, but honestly I always roll my eyes a little. I mean, high level languages are already pretty abstracted and human-readable, and it's not that hard to think in lists and dataframes. For my (admittedly limited) usage I have found the main advantage of classes to be that it’s simply a convenient way to group code and data when programming. I constantly end up with related things in my code. For example, for a single confocal movie, I have a bunch of data: the image stack itself, information about the experiment (date, fly line, microscope, labeled gene, developmental stage…), information extracted from the movie (nuclear mask, MS2 spot mask, intensity profiles of spots). I also have a bunch of functions that I wrote to work on these data (nuclei segmenter, MS2 spot segmenter, normalization functions…). There are lots of ways I could organize this code, but a very convenient way to do so is to create a Movie class and populate movie objects with all the things I need.

I'm sure I've massively under-used and ill-understand the full power of classes. Hopefully this is a decent introduction, and you all will go off, start using classes, and learn lots of wonderful new powerful things about them. Happy coding!