From the Dec 12, 2014 SoMa Tech Talk series.
Steve Howard presents on dependency injection, composition, and global state. Slides available here.
01:26 SH: So, as I am kind of alluding to, these techniques are universally applicable with a little asterisk. The asterisk is that if you’re not working in an object-oriented system or language or paradigm then this isn’t applicable, but that’s probably not you. And if your system is extremely performance intensive then this might not be applicable because these techniques do have performance applications, but that’s probably not you either. These techniques are related, which I will explain. And these techniques will make your code better, which is why you should care. What’s all the fuss about? What is the fuss? Is there fuss? Yes, there is fuss. Here’s a quote from a paper written in 1986. I didn’t even know they had computers back then. “Encapsulation has many advantages, so and so and so. Unfortunately, in most object-oriented languages, the introduction of inheritance severely compromises encapsulation.” So, 27 years ago. Thank you. 27 years ago, here’s this guy saying this in a paper.
02:34 SH: Let’s fast forward, I think, about 15 years, and Josh Bloch, one of the creators of the Java language. I know here’s Java again and you’re all like, “Oh, get that off the screen. Eew! Evil!” But he wrote this book called Effective Java which is actually a really good book even if you don’t write Java. And he has a whole chapter called Favor Composition Over inheritance and he says, “To avoid this fragility… ” He closes the chapter, “To avoid this fragility, use composition and forwarding instead of inheritance. Not only are wrapper classes more robust than subclasses, they are also more powerful.” And as an example of fragility he includes this example, if you wanted to make this instrumented HashSet, it just acts like a HashSet except it counts how many objects have ever been added to this set. This looks like a pretty simple cookie-cutter textbook implementation. It looks fine. It works. Done. Check. Move on. But this actually doesn’t work. When you call add all with a collection of objects, they’ll all get double counted because as it turns out the implementation of addAll in HashSet calls Add. And when the super class implementation calls Add, it will call back into your subclass over written implementation and your stuff will get counted again.
03:43 SH: Now, of course you couldn’t have known that because you don’t know what the implementation of a HashSet looks like and you shouldn’t have to, but what’s happened here is by sub-classing, you’ve coupled your implementation to the details of the implementation of the super class. And you could say, “Well now, I know that. I can just take out this line where I add to the count and add all and that’s fine. My implementation works.” And that’s true. It would work then. But if in Java 9, they change the implementation of HashSet, which they should be allowed to do because they’re not changing the interface, then your implementation would break. So, that’s this fragility. It’s this tight coupling. And we go forward another 11 years, and here is Rob Pike who is one of the main creators of the Go Language. And in this document, “Go At Google”, he says, “We argue that this composition of style of system construction has been neglected by the languages that push for design by type hierarchy. Type hierarchy is the result of brittle code. Blah, blah, blah encourages composition over inheritance.”
04:40 SH: So, the Go Language itself was designed to encourage this. In fact, they went further than encouraging it. You just can’t really do inheritance and go because they felt inheritance was just bad, not worth the trouble. Just don’t do it. We’re not even gonna let you. And then, just to cover my hip languages bases, here’s a quote from Effective Scala, a document published by Twitter, by Marius Eriksen whom I don’t really know much about. And he says, “Use dependency injection for program modularization and, in particular, prefer composition over inheritance, for this leads to more modular and testable programs.” And then, I love this last sentence here, “When encountering a situation requiring inheritance, ask yourself, ‘How would you structure the program if the language lacks support for inheritance?'” This rule I have found to be one of the most helpful rules in all of the little rules about coding that I’ve ever come across. This has really led me to many design insights.
05:36 SH: Okay. Now that I’ve cited other people talking about this who have much more authority than I do, let’s talk about the details. Inheritance, hopefully you’re familiar with. Otherwise, this talk is going to be hard to follow. But I’ve just joined the programming, you make a sub-class. You’re inheriting from a class. You inherit all of its methods. They all now apply to your subclass. You can override some of those methods, you can call into the superclass methods or superclass methods could call in to you, as we saw with the instrumented HashSet example. This is inheritance. Here’s an example from real code, it’s from the Python Socket Server module, which is in the Standard Library, we have this abstract-base server class and then there’s a TCP Server concrete subclass, which you can implement, and then there this Unix Stream subclass for Unix sockets and then there’s a UDP server subclass for UDP and then a Unix Datagram for Datagram Unix sockets. And then over here, this is in the base HTTP Server module, also in the Standard Library. So there’s this HTTP Server that subclasses TCP Server, overrides some of the details. These servers were all meant for you actually to use them, you subclass them and you fill in some details that were left about how to actually handle data. And the HTTP Server gives you some different things to fill out when you subclass it.
06:54 SH: This is a little bit of an interesting example because servers have these cross concerns where there’s like, streaming versus datagram and then there’s sock INET versus sock UNIX or something INET versus something UNIX, and then there’s the protocol with HTTP. And there’s actually a comment in the Python Library that this is a bad class design, but it’s optimized for the common case, which is TCP Server. So I’m being a little unfair, but it is a real example.
07:20 SH: And in composition, which may be you’re not as familiar with, but it’s a very, very simple concept, it’s simpler than inheritance. Composition is just, I have an object and it has references to some other objects that it uses and maybe those have references to other objects that they use. So there’s no complicated inheritance in all this. There’s just this class calls into this class, and that’s all there is. This basically predates inheritance because even in very simple languages, you can do this back and see you could do this. But it’s very simple so it’s often overlooked.
07:54 SH: But here is a theoretical example of how you could do this whole server thing, composition instead of inheritance, which I came up with about an hour ago, so it’s just strictly illustrative. But maybe we have the server class and now it’s not an abstract class anymore, it’s a concrete class you would actually instantiate. But it delegates work to other classes. So instead of overriding it to fill in the details of the socket type, you pass it an instance of some socket type class, and I’ve used this dotted line to show. First of all, this dash line says it has a reference to an instance of this, it’s not a subclass anymore. This dotted line, sorry that I used UML but I hate UML, this dotted line says this is just an interface. There’s no actual socket type class, it’s just an interface. And then we have, like, TCP socket or UDP socket that implements that interface, but we passed it an instance of one of those to the server. And that’s how we customize it. We don’t need to make subclasses. And then it also delegates to some handler implementation, for we how to actually handle the data.
08:53 SH: And then if we wanna make an HTTP Server, maybe we’ll just have it sort of wrap the server or have a reference to the server. And then likewise, it might have a reference to some handler implementation, which will be different because this handler will handle things like, get and post. This will handle things like some data came in. It’s a little illustration of composition.
09:13 SH: Okay. So here’s what they teach you in school where everyone learned about inheritance and composition. They’re techniques for taking these concepts that you have in your head, maybe concepts from the real world, and mapping them into code in this nice taxonomic mapping. So inheritance model is an ISO relationship, like a car is an automobile and a truck is an automobile. And then composition models has a relationship, like a car has four wheels and a car has a motor. So that makes sense, that’s what the real world was like. That’s how we model this taxonomies and it’s a very natural fit for lots of problems. And it’s very seductive for us, scientists and engineers, because we love breaking the world down by these taxonomies into these classes of things and categorizing everything and then it feels very good. The problem is, practice doesn’t work out that nice and practice is not about this conceptual mapping. It’s about achieving modularity. Modularity and splitting up a code into smaller pieces that we can think about and deal with in isolation and being able to re-use pieces, common pieces, in different context.
10:26 SH: Inheritance and composition both achieve modularity. But inheritance, and here I have to break inheritance into two different concepts. There is interface inheritance, which we talked about in the composition example. Interface inheritance means an object fulfills some… Several objects fulfill the same interface. And here, an interface is not necessarily some language thing like a Java interface. It’s just a contract, a general idea of a contract. This exists in any language whether there’s formal interfaces or not. It’s a contract that says, “Here are my methods and if you call these things, this is how they’ll behave. And if you tell me to store some data and then you ask me to load the data, I’ll give you the same data back and maybe it goes to a database, maybe it goes to S3, it doesn’t matter.” It’s a contract.
11:05 SH: Interface inheritance is good, it’s a cornerstone of polymorphism and object oriented languages. It’s great, you should use it. Implementation inheritance is what we talked about at the beginning. Implementation inheritance is when a subclass actually gets the methods, the implemented methods down from the superclass. This is what most people think of when they talk about inheritance. This is what leads to all these problems with fragility, with brittle code. The crux of the issue is that implementation inheritance creates a two-way coupling. Your subclass can call into a superclass, and the superclass, as we saw in the HashSet example, can call back into your subclass. They’re calling into each other. There’s a two-way coupling between the implementation details of your two classes. It also tends to create this kind of action at a distance effect where when a class calls some superclass method, it could call up the chain, like 10 levels to some superclass way up here, which could then call something that calls back down the chain to your subclass.
12:01 SH: It could call over into some mix in that was brought in, so you get this action of distance that makes the code very hard to follow, hard to debug, hard to reason about. Composition on the other hand, which is maybe less obvious once you’ve been trained to think in the inheritance way, tends to lead to superior solutions as I will argue over and over in this presentation, and the main reason, in contrast to inheritance, ’cause it creates a on… Way coupling. When you have a reference to some object, and you call it, you are calling into that object, that object is not calling back into you unless you explicitly tell it to, by passing it some kind of call back in which case you are defining a two-way contract, but it’s two separate contracts, and it’s much more well defined, and it’s clear that you’re doing it because you have to pass a reference in for the call back. So, it tends to make one-way couplings, which are much simpler and much less prone to fragility, and I will illustrate the rest of those points in further slides.
12:58 SH: So, last point on this, I wanna show this little tidbit from the Python thread class documentation. Real excerpt from the real code. There are two ways to specify the activity by passing a callable object or by overriding the run method in a subclass. There’s inheritance. No other methods except for the constructor should be overridden in a subclass. In other words, only override the ‘init’ and ‘run’ methods of the class, and then a few paragraphs down. If the subclass overrides the constructor, it must make sure to invoke the base class constructor before doing anything else to the thread. These instructions seem like a little bit nitpicky like really given your subclass a lot of opportunities to mess up in subtle ways that might be hard to debug. This is fragility.
13:39 SH: Is this text big enough? Good. Tell me if it gets too small. Here’s a more concrete example. Take a look at this top example. We’ve got this event tracking client class in some system that uses lots of inheritance, and its subclasses. JSON, HTTP client. Alright, I got an idea of what this is doing. And it’s got this ‘send’ method, and then it calls self.build message here. Where is self.build message? Where is the build method? Sorry, where is the build message function defined? Event tracking client right here somewhere a few lines up, down, maybe. Maybe it’s up in JSON HTTP client. That’d make sense. Maybe it’s in JSON HTTP client parent or it’s parent or it’s parent or some mix in that, that thing brought in or some other class that that multiplely inherited. Maybe it’s in the subclass of event tracking client that you don’t even know about, could be anything. Really could be anywhere. So, you’re reading this code or you’re debugging this code or you’re trying to reason about it, there’s not a lot of constraints here. That method could be almost anywhere.
14:42 SH: In contrast, in the bottom example, same class, same method. No inheritance here, and say we’re in a system that kinda of eschews inheritance generally, we call self.build message. It’s not under the super class, there’s no super class. If we assume it’s not in a subclass because we’re not doing a lot of inheritance, then I know that that build message method is right here in event tracking client. I just scroll up, scroll down. I can do a Control F or Control S or whatever it is in Vim, and find it, and that’s it. End of story. I could be in the GitHub web browser, and I can find it. No problem. I don’t need tags, all that stuff. And then, the send message function, I can see it’s calling in on some JSON HTTP client member, so I have a better idea of where that’s going.
15:31 SH: So, in summary on that point, it boils down to coupling. This coupling means there are fewer direct dependencies from object to object, and there are fewer methods used on the dependencies. So, less getting your fingers into the different implementation details of that method. This coupling is annoying as a programmer because it makes you think harder. It’s easy to just say “All the classes get to access the database. All the classes get to access the authentication system.” It’s harder if you have to say “Well, wait a minute. Do all of these classes need to access the database? Maybe just these do, and then these can access those.” It makes you design more carefully, but it makes you come up with a better design that’s going to have a lot of positive downstream effects. Some of which I’ll talk about.
16:13 SH: It makes this system easier to reason about. It makes this system easier to create seams in. This is a term I’m borrowing from Miško Hevery. Seam is when you have an object or a collection of objects, and you wanna draw a line that splits this collection of objects off from the rest of the dependencies in your system, so you can then maybe stub them out with fakes or mocks for testing, and what you need to do is take a collection of objects in some system and create a seam where you stub out everything else that connects them to the outside world so they can be tested in isolation. The fewer of those lines that are going out, the easier it’s going to be to create these seams. Seams are very good for testing, critical for testing. And the point is the composition leads to looser coupling than inheritance.
16:59 SH: Okay. Let’s talk about dependency management. I’ve been talking a lot about dependencies, so clearly this dependency management thing must be important. So, this is a slightly orthogonal topic, but very closely related, I would say. Dependency management just refers to, you have some program, you have different objects that depend on other objects. How are you gonna represent the dependencies among objects and manage them? I’m gonna talk about three main styles: So-called static cling, which I’m borrowing from this Google Testing Blog entry. Singleton, which I’m sure many of you here are familiar with, and dependency injection, which you will be very familiar with if you’re not already. Here’s our little bit of example code. We’ve got some MyApp Class that’s like a web app handler in Python, and it has this get response method that takes the request, and then we’re gonna return a response and somehow, we need to render this template. We’ve got this hello.html template, we need to render it somehow. Here’s an example of static cling.
17:58 SH: We have this template’s module, it imports a Jinja, which is a third party library for template rendering. Its details aren’t important. The important thing is: It does some stuff, with this third party library to set up our template rendering and then it defines this render function at module global level, that takes in the information and uses that third party library to render the module. And then, here in myappmodule.py, I will implate that templates module and here’s my app class and that place where I had somehow render. I’m just gonna call the render method in the templates module, like global method and then down here in main when I construct it… Yeah that’s visible. In main, when I construct it, I just construct the app and no problem, nothing to worry about. Seems simple. This is probably the first thing one would reach for, especially coming right out of school. This is simple. It works. It’s fine, but it causes a lot of problems down the road, in particular with testability, which is when I wanna test this my app, the first thing that tends to want to come up is when I wanna test this MyApp class. I don’t want it to actually go in the file system and render this template. I wanna stub that out, maybe and there’s no way to do that here. You could monkey patch in Python. I’m not gonna go there in this presentation.
19:13 SH: So, a lot of people reached for Singleton when they learn about that. And Singleton means I’m gonna make this class, this template manager class, and I’m gonna take all that logic for interacting with the third party templating library and wrap it up in that class. And then I’m going to create this one instance over this, this global instance up with the manager. And then things don’t look too different for the rest of the app. When I go to call it, instead of calling this global method. I call this method ominous global object and nothing changes in main. So that’s pretty simple and this is a good step forward, it seems like a very small change but it is a good step forward because now I can substitute out that manager object with some other instance, so if I’m testing I could substitute manager with some object that’s like a fake template manager and that’s great, made progress but it could be a lot better.
20:10 SH: It could be a lot better with dependency injection. So in this case in the templates module, I’ve just made this one small change, two small changes. One, the template manager now accepts its third party dependency as an argument to the constructor. So while this doesn’t look that different, you can see the template manager class now never actually calls the Jinja2 module directly, it just accepts this object in his constructor that it’s going to use. And then the application in myapp.py does a very similar thing, it accepts the template manager object that its going to need. It never actually calls into the template module itself, it just accepts the argument of the object it needs to use and uses that. Then the big changes are down here in main.py in my entry point script. When the applications first runs, it imports all these modules, it goes and constructs the Jinja environment for the third party library. It goes and constructs the template manger and passes in the Jinja environment then it goes and constructs the application and passing in the template manager.
21:14 SH: So all that work of constructing, all these different objects and wiring them up together happens right here in one place in the entry point and all the other classes are just defined to say “Here are the dependencies I have, their arguments in my constructor, this is what I need to operate, you give me those when you make me, you have to, I require them in my constructor. You give me those and then I’ll operate with those.” This has a lot of benefits, but I wanna take a second to let this sink in. This is dependency injection, it’s not a framework, it’s not a container, it’s not an annotation, it’s none of that stuff. It’s a style of encoding dependencies and the key thing is that each object declares usually as parameters to its constructor. All the other objects it needs to operate you have to pass them in, it doesn’t go looking for them, it just asks you to give it to them.
22:00 SH: They can also be arguments to methods, but more frequently they are constructor arguments. Cool. Here’s a lesson if you haven’t already figured it out. Composition is better than inheritance in most cases. Dependency injection is better than Singleton which is better than static cling by a lot. And composition and dependency injection play nicely together. This is the slide you should burn into your brain and never forget. I’ve used a single greater than for composition over inheritance because inheritance is actually still really useful, it’s a great tool. It still has its place in your toolbox and you will still find yourself using it, but you should reach for composition probably more often than you do. And static binding still has some place like logging is a very common place where it will be statically bound, it doesn’t help a whole lot to do dependency injection of logging, some people do it, but static cling tends to be way more over used.
22:58 SH: So what are some of the benefits? There’s readability, take a look at that first class. We see this kind of stuff all the time. Come across some class called transaction helper and I don’t know, is that like the database helper class or some kind of like credit card processor or what is that? Well, just a tiny change in that second snippet there in the dependency injection style, we see that the transaction helper takes in three objects, a user manager, a credit card processor and a shopping cart. I don’t really know what this code is. I mean I just made it up, it’s nothing. But from looking at that code, now I have much a better idea what transaction helper is. I’m pretty sure it’s something that takes your shopping cart items and some shopping application and actually runs the credit card transactions with the card processor. Just from adding those parameters to the constructor, I have a much better idea if I’m reading this code, what this class does, it’s a huge help. And then, from earlier, here’s this event tracking client again where we saw that in a composition style of object-oriented programming, when you’re tracing through code and you see a method, you can go find that method much more easily. And if you’ve ever read-through code, which hopefully you have, then you know that usually what it consists of is, “Here’s this method, oh what does this do, I need to go find that method?” So it’s very helpful.
24:19 SH: There’s extensibility. Here are some more examples from that effective job example, I showed earlier. If you write this instrumented set in a composition style, where it’s a wrapper, instead of a subclass, now I can use it to wrap the original built in set, which is probably what I originally wrote it for. But I could wrap this other tree set object, that I needed to use for some reason or I have this crazy sparse integer set in my program. I can wrap that, too, as long as it fulfills the set interface. So now it’s more reusable.
24:48 SH: We can also, temporarily wrap existing objects. So maybe have some function that takes a set past and that was already constructed and used elsewhere in the program. I can temporarily wrap it and then it will be instrumented for the duration of my function and in any function calls that come out of this function, it’ll be wrapped. And then when this function returns, the wrapping goes away. So we get this temporary instrumentation of existing objects at run time, which you can’t really do with inheritance.
25:12 SH: So that’s nice. Testability, this is usually the main thing that people tout as an advantage of dependency injection. And you can see why, pretty easily. Most people, who have written test, have seen something like this. You make this user object, you call save, you run your test and then you get this weird error about a database server not being able to be looked up. And you’re like, “What, I just made a user object and saved it. Where did this database stuff come from?” And as Miško Hevery puts this, the user class lied. It lied about it’s interface. It said, “You give me an email and password and then you can save me.” But that wasn’t true, it needed a database too. It’s interface didn’t tell you that. It misled you.
25:50 SH: So in dependency injection, it doesn’t mislead you. We could say maybe that the user-save method takes a database, because it needs a database, so then your tests will know, “Oh, I have to pass in a database. Well I’ll construct a fake one. Now my test will be fine.”
26:05 SH: It can help you with debugging. Very common thing, we’ll set some global flag to enable debug logging in our database, in our program. That’s nice, but maybe there’s a huge program with tons of things doing database calls and I just want to debug the calls coming from the transaction manager. Well with compositional dependency injection style, make my database and I’ll pass that database into the user manager and a 1000 other classes. But when I make the transaction manager I’ll pass it in this instrumented or wrapped logging database and that way all the calls that come from my transaction manager will get this extra logging, while the rest of the program won’t do that logging. So that’s nice, gives us more flexibility.
26:42 SH: And lastly and probably most importantly, using dependency injection and composition, I find improves the design of your program. I talked about this a bit, already. But maybe we’ve got this and this is real code, sadly, I wrote it. Real pipeline handler, whatever that does, looks pretty innocent here, but actually this code is in dependency injection style and you look at this constructor and it’s got this shard manager and this writer factory and this DNS cache and the stat keeper. All these things and it’s like, “Whoa! In dependency injection style, pretty obvious that this class probably should be re-factored into smaller pieces. Maybe it’s gotten a little too big.”
27:21 SH: Let’s talk about movin’ out west. So, you’re working on your code and it’s all static binding and singletons. Maybe it’s not even singletons, it’s all just static calls. And it’s got… You can’t test it and it’s hard to debug and it’s hard to understand and we’re gonna have to go pretty fast through this. And you’re like, “Well I wish these were all in dependency injection style. But I don’t know how to move it over. It’s like too late now. And you know it’s not my fault. There was a guy who was here before me. He did all this stuff. I just parroted it.” We’ve all been there. But what can you do?
27:59 SH: Well, actually you can fix things and I will show you how, quickly. Step one, make one shared place for all your dependencies. One global store for all your dependencies and let’s call it wyoming.py, because we’re talking about state. So make wyoming.py and I got my session factory and I’ve got my template manager and I’ve got all my other objects, as global variables. Then, in your application entry point, when you’re building everything, initialize that shared place. So it’s here that I am going to construct the SQL alchemy object and put it into Wyoming. Construct the template environment. Put it into Wyoming.
28:42 SH: And then finally, go… And this is the hard part. Go out into your code and find all those sites that were accessing the Jinja library and the SQLalchemy library and making all these static calls and replace them with reverences to Wyoming. So all those calls to your dependencies, you want to replace the references to Wyoming. This is a big change, but you can make it incrementally. You can change one class at a time or just part of a class, at a time. You can slowly move and that’s important ’cause refactoring must be incremental or you will just break everything, which you don’t want to do.
29:15 SH: So once you’ve done all this, you’ve moved to singleton. Now Wyoming contains all of your singletons. An important thing to remember is when you’re doing this process, don’t put newly created dependencies into Wyoming. Wyoming is not the right answer, it’s just an intermediate home on the way to the right answer, which is dependency injection. So if you start this process and then you say, “Oh, we’re gonna start using Braintree to process credit cards and we’ve got this Braintree proxy object.” Don’t put that into Wyoming. Just use dependency injection from the beginning. Wyoming is just for the existing statically bound dependencies.
29:50 SH: Then once we start down that road, then we can move forward with the final step, which is getting dependency injection. And step one there, is to go through all those objects that were accessing Wyoming, and take those Wyoming accesses and just move them up into the constructor, store them as references in instance variables, and then all the rest of the code just accesses those instance variables. So, Wyoming is only accessed from the constructor. You go around and do this in all your classes and once you’ve got that, you can go a step further, take those Wyoming references that were in your constructor, and actually make them parameters, instead of Wyoming accesses. So now this MyApp object doesn’t access Wyoming at all, anymore. It doesn’t access any statically bound things at all. It just says I need a session factor, I need a template manager. So here’s where we’ve introduced dependency injection, which is great.
30:38 SH: And so now, down here, when we construct the app, that’s where we access Wyoming. That’s were we pull the statically bound global variables and pass them in. And this will be a long process, where you usually… MyApp won’t have been made by main, it will have been made by some other object. So, you move the Wyoming calls into that other object construction site, and then you move them into that class’ constructor and pull them about into it’s instantiation sites and so on.
31:03 SH: But eventually, you’ll pull all of your Wyoming accesses right here, into your entry point, which is a great place to be, because then you can observe. You don’t need Wyoming, anymore. Your entry point was initializing Wyoming with all these global objects, and then it’s passing them in. You don’t need it. You can just have your entry point, initialize all the objects, and pass them straight in and then you don’t need any global references to them anymore. And this is very typical in dependency injection style code, is your entry point code will build a whole bunch of environments and wire them all up to each other. And then that’s it. From then on, there’s no building dependencies or accessing them statically. They’re all just in instance variables. Farewell, Wyoming.
31:46 SH: And that is all the time I have. These are people whose ideas I built on. I didn’t come up with any of this on my own. And our own Chris Mueller and Eric Naeseth gave me early feedback on this. So, thank you to them. Also Chris coined the Wyoming term, I believe. So, congrats. That is all.