How (and why) Tomcat uses custom Class Loaders to load web applications

What are Class Loaders?

When the JVM needs to use a class (e.g. to execute an instance method), it has to first find that class file and then load it’s contents. The JVM uses class-loaders for this task of locating the class files somewhere (e.g. in a local directory specified on the CLASSPATH) and loading the Java classes defined inside them.

  • JVMs usually come bundled with a several class loaders that load different kinds of classes –
    • Bootstrap class loader – loads all classes from the core Java API (e.g. String, Integer)
    • Standard Extensions class loader – loads the classes downloaded and installed as part of standard extensions ($JAVA_HOME/jre/lib/ext)
    • Class path class loader – loads classes from the directories and JARs specified on the classpath (this includes classes belonging to the running application)
  • In addition to these, java applications may create their own custom class-loaders by extending the java.lang.ClassLoader class (e.g. tomcat uses custom class-loaders to load web applications)
  • Each class which has been loaded inside the JVM is associated with the class loader that loaded it. E.g. you can get the classloader that loaded the String class by calling String.class.getClassLoader()
  • Class-loaders are tied to each other in a parent-child fashion –

class_loader_hierarchy

  • When a class-loader is asked to load a class, it first asks its’ parent class-loader to load that class (and the parent class-loader asks its’ own parent and so forth). If a parent class-loader can find and load the class, it does so and the flow ends. If the parent can’t, only then the child class-loader tries to load the class itself. This way, class-loaders usually “delegate” class-loading requests to their parent first

Benefits of the class-loader architecture

  • It protects the boundaries of trusted libraries
    • When a class references another class, the JVM uses the same class loader that loaded the referencing class to load the referenced class. E.g. ConcurrentHashMap is loaded by the bootstrap class-loader and internally it references to many other classes (e.g. java.util.Arrays). While resolving those dependencies, the JVM uses the bootstrap loader, which would reliably load those classes from the core API library only. No-one else can put his malicious version of the Arrays class on the class-path and trick the JVM into using that instead
  • It creates separate namespaces inside the JVM
    • Classes loaded by a class-loader are visible to its child class-loaders, but NOT to its sibling class loaders. This way, classes loaded by class-loaders that are not in the same hierarchy actually belong to different namespaces. They cannot see/detect each other and cannot interact with each other (unless explicit security configuration is done)

class_visibility_across_class_loader_hierarchy

    • Given this, using multiple class-loaders, it’s possible to load the same class multiple times. This also means that to identify a class in a running JVM, its’ fully qualified name is NOT enough – we need to know the class-loader as-well
    • We can apply separate security policy to classes loaded by each class-loader. This gives a powerful way to seggregate and secure classes loaded by different libraries according the how much we trust each of those libraries

How Tomcat benefits from this architecture

  • The tomcat server and the applications installed on it – all run on a single JVM (in a non-clustered environment)
  • Tomcat uses a well-crafted class-loader hierarchy to load the server code, the code shared by all applications and each applications’ codetomcat_class_loader_hierarchy
  • Class-loaders used by Tomcat are as follows (exact details depend on the JVM being used) –
    • Bootstrap – loads the core Java API and the standard extensions
    • System – supposed to loads classes specified on the CLASSPATH. However, tomcat standard scripts totally ignore the CLASSPATH and build a system class-loader which looks at JARs in the $CATALINA_HOME/bin directory. These JARs are the building blocks of the tomat server
    • Common – loads the classes that shared by all applications ($CATALINA_BASE/lib and $CATALINA_HOME/lib). These are visible to tomcat server as-well. As a good practice, application classes should not be placed here. Each application should include the classes it needs in its’ own WAR file
    • Web-application class-loaders – loads the web-applications installed on the server. One class-loader is created per web application. Since these class-loaders are siblings of each other, they can’t see the classes loaded by each other
  • When a web application class-loader is asked to load a class, instead of always delegating it to the parent first, it actually  tries to load the class itself (with some exceptions e.g. the core Java API and Java EE API classes). This way, a web-appplication can override the common classes shared by all applications
  • Since tomcat uses a separate class-loader per application, each application gets sandboxed in a separate namespace. This comes with following benefits-
    • Classes loaded by an application cannot adversly impact other applications and visa versa
    • Applications can load classes with same names. Even better, those classes may be from different versions of their library (e.g. Spring 2 VS Spring 3). Thus, applications do not get into dependency conflicts with each other
    • Reloading classes for one application doesn’t impact others
    • You can install a separate security policy per application depending on the level to which you trust the application

References

  • https://tomcat.apache.org/tomcat-8.0-doc/class-loader-howto.html

Found this useful? Please leave a Reply :)