Engineering Fitness

Instrumenting Hibernate Connection Providers

Note: this article is written with the assumption that the reader has existing knowledge of the Hibernate framework. For an introduction to Hibernate, see http://hibernate.org/orm.

Code for this post may be downloaded here

Background

Many of the Java web applications we develop at Fitbit leverage Hibernate as an object-relational mapping (ORM) framework. Hibernate provides many abstractions around pluggable services used by the framework. In most situations, there is no need to go beyond the default implementations. One common exception is the ConnectionProvider, which Hibernate uses to both open and close connections to the database. Here’s a simplified diagram of how the Session and ConnectionProvider interact:

ConnectionProviderComponentDiagram - Page 1

The ConnectionProvider abstraction lends itself to intercepting the opening and closing of connections through some other implementation.  This can be leveraged to add connection pooling without affecting the way the physical database connections are opened or closed. We use the popular c3p0 connection pool library for which Hibernate provides an out-of-the-box provider implementation. Although c3p0 and Hibernate expose their own set of metrics, they do so only through JMX, and therefore lack support for the “true” min/max/means and percentile data. Our need to improve visibility into our application’s behavior led us to implement our own ConnectionProvider that introduced a listener abstraction around connection events.

The lack of support for true min/maxes and percentiles in the provided metrics is a result of using instantaneous snapshots of JMX attributes rather than using in-memory (time-framed) histograms. There were a few production incidents that we encountered involving connection pools filling up. Unfortunately, we did not have the histograms added by our implementation, so we were unable to accurately assess both the time that connections were held along with true min/maxes on the number of in-use connections over a period of time. Implementing our own ConnectionProvider that wraps the C3P0 implementation allowed us to add our metrics and monitoring hooks in a single place.

The first metric we added was a measure of how long a connection was held before being released back to c3p0. Although this metric has many uses, we found this metric critical in assessing possible root causes when seeing complete connection pool exhaustion. A connection pool being exhausted may be due to overuse and/or misconfiguration. Unfortunately, it is difficult to distinguish one from the other without this metric. Seeing a spike in the 98th percentile time suggests an issue with the code whereas the problem is more likely to be in the pool configuration or the request load if we see no change in the trend of this metric. Using our new listener abstraction, implementing this metric was as simple as creating a new class that uses a stopwatch to measure time between acquiring and closing connections.

As time went on, more and more code was added to our custom provider until the class became unwieldy. As the provider became crammed with more and more metrics and diagnostics features, it became difficult to not only read and understand the code, but also to test the new features in isolation. It no longer felt maintainable by anybody but those most familiar with that code. A major refactoring was necessary.

Design Decisions

Several features were now hooking into various phases of Hibernate’s connection lifecycle. This lent itself perfectly to the Event Listener pattern. Interception of the lifecycle phases could translate to callback invocations on registered listeners. Although we only cared to intercept the getConnection() and closeConnection(..) method, some features needed to execute code before and/or after acquiring (or closing) a connection. This led to 4 unique events for ConnectionProviderListeners that are each grouped by a sub-interface.

  • PreConnectionAcquisitionListener
  • PostConnectionAcquisitionListener
  • PreConnectionCloseListener
  • PostConnectionCloseListener

The term “acquired” was chosen to be more generalizable between providers using a connection pool and those dealing with physical connections themselves.

Two new Hibernate properties were added to support configuration of the InstrumentedConnectionProvider:

  • “hibernate.connection.delegate_provider_class” specifies the provider implementation class name that should be wrapped and would otherwise be specified in the “hibernate.connection.provider_class” property
  • “hibernate.connection.provider_listener_classes” specifies a comma-delimited list of qualified class names for each connection provider that should be subscribed to the provider, in the order they should be invoked

The InstrumentedConnectionProvider is responsible for instantiating and managing the specified classes in these properties. While there are drawbacks to this approach it fits with existing Hibernate conventions and keeps the InstrumentedConnectionProvider free from external dependencies.

Implementation Details

Each sub-interface of ConnectionProviderListener has callbacks that receive parameters that are available at the time of the event. In the case of pre-acquisition there will not be a reference to any java.sql.Connection available yet, since it has not yet been acquired.

/**
 * Listener type that is invoked immediately prior to acquiring a new connection
 * from the underlying connection provider implementation (e.g. c3p0).
 */
public interface PreConnectionAcquisitionListener extends ConnectionProviderListener {
 
   /**
    * Callback invoked immediately prior to attempting to acquire a connection.
    *
    * @param connectionProvider the connection provider being used to acquire the new
    * connection
    */
   void beforeConnectionAcquisition(InstrumentedConnectionProvider connectionProvider);
}

Similarly, the post-acquisition callback has the connection available if it succeeds and the exception if it fails.

/**
* Listener type that listens only to post-connection acquisition events, which includes
* both successful acquisitions and failures.
*/
public interface PostConnectionAcquisitionListener extends ConnectionProviderListener {
 
   /**
    * Callback invoked immediately after <strong>successfully</strong> acquiring a
    * connection.
    *
    * @param connectionProvider the connection provider invoking this callback
    * @param connection the acquired connection
    */
   void afterConnectionAcquired(InstrumentedConnectionProvider connectionProvider,
                     Connection connection);
 
   /**
    * Callback invoked when an exception occurs (<strong>failure</strong>) during
    * connection acquisition.
    *
    * @param connectionProvider the connection provider invoking this callback
    * @param exc the exception thrown
    */
   void afterConnectionAcquisitionFailed(InstrumentedConnectionProvider
           connectionProvider, Throwable exc);
}

Listener in Action

The ConnectionPerformanceMetricListener class adds support for measuring how long connections have been held before they are closed. This requires callbacks both after acquiring a connection and prior to closing that connection.

public class ConnectionPerformanceMetricListener implements 
   PostConnectionAcquisitionListener,PreConnectionCloseListener

Hibernate reuses a connection provider across multiple threads, so any state information in the listener has to be thread-local to avoid interfering with measurements being taken for a different thread (and Connection). There is also a possibility that the ConnectionProvider will have its getConnection() method called more than once prior to releasing one of the acquired connections. Keeping track of the “acquisition depth” is necessary to ensure that we do not begin using a thread-local timer that is still in use. This would occur if two (or more) getConnection() calls are made before the first timing is measured in the closeConnection(..) call.

/**
 * Thread-local stopwatch that is used to measure the time a connection is held for
 * when acquired from the owning provider prior to being closed/released.
 */
private final ThreadLocal<Stopwatch> connectionUsageStopwatch = 
 new ThreadLocal<Stopwatch>() {
   @Override
   protected Stopwatch initialValue() {
       // use the ticker that was provided so we can mock timings
       return Stopwatch.createUnstarted(ticker);
   }
};
 
/**
 * Keeps track of the depth in connection acquisition calls to the specific
 * ConnectionProvider this listener is attached to. Since we are using thread-locals 
 * here, we cannot support nested connection acquisitions, and this counter ensures
 * that we do not attempt to do so.
*/
private final ThreadLocalCounter connectionAcquisitionDepth = new ThreadLocalCounter();

We allow the ticker to be configurable here so that unit tests do not require calls to Thread.sleep(long).

/**
 * The ticker to use for stopwatches in this class. Start with the system ticker but
 * allow any test code to override so that it can measure timings without requiring any
 * {@link Thread#sleep(long)} calls.
 */
private Ticker ticker = Ticker.systemTicker();

Timing the connection checkout can be accomplished by using a timer that is reset after acquiring a connection and recorded when the connection is being closed. A listener can easily do this by hooking into the afterConnectionAcquired(..) and beforeClosingConnection(..) events.

@Override
public void afterConnectionAcquired(InstrumentedConnectionProvider connectionProvider,
               Connection connection) {
   // reset and start the usage stopwatch now if this is the only connection that
   // 'will' be checked out from this provider (e.g. there is not already one that has
   // been acquired but *not* released)
   int depth = connectionAcquisitionDepth.incrementAndGet();
   if (depth == 1) {
       connectionUsageStopwatch.get().reset().start();
       connectionMetricReporter.recordConnectionAcquired(/*isTopLevel=*/true);
   } else {
       // record that we did this but do not consider it an 'acquisition' in the
       // metrics unless we are also timing it
       connectionMetricReporter.recordConnectionAcquired(/*isTopLevel=*/false);
   }
}
 
@Override
public void afterConnectionAcquisitionFailed(
   InstrumentedConnectionProvider connectionProvider, Throwable exc) {
   // simply record failure, no timing is necessary
   connectionMetricReporter.recordAcquisitionFailure(
               connectionAcquisitionDepth.getValue() == 0, exc);
}

You’ll notice that we are recording this “acquisition depth.” This counter is to ensure that we do not accidentally interfere with a timing that is in progress in cases where getConnection() might be called multiple times on one provider prior to closing the connections.  

@Override
public void beforeClosingConnection(Connection connection) {
   int depth = connectionAcquisitionDepth.decrementAndGet();
   // make sure we haven't released a connection more times than we have acquired one
   if (depth < 0) {
       throw new IllegalStateException("getConnection and closeConnection calls are " + 
           "balanced. Is a connection being closed multiple times?");
   } else if (depth == 0) {
       // this marks the last time the user code held the connection, so we can halt
       // the usage duration stopwatch but wait until after the operation completes to 
       // record it
       long usageDurationMillis = connectionUsageStopwatch.get().stop().elapsed(TimeUnit.MILLISECONDS);
       connectionMetricReporter.recordConnectionClosed(/*isTopLevel=*/true, usageDurationMillis);
   } else {
       connectionMetricReporter.recordConnectionClosed(/*isTopLevel=*/false, -1);
   }
}

Similar to the acquisition events, the listener must hook into both the before- and after-close events so that it can measure the elapsed time during that operation. Since closing the connection is indicative of its consumer being done with it, this callback also stops and records the connection usage timing.

Conclusion

The InstrumentedConnectionProvider, as presented in this article, is the product of evolving Hibernate monitoring needs. Working with our monitoring code is much easier since refactoring our connection provider. Each feature is now isolated to its own listener class and can now be tested individually. We were able to eliminate the need for classpath visibility between the provider and our listener implementations by using Hibernate properties, allowing our solution to evolve into the generally usable implementation presented in this article.  The connection provider is just one of the many Hibernate abstractions that can be configured to use user code, empowering developers to write a variety of crosscutting monitoring features.

This article is the first in a series of posts about instrumenting Hibernate. The next article will cover Hibernate Transactions and discuss a long-running transaction detection mechanism created using the TransactionFactory interface.

About the Author

AuthorPhotoDavid Garson joined Fitbit in 2014 and has been working as a Site Reliability Engineer since the team was formed. David is a coding enthusiast who specializes in incident diagnostics and application visibility. He enjoys working working with distributed systems and has a passion for creating and contributing to tools and frameworks relating to application visibility, extensibility, and configurability.

1 Comment   Join the Conversation

1 CommentLeave a comment

If you have questions about a Fitbit tracker, product availability, or the status of your order, contact our Support Team or search the Fitbit Community for answers.

Leave a Reply

Your email address will not be published. Required fields are marked *