What if CDI and JPA would be friend?

, Romain Manni-Bucau, 2016-11-02, 8 min and 18 sec read

JavaEE 7 brought JPA and CDI integration allowing to get injections in listeners. This is a nice feature but it doesn't change the programming model as much as CDI could. Let see if we can give more power to the user relying on CDI more structurally!

How JPA looks like today

Today JPA has two available modes:

RESOURCE_LOCAL
JTA

Where strictly speaking both could be used by anyone, in practise it is often strictly equivalent to:

Application
Container

Both will provide access to an EntityManager but before that point it is a bit different. Let's dig quickly into it.

Resource local

In term of code you get an EntityManagerFactory (so an EntityManager) using:

String unitName = "...";
Map programmaticConfig = ...;
EntityManagerFactory factory = Persistence.createEntityManagerFactory(unitName, map);

Then the pitfall of resource local mode is you need to handle transactions yourself instead of relying on JTA which has @Transactional (but you can write your own interceptor):

entityManager.getTransaction().begin();
try {
  doSomeWork(entityManager);
} catch (RuntimeException re) {
  entityManager.getTransaction().rollback();
  throw re;
}
entityManager.getTransaction().commit();

Side note: most of the time you use a library like DeltaSpike or Spring to hide this boilerplate.

This code works if you define a persistence.xml with something like:

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0"
             xmlns="http://java.sun.com/xml/ns/persistence"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://java.sun.com/xml/ns/persistence
                       http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
  <persistence-unit name="demo" transaction-type="RESOURCE_LOCAL">
    <class>org.demo.User</class>
    <properties>
      <!--
      some properties to create the database,
      define the datasource connection (driver, url, user, ...),
      ....
      -->
    </properties>
  </persistence-unit>
</persistence>

JTA: easier but....

With JTA the persistence.xml is close but get rid of the datasource connection information in favor of a datasource name retrieved using JNDI or a container specific lookup:

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0"
             xmlns="http://java.sun.com/xml/ns/persistence"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://java.sun.com/xml/ns/persistence
                       http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
  <persistence-unit name="demo">
    <class>org.demo.User</class>
    <properties>
      <!--
      some properties to create the database,
      ....
      -->
    </properties>
  </persistence-unit>
</persistence>

The usage is then far easier since @PersistenceContext makes it automatically available:

@PersistenceContext
private EntityManager em;

Transactions are then completely delegated to JTA so either an EJB or @Transactional on the enclosing method works well.

What's the pitfall?

So what's the issue between JPA and CDI, and why a post about it?

The integration between CDI and JPA defined that the JPA provider gets the BeanManager (CDI) from the properties passed to the provider to be able to do the work it needs (injections). This is great but CDI should be able to rely on JPA in its startup phase and the JPA provider can do the injections/lookup when the EntityManagerFactoryis created. Both use cases are perfectly valid for the spec and...incompatible (chicken-egg you said?). Of course some lazy loading on JPA side makes it smoother but all providers (Hibernate ;)) don't do that without extending its SPI.
The DataSource: it is either using properties in persistence.xml or a container defined name. This require some DataSource facade to be able to switch the datasource or to check in the container to see which one is defined in the container configuration format - which implies split the configuration of the application.
Not very visible on previous examples but the scanning is another pitfall: CDI scans for beans but JPA will likely rescan (didn't find a way to not scan with hibernate without writing a custom hibernate scanner for instance). The integration could rely on CDI to get this scanning for free and avoid the double pain in term of scanning time.
We can - by extension - say the same of most of the configuration (validation mode, cache mode properties etc...) localized in the persistence.xml: split the configuration, not programmatic or provider dependent etc...

Is there a solution?

Before giving some ideas, it is important to write block on white this section is NOT part of the specification.

Now things are clear, what would be the idea? Consider the persistence.xml could be handled programmatically...wait, it is already the case, no? Yes exactly! That's why this integration is not "finished".

JPA specification expose for the containers the PersistenceUnitInfo API. It is mainly a representation of the persistence.xml file allowing to programmatically configure the provider.

Note: all providers don't support to not have a persistence.xml but some of them do which means using that API you don't use any XML anymore :).

PersistenceUnitInfo or programmatic JPA configuration

The PersistenceUnitInfo can be seen as a JPA configuration provider. It holds the JPA configuration:

unit name
transaction type (RESOURCE_LOCAL or JTA)
the datasource(s)
managed classes (entities, mapped super classes, embeddable etc...)
validation mode
cache mode
unit properties (where to put the DDL for instance)
....

Once you configured your PersistenceUnitInfo (or when you implemented this interface to be clearer), you can call PersistenceProvider class to get an EntityManager:

EntityManagerFactory factory =
  provider.createContainerEntityManagerFactory(info, new HashMap());

Your question there is probably how to get a provider instance? You have two options: do a new of the implementation you use:

new PersistenceProviderImpl()
  .createContainerEntityManagerFactory(info, new HashMap());

or use reflection to get it from the info configuraton:

final PersistenceProvider provider;
try {
  provider = PersistenceProvider.class.cast(
    Thread.currentThread().getContextClassLoader()
      .loadClass(info.getPersistenceProviderClassName()).newInstance());
} catch (final Exception e) {
  throw new IllegalArgumentException(
    "Bad provider: " + info.getPersistenceProviderClassName());
}
final EntityManagerFactory factory = provider
  .createContainerEntityManagerFactory(info, new HashMap());

What about this HashMap? The second parameter of createContainerEntityManagerfactory is a Map. It is intended to communicate between the EE container and the JPA provider all the integration points which can be useful. Concretely that's here you can - or not - pass the Bean Validation factory and CDI BeanManager to use:

final EntityManagerFactory factory = provider.createContainerEntityManagerFactory(info, new HashMap() {{
  put("javax.persistence.bean.manager", beanManager);
  put("javax.persistence.validation.factory", validatorFactory);
}});

Side note: this only shows the EntityManagerFactory instantiation cause it is the hard part but don't forget to call close() on the factory once the application is shutting down - will avoid to leak and release all the provider needs like CDI instances for example.

Wire it to CDI: the scanning

To get the JPA components a CDI extension could scan them with this observer (to put in a CDI Extension implementation):

void collectEntities(
    @Observes
    @WithAnnotations({Entity.class, MappedSuperclass.class, Embeddable.class})
    final ProcessAnnotatedType<?> jpa) {
  jpaClasses.add(jpa.getAnnotatedType().getJavaClass().getName());
}

This will build a jpaClasses list of all scanned entities (jpa components actually). Then when creating the EntityManagerFactory it is simply to set the managed classes property to this value and the excludeUnlistedClasses to true (to prevent another jpa scanning).

What about the DataSource?

The DataSource is probably the biggest issue in JPA today: it relies on the persistence.xml or the container which makes the application either enforcing the configuration and making operation teams without real configuration to tune the datasource/pool or it makes the application completely relying on the container for the connection pooling and the operation teams for the pooling tuning. Depending the kind of company you work in and the way to deliver the application both options can add complexity for a poor gain and it also adds the risk to get bad performances at runtime if you don't control the server and the configuration the application will be deployed to.

So what would we need? A way to define the DataSource from the application with the pooling library the application selected. It would then enable to:

ensure the performances are good enough with the selected stack
ensure the configuration is consistent with the application (same "source" of configuraton and patterns for entries)
ensure the configuration matches the application needs (can be dynamic, environment aware, updated at runtime etc...)

If you check the PersistenceUnitInfo API, you will realize the DataSource is provided, the instance not a name or properties. This means that using the previous proposed setup you would just need to pass a DataSource instance to your JPA provider.

Where to take this instance from? CDI of course. CDI is a Bean registry. If you know how to lookup an instance then you are done! Let's take the simplest case: a DataSource instance. It can be as simple as these lines:

Set<Bean<?>> beans = bm.getBeans(DataSource.class);
Bean<?> bean = bm.resolve(beans);
if (bean == null || !bm.isNormalScope(bean.getScope())) {
  // fail: no DataSource found or the scope is not compliant with JPA
}
DataSource ds = DataSource.class.cast(
  bm.getReference(
    bean, DataSource.class,
    bm.createCreationalContext(null)));

This code just do a standard lookup from the BeanManager of CDI (bm) of a DataSource instance. The only trick is to ensure you get a normal scoped instance. Strictly speaking this is not mandatory and you could use a pseudo-scope or even @Dependent instance but if you use the DataSource directly elsewhere or want to rely on CDI features like destroying and recreating an instance at runtime you could get troubles.

Side note: if you choose to go with a @Dependent instance, you will also need to release when the application will shut down the CreationalContext to not leak and execute the "destroy" methods (@PreDestroy, @Disposes, ...).

Now we have a DataSource we just return it in our PersistenceUnitInfo and JPA will use the CDI instance.

Tip: this logic can also apply to the ValidatorFactory ;).

Putting it all together

What does it look like? It looks super complicated to use no? Actually the technical stack is maybe not obvious or intuitive but the user code is:

@ApplicationScoped
public class JpaConfig {
    @Produces
    @ApplicationScoped
    public DataSource dataSource() {
    BasicDataSource source = new BasicDataSource();
    source.setDriver(new Driver());
    source.setUrl("jdbc:h2:mem:jpaextensiontest");
    // ...
    return source;
  }
}

We just need a producer for the datasource (it can get its config from injections like @ConfigProperty of DeltaSpike).

Then we need a small extension with this responsability:

capture units to create (which will make them injectable)
add the transactional interceptor for RESOURCE_LOCAL units on beans getting an EntityManager injection to ensure it can use it - the same interceptor would support to activate/deactivate the transaction for performance reasons.
(optional) if handling multiple units it can retrieve the PersistenceUnitInfo to build from the CDI beans

In term of end user code it can look like:

@ApplicationScoped
public class UserService {
    @Inject
    @Unit(name = "test") // the extension adds units with this qualifier
    private EntityManager em;

    // tx by default
    public User save(final User user) {
        em.persist(user);
        return user;
    }

    @Jpa(transactional = false) // no tx
    public User find(final long id) {
        return em.find(User.class, id);
    }
}

Conclusion

All right, ready to start? Almost, this post doesn't detail how to implement the extension but if you are interested there is an experimental module in microwave (a microservice server propulsed at OpenWebBeans community) implementing this idea. Check out the code source.

In any case, JPA could leverage far more CDI to make it smoother and more modern so don't hesitate to experiment and go further than what the specifications do to make your life and operation teams one easier!

From the same author:

Romain Manni-Bucau

In the same category:

Java/EE/Microprofile