Erwin Vervaet

Saturday, February 5, 2011

Hibernate hurdles

I've been using Hibernate for the better half of 10 years now (since Hibernate 2), and while I really like Hibernate it still surprises me from time to time how complex things can get. This post highlights a few of these hurdles. I'm pretty sure other ORMs have the same or similar issues, but I'll use Hibernate here because that's what I know best. To demonstrate things I will use the classic Order/LineItem example:

@Entity
@Table(name = "T_ORDER")
public class Order {

 @Id
 @GeneratedValue
 private Long id;

 @ElementCollection
 @CollectionTable(name = "T_LINE_ITEM")
 private Set<LineItem> lineItems = new HashSet<LineItem>();

 public Long getId() {
  return id;
 }

 public Set<LineItem> getLineItems() {
  return lineItems;
 }
 
 public int getTotalPrice() {
  int total = 0;
  for (LineItem lineItem : lineItems) {
   total += lineItem.getPrice();
  }
  return total;
 }
}

@Embeddable
public class LineItem {

 @ManyToOne(fetch = FetchType.LAZY)
 private Product product;
 
 private int price;
 
 @SuppressWarnings("unused")
 private LineItem() {
 }

 public LineItem(Product product, int price) {
  this.product = product;
  this.price = price;
 }
 
 public Product getProduct() {
  return product;
 }

 public int getPrice() {
  return price;
 }
}

@Entity
@Inheritance(strategy = InheritanceType.SINGLE_TABLE)
@Table(name = "T_PRODUCT")
public abstract class Product {

 @Id
 @GeneratedValue
 private Long id;
 
 private String name;
 
 protected Product() {
 }
 
 protected Product(String name) {
  this.name = name;
 }
 
 public Long getId() {
  return id;
 }

 public String getName() {
  return name;
 }
}

@Entity
public class PcProduct extends Product {
 
 @SuppressWarnings("unused")
 private PcProduct() {
 }

 public PcProduct(String name) {
  super(name);
 }
}

@Entity
public class MacProduct extends Product {
 
 @SuppressWarnings("unused")
 private MacProduct() {
 }

 public MacProduct(String name) {
  super(name);
 }
}

That's pretty straightforward stuff. Assume we have an order stored in the database:

Product pc = new PcProduct("Dell XPS M1330");
session.save(pc);
Product mac = new MacProduct("Apple MacBook Pro");
session.save(mac);

Order order = new Order();
order.getLineItems().add(new LineItem(pc, 1000));
order.getLineItems().add(new LineItem(mac, 2000));
session.save(order);

Now let's look at some surprising pieces of code.

Beware of fetch joins

Suppose you have a business transaction processing orders involving Dell XPS M1330s:

List<Order> results = session.createQuery(
  "from Order as o join fetch o.lineItems as li where li.product.name = 'Dell XPS M1330'").list();
for (Order result : results) {
 System.out.println(result.getTotalPrice());
}

What do you think this prints out for our sample order containing both a Dell XPS M1330 and an Apple MacBook Pro? If you guessed 1000 you would be right... oopsie! The fetch in the query combined with criteria on the joined relation causes the lineItems collection of the resulting Order entities to be incomplete! Unfortunately, there is no way to tell that those Order objects are crippled. Worse still, they will end up in the Hibernate session so you might inadvertently end up using them later on in the same transaction, unaware that they are damaged goods. This can lead to very hard to diagnose bugs, because the cause of the problem (the join fetch) and the effects (incorrect results) might be far apart in the code. It's interesting to note that normal joins (so not using fetch) do not have this adverse side effect. So my advice is:

Be very wary of fetch joins. Only use them if you are sure the resulting objects will not be reused later on in the same session for different purposes.

Beware of inheritance mapping

Let's look at another order processing transaction:

long orderId = ...;
Order order = (Order) session.get(Order.class, orderId);
Product product = order.getLineItems().iterator().next().getProduct();

PcProduct pcProduct = (PcProduct) session.get(PcProduct.class, product.getId());

System.out.println(pcProduct.getId().equals(product.getId()));
System.out.println(pcProduct == product);

If you run this piece of code, Hibernate will spit out a warning:

1463 [main] WARN org.hibernate.engine.StatefulPersistenceContext.ProxyWarnLog
 Narrowing proxy to class com.ervacon.order.PcProduct - this operation breaks ==

And indeed, the output is true / false, confirming that the session can now no longer guarantee that primary key equality is equivalent with reference equality (==). This is a dangarous situation that could again lead to hard to diagnose bugs, especially if some parts of the code (e.g. the lineItem.getProduct() call) are only executed is certain situations. There are other problems with mapping inheritance hierarchies (e.g. the fact that instanceof in unreliable), so my advice here is:

When using Hibernate, avoid mapping inheritance hierarchies. There are often better ways to factor your code that do not require mapping an inheritance hierarchy to the database, for instance by introducing an common interface to facilitate polymorphism.

There are other non-obvious head-scratchers you run into when using an object-relational mapper like Hibernate. If you know about an interesting one, I'd love to hear about it!

Thursday, November 11, 2010

To reuse or not to reuse, that's the question

It's clear that code duplication is a very bad code smell. As a result you use things like extract method refactoring to keep your code DRY. This brings up the question if this can be taken too far? Should all code duplication be avoided at all times?

Let me illustrate. Assume you have a bit of code like this:

public static void notNull(Object obj) throws IllegalArgumentException {
 if (obj == null) {
  throw new IllegalArgumentException("Argument cannot be null");
 }
}

This method is clearly designed for reuse. It's a small building block of code that can be used thoughout a piece of software to avoid duplicating null-checking code in several places. In this case there are also no real downsides to consider.

Now consider a somewhat higher level piece of (admittedly extremely contrived) code:

public void doSomeProcessing(String str) {
 checkPreconditions(str);
 ...
}

private void checkPreconditions(String str) {
 Check.notNull(str);
 if (!"A".equals(str) && !"B".equals(str)) {
  throw new IllegalArgumentException("Argument can only be A or B");
 }
}

As it stands right now the private checkPreconditions() method is not designed for reuse. If another component in the system happens to require the exact same preconditions, should you try to reuse it? In some situations the answer to this question is clearly: yes! If the new component intrinsically requires the same preconditions, it's natural to try to reuse the method and highlight the fact that the two components have some functional relation. However, in other situations the fact that the two components do the same pre-condition checks might be purely accidental. In other words: there is no functional relation between the two components. In this case refactoring the code to be able to reuse the checkPreconditions() method in both cases comes with a few downsides you should consider:

By reusing the method, the refactored code now potentially communicates a relationship between the two components, where there really is none.
If the preconditions of the two components are in reality unrelated, it would not be uncommon for the preconditions
of one of the components to change, while the other component's preconditions remain the same. When making this change in the refactored code, you need to realise you can't just change the common checkPreconditions() method. Instead, a more complex code change will be required.

In my experience DRY code is certainly something to strive for. However, avoiding code duplication at all costs also comes at a cost!

Saturday, October 30, 2010

Installing ING Home'Bank and Belgian eID on Ubuntu

I spend some time today upgrading my laptop from Ubuntu 9.04 (Jaunty Jackalope) to 10.04 LTS (Lucid Lynx) since 9.04 reached it's end of life earlier this month. Overall this was a very smooth process except for two specific apps that I require: ING Home'Bank and Belgian eID.

In an attempt to help people who also need to install one of these apps on a recent Ubuntu version, here is how I did it:

Home'Bank
ING is phasing out it's Home'Bank security module so the download is ages old and references a bunch of old libraries.

Start by downloading the HomeBank333.deb and simply install it as you would any other .deb. Next thing to do is pin the Home'Bank version in Synaptec to avoid this bug: 380511. To do that, start Synaptec, find the homebank package and select Package>Lock Version.

If you try to run /opt/HomeBank/HBSecurity, it will complain about 3 missing libraries:

libtiff.so.3 is missing -- simply sym-link it:

cd /usr/lib
sudo ln -s libtiff.so.4 libtiff.so.3

libexpat.so.0 is missing -- install and sym-link it:

sudo apt-get install libexpat1
cd /usr/lib
sudo ln -s /lib/libexpat.so.1 libexpat.so.0

libstdc++libc6.2-2.so.3 is missing -- this is an old Dapper Drake package that you can still download and install

Once these steps have been completed you should be able to run /opt/HomeBank/HBSecurity and start using Home'Bank.

Belgian eID
This is actually straightforward. A .deb package is provided for recent Ubuntu versions that installs and runs without any problems. The only problem on my machine was that my smart card reader was not being recognized (an Alcor Micro Corp. EMV Certified Smart Card Reader in my case -- do a lsusb in a terminal to find out what type of smart card reader you have). To rectify that, I had to install the libccid package:

sudo apt-get install libccid

Once that is done you can simply follow the install instructions provided in a PDF to configure your Firefox to use the new certificates and you should be up-and-running!

Thursday, October 7, 2010

`ThreadLocal` naming conventions

Since Sonar is now part of our normal build platform on the project I'm currently doing, we're keeping a close eye on the violations it identifies. One rule we violate in a couple of places is the CheckStyle constant naming rule. This rule simply says that the names of all static final variables should be ALL_UPPERCASE_WITH_UNDERSCORES. This is of course a well known Java coding convention. Still, it feels a little unnatural if you apply it to ThreadLocals, which are technically static final variables, but are neither immutable nor do they have a global scope like typical constants. This makes code using ThreadLocals look a bit weird if you use normal Java constant naming, e.g. compare the following:

private static final ThreadLocal<Date> TIME_FRAME = new ThreadLocal<Date>();
...
TIME_FRAME.set(myDate);

private static final ThreadLocal<Date> timeFrame = new ThreadLocal<Date>();
...
timeFrame.set(myDate);

For me using normal variable naming conventions for ThreadLocals seems to better communicate their role and intented usage in the code. Does anybody have an idea what the official Java naming conventions for ThreadLocals are?

Friday, August 27, 2010

Killer tool: Sonar

Today a colleague at work showed Sonar to me, and I must say that I was really impressed! Sonar is an open source code quality analysis tool that uses a number of popular Java code analyzers like PMD, CheckStyle, FindBugs and Cobertura under the hood, and presents the metrics calculated by all of these in a consistent and integrated way. It also keeps track of all the metrics data in a database so you can see how your code quality evolves.

One part of Sonar is a Web front-end for the metrics database, and the other part is a Maven plugin that runs all code analyzers and pumps the collected data into that database. (As an aside: imagine how much harder it would be to develop a tool like this if there wasn't a dominant build solution in the Java space like Maven.)

As a package Sonar just ticks all the right boxes. It's really easy to get up and running (starting its DB/Web server and running mvn clean install sonar:sonar in your Maven project is all there is to it), feels absolutely solid, has a friendly UI and looks very polished overall.

Warmly recommended!

Wednesday, July 21, 2010

How much damage can a Java `Error` do?

In Java there is a clear distinction between recoverable Exceptions and fatal Errors. It's common for applications to have a catch-all block so that they can continue running even if unexpected problems occur:

while (true) {
 try {
  // execute application logic
 } catch (Throwable t) {
  // log exception and/or display to the user
 }
}

This will catch both Exceptions and Errors. Although it's fine to catch an Exception and have the application carry on, this is typically not a good idea when an Error occurs!

I recently saw a bug in an application that caused a StackOverflowError. However, because of a catch-all block the application tried to recover from the problem and carry on. As it turns out this caused all kinds of instability and seemingly unrelated problems: connection leaks, ThreadLocal corruption and other such joys.

If you think about it this makes sense. A StackOverflowError indicates that the JVM ran out of stack space to properly invoke methods. Of course this has implications for things like finally blocks, and certainly for method invocations in those finally blocks! If you can no longer rely on the proper execution of your finally blocks you end up in a world of pain with things like messed up ThreadLocals. There are similar considerations for OutOfMemoryError and certainly for the other VirtualMachineErrors.

In summary: Errors can do a lot of damage! Don't keep the application running if you encounter an Error. Instead, try to log the problem and die. Only recover in case of Exceptions.

while (true) {
 try {
  // execute application logic
 } catch (Exception e)
  // log error and/or display to the user
  // => recover
 } catch (Error e) {
  // log error and/or display to the user
  // => exit JVM
 }
}

Tuesday, June 1, 2010

Bitemporality in Python

A few years ago I published a Java framework to deal with bitemporal issues in rich domain models. The code is freely available under a BSD style license.

As a first piece of non-trivial Python programming I coded up an equivalent framework in Python 2.5, available here. The framework allows you to set up a class with bitemporal properties that track the history and evolution of the value assigned to those properties. Here's a small example that should give you a feel for what you can do:

from bitemporal import *

class Person:

 def __init__(self, name):
  self.name = name
  self.address = BitemporalProperty()

p = Person("John Doe");
p.address.assign("Smallville", Interval(date(1975, 4, 3), date.max))
p.address.assign("Bigtown", Interval(date(1994, 8, 26), date.max))

print p.address.now() # Bigtown
print p.address.on(date(1980, 1, 1)) # Smallville

I'm looking for feedback on this code, which right now is only about 300 lines long. Is this code pythonic? Or is this Java-style Python code? If so, what areas need to be improved? Personally, I have doubts about at least a few things:

The type checks in the TimeFrame class (and a few other classes) feel iffy:

@classmethod
def set_reference(cls, value):
 if type(value) == date:
  cls.reference = datetime.combine(value, time())
 else:
  cls.reference = value

But how else do I ensure that a timeframe always holds a datetime?

The BitemporalWrapper class feels Java-like. It decorates another object with bitemporal information and functionality. Is there a better way to do this in a dynamically typed language?
I considered using @property for the assign() method on the BitemporalProperty class, but the problem is that you often want to pass in a validity interval, which rules out use of "=".

Does anybody have any tips how this code can be improved?