Implementing a bi-directional association in Java

2011-12-18

It occurred to me many Java developers don't really understand the semantics, consequences, and implementation of a bi-directional association between objects. In this article, I would like to discuss a proper way of implementing such a relationship. I will be using a classic example: the relationship between a Department and Employee. This "works-at" relationship is depicted in the UML diagram below.

UML association

A UML association is implemented in Java using fields. That's what an association means: the relation becomes part of the state of the objects. In contrast with a dependency. Here one object uses the other, but does not keep it as part of its state. The dependent object is for example passed into the method as an argument, or is the return of a calling method.

In this case the Department needs a Collection field holding the employees. We name this field using the role name of the other end. The Employee then gets a field to the Department.

public class Department {
  private Set staff = new HashSet();
  ...
}

public class Employee {
  private Department department;
  ...
}

If this was a uni-directional association, one of the sides would not have such a field.

Why is there a relationship?

This is perhaps also a good time to back-paddle a little and ask ourselves why do we have this association. Why do we implement the association "Employee works at Department"? You might reply along the lines of: "because that is reality, that's the way it is: an employees work at an Department. But this is not enough in the case of Object-orientation programming.

Terms, facts and business rules

The answer from above is correct within your business model, were you to define business semantics (terms and their relationships) and business rules. In a business model, you describe the business rules. In the world of business rules, there is a mantra that often boils down to: "rules build on facts, facts build on terms".

To give you an example: Perhaps there is a rule that states: "A passenger must be 18 years old to book a flight". This is a rule based on the fact "passenger books flight", which is made up of the business concepts (in the form of terms), named "passenger" and "flight".

In a business model for our employees we would therefore have the fact: Employee works at Department (for rules such as "An Employee works at exactly one Department").

So why did this fact make it into our IT model? The answer is simple, why do objects need to know each other? The answer is because it needs the behaviour (or data) of the other objects. In other words, it invokes methods on it.

The key aspect of an Object is that it has behaviour (which creates the need for state which is changed by this behaviour). For an object to exercise its behaviour it might need to collaborate with another objects. This is where a relationship is born (association, aggregation of simply a dependency).

So somewhere there is a use case with a sequence diagram where an employee instance invokes a method on the department object. Moreover, in the case of a bidirectional-association, a department object using an employee object. If such method invocation does not occur there would be a uni-directional association, or no association at all (even though it exists in your business model, there was just no reason for it to also make it into the IT domain model).

What's the problem?

So let's go back to the subject of this article. What is the problem with a bi-directional association? Well, unlike a relation in a relational database, a bi-directional association is stored on two ends. To illustrate the problem

// Let's assume "Jennifer" currently works at the "IT" department
Department sales = repository,getDepartment(...);
Department it = repository,getDepartment(...);
Employee jennifer = repository,getEmployee(...);
jennifer.getDepartment(); //  returns IT
jennifer.setDepartment(sales)

Now this code would never work. You must also add Jennifer to the sales department:

// Never have set methods (just for illustration of a problem :)
sales.getStaff().add(jennifer);

That's really bad: you placed a "button" on employee you may press (the setDepartment method), but above the button you have to place a warning sign saying: "if you press this button, you must also press this and that button". That's like putting a vending machine in a building for cold drinks. When you want your favourite drink, you have to press one button and then you must press another button, otherwise the machine explodes! That does not sound very safe - however that's how many people implement software.

Anyway, the code above still even does not do the trick, you would also have to remove the employee from the previous IT department

// We need to press a third button
it.getStaff().remove(jennifer);

A solution (sort of)

As you can see it is very easy to blow-up your system. What we need to do is think more object-oriented and think in terms of behaviour and responsibilities.

First we need to ask ourselves: "who are we going to make responsible for managing this relation" and "what behaviour makes the connection between these objects"?

Let's say in this example, employees are "hired", "fired" from one department to another. What we can decide now is to make these methods part of the Department class (and by that making the department responsible for managing the relation). Let's start with the Department class:

public void hire(Employee employee) throws HumanResourceException {
  staff.add(employee);
  // still missing something
}

public void fire(Employee employee) throws HumanResourceException {
  if (!staff.contains(employee)) throw new HumanResourceException("Employee does not work at this department");
  staff.remove(employee);
  // still missing something
}

This looks very straightforward. But what about the state of the employee? The state of the employee instance must be updated as well. How is the department going to change the state of the employee?

This is where it gets tricky (and ugly in Java). We need to add methods to the Employee class to get/set the department value, but(!) these methods should not be called by anything else than department.

How can we accomplish this is Java? How can we limit the scope of a method to one other class? Well, you can't! (well you could analyse the call stack programmatically) We need to limit the scope as little as possible and somehow make it clear these buttons should only be pressed by department objects.

Well let's first make the scope "package wide", and prefix the methods with an underscore. That seems to be an unwritten law which states that "methods with underscores are used for internal purposes")

void _changeDepartment(Department department) {
  assert this.department == null;
  this.department = department;
}

void _removeFromItsDepartment() {
  assert this.department != null;
  this.department = null;
}

The first method "_changeDepartment" reeks as bad code. True, there is however no other way. To its defence, it is only used internally by the department and is not a button that should/can be pressed by other objects (except for objects inside the same package; we have no control.)

Notice the use of assert. I am a big fan of using assert. For many reasons. For one, it helps the users of your classes (your co-developers and future developers of the system) to understand the constraints and intend behaviour.

Below is a UML diagram depicting the final design.

You can download the demo code for this together with a unit-test (a Specs2 specification): here (you can run the test with maven)

This article does not necessarily reflect the technical opinion of EDC4IT, but purely of the writer. If you want to discuss about this content, please send us an email at support@edc4it.com.