04.19.08
Abuses of DTO pattern in Java world
Basics
I have to go on a detour for bit (from elegant visitor to strategy pattern to DI … ) to something that most of the Java people are familiar with; Data Transfer Object, or most commonly known as a DTO. The reasons for this detour are related to something i have been playing with for a while (i’ll ellaborate in later posts). First of all keeping with my tradition of referring definitions, here’s what wikipedia has to say about a DTO :
Data Transfer Objects (DTO), formerly known as Value Objects or VO, are a design pattern used to transfer data between software application subsystems. DTOs are often used in conjunction with Data Access Objects to retrieve data from a database.
The difference between Data Transfer Objects and Business Objects or Data Access Objects is that DTOs do not have any behavior except for storage and retrieval of its own data (accessors and mutators).
Not much insight but still, it explains it in a nutshell. A data transfer object was born to reduce the number of calls through wire in an atypical multi tiered architecture. Usually at the server end there would be an assembler which takes care population of data etc. If anyone of you have ever worked with pre EJB 3.0 days, you would know why they used DTOs. Basically the idea was that an Entity bean would typically expose a method that would copy its entire state into an object that could be serialized. Now this object could be used
by any client in whatever manner needed. Typically code would be something like :
public class CustomerEntityBean implements javax.ejb.EntityBean{
public CustomerDTO getCustomerDTO( ){
// prepare a customer DTO with First Name.
return new CustomerDTO(this.getFirstName());
}
}
Nothing special going on in there, except that customer DTO constructor is called with getFirstName( ) method of the bean being called to allow for populating the customer DTO. In EJB 3 you are saved from doing all this (or so they say
) because your entities are detachable (in the sense that they are not ALWAYS managed by the container), so essentially they behave like DTOs when in “detached state. There are DTOs and then there are something called Value objects. Often both are used interchangeably. Although this is not true. For two DTOs the equality is based on object identity, whereas two value objects are equal in terms of their state. By this i mean for example two customers are equal if : their customer id AND OR customer Name is equal. For two value objects to be equal you can have all properties or subset of properties that uniquely identify each of them dictate the equality. (overriding equals)
A good example of Value object would be Date, for example something like 11/11/2011 is value object, as you can make out it is:
- Immutable
- If i have multiple copies of this date, all the copies would be equal
A value object should always override equals method. ( refer to : Value Object for more details ) . A DTO on the other hand is just used as a mutable object. Their equality is based on object identity rather than state identity as stated earlier. (CustomerVO vs CustomerDTO, VO = Value Object). Although this example is admittedly lousy because ideally i would want to keep my value objects very simple, like a Date for example (as opposed to a Customer).
Apart from all of this lets keep things even more clear, throughout the course of this blog I would be using the term Model for representation of “actual” data to be retrieved or persisted. In Hibernate terms the Model is a “Model” of what lies in database or what will lie in database (Talk about puns
).
Abuses
Throughout its long history DTO has been utilized for all kinds of purposes. In EJB days usually the mantra was to use them for making expensive three tiered calls easier, but times have changed since pre-historic EJB 1.x days. While it does make a lot of sense today why DTOs were needed originally, in today’s world people are simply abusing the term and pattern in their projects mostly. (I am speaking from experience off course, I don’t claim that everyone is doing this, before someone jumps a gun on me
). Some of the common blunders / mistakes / abusive usage is :
- Using DTOs to pass the data within the tier
- Using DTOs inter changeably with Value Objects
- Using DTOs to compose other DTOs and finally using one huge DTO for display (it sounds horrifying and dreamy, but i have seen this happen)
Lets tackle each one of those abuses one by one, as first one says, there is quite often the case for passing DTOs around simply because parameters are too much in one method call. While this might be a more object oriented approach, but it doesn’t justify the usage of a DTO for this. If there are two many parameters sitting independently, may be it is required that they all be composed into One Model class and then passed around. If the method that is being called exists in a separate tier (lets assume that tiers are not even in different VMs for now) only THEN should a DTO be considered for usage.
The second one is pretty self explanatory; Value objects being immutable are often used interchangeably with DTOs, while this is not a pain for much of the time, but still as a purist you’d want to see Value objects being used for what they are.
Finally, In couple of cases I have seen people do this : They compose DTOs with a DTO to transfer some data and retrieve it as a matter of convenience, Although it is not harmful in any sense, One should refrain from composing DTOs within a DTO (as opposed to a Model within a DTO) a DTO can have n number of models, it can have n number of lists, but if we are adding DTOs within a DTO, it tells me that we want to use Outer DTO as a shuttle to transfer Inner DTO because of some stringent need. I don’t think that plays out really well with a good design.
It is sometimes hard to point out why DTO can be such an anti pattern sometimes (I know a lot of folks believe that it is, but unfortunately i am not one of them.) I have been working on a project that uses a very interesting commons framework, that glorifies the DTO pattern to its most simple, most elegant and pure roots (Which is why I am proud to be part of this project
). I’ll elaborate on this later. Its a matter of personal taste as to how we write our code, however thinking what is intended use for what you are doing can go a long way in simplifying things for us.
Regards
Vyas, Anirudh
R. Goodwin said,
August 21, 2008 at 7:06 pm
Hi Anirudh,
An interesting article.
What do you think of a DTO situation like this:
1. A web tier (on one server) constructs a Search object containing a number of search parameters, some of which may or may not be set, depending on search complexity.
2. It calls a data service tier session (on another server) passing the Search object.
3. Data service performs search, adds search results and total results (for paging options), then returns this Search object back to the web tier.
I would say the DTO pattern makes sense here because all the related details are grouped, and the session interface signature is nice and simple as there’s no need to overload with all the various possibilities for the different search parameters.
Administrator said,
August 21, 2008 at 8:01 pm
You are right, In fact I would do a write up on this soon hopefully; we are using the same concept in our current project, where a DTO essentially holds a search criteria, users in UI set the criteria for search, the DTO is passed through the tiers, the search criteria could be a string, it could be a model (or a POJO or a Java bean or anything that is Serializable) and search results could be collection of models received. This definitely makes sense.
Regards
Vyas, Anirudh
Mark said,
August 23, 2008 at 2:37 am
Then it is a Search object or a Search results object, not a DTO. For Search Results objects, they can be Projects or Report objects. They might seem the same, but they are not.
Administrator said,
August 23, 2008 at 2:47 am
A DTO when taken literally should mean A Data Transferring Object; we see that Search Criteria is being passed into through the tiers and some business processing being done on Business tier and finally data being either gotten or persisted based on info gotten using Search / Operation Criteria is set in DTO to be “transferred” across the tiers back to the UI tier which initiated the request in the first place.
Could you elaborate more on your definition of a DTO mark and delve a little deeper on DTO vs a Search Object Mark?
Regards
Vyas, Anirudh
R. Goodwin said,
September 2, 2008 at 8:07 pm
Hi Anirudh,
You’re not by chance using Lucene for the search and Wicket for the presentation?
I’ve been trying (rather unsucesfully) to integrate Wicket and a search data tier via IDataProvider, but have found that they are not compatible.
The current Wicket paging mechanism expects queries for data and total results (count query) to be executed separately.
Fine if using a DAO and a database.
But of course a Lucene search is executed once to locate both results and totals for the paging.
If you’re interested, have a look at the Apache Jira ticket I created:
https://issues.apache.org/jira/browse/WICKET-1784
Don’t think Igor (lead Wicket developer) is convinced that the use of a DTO object for passing search and paging parameters and returning results between Wicket to the data service tier is widely used.
Was wondering if you have a view on this?
Many thanks.
Anirudh Vyas said,
September 14, 2008 at 7:12 pm
One of the guys asked me to cite some sources who believe DTO and Value Objects are different, For details please read P of EAA book by Martin Fowler and refer to Page 487.
The idea of DTO vs Value Object being different is fairly common in J2EE community as per my understanding.
Anirudh Vyas said,
September 14, 2008 at 7:18 pm
Yes; One of our colleagues had a discussion about this at length with Igor (Query for data and count being separate; and their opinion seems to be that as a framework thats what makes sense to them; I dont completely agree).
Usage of a DTO in common J2EE environment is a POJO or a Java bean; The difference is only a matter of purist view in my opinion. Its like when i use DTO to represent say an Order and call it OrderDTO and then use the same entity for persistence; the TO’s purpose becomes polluted. A TO should do only what its supposed to do; beyond transfer the tier has to take over. In my opinion actual DB entities should be represented as Models in Java / J2EE environment and then passed around through tiers using a DTO (which would mimic the data of Model or state of the Model or could in fact contain the model itself.)
This in my opinion leads to better Separation of concerns.
Regards
Vyas, Anirudh