Abuses of DTO pattern in Java world
Basics
I have to go on a detour for bit (from elegant visitor to strategy pattern to DI … ) to something that most of the Java people are familiar with; Data Transfer Object, or most commonly known as a DTO. The reasons for this detour are related to something i have been playing with for a while (i’ll ellaborate in later posts). First of all keeping with my tradition of referring definitions, here’s what wikipedia has to say about a DTO :
Data Transfer Objects (DTO), formerly known as Value Objects or VO, are a design pattern used to transfer data between software application subsystems. DTOs are often used in conjunction with Data Access Objects to retrieve data from a database.
The difference between Data Transfer Objects and Business Objects or Data Access Objects is that DTOs do not have any behavior except for storage and retrieval of its own data (accessors and mutators).
Not much insight but still, it explains it in a nutshell. A data transfer object was born to reduce the number of calls through wire in an atypical multi tiered architecture. Usually at the server end there would be an assembler which takes care population of data etc. If anyone of you have ever worked with pre EJB 3.0 days, you would know why they used DTOs. Basically the idea was that an Entity bean would typically expose a method that would copy its entire state into an object that could be serialized. Now this object could be used
by any client in whatever manner needed. Typically code would be something like :
public class CustomerEntityBean implements javax.ejb.EntityBean{
public CustomerDTO getCustomerDTO( ){
// prepare a customer DTO with First Name.
return new CustomerDTO(this.getFirstName());
}
}
Nothing special going on in there, except that customer DTO constructor is called with getFirstName( ) method of the bean being called to allow for populating the customer DTO. In EJB 3 you are saved from doing all this (or so they say
) because your entities are detachable (in the sense that they are not ALWAYS managed by the container), so essentially they behave like DTOs when in “detached state. There are DTOs and then there are something called Value objects. Often both are used interchangeably. Although this is not true. For two DTOs the equality is based on object identity, whereas two value objects are equal in terms of their state. By this i mean for example two customers are equal if : their customer id AND OR customer Name is equal. For two value objects to be equal you can have all properties or subset of properties that uniquely identify each of them dictate the equality. (overriding equals)
A good example of Value object would be Date, for example something like 11/11/2011 is value object, as you can make out it is:
- Immutable
- If i have multiple copies of this date, all the copies would be equal
A value object should always override equals method. ( refer to : Value Object for more details ) . A DTO on the other hand is just used as a mutable object. Their equality is based on object identity rather than state identity as stated earlier. (CustomerVO vs CustomerDTO, VO = Value Object). Although this example is admittedly lousy because ideally i would want to keep my value objects very simple, like a Date for example (as opposed to a Customer).
Apart from all of this lets keep things even more clear, throughout the course of this blog I would be using the term Model for representation of “actual” data to be retrieved or persisted. In Hibernate terms the Model is a “Model” of what lies in database or what will lie in database (Talk about puns
).
Abuses
Throughout its long history DTO has been utilized for all kinds of purposes. In EJB days usually the mantra was to use them for making expensive three tiered calls easier, but times have changed since pre-historic EJB 1.x days. While it does make a lot of sense today why DTOs were needed originally, in today’s world people are simply abusing the term and pattern in their projects mostly. (I am speaking from experience off course, I don’t claim that everyone is doing this, before someone jumps a gun on me
). Some of the common blunders / mistakes / abusive usage is :
- Using DTOs to pass the data within the tier
- Using DTOs inter changeably with Value Objects
- Using DTOs to compose other DTOs and finally using one huge DTO for display (it sounds horrifying and dreamy, but i have seen this happen)
Lets tackle each one of those abuses one by one, as first one says, there is quite often the case for passing DTOs around simply because parameters are too much in one method call. While this might be a more object oriented approach, but it doesn’t justify the usage of a DTO for this. If there are two many parameters sitting independently, may be it is required that they all be composed into One Model class and then passed around. If the method that is being called exists in a separate tier (lets assume that tiers are not even in different VMs for now) only THEN should a DTO be considered for usage.
The second one is pretty self explanatory; Value objects being immutable are often used interchangeably with DTOs, while this is not a pain for much of the time, but still as a purist you’d want to see Value objects being used for what they are.
Finally, In couple of cases I have seen people do this : They compose DTOs with a DTO to transfer some data and retrieve it as a matter of convenience, Although it is not harmful in any sense, One should refrain from composing DTOs within a DTO (as opposed to a Model within a DTO) a DTO can have n number of models, it can have n number of lists, but if we are adding DTOs within a DTO, it tells me that we want to use Outer DTO as a shuttle to transfer Inner DTO because of some stringent need. I don’t think that plays out really well with a good design.
It is sometimes hard to point out why DTO can be such an anti pattern sometimes (I know a lot of folks believe that it is, but unfortunately i am not one of them.) I have been working on a project that uses a very interesting commons framework, that glorifies the DTO pattern to its most simple, most elegant and pure roots (Which is why I am proud to be part of this project
). I’ll elaborate on this later. Its a matter of personal taste as to how we write our code, however thinking what is intended use for what you are doing can go a long way in simplifying things for us.
Regards
Vyas, Anirudh
Hi Anirudh,
An interesting article.
What do you think of a DTO situation like this:
1. A web tier (on one server) constructs a Search object containing a number of search parameters, some of which may or may not be set, depending on search complexity.
2. It calls a data service tier session (on another server) passing the Search object.
3. Data service performs search, adds search results and total results (for paging options), then returns this Search object back to the web tier.
I would say the DTO pattern makes sense here because all the related details are grouped, and the session interface signature is nice and simple as there’s no need to overload with all the various possibilities for the different search parameters.
You are right, In fact I would do a write up on this soon hopefully; we are using the same concept in our current project, where a DTO essentially holds a search criteria, users in UI set the criteria for search, the DTO is passed through the tiers, the search criteria could be a string, it could be a model (or a POJO or a Java bean or anything that is Serializable) and search results could be collection of models received. This definitely makes sense.
Regards
Vyas, Anirudh
Then it is a Search object or a Search results object, not a DTO. For Search Results objects, they can be Projects or Report objects. They might seem the same, but they are not.
A DTO when taken literally should mean A Data Transferring Object; we see that Search Criteria is being passed into through the tiers and some business processing being done on Business tier and finally data being either gotten or persisted based on info gotten using Search / Operation Criteria is set in DTO to be “transferred” across the tiers back to the UI tier which initiated the request in the first place.
Could you elaborate more on your definition of a DTO mark and delve a little deeper on DTO vs a Search Object Mark?
Regards
Vyas, Anirudh
Hi Anirudh,
You’re not by chance using Lucene for the search and Wicket for the presentation?
I’ve been trying (rather unsucesfully) to integrate Wicket and a search data tier via IDataProvider, but have found that they are not compatible.
The current Wicket paging mechanism expects queries for data and total results (count query) to be executed separately.
Fine if using a DAO and a database.
But of course a Lucene search is executed once to locate both results and totals for the paging.
If you’re interested, have a look at the Apache Jira ticket I created:
https://issues.apache.org/jira/browse/WICKET-1784
Don’t think Igor (lead Wicket developer) is convinced that the use of a DTO object for passing search and paging parameters and returning results between Wicket to the data service tier is widely used.
Was wondering if you have a view on this?
Many thanks.
One of the guys asked me to cite some sources who believe DTO and Value Objects are different, For details please read P of EAA book by Martin Fowler and refer to Page 487.
The idea of DTO vs Value Object being different is fairly common in J2EE community as per my understanding.
Yes; One of our colleagues had a discussion about this at length with Igor (Query for data and count being separate; and their opinion seems to be that as a framework thats what makes sense to them; I dont completely agree).
Usage of a DTO in common J2EE environment is a POJO or a Java bean; The difference is only a matter of purist view in my opinion. Its like when i use DTO to represent say an Order and call it OrderDTO and then use the same entity for persistence; the TO’s purpose becomes polluted. A TO should do only what its supposed to do; beyond transfer the tier has to take over. In my opinion actual DB entities should be represented as Models in Java / J2EE environment and then passed around through tiers using a DTO (which would mimic the data of Model or state of the Model or could in fact contain the model itself.)
This in my opinion leads to better Separation of concerns.
Regards
Vyas, Anirudh
Can you elaborate on this statement in more detail, or, provide an example of such concept ?
“actual DB entities should be represented as Models in Java / J2EE environment and then passed around through tiers using a DTO (which would mimic the data of Model or state of the Model or could in fact contain the model itself.)”
Well what i meant is:
A Model is essentially a model of a DB entity in Java / Programmer environment. For example a Customer table may be mimicked as a CustomerModel in Java app.
Now a DTO’s purpose is to transfer data or state changes in a model across the tiers, so we could have something called CustomerDTO that emulates the CustomerModel’s attributes (like id, name, DOB, age, address) but is only used for transferring the change of state in CustomerModel.
Consider a scenario where Customer’s address has to be changed, this would mean, that I go ahead and change customer’s address on CustomerModel, use CustomerDTO to reflect those changes, then transfer this state change across the tiers using the same TO, and finally at the data access tier, use the TO’s information to reflect updates on a CustomerModel, this customer model is then used for persisting the data onto DB using either JDBC or some ORM mechanism … This makes my app clean, clean because now, i have a clear separation of concerns with a transfer object functioning as a transfer object (for transferring state), and a Model being used to emulate a DB entity (a Customer Table being represented as CustomerModel in an application).
Regards
Vyas, Anirudh
Hi Anirudh,
read the comment below and wondered how you would approach my problem of modeling a cycling route when you said:-
“One should refrain from composing DTOs within a DTO”
I have a route object that has a name and a distance. It also has a list of waypoint objects that have names, x and y coordinates.
How can I send the route data down the wire? Are you saying that the DTO must be of a single dimension like a list? At the moment I send a route DTO that also contains a list of waypoint DTOs. So it’s multidimensional. Are you saying that I need to send twice, the route DTO once and then the list of waypoint DTOs?
Hi Anirudh
Much appreciation for sharing your wisdom and spreading software craftsmanship. Also to everyone else for the comments. I am still batlling with getting to grips with design concepts.
My question is why not use the Model itself to transfer data. What are the benefits of this clean design and separation of concerns in this case. Why do u filter out the behavior.
Thanks
Tebogo
Well we’re are not filtering out behavior, but we are making a clear distinction between that of model and DTO.
A DTO’s behavior is only limited to state and data transfer, a model is what persistable entities will look like outside of database (especially in a web application for example), so really if we are talking about persisting Data Transfer objects then we are not working with DTOs, we are really working with models.
It might be argued that DTOs and Models can sometimes have duplication of state in transferring a state, but if that provides me with clear separation of concerns, I’ll do it. What am i gaining in the end? well a truly portable tier, this is because my data access tier for example only deals with DTOs and nothing else and so do other tiers in other applications, this makes plugging in DAO tier onto some other application fairly easy. For me, DAOs are highly “fine” and re-usable component pieces; just like a SQL DDL script would be (although comparison is crazy … but hey
).
Regards
Vyas, Anirudh
Paul Said :
Hi Anirudh,
read the comment below and wondered how you would approach my problem of modeling a cycling route when you said:-
“One should refrain from composing DTOs within a DTO”
I have a route object that has a name and a distance. It also has a list of waypoint objects that have names, x and y coordinates.
How can I send the route data down the wire? Are you saying that the DTO must be of a single dimension like a list? At the moment I send a route DTO that also contains a list of waypoint DTOs. So it’s multidimensional. Are you saying that I need to send twice, the route DTO once and then the list of waypoint DTOs?
My Reply:
OK thanks for asking that. Well no; That is not what I meant, to be clear, what I meant was yes you compose DTO of ALL the data that you need, But dont use a waypoint DTO, but use Waypoint model list (that maintains the state for waypoints) to transfer the data across.
Because think of it this way, you are really using a DTO to transfer model state or some state, but that this DTO serves as a container for DTO, infers to me that theres another DTO or list of DTO that could transfer other model states (You can do this, and this would very efficient, which is why I will post some stuff on this subject soon enough … but there has to be a very unique and different approach to this).
Sorry I wasn’t able to reply any sooner, work, open source projects keeping me busy.
Regards
Vyas, Anirudh
[...] Before we move on to discuss Service Facades, it might be worth to have a look at Anirudh Vyas’s blog on common abuses of the DTO pattern. [...]
Wow. aniruhdvyas.com is treat.
DAO Objects expose the complete domain information: basically a mapping of the databases (using hibernate in my case). This includes ’sensible’ information (like passwords, security rules, etc…).
Internally, a service requires access to all the domain information (using a relational model or a Object oriented model), but can’t expose all this information in the results without transformation.
In my actual project, each published object has attached securiy information related to the one that is accessing to it like ‘you are the owner” or “you are a friend” or “you can write”… the DTO representation is completly different of how this information is stored in the repository (DAO layer).
Not all projects are ’simple web projects’ where the ‘view’ can transform the data information (i.e. hidding sensible information): Ajax projects (or WebServices oriented projects) requires client side to access service layer methods (Web services or simple JSON rest services…)… the simplest way is to say: ‘ok, we have an standard DTO model coupled with the service layer API…” and the consecuence is “all client side programming (including Web View rendering) is based in this service oriented model”.
The ‘bad’ part of this solution is the overhead of transformations (each service method has, at least, a TransformToDto at the end).
The ‘god’ part of this solution is services can merge in an unified model data from more than one source each one with it’s own Domain representation (DAO level).
The best of this point of view is: internal representation of data is not coupled with Service exposed model and this ‘isolation(sensible to who is accessing and not ‘relational’