Mike Keith's blog on Java and Persistence
All | Java
There was a recent announcement made by a consortium of companies about SCA and SDO that can be found here. I have since been getting a few questions about what this means to EJB 3.0 and what the differences between SDO and EJB 3.0 are. There are also some misunderstandings and misinterpretings about what this implies with respect to continued support of the Java EE platform by these companies.
First off, to ensure the message is crystal clear, this does not mean that any of these companies are abandoning or reducing their support for the Java platform. SCA and SDO are complimentary technologies to Java EE, not competitors to them. They are geared towards SOA architectures in general as opposed to Java-specific web services, so in many ways they are generalizations of the model but with concrete API's. EJB components fit well into the SCA component model, and EJB persistence would do well to sit underneath the binding layer of SDO at the Data Access Service (DAS) layer. The companies that are involved in these specifications have all shown their strong support for Java EE and to the best of my knowledge (I can't speak for all of the companies, of course) have no plans to change that level of support.
Here are a few of the notable differences between the specs:
· SDO is the basis for an overall architecture while the EJB 3 Persistence API is simply what its name implies -- a persistence technology. SDO is the basis for an architecture because it does not cover all of the components that would be required for a complete platform. It seems to have been designed more for the upper application architecture layers, like SOA, but leaves all of the specific underlying technologies (except for XML!) unspecified. Services such as Data Access Services are mentioned, but are abstract. Other things such as transactions are not even mentioned. EJB 3, although not at the lowest level as something like JDBC, is definitely lower down on the technology stack.
· SDO is driven by a meta-model and has a very reflection-oriented look and feel. Its API's on DataObjects and DataGraphs offer reflective access of type and property metadata as well as concrete instance data. EJB has no meta-level API's. Entities are concrete and used by applications as simple unadorned Java objects, offering whatever they natively support as part of their domain level API.
· A great deal of effort, and most of the specification, deals with mapping between XML and the data objects. This is in keeping with its SOA focus and its attempts to be interoperable. Mappings to and from the XML may be generated in both directions. EJB does not have make any special considerations for interoperability, except as defined by underlying protocols such as IIOP and CORBA protocols that may be used by the application for communicating the objects.
· Objects in SDO are self-managing because they are always wrapping the actual domain data, even when they are detached. A DataObject or DataGraph is the surface object so wherever the object or object graph goes the operations are getting invoked on those wrapper objects. EJB entities are managed only when they are attached to an EntityManager. Only when operating on managed entities are the objects potentially wrapped. When objects become reattached is when the processing comes back into play to calculate changes and so forth. The difference is that SDO retains the heavier but more controlled objects throughout their entire life cycles, whereas EJB 3 entities are POJO's of the simplest kind if they become unmanaged.
· Object traversal in SDO is done using Xpath queries. This provides the abstraction to navigate through the various wrappers without having to issue all of the unwrapping calls at each level. EJB 3 objects are POJO's and are thus directly traversed using the domain API. The SDO model actually feels a little like the OODB's of old. There is no querying of objects except by traversal, as part of an XPath query or directly from one object to the next. It is unclear what the specified way of causing objects to be loaded on demand during traversal is (whether the DataGraph is supposed to go back to the DAS to get DataObjects that are not loaded).
( Dec 21 2005, 12:35:30 AM EST ) Permalink Comments 
For those who know me I'm sure the title was enough. You already know what I am going to run on about in this blog entry and are probably just reading on to find out exactly how vigorously I am going to spew.
The simplest definition of metadata that I can think of is data that describes other data. Like regular data it must be persisted somewhere in order for it to endure beyond the program that created it. It can appear in many forms; in some cases it can be found stored in the database, in others it may be in the file system. Back in the Smalltalk days metadata about classes was stored as metaclasses and persisted as part of the runtime image itself. To summarize, it is the function, not the form, of the data that makes it metadata.
Okay, let's move on to annotations and how JSR 175 is supposed to change the way that we program in Java.
So what is the problem that annotations solve? Well, since Java code is really just another type of data it turns out that we often need data to describe it as well. Traditionally we have used various formats for storing and associating the metadata with the Java code, and obviously the one most commonly used within the last few years has been XML files. In XML the metadata is typically structured inside of multiple levels of XML elements that include the names of the Java artifacts, and this is all in order to link the two together. The problem is that the metadata and the Java code that it describes are separated from each other. This introduces all sorts of problems, from persistent storage of the metadata to retrieving it and associating it at processing time, or runtime, or whatever time the two need to be linked. Annotations introduced a way to couple them within the language format so that they are stored in the same place, loaded at the same time and accessible whenever the Java program is. We no longer need to go out into an entirely different language, grammar and syntax and embed the metadata within a myriad of XML elements that exist solely to tell us which Java artifacts the metadata applies to.
So if annotations are so great then why am I discontented? Because I can't use them. In particular, I can't use them for anything other than their simplest and most obvious usage. I want to use them for metadata that applies to a group of classes. I want the same ease of use at the application level that I get at the class level.
It really scrapes my elbows that there was such a lame attempt at supporting annotation metadata at the level that applies to multiple classes. The so-called package-info.java mechanism is too crippled and clumsy to be usable, and the result is that it has left such a bad taste in everybody's mouth that they want to spit whenever they get reminded of anything related to annotations above the level of classes. It doesn't help that Java does not have any notion of a module, or enclosing configuration-friendly wrapper around bunches of Java code. Combine these together and it becomes hard to dispute that Java is doing a pretty lousy job accommodating application-level metadata.
The fact that Java has failed to provide a decent mechanism to support application-level annotations by no means implies that application-level annotations are a bad idea. Remember that annotations are just a way of specifying metadata. In fact, they are a preferred format than XML if the metadata is already coupled to Java code, and does not need to be portable to other platforms outside of Java, which is almost all of the time. There are lots of other reasons why people might want to use annotations instead of XML, ranging from development processes that are not geared for configuration management and version controlling of XML files to having different editing environments that are required for managing XML and Java. Some people have had such bad XML experiences that they shun it whenever possible. Regardless of the reasons, annotations are a much neater, more consistent way of adding metadata to Java programs that integrates perfectly with programming in Java. The code completion of annotations inside the Java editor, the brevity of their format and the compile-time checking that comes for free already combine to make it a superior environment for metadata programming than XML.
There are two main complaints that I have heard from critics of application-level annotation metadata. The first is that annotations should not be used for a group of application-level classes because there is no reasonable Java artifact on which to tack the annotations. This is true, of course, but quite beside the point. While the metadata ideally should be on a Java module no such module exists right now (although I believe it is being kicked off as a JSR even now). There is absolutely nothing stopping us from using whatever artifact we choose to store the metadata on as long as we have designated it as such. A class of our choosing will do the job for now... a no worse suggestion than using one for a global Persistence bootstrap point. It's no more or less a correct use for a class. The idealist may scoff at this saying that it is not the *correct* use of annotations, but my response is that just because the perfect target for the annotations does not exist does not mean that we can't adapt to the shortfall that exists in Java. The same idealist probably also criticizes the use of an interface to group a number of shared constants together, but that does not mean that it is not useful as a constant pool. When we don't see exactly what we need in Java most people won't say, "Since Java does not have what I need then I should probably not do it at all." We have a need or a design in mind and we use whatever tools are at our disposal to accomplish it. In this case we want to be able to use annotations to specify application-level metadata. We can make it work quite easily, so why shouldn't we?
The second argument that I have heard is that application metadata is not coupled to the code and therefore should not be specified as annotations on code. This is then always followed by the comment that you would have to recompile the annotated class in order to change the metadata. In the first case it is incorrect, while in the second it is both erroneous and inconsistent.
Application metadata can be divided up into two types of metadata, that which is tightly coupled to the code, and that which is loosely coupled to the code. As an example of tightly coupled metadata consider the EJB Persistence API where currently the @Entity annotation must have the access member set to FIELD if mappings are to be defined on the member fields instead of on properties. Setting this for all of the classes in the application will ease development considerably for the many people that like to use direct field access instead of the getter/setter property accessors, but changing the application level default definitely affects each and every class that did not explicitly specify the access mode. Ditto for the named queries which are defined for the entire application and referenced within classes that call em.createNamedQuery("MyFavoriteQueryName").
There can also be metadata that is only loosely coupled to the code. An example of this kind of metadata would be something like the run-as security role for a method. This could potentially change without changing the code, meaning it is not tightly coupled to the code, although some code in the method might well access resources that assume a particular user and would fail if the wrong user tried to execute it.
If you have not spotted the inconsistency yet it is that this same loosely-coupled metadata that is not supposed to be in an application-level annotation because it will require recompilation happens to already be present as a @RunAs annotation on the bean class, even though it will require a recompile of the class. The reason? First, because it is almost always the same person that does the metadata as the one that does the development, and putting it on the bean class is just easier than having to go into a separate XML file to do it. Second, (this is the erroneous part for those following along at home) in practice it makes no difference that we have to recompile the annotated class before redeploying the application. The reality of it is that there is no practical difference between clicking an IDE button to recompile and insert a class into an archive, and clicking a button to insert an XML file into an archive. Both require updating the application archive and redeploying the application. There is no justification for saying that this is okay for class metadata, but not for application metadata, though. They are equivalent in both senses.
Now if you took away from this the fact that I think Java annotations are the greatest thing since the Turing machine then I have obviously failed to get my point across. Java annotations still have problems -- lots of 'em. While they do provide a basic level of support for specifying metadata they do not support many of the things that they really ought to have. That s not the object of this rant, however, and I have already griped about those things here.
It's not chic these days to say that application-level annotations are good, and I may be added to a blacklist somewhere now that I have made my opinions public, but to me annotations are a means of specifying metadata by storing the metadata on Java artifacts. Essentially they are post-it notes for Java. Why do people insist that every post-it note I put on the fridge has to be about the freakin' fridge?
( Jun 07 2005, 07:52:37 PM EDT ) Permalink Comments 
Wednesday April 20, 2005
I am Canadian. I have only been to Britain once, but I still know enough about Britain to pronounce Worcestershire correctly. The real question is what do I do when I hear others saying it incorrectly? Do I just ignore them and go on, inwardly laughing at them and their blunder, or do I correct them?
Well, I confess that I am a corrector. I do tell people how to pronounce Worcestershire correctly, not because I want to appear international or love to correct people, but because if I were them I would want to be told. Basic golden rule type of stuff. And, yes, I also tell people about the long pink thread stuck on their shoulder, the black mark that they rubbed on their cheek after changing the toner, and even on occasion the skirt that is caught in the pantyhose (that one is definitely a little trickier to do properly, though).
Every now and then a statement gets made somewhere that EJB 3.0 is just a copy of Hibernate, or even worse, that EJB 3.0 is Hibernate. Claims like these are typically made in innocence by uninformed people whose vision is obscured by the narrow scope of their own experience. It seems that few people actually know enough to even correct the propagation of this fallacy, and those that do know are not doing it. Rather than hanging my head and feeling guilty about this apparent inconsistency in my character I have determined to right my wrong, or at least do what is within my power to spread the truth. This blog entry is the first step in my repentance process.
The root of the fallacy is that Hibernate was the first free O/R mapping software that came even close to providing enough functionality and features to solve some of the problems that real world applications were facing. Lots of developers caught on to this and began to use Hibernate successfully for prototypes and small projects. For the majority of these developers this was their first and only experience using O/R mapping software to solve the O/R impedance mismatch problem.
Some of these people have never gone on to using the full-blown over-the-counter mapping products and do not even realize that these products exist or that they provide all of the features of Hibernate, and in some cases more. Their understanding of the O/R mapping and persistence concepts relate only to Hibernate and the Hibernate API's.
Cue the mandate to make EJB a useful and relevant specification. Linda DeMichiel, to her credit, realized that as the EJB spec lead she needed to fashion EJB 3.0 after existing and successful products instead of adopting the usual ivory tower approach to specification development. To do this she invited members from Oracle TopLink and JBoss Hibernate, the two most successful O/R mapping products ever, people from the top selling application servers, users/consultants from different backgrounds and eventually SolarMetric, the only JDO vendor with a real customer base as far as I know, to participate in the spec. All of the members are combining to produce a spec that will not only make session beans and MDBs easier to develop and use, but also proffer a persistence standard that will please the user community that had previously rejected the EJB standard in favour of the proprietary POJO persistence vendors.
As we progressed and began specifying the persistence layer it was obvious that we had similar features at the 80-90% functionality level that EJB is trying to achieve. These features include:
EntityManager - A transaction-level artifact that references, maintains identity and manages the objects in a given transaction. JDO calls this a PersistenceManager, Hibernate calls this a Session. TopLink calls this a UnitOfWork. These are all very close in scope, purpose and API.
Named queries - Queries must be able to be pre-defined and bound to a name for later retrieval and execution. These are called named queries in all of TopLink, Hibernate and JDO.
Native queries - Native SQL queries that allow the application to specify the query criteria in SQL. These are called SQL queries in all of TopLink, Hibernate and JDO.
Callback Listeners - The ability to define a class or method that will get invoked when a given event occurs. TopLink calls these event listeners, Hibernate and JDO call them life cycle callbacks.
Detaching/Reattaching objects - Objects can leave the scope of the EntityManager that controls them. They can also be reattached to the same or a different EntityManager through the use of the merge API call on the EntityManager. TopLink offers a series of merge calls, the most basic one being mergeClone. Hibernate has saveOrUpdateCopy and JDO has a couple of flavours of attachCopy call on the PersistenceManager.
O/R Mapping Types - All of the direct and relationship mapping types that are fundamental to mapping object state to relational database tables. These are all supported by Hibernate, TopLink and JDO. I won't go through all of the names (one-to-one, etc.. they are all pretty standard), but although some of the names differ a little bit from one to the other the functionality is pretty much the same and what you would expect.
Embedded Objects - Objects that have no persistent identity of their own but depend upon their parent object for identity. JDO calls them embedded objects, TopLink calls them aggregates and Hibernate calls them components.
The list could go on, but hopefully people get the idea. The important features in EJB 3.0 are stock persistence features that anybody that has used multiple persistence products should recognize. The best part is that by standardizing these features the design patterns (actually they are more like "feature patterns", but nobody has written a book about feature patterns, yet :-) will be able to be used and referenced in ways that span products.
Note that I am not comparing the different features offered by these products. That is not what this is about. The point is that there are similarities and that those similarities are getting enshrined in a specification. This is the biggest win for vendors and developers.
Having said this some of the Hibernate/JBoss customers will recognize that there are some similarities as well in some of the API names. This is not a problem for most of us since they represent the feature as well as any other name would, and Gavin happened to have been the first one to write it up and propose it. (Unless there is something actually wrong with a proposed name there is no reason to turn it down.) It doesn't mean that the feature was modelled after Hibernate, just that the guy from Hibernate happened to be the one to propose the name for the feature that everyone already had.
Hibernate 3 users may recognize more similarities than ever because Hibernate has decided to add these to the base Hibernate product and expose them within the core API. From a migration standpoint this may be problematic for them but that is certainly their perogative and I applaud any product that makes their own proprietary API's look more like the standard. So as Hibernate evolves it turns out that it is actually modelling itself after EJB 3.0, not the other way around.
Finally, and maybe this is just pride speaking here, but I also have to confess feeling just a tinge of personal insult given that I have expended a substantial portion of my own time and effort toiling over the specification issues. Saying that we simply copied Hibernate would be trivializing that time and work, especially when I know full well that the conclusions that we arrived at are in most cases either the best solution, or the best possible solution given the circumstances. The spec should look a lot like Hibernate, TopLink and JDO. If it didn't then we would not have done a very good job since the whole point of this was to make use of our experience and standardize it.
So next time somebody says or writes that EJB 3.0 is modelled after Hibernate don't just inwardly laugh at them, or roll your eyes and feel sorry for them for their naivete. Please correct them. It's embarrassing for them and they would want to you to tell them. I know I would.
( Apr 20 2005, 02:35:38 PM EDT ) Permalink Comments 
Tools are essential for a technology to mature. Without them it stays in the realm of being accessible only to the experts and usable only by those in the upper experience echelons. I had regular arguments with a friend who repeatedly claimed that O/R mapping tools were not required. In the end he at least conceded that if a product wants to be mainstream it has to have graphical tools to enable the sorts of development that can already be done using API's and XML configuration. This is critical to being able to support the types of developers that don't like to wallow in XML, or managers don't trust their developers to do so :-).
EJB 3.0 has now reached this stage. With Oracle's recent announcement that it is leading the Eclipse project to provide the EJB 3.0 O/R mapping tools as part of the Web Tools Project a standard persistence tools platform is being formed. This will provide the infrastructure for meeting the EJB 3.0 goal of making this a technology that entry-level developers can understand and feel comfortable using. The learning curve to use EJB just got a lot shorter.
Hard to believe, but some people are still really missing the point. The EJB Persistence API is now set to be the standard for persistence. *All* of the major O/R vendors are on board and participating in the expert group and acknowledging it as the standard. TopLink and Hibernate, the two leading O/R mapping products already have support for EJB 3.0 and Kodo, the only JDO vendor that has any real market share in JDO-land, is working on it as well. The age of proprietary mapping descriptors is over, at least for the vast majority of applications.
Of course there will always be some proprietary mapping features that go beyond the spec, and the proposal discusses that these will be able to be plugged in by different vendors as they feel so inclined. There will probably always be a need for proprietary features, the trick is just to ensure that they are done in a conscious way and are harnessed in a well-defined application space. Then if the requirement to move to another vendor comes along the difficulty level will be easy to diagnose.
Any way you look at it, the Eclipse proposal is going to be good for EJB 3.0 and for developers. Having a persistence development platform for the most commonly used IDE is going to provide the support that most people want.
Of course at Oracle we are still providing the support within JDeveloper. It will be well-integrated with TopLink and expose all of the deluxe features that make TopLink the coolest and most powerful O/R mapping framework on the planet. :-)
( Apr 14 2005, 11:58:05 AM EDT ) Permalink Comments 
Thursday March 17, 2005
After speaking at TSS on migration I have gotten a few requests from people asking me to write up the presentation in a paper format for people to be able to read and get more details on the subject. I definitely have to do that at some point, but for now Oracle is hosting a couple of webinars on EJB 3 that will help people to start picturing how this can be achieved. The first one is a basic intro to EJB 3 and the second one will be focusing specifically on migration. See the Java Online Seminar site for more details.
I really believe that migration is the key to getting to EJB 3.0. I know this sounds trite, but I mean migration in the implementation sense, not the general sense.
Most consultants/developers/practitioners have to work on and maintain existing systems, but will still want to find ways to incorporate EJB 3 practices and features. By offering a stepwise migration path from existing products towards EJB 3 then those features will be able to be integrated into legacy systems and slowly be able to become more and more EJB 3 compliant.
If EJB 3 represents freedom from vendor lock-in then if I were an IT manager (and I'm not, and neither have any aspirations to be one so that might actually disqualify my moccasin switching) and I wanted to extend the lifetime of my application then I would obviously want my application to be compliant with the standard. No self-repecting IT professional would ever take on the job of taking an application, ripping it to pieces and rewriting it entirely to a new specification. The chances of succeeding are minimal at best and the chances of regression are probably nigh on 100%. As with any successful migration, moving existing systems to EJB 3 should be done in an incremental fashion.
The granularity of the steps might be arguable, and that is the sort of stuff that I discuss in migration talks (which incidently has been accepted for presentation at JavaOne this year).
( Mar 17 2005, 01:11:48 PM EST ) Permalink Comments 
Friday March 04, 2005
Today was kind of a long-awaited day, not because I have to speak at TheServerSide Java Symposium today, but because I along with a number of other Oracle developers have been anxious to be able to share the progress that we have made on EJB 3.0. While this work has been fun, it has been somewhat frustrating because it could not be released until the timing was right. Today the timing was right, and Oracle announced our new EJB 3.0 technology preview.
This preview is really a landmark release for a rew reasons:
1. It is the first commercial application server release that showcases the next generation of standardized persistence.
2. It enables actual unit testability of CMP entities using JUnit or any other test framework outside the server.
3. It provides support for migrating from EJB 2.x to EJB 3.0.
The preview can be downloaded for free here.
( Mar 04 2005, 05:21:05 PM EST ) Permalink Comments 
Tuesday February 08, 2005Permalink Comments 
Wednesday January 12, 2005
Saturday November 27, 2004
Sunday November 21, 2004
Okay. I admit defeat. I can no longer remain silent.
After a year of resisting the impulse to create a blog something happened to me that I simply could not hold back. It wasn't that I didn't want to blog, only that I was afraid of the time commitment and the responsibility that I was worried I would take upon myself.
But alas, the camel's back got broken today as I attended a keynote by Tim Bray at the Colorado Software Summit. Despite complaining to him afterwards I could not satisfy my frustration about some of the things that he said, and I felt that if I did not let it out then I would be in danger of combusting. This seemed to be the only venue available.
Tim's presentation was a good one, but he is obviously somebody that speaks a lot and has a bunch of polished pieces of material that he bangs together. Being a technical guy, and very accomplished I might add, he likes to bring things to a very technical level.
Where he really burned me was when he started talking about how O-R mapping was broken. Don't get me wrong, I was not angry at that. Everybody knows it is a broken idea, and something that we would rather not have to do. What got my britches bunched was that he proceeded to say how people shouldn't do it. This is not an acceptable solution, being that the only reason why people are doing it is because at this stage they have to. He, himself, said that some things were too late to change, and I really think that this is just one of those things. Too much data in relational databases and people that want to program in Java. They have to do something, and when I stood up during Q&A; and told him so his idea that we all use JDBC was just too naive to be taken seriously. He obviously has never really programmed a real-live application lately and the triteness with which he dealt with the problem was indicative of this.
I have to admit, though, that he really did have a very useful and interesting idea for presenting that consisted of a long list of links that he visited in sequence and talked about. With the wireless in the room most people were able to follow the links and bookmark them individually, or the whole page from his website that he was working off of. Really useful as it leaves you with some concrete pointers of the interesting places to go to follow up on the things that he talked about. Turns out that he is a fellow Canadian, too, which I didn't know when I went up to him. Shame.
And so it begins...( Nov 20 2004, 02:43:34 PM EST ) Permalink Comments