Tuesday, April 7, 2009

Google AppEngine and GWT now a marriage made in heaven

The announcement that Google AppEngine now supports Java is incredible news. Not just because it opens the doors to running arbitrary JVM languages like Scala, JRuby, PHP, etc on AppEngine, but because of the ability to wire up Java on the client, and Java on the server, through Google Web Toolkit. You can use all of your familiar Java tools for editing, debugging, testing, and packaging.

With the new system, you can write a POJO, add JPA or JDO annotations, and write server-side logic to persist these POJOs in either a RDBMS like MySQL, or in BigTable/AppEngine. Moreover, you can export your DAO or logic interfaces through GWT RPC, and call them directly from the client, seamlessly, and painlessly.

Almost Painlessly


The one hitch you'll encounter as a GWT developer is trying to serialize or deserialize persistence capable types. This is nothing new for GWT developers who have tried this with Hibernate before, and there are workarounds such as Hibernate4GWT. This problem occurs because the persistence classes are enhanced with an extra field to hold state which enables them to work when detached from the persistence context. GWT RPC computes its own CRC based on the fields of a class in order to ensure compatibility between server and client and the extra field causes problems.

In general, when it comes to sending serialized ORM POJOs down the wire, I think it's a risky practice, because you're likely to pull in a lot more of the reachable object tree than you bargained for unless you're careful. A better approach might be to use DTOs based on ProtocolBuffers.

However, it is sometimes nice to do it if your POJOs are relatively flat and you want to rapidly prototype. If your insist on using your ORM'ed POJOs over RPC, there is a trick to making it work.

Making JDO/JPA enhanced classes work over GWT RPC


The first step to making things work is to disable detachable objects and tag your class as serializable.

@PersistenceCapable(identityType = IdentityType.APPLICATION, detachable = "false")
public class MyPojo implements Serializable {

}

This does two things. First, it tells the persistence engine that you'll be managing object identity, usually through a primary key, and secondly, when your POJO is accessed outside of a transaction/session, you want it to be transient not detached. A detached object remembers where it came from, so that after modifications, it can be reattached and merged back into the datastore. A transient object forgets that it once came from the datastore, and thus if you try to repersist it, you'll end up inserting a copy.

This has a major downside in terms of ease of use, but it does prevent the enhancer from injecting hidden fields into your class to manage detached state, and it is these hidden fields which break GWT RPC compatibility.

But I don't want copies!


In using transient objects, you'll break a desired design pattern, which is to fetch an object through RPC, modify its properties, and send the same object back through RPC to be merged into the datastore. A quick and dirty work around is to use reflection to lookup an attached object on the server, copy all of the persistent fields from the transient object, and then merge the persistent object. Here's a prototype class that does this (but not recursively, so it doesn't handle anything but primitive fields):

public class PersistenceHelper {
public static Object findPrimaryKey(T tInstance) {
if (tInstance == null) {
return null;
}
for (Field l : tInstance.getClass().getDeclaredFields()) {
if (l.getAnnotation(PrimaryKey.class) != null
|| l.getAnnotation(Id.class) != null) {
l.setAccessible(true);
try {
return l.get(tInstance);
} catch (IllegalArgumentException e) {
e.printStackTrace();
return null;
} catch (IllegalAccessException e) {
e.printStackTrace();
return null;
}
}
}
return new IllegalArgumentException(
"Class " + tInstance.getClass().getName()
+ " does not have a method called getId()");
}

public static void copyPersistentFields(Object entity, T tInstance)
throws IllegalAccessException, NoSuchMethodException,
InvocationTargetException {
for (Method f : tInstance.getClass().getMethods()) {
if (f.getName().startsWith("set") && Character
.isUpperCase(f.getName().charAt(3))) {
f.setAccessible(true);
Method getter = tInstance.getClass()
.getMethod("get" + f.getName().substring(3));
getter.setAccessible(true);
f.invoke(entity, getter.invoke(tInstance));
}
}
}
}


The way you'd typically use this is as follows:

public T mergeTransient(T tInstance) {
EntityManager e = em.get();
if(e.contains(tInstance)) {
e.persist(tInstance);
return tInstance;
} else {
Object primaryKey = PersistenceHelper.findPrimaryKey(tInstance);
if(primaryKey != null) {
Object entity = e.find(tInstance.getClass(), primaryKey);
if(entity == null) {
e.persist(tInstance);
return tInstance;
}
else {
try {
PersistenceHelper.copyPersistentFields(entity, tInstance);
} catch (IllegalAccessException e1) {
e1.printStackTrace();
throw new IllegalArgumentException("Can't copy fields from transient class to persistent class.");
} catch (NoSuchMethodException e1) {
throw new IllegalArgumentException("Can't copy fields from transient class to persistent class.");
} catch (InvocationTargetException e1) {
throw new IllegalArgumentException("Can't copy fields from transient class to persistent class.");
}
e.persist(entity);
return (T) entity;
}
} else {
// primary key may be null, assume insert
e.persist(tInstance);
return tInstance;
}
}
}

Less than ideal


After experimenting with this pattern, I've come to the conclusion that although it works, I don't feel warm and cozy serializing instances out of the datastore, I like to have full control over what I'm sending down to the client so I can optimize for size and speed. However, I don't want to write boilerplate sychronization code for DTOs. In a later article, I'll detail a pattern for using ProtocolBuffers with GWT and a DSL for terse/concise manipulation of them.

It's still awesome


Even though there are some issue surrounding using RPC seamlessly with the Datastore ORM later, it completely trumps the time saved not having to do ANY configuration AT ALL to deploy an application. Words cannot describe how much of a time saver this is. No messing around with apt-get. No editing Apache configs. No Setting up log rotation and archiving. No dealing with backups. No bother with figuring out the right way to shard your db for your expected growth. No need to harden your own machines and firewalls. No need to provision anything but the application ID.

To be sure, there are still things you can't do on AppEngine. I can't run JNI. I can't launch threads. I can't use Java2D/JAI/JavaSound. And probably most relevant, I can't host long-running comet sessions. And if you really really need to do those, you can rent a server somewhere to do it.

However, the majority of applications don't use these capabilies, and for these developers, AppEngine is an epic win.

8 comments:

Anonymous said...

I agree. In addition, most hosting providers don't allow Java. Google has made it incredibly easy to use Java end-to-end. I have an article on this subject at http://blog.srinivasan.biz/2009/04/11/google-web-toolkit-and-google-app-engine-java-end-to-end

David Yu said...

I'd be interested on your approach on using protobuf from a datastore and using it with GWT rpc.
Keep em coming.

Cheers

phil said...

I'm very interested in how to get the RPC working with the GWT/Java and the App Engines dataabase. Hope you can follow this one up .. thanks

Unknown said...

Nice article... I've had similar experiences with GAE. The thing about requiring non-detachable objects in order for serialization to work over RPC is really annoying, IMO.

Angelyx said...

What about using google gson to turn POJOs into JSON that effectively allows RPC of a JDO into a JSO? It's not very recursive-friendly, and does need a bit of boilerplate, so you could skip the gson and use your reflection methods to get all the getter names, convert into a standard naming convention, construct a JSON string to send down the wire, and eval into JS Overlay type that mirrors the serverside JDO?

Yah, you would have to copy and paste classes, and replace JDO annotations with native coding {nothing a little regex script couldn't automate}, but it would certainly remove all hidden fields.

I don't know, I'm not expert. Just another pauper giving away his last two cents... I'm working on this right now, so if you could save some time and set me straight, it would be greatly appreciated.

Angelyx said...

First off, Ray, thanks for this awesome hack; I hope it's just temporary, but in the meantime, it wasn't entirely compatible with my beans...
...So, I made a few modifications to use a @GetterPrefix annotation to make the copyPersistentFields method test for getName or isName.

Step one, an enum for is and get:
public enum MethodPrefix {
GET("get")
,IS("is")
;
String x;
private MethodPrefix(String x){
this.x=x;
}
@Override
public String toString() {
return x;
}
}

Step two, the actual annotation:

package javax.jdo;

import ...;

@Target({ElementType.FIELD, ElementType.METHOD})
@Retention(RetentionPolicy.RUNTIME)
public @interface GetterPrefix {
public static final MethodPrefix Default = MethodPrefix.GET;
MethodPrefix prefix() default MethodPrefix.GET;
}

NOTE! This annotation MUST be in the javax.jdo or org.datanucleus package to pass AbstractAnnotationReader.isSupportedAnnotation()

Step Three, create a prefix-getter method

private static String defaultPrefix = GetterPrefix.Default.toString();
public static String getPrefix(Method f) throws SecurityException, NoSuchMethodException{
GetterPrefix x=f.getAnnotation(GetterPrefix.class);
return x==null?defaultPrefix: x.prefix().toString();

}

Step four, replace hardcoded "get" with method

Method getter = tInstance.getClass().getMethod("get" + f.getName().substring(3));

BECOMES

Method getter = tInstance.getClass().getMethod(
getPrefix(f) + f.getName().substring(3));

Just annotate your isName() with @GetterPrefix(prefix=MethodPrefix.IS)

Voila! Beans are happy!


...Thanks again, timelord.

DF said...

"I like to have full control over what I'm sending down to the client so I can optimize for size and speed. However, I don't want to write boilerplate sychronization code for DTOs. In a later article, I'll detail a pattern for using ProtocolBuffers with GWT and a DSL for terse/concise manipulation of them."

Yes, please, more detail on this. The thought of writing a @PersistenceCapable object with a bunch of fields, then a data transfer object (DTO) with those same damn fields, and constructors to pour fields from one to the other, seems horrible.

Anonymous said...

Hi Ray! Thanks for this and I'm looking forward to your upcoming article about using ProtocolBuffer!

Thanks!