Tuesday, December 8, 2009

GWT 2.0: So good it's ridiculous

GWT 2.0 was released today, and even though I've been using it for months and contributing to it, I didn't realize the significance of this release until I saw the Google Campfire event, because I've been exposed to each new feature incrementally, rather than all at once. If you take a step back, look at what has been accomplished since 1.7:

Development Mode: Who else but Google could build a plugin that runs on 3 different browsers on 3 different operating systems, deeply integrated with the Javascript VMs on IE, WebKit, and Firefox, that runs Java in a hybrid mode, where Java code runs in any JVM, and Javascript code is injected and ran in the browser as Javascript over a network connection? Do not underestimate the difficulty of doing this, especially when considering that Javascript can hold references to Java and vice-versa, causing problems for accurate garbage collection. Development Mode is an amazing productivity tool, made possible by some really awesome technology under the hood. Let's get a w00t! for the engineering that went into this.
UI Binder: Following the GWT mantra "Why do at runtime what you can do at compile time?" UIBinder is an extensible declarative language for constructing user interfaces, injecting those constructs into your Java code, and optimizing and bundling resources like stylesheets, images, and localized text. While it may seem like a throwback to use XML/XHTML in an era when HTML5 tagsoup and JSON are emerging as favorites, you have to consider what has been bought in exchange. UiBinder constructs HTML at compile time, which means it minimizes the number of DOM operations needed to get your UI up and running. It treats stylesheets as an intelligence resource, obfuscating and compressing them just as the GWT compiler does for JS, as well as providing local scope for CSS classes so that you can compose stylesheets written by different parties together without worrying about conflicts.
Google Eclipse Plugin: Technically, it's separate from GWT, but I'll list it here because of its tight integration with GWT. It provides intelligent integration between your Java source, JSNI methods, and UiBinder files. Again, this product alone would be a major event unto itself, but it's almost under-the-radar in Google's Campfire event.
Code Splitting: The value that whole-program compilation for JS brings to the table is no better demonstrated than it is here, by the ability of the GWT compiler to statically compute transitive callgraphs and dependencies from any callsite you desire, and split off that chunk of code for deferred loading. Ask yourself, if I handed you a hand-written large JS application like Maps, or GMail, or Y! Mail, how long would it take you to fully modularize all of the libraries used, get all of the dependencies exactly right (if you need feature F, load library F, but F needs G and H, so load libraries G and H). You can point to popular JS libraries using complex JS module loaders with dependency management, but keep in mind, that is all done by hand, on very mature libraries. GWT Code Splitting brings this functionality to mere mortals in just a few lines of code.
Constraint-Based Layout: Many modern JS libraries have punted on making CSS work for general compose-able widgets, and rely on JS to perform layout calculations for widgets. This works well, but has the downside of laggy performance, as the browser rendering engines are optimized for layout calculations, and often performing JS layout can cause excessive browser layouts. You'll see this in split panels or scroll panes sometimes as the divider or scrollbar lagging the mouse position. Joel Webber of the GWT team has finally cracked the code to make CSS based layouts work in a predictable fashion, and as a result, you can now compose arbitrary layouts in GWT with predictability, a productivity enhancement unto itself, even if some flexibility is lost, end-user performance and development speed are gained.
ClientBundle: Few people realize that GWT has been doing automated image sprites for a long time. ImageBundle, introduced in GWT 1.4, allowed GWT users to simply refer to images on disk, and the compiler would pack all of those images into a single file, transforming references to the images automatically into sprites. Then with 1.5/1.6, FooBundle, or ImmutableResourceBundle was introduced to the GWT Incubator. FooBundle/IRB took spriting to the next level by allowing all of your resources to be bundled into one or more files, or even inlined in your compiled JS as a gigantic data URI. GWT 2.0 introduces IRB into the core SDK, now called ClientBundle, it has been enhanced and polished. Simply put, ClientBundle allows GWT developers to reduce the number of HTTP requests for resources to just 1 or 2 which are cached forever, pack images optimally, optimize and minify CSS, and much more. Did you know that ClientBundle works with internationalization? That GWT ClientBundle can automatically flip images for Right-To-Left locales like Arabic, and flip CSS rules as well? Crazy awesome!
Optimized for Compression: I'm some what tooting my own horn on this feature, since I contributed it, but GWT 2.0 produces obfuscated Javascript in a way that no other tool on the planet does. In addition to renaming pretty much every variable in your entire program in a few symbols as possible, it actually re-orders code to increase gzip compression ratios. And that's just the beginning. The ability to just keep plugging in optimizations to this toolset is unbounded. Automated PNG optimization in the future? Check. New Deflate algorithms? Check. The list goes on.
HtmlUnit for Testability: You want testing? GWT's got you covered. Java tests? Check. Selenium tests? Check. Web-Mode test in real browser? Yep. But what's really new, is that GWT can run tests in a simulated browser environment thanks to HtmlUnit. This has two immediate benefits: 1) You can run your tests anywhere and not have to worry about prepping a test environment or platform specific details and 2) Tests can startup much quicker because they don't need to use a real browser. Again, it would seem like an almost irrelevent footprint if it weren't for the fact that a lot of effort is going into making this work correctly.
Speed Tracer, 'nuff said!: 1. It's awesome. 2. It's actually written in GWT. 3. It enables easy collaboration and sharing of traces. 4. It has a plugin mechanism for analyzing traces and providing automated warnings and suggestions like YSlow!. 5. It's awesome. 6. It's awesome!

But wait, there's more! These are just the main bullet points, there's lots of other changes under the hood that are also important, let's look at a few:

Story of Your Compile, Compile Reports: Think of it as Compile-Time SpeedTracer. SOYC gives you lots of detail about what each part of your app is contributing to the output in terms of size, and what the interdependencies are. This provides valuable feedback as to where to look to reduce bloat, or split-off code better. This would be the equivalent of a JS tool that looked at your jQuery application, and told you exactly how many bytes were used by each method in each jQuery plugin you're using, and which modules and functions pulled in the others. Think about it.
Better stack traces, and stack trace emulation for IE6: The GWT team simply isn't content with the status quo. When it doesn't exist, invent it. Internet Explorer does not provide proper Javascript stack frame traces. This means when something goes wrong in your application, it can be difficult to get a detailed report about what call sequence triggered it. GWT 2.0 contains the ability to switch on stack-trace emulation, which causes the compiler to insert line number, file, and callstack recording into every program method. When an error occurs in a GWT application compiled for IE6, you can actually reverse obfuscated emulated stack trace output back into real Java stack traces that tell you file and line number. Com'on, you're not excited yet?
Schedule Finally: Javascript is an event driven programming language. When the Javascript interpreter runs, it is the context of an event pumped through an event queue that enters the interpreter/VM through some handler callback, or default entry point. GWT 2.0 recognizes this fact by becoming the gatekeeper to every entry/exit of the application code, and by doing this, it can perform quite a few useful things. One is, it can implement the idea of an uncaught exception handler and enforce it, so that any exception that causes the JS interpreter to exit/yield can be diverted to a custom handler of your choice. More importantly, GWT can take control of how code is scheduled for execution, beyond setTimeout()/setInterval(). GWT 2.0 introduces the concept of Schedule Finally? What is it? Simply this: Defer execution of a piece of code to run after all other code in the current execution context is run, but before the next event is pumped. This is used, for example, by the GWT style injector, so that CSS style injection can be deferred to after the current event is handled, but run before the next is. This can allow efficient batching of operations that mutate the DOM, among other things.

I could go on, tons of compiler optimizations, Directly Evalable RPC, Conditional Deferred Binding Properties, HTML Standards Mode Support. One could make the argument, that almost everything in this list can be done, or has been done, elsewhere, or in parts of other tools. But that's why GWT is so beneficial. It unifies where other approaches fragment. Sure, you could combine your own code splitting, with your own handrolled JS loader, with someone else's image sprite tool, with another guy's test framework, and someone else's minifier, and this guy's trace analyzer, and that guy's widget toolset. And if you think you could achieve the same outcome, you'd be absolutely right.

GWT isn't about doing what isn't possible elsewhere. It's about making you productive doing it. It isn't about extolling the beauties of Java, it's about pragmatically recognizing the leverage you get from the existing toolchain, IDEs, test harnesses, and types. It isn't about hiding the DOM, CSS, and the browser from you. It's about allowing you to write your abstractions yourself more productively, just like you'd do in hand-coded JS anyway. Someday, there may even be the ability for GWT to compile other languages to Java, or even process JS with type annotations.

GWT makes productive, large scale web development possible, and enables next-gen applications to be built reliably.

-Ray

GWT Exporter 2.09 released

Improvements since last version:

Support for runtime method overloading.: Methods on the same class with identical names can be exported as a single Javascript method using run-time type-detection and dispatch. You pay a price in the form of a dispatch table encoded as a JSON object, but only for methods with overloads.
ExporterUtil::exportAll(): Feeling lazy? exportAll() will export every single exportable type in your classpath so that you don't have to invoke GWT.create() manually.
ExporterUtil::exportAllAsync(): All exports compiled to a separate JS fragment, loaded asynchronously, and invokes window.onexport() when done.
Exporter.export() deprecrated: Now simply invoking GWT.create(SomeClass.class) is sufficient for export.
Supports GWT 2.0 $entry function: $entry preserves Java exception handler behavior as well as the correct event-loop entry/exit, like running scheduleFinally commands on exit.
Requires GWT 2.0 RC2 or higher: Because $entry and code-splitting are now integrated features.

Structural Typing

I'm also about 90% done implementing Structural Types for exports, which allows idiomatic interoperability between JS and exported APIs. Structural Typing will be available in an upcoming 2.10 release real soon now.

The future

GWT Exporter was originally written for some dire needs I had, and the codebase is now over 2 years old. The latest additions have been difficult, as the codebase is getting messier. The 2.10 release will be the last release, afterwards, I will use the knowledge gained from the library to work on a rewrite for GWT core that may or may not make it into a future SDK distribution.

The goals for the future releases will be (1) Never pay for what you don't use, that is, exports should only bloat code if they absolutely must. 2) Provide flexible way to customize output for various idiomatic Javascript styles. You should be able to generate APIs that look like they were designed by hand. 3) Look at ways to process JSDocs and provide additional ways to automatically bridge JS and Java objects. 4) Improve error checking, diagnostic messages, and quality of auto generated JS docs.

GWT Exporter is available at http://code.google.com/p/gwt-exporter

Thursday, November 19, 2009

The Traveling Salesman Problem and Javascript Compression

In my last post, On Reducing the Size of Compressed Javascript, I covered a technique I had been researching on re-ordering Javascript code to enhance the efficiency of the gzip/deflate compressors. The basic approach is to come up with some way of measuring the distance between two functions, and then trying to group functions together by shortest distance.

A key question is, what's a good metric? I knew that the ideal metric would take into account how the LZ77 algorithm works, but for expediency, I went with a reasonable and well known string distance: the levenshtein distance.

GZip vs 7-zip

Another finding from my previous experiments was that 7-zip sports a much superior deflate implementation, it can compress a given input using the same algorithm as gzip with non-trivial size improvements. The problem is, 7-zip is implemented in C++, and I wanted to write a GWT Linker to automatically produce super-compressed Javascript.

In digging around for a Java 7-zip deflate implementation, I ran into this paper on Paul Sladen's page. titled: Compressing File Collections with a TSP-Based Approach. This paper published in 2004 details a similar technique of enhancing TAR compression of a collection of files by re-ordering the TAR to maximize compression. More on that later, but the paper itself has an obscure reference to something called the LZ Distance metric. Tracking down that paper "G. Cormode, M. Paterson, S. Sahinalp, and U. Vishkin. Communication complexity of doc- ument exchange. In Proc. of the ACM–SIAM Symp. on Discrete Algorithms, January 2000." leads us to the definition of LZ Distance.

LZ Distance

The LZ Distance between two strings x and y denoted by LZDIST(x,y) is as follows:

The minimum number of single characters or substrings of y or of the partially built string, which are required to build x from left to right.

As an example, consider x=aabcdefgabc and y=aaaaaaaaaaaaaaaa. We try to parse x in terms of y, and keep a running partially built string called 'p'. Here are the steps:

Prefix 'aa' of x is contained in y, so p = (aa), x = bcdefgabc
No prefix of x found in y or p, so just add 'b' p = (aa)(b), x = cdefgabc
No prefix of x found in y or p ,so just add 'c', p = (aa)(b)(c), x = defgabc
No prefix of x found in y or p ,so just add 'd', p = (aa)(b)(c)(d), x = efgabc
No prefix of x found in y or p ,so just add 'e', p = (aa)(b)(c)(d)(e), x = fgabc
No prefix of x found in y or p ,so just add 'f', p = (aa)(b)(c)(d)(e)(f), x = gabc
No prefix of x found in y or p ,so just add 'g', p = (aa)(b)(c)(d)(e)(f)(g), x = abc
'abc' is found at index '1' in p, so p = (aa)(b)(c)(d)(e)(f)(g)(abc)

Here, the LZDIST(x,y) = 8. But what of LZDIST(y, x)?

Prefix 'aa' of y is contained in x, so p = (aa), y = aaaaaaaaaaaaaa
Prefix 'aa' of y is contained in x, so p = (aa)(aa), y = aaaaaaaaaaaa
Prefix 'aaaa' of y is contained in p, p = (aa)((aa)(aaaa) y = aaaaaaaa
Prefix 'aaaaaaaa' of y is contained in p, so p = (aa)(aa)(aaaa)(aaaaaaaa)

LZDIST(y,x) = 4. Interesting, but what does this buy us?

LZDIST and GZIP

Recall that GZIP's deflate encoder uses an algorithm called LZ77 which looks at the current input tokens, and searches backwards for previous substrings that match the current input string, replacing them with back references. What LZDIST(x,y) gives us, is a measure of how many back references for X can be found in Y, that is, how X can be compressed by references to substrings in Y or in partially built decompression outputs.

Now you can see why this might help improve code re-ordering. If we can order Javascript functions in order to maximize back-references, we might be able to squeeze a little more out of gzip.

Clustering with LZDIST

Remember the earlier paper's title had the acronym TSP? This paper discusses how re-ordering can be viewed as an optimal tour on a graph which means that solving it is as hard as the Traveling Salesman Problem - NP-Complete. Uh oh, that sounds like trouble, but that's only NP Complete for the optimal solution. A good estimate on the true solution might also produce good results, and this paper shows that a greedy tour heuristic produces up to a 10% benefit in compression.

For expediency reasons, I decided not to implement a TSP model at this point, but went with simple pair-based greedy approach detailed in my previous blog post, substituting LZDIST for Levenshtein Distance. I also implemented an LZ-like greedy algorithm where I keep a running string which is the concatenation of all of the functions which have been added to the output, and use LZDIST to find the next closest function to append within a 32k window. This mimics how a gzip compressor might see the input.

Results for GWT Showcase Application

Technique	Size	Ratio	Relative Improvement (vs previous row)
No Compression	196778	100%
gzip -9	64401	32.7%	32.7%
7zip	61500	31.5%	4.6%
gzip -9 + levensthein	61409	31.2%	4.7%
7zip + levenshtein	58536	29.7%	4.7%
7zip + lzdist	57923	29.4%	1.1%
LZMA	51423	26.1%	11.3%

Looking at relative compression, levenshtein vs lzdist, the difference 57923/58536 = 1.1% which seems worthwhile to me. In a large script, this can save 1-2 kilobytes. I threw in LZMA which boasts 12% better compression in this instance than DEFLATE.

Notes on implementation

The LZDIST function is expensive to compute in a naive fashion. It is similar to the longest common substring problem. The brute force version can be on the order of O(n^3), while a dynamic programming version can run in O(n^2). To make this run efficiently and not irritate your users, you need to use a Generalized Suffix Tree approach. Compressors like 7-zip are already using Suffix trees to implement the LZ77 search algorithm, so those of you looking for hints to do an efficient implementation should take a look at the LZMA SDK (both C++ and Java versions are available)

Conclusion

Utilizing clustering compression techniques, LZ distance, and 7-zip, it is possible to reduce the GWT Showcase application from 64401 bytes to 57923 bytes, a savings of about 10.1%.

-Ray

Monday, August 17, 2009

On Reducing the Size of Compressed Javascript (by up to 20%)

Recently, I've been studying ways of reducing the download size of Javascript applications produced by Google Web Toolkit, while preserving or improving startup time. There are a number of ways to do this, the first of which is to transform Javascript code into a form that is naturally more succinct, while eliminating unused code, or deferring the load of some code until later.

Unfortunately, there is a tradeoff inherent in some of the fancier ways of making JS more succinct that use run-time code generation. This is put to good use in libraries like jQuery. For a large application, like Google Wave, this would increase startup time. Another technique is to teach a compiler even better optimizations for reducing redundant code, work which is ongoing in Google Web Toolkit.

Other attempts to reduce JS size have looked at ways to remove extraneous whitespace, shorten identifiers, and tokenize/pack JS statements, and while the former two are win-win one-way transforms, the latter has the trade-off requirement of being a reversible transform, meaning that it has to be decoded by an additional JS stage after load, a step that hurts startup performance.

Instead, I was drawn to a reversible transform the browser already includes support for: gzip compression, and decided to ask the question: what effect does the large-scale structure of the JS output code have on the DEFLATE algorithm of GZIP which is used to serve up compressed script? The answer it turns out, is substantial.

Reversible Transforms

If we didn't care about startup performance, we could spend all the time in the world unpacking JS code by including a custom tailored compressor, perhaps the LZMA (Lempel-Ziv Markov Chaining) algorithm or PPM (Prediction by Partial Matching), but unfortunately, these algorithms would be very slow to run in Javascript.

That leaves us with the built in GZIP compression that most web browsers include support for. The question is, can we improve compression while remaining compatible with the browser's decoder? There is an existing example of improving GZIP by injecting a reversible transform: bzip2 and the Burrows-Wheeler Transform.

Which brings up an interesting idea, can we do something like BWT's sort but for Javascript, in a way that doesn't require an extra pass to 'undo' the sorting?

Deflate: A digression

Before answering that question, it would be helpful to look at some of the restrictions of the deflate algorithm, and how code ordering could affect the outcome.

The deflate algorithm is a combination of two compression algorithms: LZ77 and Huffman coding. Huffman coding is a variable length code technique where symbols are replaced with codes based on their frequency of occurrence. So for example, the most common letter in the English alphabet is the letter e, so replacing e with a short code, but giving q, the least frequent character, a longer code, leads to shorter text overall. Huffman encoding is order independent, so that 'eq' is the same length as 'qe'.

LZ77 on the other hand, is a sliding window compression algorithm based on replacing strings with backwards references to previous strings in the input. For example, the string "this is a test" contains the substring 'is' repeated twice in a row, separated by a space, so that the second occurance of 'is' can be replaced with a length (2 characters, and a backwards distance (-3 positions), called the length-distance pair. The compressor typically scans backwards in the input within a certain window (e.g. 8,192 characters or 32,768 characters) looking for matches and then encoding them as length-distance pairs. The compressor has some freedom as to how hard it will search for a match before giving up (something I'll get to later).

One important effect of the sliding window limit is that if two Javascript functions with common substrings are separated by more than this distance, they cannot be matched.

But DEFLATE has another trick up its sleeve. It encodes the output of the LZ77 algorithm using Huffman encoding, but uses one Huffman tree for the character literals and length codes, and another Huffman tree for the backwards distances.

Which suggests another potential gain is to try and arrange for the backwards distances to be both small, and frequently the same, so as it produce shorter Huffman codes.

A sort with no undo

Thus far, intuition would tell us that if we could rearrange the input in order to bring more closely matched text closer together, we might be able to push up compression ratios, but how to do this without something to reverse the sort? Fortunately, unlike BWT, we are not working on plain text, but machine readable program code. We already know of something that rearranges code and moves it around, it's called a compiler!

We would not want a one-way text compression sort which say, brings Hamlet's prologue, climax, and epilogue together and randomly rearranges the rest, but your browser has no problem running your Javascript code if function foo is declared after bar, or before it. Thus, as long as two statements do not have order-dependencies, we can arrange them freely, in fact, all top-level function declarations can be rearranged arbitrarily.

Sort by Clustering

So, we've finally arrived at the point in which we have to devise our sort algorithm. What we want to do is, for any function Foo ensure that the best-match for this function in the whole program appears closest within the sliding window that the compressor will use. But bringing two functions that are most similar together in a greedy algorithm fashion won't necessarily produce the optimal result, as it is possible that by moving function Bar closer to function Foo, you've moved it away from lots of other functions that were good matches as well. As an example, consider these strings:


"Hello World"
"Hello World is a common test string used."
...several thousand strings later including the phrase "hello world" ...
"Common test strings used are metasyntactic variables like Foo and Bar."
"Variables like Foo and Bar are very common."

If we selectively moved the second string close to the first, we might prevent it from matching the other good matches later, especially if the window fills up.

One idea I started to think about was to repurpose Document Clustering techniques towards code. Document Clustering is commonly used in information retrieval systems to find related documents. Typically, a document is encoded using some technique to measure word importance, such as representing each word by its term frequency inverse document frequency. Then, any two documents can be compared by some distance metric, for example, taking the tf-idf weightings of terms as a vector in N-space and computing the cosine between them.

In this case, we'd let each function be a separate document, and the entire program be like the corpus of documents. We'd then choose some encoding to weigh Javascript grammar nodes by importance in a way that would produce good LZ77 matches, and then proceed in a bottom-up clustering fashion. First, we'd construct all the pairs of functions which match best. Pick a function, pair it with its best match, call that Cluster 1. Pick another function, pair it with its best match, call that Cluster 2, and so on. After this procedure is done, pick a Cluster, and find its nearest Cluster (according to some metric) and pair them up in a Cluster of 4 functions. After that's done, pair up 4-Clusters into Clusters of 8, and so on, until the final cluster encompasses the whole program.

What's a good metric?

Ideally, a good metric for comparing two functions would take into account the way the GZIP compressor searches for matches and encodes them. It is an interesting theoretical question, but for practical implementation purposes, I needed something that performs reasonably well, now. One algorithm that is pretty good at finding strings of rough similarity, even in the presence of noise, is the dynamic programming edit-distance algorithm. It's deployed widely in one variant or another in the bioinformatics industry for gene sequence alignment (Smith-Waterman, HMMR, etc), but the version commonly used for general CS work is the Levenshtein Distance.

Results

Taking Levenshtein Distance as my metric, I produced a greedy variant as a patch to the GWT compiler. The greedy variant does not implement bottom-up clustering, but instead, sorts all functions by length first (suggested by GWT team member Lex Spoon), and then performs a linear scan over the sorted functions, picking the best match each time to the previous output. The source code is here. Remarkably, even this simple algorithm produces nice gains? How much? Well, if you do nothing else, it produces a 5-7% gain in GWT's Showcase application when compressed with gzip -9. But there's more that we can do!

Optimizing GWT's Obfuscation

When the GWT compiler is executed in obfuscated mode, it renames every single Javascript identifier in the whole program except for foreign Javascript. Ideally, you want the shortest id possible. Up until recently, GWT limited the first character of an identifier to an alphabet of 32-characters, and for strings of length 2 or more, it used base-64 characters. However, due to a clever patch by GWT community member Andriasyan (Issue #2448), the first character can actually be chosen from a base-54 alphabet. This has the effect of shrinking output size by up to 1.75% prior to compression.

We're not done yet! The GWT compiler has other tricks up its sleeve. It performs the renaming from the bottom-most scopes upwards, letting each scope reuse variable identifiers as they become free. However, it unfortunately did not insure that identifiers were picked in a stable order. Thus, a function of 3 variables could be declared as function foo(a,b,c){ or as function foo(b,c,a){. Obviously, this would lead to suboptimal compression since every function of 3 variables should have the same suffix (a,b,c){. The effect of making obfuscated identifier allocation have a stable sort order combined with the base-54 patch produces an incredible gain of 10.5% when compressed with gzip -9.

Choosing a different GZIP implementation

The deflate algorithm actually gives some leeway to the compressor implementor in terms of how matches are found. ZLIB's implementation on which GZIP is based is actually not the best implementation, although it might be the best patent unencumbered one. Rather, the inventor of the LZMA algorithm has his own DEFLATE implementation in his 7-zip utility, which produces 4% better output than gzip by my estimates.

Combining base-54/base-64 obfuscated identifier encoding, stable sort-order for identifier allocation, my greedy clustering-by-edit-distance sort algorithm, and 7-zip as a gzip-compatible compressor, yields an incredible 21% reduction of the Showcase application. On a large 500k Javascript application, this means an additional 100k bandwidth is saved, with no performance penalty!

Conclusion

A general purpose technique (Cromwell Clustering Transform? (CCT) :)) for compilers to rearrange code for compression efficiency (vs say, cache locality) has been presented, which achieves non-trivial compression efficiency gains in Javascript output from the GWT compiler. Some of these techniques can also be applied to hand-written Javascript as well and included in 3rd party JS minification utilities.

Addendum

From reading a description of the algorithm, it may be hard to visualize. Here is sample output from GWT's Showcase application:


function lU(a){gU();while(bU){bU.a.a.a.p=false;$wnd.alert(s5b+a+t5b);bU=bU.b}cU=null}
function vR(a){qR();while(lR){lR.a.b.a.p=false;$wnd.alert(s5b+a+t5b);lR=lR.b}mR=null}
function QR(a){LR();while(GR){GR.a.b.a.p=false;$wnd.alert(s5b+a+t5b);GR=GR.b}HR=null}
function mS(a){hS();while(cS){cS.a.b.a.p=false;$wnd.alert(s5b+a+t5b);cS=cS.b}dS=null}
function KS(a){FS();while(AS){AS.a.b.a.p=false;$wnd.alert(s5b+a+t5b);AS=AS.b}BS=null}
function gT(a){bT();while(YS){YS.a.b.a.p=false;$wnd.alert(s5b+a+t5b);YS=YS.b}ZS=null}
function PT(a){KT();while(FT){FT.a.b.a.p=false;$wnd.alert(s5b+a+t5b);FT=FT.b}GT=null}
function JU(a){EU();while(zU){zU.a.b.a.p=false;$wnd.alert(s5b+a+t5b);zU=zU.b}AU=null}
function fV(a){aV();while(XU){XU.a.a.a.p=false;$wnd.alert(s5b+a+t5b);XU=XU.b}YU=null}
function DV(a){yV();while(tV){tV.a.a.a.p=false;$wnd.alert(s5b+a+t5b);tV=tV.b}uV=null}

For any function, note that the one immediately following it contains large numbers of common substrings of length 3 or greater.

Thursday, May 14, 2009

Gwt Query and Timefire @ Google I/O

If you're attending Google I/O this year (and you should, it's not to late to Register, $50 for Academia!) I'm doing two sessions. The first:
Progressively Enhance Ajax Applications with Google Web Toolkit and GQuery

Google Web Toolkit
Ray Cromwell
Thu 1:15p - 2:15p
Don't throw away your Web 1.0 websites just yet. In this session, you'll learn how to take your existing websites, and layer AJAX on top of them using GQuery, a jQuery style library for GWT. Learn how GQuery benefits performance over existing solutions, increases productivity, and reduces defects by leveraging the GWT tool chain.

The second:
Building Applications With Google APIs

Google Web Toolkit
Ray Cromwell
Thu 3:45p - 4:45p
Google offers a wide variety of APIs in many domains that together form a complete platform, from authentication and authorization, cloud computing, and social networking, to visualization, mobile computing, and Google Web Toolkit. In this talk, we will walk though a complex application that integrates many APIs together, how each can solve a different need in your application, how you can share code between GWT, Android, and App Engine, and how you can monetize your application with Google Checkout.

In the first presentation, I'll detail the revamped and relaunched GWT Query library, a jQuery clone for GWT featuring fast implementation, terse syntax, type safety, and a few other tricks. If you've ever wanted to use GWT, but didn't want to program in a Desktop Widget metaphor, this talk is for you.

The second presentation sounds stale (I couldn't write more exciting copy, honest!) but it is infact, the story of Timefire, a software-as-a-service visualization/analytics platform we are building that leverages a number of Google APIs, from AppEngine, to GWT, to Android, OpenSocial, Checkout, and beyond. If you've got a need, Google has an API for it, and in my opinion, what they have is looking more and more like a complete platform, from client to cloud.

So if this sounds interesting, mark Thursday on your calendars, and come see the amazing things you can do with Google Web Toolkit.

-Ray

Friday, April 24, 2009

GWT RPC over arbitrary transports: Uber RPC

A frequent request that pops up in the GWT groups is how to run GWT RPC over non-XHR transport mechanisms, like script-tag injection, cross-domain POSTs, or OpenSocial Gadget's makeRequest().

There are a few ways to do this, including patching GWT itself, but the path of least resistance would be a module that you could inherit which would provide this functionality, without compromising or changing your client code in any way.

Step 1: Overwrite GWT ServiceInterfaceProxyGenerator

In order to handle alternate transport mechanisms, we need to take over generation of the RPC client stub that is created by the builtin generators, to do this, we add the following entries to a module file:


             class="com.google.gwt.user.rebind.rpc.UberServiceInterfaceProxyGenerator">
                        class="com.google.gwt.user.client.rpc.RemoteService"/>

This tells GWT to invoke our UberServiceIntefaceProxyGenerator whenever someone calls GWT.create(RemoteService.class). There are some package-protected methods we need access to, so we arrange for our class to be in the com.google.gwt.user.rebind.rpc package. Next, we want to modify the generator to allow substitution of arbitrary client stub superclasses.


<define-property name="gwt.rpc.proxySuperclass"
                     values="org_timepedia_uberrpc_client_RpcServiceProxy"/>
<set-property name="gwt.rpc.proxySuperclass"
                  value="org_timepedia_uberrpc_client_RpcServiceProxy"/>

This construct is used to pass a compile time parameter to the generator as to which class will be used as a client stub, org.timepedia.uberrpc.client.RpcServiceProxy. Here is our proxy generator source, which exists merely to redirect to UberProxyCreator


public class UberServiceInterfaceProxyGenerator extends Generator {

  @Override
  public String generate(TreeLogger logger, GeneratorContext ctx,
      String requestedClass) throws UnableToCompleteException {

    logger.log(TreeLogger.WARN, "Running UberProxyCreator", null);
    
    TypeOracle typeOracle = ctx.getTypeOracle();
    assert (typeOracle != null);

    JClassType remoteService = typeOracle.findType(requestedClass);
    if (remoteService == null) {
      logger.log(TreeLogger.ERROR, "Unable to find metadata for type '"
          + requestedClass + "'", null);
      throw new UnableToCompleteException();
    }

    if (remoteService.isInterface() == null) {
      logger.log(TreeLogger.ERROR, remoteService.getQualifiedSourceName()
          + " is not an interface", null);
      throw new UnableToCompleteException();
    }

    UberProxyCreator proxyCreator = new UberProxyCreator(remoteService);

    TreeLogger proxyLogger = logger.branch(TreeLogger.DEBUG,
        "Generating client proxy for remote service interface '"
            + remoteService.getQualifiedSourceName() + "'", null);

    return proxyCreator.create(proxyLogger, ctx);
  }
}

Step 2; Override the superclass of the generated client proxy stub

This source was mostly copied unchanged from the original, except for the line which calls UberProxyCreator. The bulk of the work is done there. Again, I copied the source, but made just a few changes to the routine which creates the SourceWriter


private SourceWriter getSourceWriter(TreeLogger logger, GeneratorContext ctx,
      JClassType serviceAsync) {
    JPackage serviceIntfPkg = serviceAsync.getPackage();
    String packageName = serviceIntfPkg == null ? "" : serviceIntfPkg.getName();
    PrintWriter printWriter = ctx
        .tryCreate(logger, packageName, getProxySimpleName());
    if (printWriter == null) {
      return null;
    }

    ClassSourceFileComposerFactory composerFactory
        = new ClassSourceFileComposerFactory(packageName, getProxySimpleName());

    String[] imports = new String[]{RemoteServiceProxy.class.getCanonicalName(),
        ClientSerializationStreamWriter.class.getCanonicalName(),
        GWT.class.getCanonicalName(), ResponseReader.class.getCanonicalName(),
        SerializationException.class.getCanonicalName()};
    for (String imp : imports) {
      composerFactory.addImport(imp);
    }

    String rpcSuper = null;
    try {
      // retrieve user-defined superclass from module file
      rpcSuper = ctx.getPropertyOracle()
          .getPropertyValue(logger, "gwt.rpc.proxySuperclass");
      if (rpcSuper != null) {
        rpcSuper = rpcSuper.replaceAll("_", ".");
      }
    } catch (Exception e) {
    }

    // allow defining a custom superclass to customize the RPC implementation
    composerFactory.setSuperclass(rpcSuper);
    composerFactory.addImplementedInterface(
        serviceAsync.getErasedType().getQualifiedSourceName());

    composerFactory.addImplementedInterface(
        serviceAsync.getErasedType().getQualifiedSourceName());

    return composerFactory.createSourceWriter(ctx, printWriter);
  }

Step 3: Write your own Proxy

This is where you come in, since you have to decide how you're going to transport the RPC payload, such as putting it as a URL GET parameter, a POST parameter, or using an OpenSocial container. Here is some skeleton code showing you how to override the doInvoke method. This example is pseudo-code for how you'd do it using gadgets.io.makeRequest() in an OpenSocial container.


public class RpcServiceProxy extends RemoteServiceProxy {

  protected GadgetRpcServiceProxy(String moduleBaseURL,
      String remoteServiceRelativePath, String serializationPolicyName,
      Serializer serializer) {
    super(moduleBaseURL, remoteServiceRelativePath, serializationPolicyName,
        serializer);
  }

  static boolean isReturnValue(String encodedResponse) {
     return encodedResponse.startsWith("//OK");
   }

   /**
    * Return true if the encoded response contains a checked
    * exception that was thrown by the method invocation.
    * 
    * @param encodedResponse
    * @return true if the encoded response contains a checked
    *         exception that was thrown by the method invocation
    */
   static boolean isThrownException(String encodedResponse) {
     return encodedResponse.startsWith("//EX");
   }

  public static final String RPC_PAYLOAD_PARAM="rpcpayload";
  
  @Override
  protected  Request doInvoke(
      final RequestCallbackAdapter.ResponseReader responseReader, String methodName, int invocationCount,
      String requestData, final AsyncCallback tAsyncCallback) {

    try {
      makeRequest(getServiceEntryPoint(), "text/x-gwt-rpc; charset=utf-8", requestData, new AsyncCallback() {
        public void onFailure(Throwable throwable) {
           tAsyncCallback.onFailure(new InvocationException("Unable to initiate the asynchronous service invocation -- check the network connection"));
        }

        public void onSuccess(String encodedResponse) {
          try {
            if(isReturnValue(encodedResponse)) {
              tAsyncCallback.onSuccess((T) responseReader.read(createStreamReader(encodedResponse)));
            }
            else if(isThrownException(encodedResponse)) {
              tAsyncCallback.onFailure((Throwable)responseReader.read(createStreamReader(encodedResponse)));
              
            }
            else {
              tAsyncCallback.onFailure(new InvocationException("Unknown return value type"));
            }
          } catch (SerializationException e) {
            tAsyncCallback.onFailure(new InvocationException("Failure deserializing object "+e));
          }
        }
      });
    } catch (Exception ex) {
      InvocationException iex = new InvocationException(
          "Unable to initiate the asynchronous service invocation -- check the network connection",
          ex);
      tAsyncCallback.onFailure(iex);
    } finally {
      if (RemoteServiceProxy.isStatsAvailable()
          && RemoteServiceProxy.stats(RemoteServiceProxy.bytesStat(methodName,
          invocationCount, requestData.length(), "requestSent"))) {
      }
    }
    return null;
  }

  private native void makeRequest(String serviceEntryPoint, String contentType,
      String requestData, AsyncCallback tAsyncCallback) /*-{
      var params = {};
      params[gadgets.io.RequestParameters.CONTENT_TYPE] = gadgets.io.ContentType.TEXT;
      
      params[gadgets.io.RequestParameters.AUTHORIZATION]=gadgets.io.AuthorizationType.SIGNED;
      params[gadgets.io.RequestParameters.METHOD]=gadgets.io.MethodType.GET;
      
      gadgets.io.makeRequest(serviceEntryPoint+"?"+@org.timepedia.uberrpc.client.UberRpcServiceProxy::RPC_PAYLOAD_PARAM+"="+encodeURIComponent(requestData), function(resp) { 
          if(resp.errors && resp.errors.length > 0) {
            tAsyncCallback.@com.google.gwt.user.client.rpc.AsyncCallback::onFailure(Ljava/lang/Throwable;)(null)
          }
          else {
          tAsyncCallback.@com.google.gwt.user.client.rpc.AsyncCallback::onSuccess(Ljava/lang/Object;)(resp.text);
          }
      }, params);
  }-*/;
}

Step 4: Modify RemoteServiceServlet

The last step is to modify RemoteServiceServlet so that it understands the new transport formats you've devised. Here's an example of one that would handle the incoming OpenSocial makeRequest(). This one handles GET or POST requests with the incoming payload as a form parameter.


public class GadgetServiceServlet extends RemoteServiceServlet {

  @Override
  protected void doGet(HttpServletRequest httpServletRequest,
      HttpServletResponse httpServletResponse)
      throws ServletException, IOException {
    doPost(httpServletRequest, httpServletResponse);
  }

  @Override
  protected String readContent(HttpServletRequest httpServletRequest)
      throws ServletException, IOException {
    String str = httpServletRequest.getMethod().equals("POST") ? RPCServletUtils
        .readContentAsUtf8(httpServletRequest, false) : httpServletRequest
        .getParameter(RpcServiceProxy.RPC_PAYLOAD_PARAM);
    String ustr = URLDecoder.decode(str);
    return ustr;
  }
}

To use, you'd just subclass this servlet.

The Sky's the Limit

Once you get to this point, you can start imagining all of the cool things you can do. Cross-domain POSTs using the window.name trick. OAuth signed RPC requests, verified in the servlet, done by OpenSocial containers. FaceBook integration. RPC over JSON, Protocol Buffers, Thrift. I'm trying to cobble together some of this stuff for a future UberRPC module, but due to talks I'm giving at the upcoming Google I/O, I'm a little too swamped to make them release worthy at this point.

Much credit goes to Alex Epshteyn for his original proposal on the GWT Contributors list, which I picked up (over an inferior method I had been using to make Gadgets work), and integrated into a more graceful override of the default RPC behavior.

If you're going to Google I/O, I will be doing two talks this year. One on Progressive Enhancement using GWT and a new library I've written, and a second on Building an Application on Google's Open Stack, which is a walkthrough of a sophisticated GWT app which leverages about a dozen Google APIs.

-Ray

Tuesday, April 7, 2009

Small update to GWT Exporter

I recently discovered a pretty critical bug in GWT Exporter that can cause an infinite loop doing export. This was fixed in version 2.06 which available in the trunk and maven repository.

-Ray

Google AppEngine and GWT now a marriage made in heaven

The announcement that Google AppEngine now supports Java is incredible news. Not just because it opens the doors to running arbitrary JVM languages like Scala, JRuby, PHP, etc on AppEngine, but because of the ability to wire up Java on the client, and Java on the server, through Google Web Toolkit. You can use all of your familiar Java tools for editing, debugging, testing, and packaging.

With the new system, you can write a POJO, add JPA or JDO annotations, and write server-side logic to persist these POJOs in either a RDBMS like MySQL, or in BigTable/AppEngine. Moreover, you can export your DAO or logic interfaces through GWT RPC, and call them directly from the client, seamlessly, and painlessly.

Almost Painlessly

The one hitch you'll encounter as a GWT developer is trying to serialize or deserialize persistence capable types. This is nothing new for GWT developers who have tried this with Hibernate before, and there are workarounds such as Hibernate4GWT. This problem occurs because the persistence classes are enhanced with an extra field to hold state which enables them to work when detached from the persistence context. GWT RPC computes its own CRC based on the fields of a class in order to ensure compatibility between server and client and the extra field causes problems.

In general, when it comes to sending serialized ORM POJOs down the wire, I think it's a risky practice, because you're likely to pull in a lot more of the reachable object tree than you bargained for unless you're careful. A better approach might be to use DTOs based on ProtocolBuffers.

However, it is sometimes nice to do it if your POJOs are relatively flat and you want to rapidly prototype. If your insist on using your ORM'ed POJOs over RPC, there is a trick to making it work.

Making JDO/JPA enhanced classes work over GWT RPC

The first step to making things work is to disable detachable objects and tag your class as serializable.


@PersistenceCapable(identityType = IdentityType.APPLICATION, detachable = "false")
public class MyPojo implements Serializable {

}

This does two things. First, it tells the persistence engine that you'll be managing object identity, usually through a primary key, and secondly, when your POJO is accessed outside of a transaction/session, you want it to be transient not detached. A detached object remembers where it came from, so that after modifications, it can be reattached and merged back into the datastore. A transient object forgets that it once came from the datastore, and thus if you try to repersist it, you'll end up inserting a copy.

This has a major downside in terms of ease of use, but it does prevent the enhancer from injecting hidden fields into your class to manage detached state, and it is these hidden fields which break GWT RPC compatibility.

But I don't want copies!

In using transient objects, you'll break a desired design pattern, which is to fetch an object through RPC, modify its properties, and send the same object back through RPC to be merged into the datastore. A quick and dirty work around is to use reflection to lookup an attached object on the server, copy all of the persistent fields from the transient object, and then merge the persistent object. Here's a prototype class that does this (but not recursively, so it doesn't handle anything but primitive fields):


public class PersistenceHelper {
  public static  Object findPrimaryKey(T tInstance) {
    if (tInstance == null) {
      return null;
    }
    for (Field l : tInstance.getClass().getDeclaredFields()) {
      if (l.getAnnotation(PrimaryKey.class) != null
          || l.getAnnotation(Id.class) != null) {
        l.setAccessible(true);
        try {
          return l.get(tInstance);
        } catch (IllegalArgumentException e) {
          e.printStackTrace();
          return null;
        } catch (IllegalAccessException e) {
          e.printStackTrace();
          return null;
        }
      }
    }
    return new IllegalArgumentException(
        "Class " + tInstance.getClass().getName()
            + " does not have a method called getId()");
  }

  public static  void copyPersistentFields(Object entity, T tInstance)
      throws IllegalAccessException, NoSuchMethodException,
      InvocationTargetException {
    for (Method f : tInstance.getClass().getMethods()) {
      if (f.getName().startsWith("set") && Character
          .isUpperCase(f.getName().charAt(3))) {
       f.setAccessible(true);
        Method getter = tInstance.getClass()
            .getMethod("get" + f.getName().substring(3));
        getter.setAccessible(true);
        f.invoke(entity, getter.invoke(tInstance));
      }
    }
  }
}

The way you'd typically use this is as follows:


public  T mergeTransient(T tInstance) {
    EntityManager e = em.get();
    if(e.contains(tInstance)) {
      e.persist(tInstance);
      return tInstance;
    } else {
      Object primaryKey = PersistenceHelper.findPrimaryKey(tInstance);
      if(primaryKey != null) {
        Object entity = e.find(tInstance.getClass(), primaryKey);
        if(entity == null) {
          e.persist(tInstance);
          return tInstance;
        }
        else {
          try {
            PersistenceHelper.copyPersistentFields(entity, tInstance);
          } catch (IllegalAccessException e1) {
            e1.printStackTrace();
            throw new IllegalArgumentException("Can't copy fields from transient class to persistent class.");
          } catch (NoSuchMethodException e1) {
            throw new IllegalArgumentException("Can't copy fields from transient class to persistent class.");
          } catch (InvocationTargetException e1) {
            throw new IllegalArgumentException("Can't copy fields from transient class to persistent class.");
          }
          e.persist(entity);
          return (T) entity;
        }
      } else {
        // primary key may be null, assume insert
        e.persist(tInstance);
        return tInstance;
      }
    }
  }

Less than ideal

After experimenting with this pattern, I've come to the conclusion that although it works, I don't feel warm and cozy serializing instances out of the datastore, I like to have full control over what I'm sending down to the client so I can optimize for size and speed. However, I don't want to write boilerplate sychronization code for DTOs. In a later article, I'll detail a pattern for using ProtocolBuffers with GWT and a DSL for terse/concise manipulation of them.

It's still awesome

Even though there are some issue surrounding using RPC seamlessly with the Datastore ORM later, it completely trumps the time saved not having to do ANY configuration AT ALL to deploy an application. Words cannot describe how much of a time saver this is. No messing around with apt-get. No editing Apache configs. No Setting up log rotation and archiving. No dealing with backups. No bother with figuring out the right way to shard your db for your expected growth. No need to harden your own machines and firewalls. No need to provision anything but the application ID.

To be sure, there are still things you can't do on AppEngine. I can't run JNI. I can't launch threads. I can't use Java2D/JAI/JavaSound. And probably most relevant, I can't host long-running comet sessions. And if you really really need to do those, you can rent a server somewhere to do it.

However, the majority of applications don't use these capabilies, and for these developers, AppEngine is an epic win.

Thursday, April 2, 2009

GWT's type system is more powerful than Java

This may come as a shock to some people. Isn't GWT just Java syntax you say? Yes, it is, and it does not extend the Java grammar in anyway. Yet, it is nonetheless true that GWT is more powerful than Java.

The Overlay Type

The reason why GWT is more powerful is because it is actually a unification of two type systems, Java and Javascript. And while it seems that these are relatively walled off from one another, there is a bridge that unites the two, and that is the GWT Overlay Type system.

The overlay system in essence, permits the 'overlay' or attachment of Java types to Javascript objects. And since you can pass Java objects into Javascript and back, this means you can in fact, overlay types on Java as well.

Categories and Extension Methods

Some languages have a facility called Categories or Extension Methods, modern examples include Objective-C, Groovy, and C#. A category allows you to pretend as if an existing reference implements additional methods, when it actually doesn't.

In reality, they are syntax sugar for invocation of static utility functions. That is:


SomeType x = new SomeType();
SomeTypeUtils.doSomething(x);

becomes


SomeType x = new SomeType();
x.doSomething();

Transparently, behind the scenes, the compiler rewrites the invocation of x.doSomething() into SomeTypeUtils.doSomething(x). GWT's JavaScriptObject overlays are essentially Categories, as the compiler transparently rewrites methods that appear to exist on an Overlay into static method calls on the underlying JavaScriptObject. This is one reason why JSO methods have to be effectively final, as there is no polymorphism allowed.

This all sounds very interesting and theoretical, but what's the practical benefit?

Type-Safe Enumerations with Zero Overhead

Let's say you want to write a method that can set the display property of an element to one of the legal values, and only the legal values. Traditional approaches would include using a Java enum and writing a setter method that accepts only this type:


enum DisplayType {
   BLOCK, INLINE, TABLE, NONE;
}

public static void setDisplay(Element e, DisplayType t) {
   e.getStyle().setProperty("display", t.name().toLowerCase());
}

Unfortunately, this will generate a lot of bloat, since each enum value is a class, the class must be initialized by a static initializer, and the ultimate CSS property value string has to be obtained through method calls that might not inline because of polymorphic dispatch.

Think about what we want here. Don't we really just want to create a subclass of java.lang.String, create a bunch of this String subclass constants, and write a method to accept that type? Unfortunately, java.lang.String is final. You can't subclass it in Java.

But you can in GWT, and I'll show you how!

Turn a String into a JSO


public class DisplayType extends JavaScriptObject {
   protected DisplayType() {}
   public native static DisplayType make(String str) /*-{
      return str;
   }-*/;

   public native String value() /*-{
      return this;
   }-*/;
}

For brevity, I left out the extra code to support Hosted Mode. In hosted mode, you must wrap the 'str' argument in an array, e.g. [str] and fetch it using return this[0], but that's not important. What this code is effectively doing is casting a String into a DisplayType and imposing an additional categorical method on this reference, which is value(). Keep in mind, the value() is never actually attached to the prototype of the underlying Javascript object.

To use,


DisplayType NONE = DisplayType.make("none");
DisplayType BLOCK = DisplayType.make("block");
DisplayType INLINE = DisplayType.make("inline");
DisplayType TABLE = DisplayType.make("TABLE");

Now, what is the output of the GWT compiler when compiling element.getStyle().setProperty("display", BLOCK.value())? Here it is:


element.style['display']='block';

In fact, even calling the setDisplay() method with various DisplayType enums results in inlined assignments to the element.style property with no method calls!

Adding methods to Numbers

In Groovy, Scala, and some languages, you can even add methods to primitive integers. Using GWT overlay types, you can even do this!


 public class Int extends JavaScriptObject {
    protected Int() {}
    public static native Int make(int x) /*-{
      return x;
    }-*/;
    
    final public native int value() /*-{
      return this;
    }-*/;
    
    final public Int square() {
      return make(value() * value());
    }
  }

 int val = (int) Duration.currentTimeMillis();
 Int x = Int.make(val);
 Int sq = x.square();
 Window.alert(String.valueOf(sq.value()));

And what do you think the generated code looks like? Try this:


val = (new Date()).getTime();
x = val;
sq = x * x;
$wnd.alert('' + sq);

There you have it. I successfully added a method to a primitive 'int' called square() without using a wrapper, and with no overhead whatsoever. This opens the doors to implementing primitive 'wrappers' for GWT for int, double, float, etc, that are not wrappers at all and have no overhead, which would be very useful in many circumstances where Integer, Double, Float, in java.lang are too heavy.

So, can we conclude from this that GWT is more powerful than Java? ;-)

-Ray

Thursday, March 12, 2009

Relaxing constraints on GWT.create()

Recently I've been playing around with some GWT ideas that really cry out for a more liberal deferred binding system. Currently, GWT imposes the restriction that deferred binding can only happen through the GWT.create() method. There's a couple of problems with this:

Can't narrow type signature for custom library

Can't create a method to decorate the returned result

Can't parameterized or override the bindings at callee site

To illustrate this, I've compiled some motivation examples.

RPC requests signed by OAuth


interface MyService extends RemoteService<MyServiceAsync> { ... }
// example call
OAuth.withSignature(MyService.class).method1(arg1, arg2, asyncCallback);


public class OAuth {
  @GwtCreate  // method must be static, class parameter must be final
  public static <S, T extends RemoteService<S>> S withSignature(final Class<T> service) {
        // GWT.create with non-literal ONLY allowed 
        // if enclosing method is @GwtCreate and 
       // variable statically resolvable to literal param
        S async = GWT.create(service);  
        ((ServiceDefTarget)async).setRequestBuilderCallback(new OAuthRequestBuilderSigner());
        return async;
  }
}

Currently today, it would look like this:


MyServiceAsync foo = GWT.create(MyService.class);
RequestBuilder rb = foo.method1(arg1, arg2, asyncCallback);
OAuthRequestBuilderSigner.sign(rb);
rb.send();

The tight coupling at the callsite also makes it difficult to swap out implementations easy (like using AuthSub, or one of the other 4 Google Friend Connect auth techniques). An alternative is to make a separate subtype of each with a custom generator, e.g.

MyServiceOAuth extends MyService, GWT.create(MyServiceOAuth.class)

I'd argue that the above gets cumbersome with multiple services and multiple authentication types. I've taken some liberties above by adding a type parameter to RemoteService to make the return type statically resolvable, as well as allowing a global callback mechanism on ServiceDefTarget for RequestBuilder override. Note that type-safety is ensured, one can't call this method with something that is not a RemoteService, and it will be flagged at edit-time in your IDE.

An EasyMock library for Hosted/Web Mode


Subscriber mock = GMock.mock(Subscriber.class);
publisher.add(subscriber);
//...
GMock.replay(mock);

public class GMock {
  // selectively override module binding rules ONLY for this method
  @GwtCreate(generator=com.gmock.rebind.GMockGenerator.class)
  public static <T> T mock(Class<T> toMock) {
        return GWT.create(toMock);
  }
}

In this method, the mock() method acts like GWT.create() except that it overrides the current binding rules, forcing the specified generator.

GWT Exporter


Exporter.export(Foo.class);

public class Exporter {
   @GwtCreate
   public static <T extends Exportable> void export(Class<T> exportable) {
       ExporterImpl ximpl = GWT.create(exportable);
       ximpl.export();
   }
}

Note, the type safety, one can't try to Export a non-exportable class. This will be flagged at edit time in the IDE.

GIN/Guice dependency injection


Processor pimpl = Gin.inject(Processor.class, TestProcessorModule.class);

public class Gin {
  @GwtCreate(generator = com.google.gin.rebind.GInjectorGenerator)
  public static <T, S extends AbstractGinModule> 
      T inject(Class<T> interf, @GwtCreateParam Class<S> module) {
         return GWT.create(interf, module);
  }
}

This one is more controversial, but allows GWT.create() to be parameterized by literal metadata that is available to the generator. This is semantically equivalent to what we have today:


@GinModule(TestProcessorModule.class)
interface MyTestInjector extends GInjector {
   Processor getProcessor();
}

but without the need to actually write the interface.

Combing compile-time and run-time parameters

Final example, mix-and-match both compile-time variables and run-time variables:


OAuth.withSignature(MyService.class, debugMode ? "/debugService" : "/MyService").method1(arg1, arg2, asyncCallback);

public class OAuth {
  @GwtCreate  // method must be static, class parameter must be final
  public static <S, T extends RemoteService<S>> 
        S withSignature(final Class<T> service, String endPoint) {
        // GWT.create with non-literal ONLY allowed if enclosing method is 
        // @GwtCreate and variable statically resolvable to literal param
        S async = GWT.create(service);  
        ((ServiceDefTarget)async).setRequestBuilderCallback(new OAuthRequestBuilderSigner());
        ((ServiceDefTarget)async).setServiceEntryPoint(endPoint);
        return async;
  }
}

Friday, March 6, 2009

Spring cleaning and the blog

To try and improve readability and reach, I made a few changes today:

1) Re-enabled SyntaxHighlighter for code snippets
2) Widened the width of the main content to 600px (for code blocks)
3) removed Twitter and added Google Friend Connect
4) enabled CAPTCHA to reduce comment span
5) tweak some colors so the subsections are easier to read.

If you have any other advice for improving the Look and Feel, leave a comment.

-Ray

Thursday, March 5, 2009

Structural Typing for GWT and Javascript

In a previous post I described a type of impedance mismatch between Javascript and Java idioms that makes the GWT Exporter still less than ideal for supporting Javascript programmers. A short illustrative (but contrived) example:


public class Customer implements Exportable {
   private String firstName, lastName;
   @Export String getFirstName() { 
      return firstName; 
    }

   @Export void setFirstName(String fn) {
      firstName = fn;
   }

   @Export String getLastName() { 
      return lastName; 
    }

   @Export void setLastName(String ln) {
      lastName = fn;
   }
}

Today, you may use this export in Javascript like this:


var cust = new Customer();
cust.setFirstName("Ray");
cust.setLastName("Cromwell");
Processor.doSomething(cust);

The example is contrived because you could use a constructor, but with more complex objects with nested types, you wouldn't use a constructor, but either a builder pattern, or inject the types after construction.

Javascript developers however don't work in the world of Javabean interfaces, they prefer easy construction of configuration/builder information via object literals:


Processor.doSomething({firstName: "Ray", lastName: "Cromwell"})

Moreover, when passing in say, a bind of event callbacks, they'd prefer to write:


foo.addEvents({
    click: function(e) { ... }
    move: function(e) { ... }
    drop: function(e) { ... }
});

The challenge is to seemlessly bridge this idiomatic mismatch between JS and Java GWT code without actually having to write much bridging code or adapters.

Structural Typing

Java is a manifestly typed language. All types have to be declared, and type checking is done by explicit hierarchy. Javascript is a dynamically typed language, with essentially no type checking at all. Orthogonal to this is the concept of Structural Typing. Haskell, ML, and Scala are all examples of languages which support structural typing. Structural typing was also planned for Javascript 2 before it got killed.

So what's a structural type? Recall the Customer example from the previous section. It was a class with two fields, firstName, and lastName, both Strings. If Java supported a structural type system, I could declare a method in two ways:


public void process(Customer cust) { ... }

public void process({firstName: String, lastName: String} cust) { ... }

In this invented syntax, the cust parameter to the second process() function is an anonymous type, we don't know its real name. However, we are stating that as long as it consists of two fields named "firstName", and "lastName", and the types are both Strings, then we can access these fields and treat it like a Customer (although it may not be one)

Hmm....I smell an idea...

Structural Type Exports

What if I rewrite the Customer POJO class with a @StructuralType annotation:


@Export
@StructuralType
public class Customer implements Exportable { ... }

The dispatch code for an exported process() function could then look like this:


$wnd.Processor.prototype.process = function(cust) {
  if(cust instanceof $wnd.Customer && isExportedInstance(cust)) {
    // JSNI dispatch to @Processor::process(LCustomer;)(unwrap(cust));
   }
  else if(typeof(cust) == 'object') {
      // cust is not an instance of the Customer POJO, but an object
      if(cust.firstName != undefined && cust.lastName != undefined) {
          var scust = new Customer();
          scust.setFirstName(cust.firstName);
          scust.setLastName(cust.lastName); 
          @Processor::process(unwrap(scust));
      }
   }
}

The GWT compiler would auto-inject this structural type check and initialization code simply by annotating a parameter or return type involved with @StructuralType

Refinements to the idea

Taking this a step further, one could override the expected type literal field names to be checked


@SType("fn")
public void setFirstName(String firstName) { ... }
@SType("ln")
public void setLastName(String lastName) { ... }

which would allow the object literal to be specified as {fn: "Ray",ln: "Cromwell"}. Another extension would allow partial matches to succeed with default values supplied:


@Optional("Ray")
public void setFirstName(String firstName) { ... }

which would reduce the structural type to just a lastName field for matching purposes, but would allow the specification of firstName to be supplied and injected into the setter as "Ray" if it wasn't present.

Structurally Typed Interfaces

GWT Exporter already supports closure conversion for single-method Java interfaces. That is, if the following interface:


public interface MyCallback {
     public void go(Object arg);
}

occurs as a type parameter in an exported method, the Javascript developer may supply function(arg) { ... } and the GWT generated bridge code will automatically convert this into a GWT object instance that implements the MyCallback interface type. What it cannot support is the example given earlier:


foo.addEvents({
    click: function(e) { ... }
    move: function(e) { ... }
    drop: function(e) { ... }
});

which would represent something like:


public interface MyEventHandler {
    public void onClick(Event e);
    public void onMove(Event e);
    public void onDrop(Event e);
}

However, using structural typing conventions, and an @StructuralType annotation on an interface, we can auto-convert object literals containing multiple function closures into a Java interface for GWT.

I am still working out the details of the full overload resolution algorithm for overloaded function types, but some version of this proposal will make it into GWT Exporter 3.0.

-Ray

GWT Exporter 2.05 released

The release adds two new features:

1) arrays are return types or parameters supported, that is:


@Export public double[] getFoo() { }
@Export public SomeExportable[] getBar() {}
@Export public void setFoo(double[] foo) { }
@Export public void setBar(SomeExportable[] bar) { }

now works, in both Web mode and Hosted Mode.

2) Export Overlays implemented. This means you can now Export classes you don't own.


@ExportPackage("gwt")
public interface WindowExport extends ExportOverlay<Window> {
   @Export void alert(String s);
}

will export the GWT Window class into the namespace $wnd.gwt.Window, and export the alert() method of this class.

The JARs are downloadable from: http://timefire-repository.googlecode.com/svn/mavenrepo/org/timepedia/exporter/gwtexporter/2.05/

Tuesday, March 3, 2009

What's next for GWT Exporter?

I have learned over the past year that a lot of Chronoscope users like to use the Javascript API, but the current API is unfortunately, a small sliver of the full power the GWT API offers. In dealing with increasing demand for more exported functions, I have run into some basic limitations that GWT-Exporter has that make it less than ideal for exporting idiomatic Javascript APIs.

Exporting Classes you don't control

Since Chronoscope made the switch to the new GWT Incubator/GWT 1.6 Event handling system, I have been unable to export event handler functions, because the HandlerRegistration type returned by the GWT event system is not an interface, making it difficult to wrap. In general, if you don't own a class, you can't add Exporter annotations to it, which is problematic because there are lots of types you don't control.

A new mechanism will be added in an upcoming GWT Exporter release, I dub Export Overlays, which will work something like this:


@ExportPackage("gwt")
public interface WindowExport extends ExportOverlay<Window> {
    @Export void alert(String message);
}

In the above code, we are exporting the GWT Window class in the Javascript package "gwt", but only exposing the alert() method. This will result in an exported wrapper in $wnd.gwt.Window with the method published on $wnd.gwt.Window.prototype.alert.

Fine control over Javascript->Java type conversions

Javascript doesn't support overloaded method dispatch as a language feature, but most Javascript libraries do use heavily overloaded functions with variable arguments and variable argument types. Currently, GWT Exporter supports Javascript overloading by arity, but not by type, so two Java methods of the same name and identical arity cannot be exported without overriding the exported name.

Some of this will be mitigated by adding in automatic type checking for handling overloaded dispatch, but more is needed. Consider the following Javascript API:


plot.addHandler({
    onMove: function(event) {  }
    onHover: function(event) { }
    onFocus: function(event) { }
});

This is somewhat idiomatic Javascript, leveraging object literals to pass in multiple bindings. The current approach is to have 3 methods, addMoveHander, addHoverHandler, addFocusHandler, but this doesn't feel quite right to some Javascript programmers. Another problem is illustrated below:


@Export
public class Foo implements Exportable {
   public void setFoo(Date date) { }
}

A Javascript developer would naturally want to write:


var date = new Date();
foo.setFoo(date);

But this won't work, because there is no automatic conversion in GWT Exporter between JS Date and java.util.Date. I could add this (as was done with JS String and java.lang.String), but that still leaves open the problem of the inability to pass Javascript types from third party Javascript libraries into GWT methods.

Type Converter injection

What I'm currently looking at is a mechanism to declare Javascript fragments on Java methods which can type-convert arguments between the exported wrapper, before they invoke the underlying GWT method. Something like this:


@Export
public class Foo implements Exportable {
  public void setFoo(@Convert(JsType.DATE, "new java.util.Date($_arg.getTime()$)") Date foo) { }   
}

The syntax is still subject to change of course, but the idea is that you supply an annotation on each param, using an enum (STRING, FUNCTION, ARRAY, OBJECT, etc) which corresponds to the return value of the 'typeof' operator, or you use a special INSTANCE version which allows you to specify a match according to 'instanceof'. Next, you supply a Java expression), including inline JSNI code (here shown in between $ $). The GWT Exporter then arranges to generate code to check the type, and inject your fragment before calling the underlying method.

Another possibility is to add JSNI methods to Export Overlays which allow arbitrary conversion code to be specified and injected into the wrapper prior to invocation of methods.

By combining this with the Export Overlay concept, it might be possible to deal with the idiomatic JS Literal pattern, while keeping JSNI methods out of classes that may not want them, or that you do not control.

Primitive arrays an oversight

Finally, GWT Exporter currently does not support export of parameters or return types which are native Java arrays, which is a hole in the export type system. This will be added in a near future release.

-Ray

Thursday, February 19, 2009

Small demo of Timelord

Timefire is building a set of social collaboration and visualization tools to help people understand data better, and part of this has been the development of Chronoscope/Timescope, but an even larger part of this effort has been the development of an end user application that permits users to play with data, annotate it, and share their activity with friends.

Timelord is the codename of our application, which runs as an embeddable widget, a Google Gadget, a Spreadsheet Gadget, or as a native Android application. It leverages OpenSocial when possible, to allow users to share activities, but also integrates with other contact sources, like GData Contacts API, Android address book, etc.

One of the more important features of Timelord is the ability for users to create their own "mini-presentations" which script tours of the data. These mini-presentations are simply a special comment syntax, like Wiki-text or BBCode, to instruct the chart to perform actions which would normally be done programmatically, like adding markers, or zooming.

A cool feature, I think, is the ability to do 2-way synchronization between these mini-presentations and YouTube video, which is demonstrated in the screencast below, courtesy of our friend Al Gore, and the YouTube Player API.

Take it away, Mr Gore (best if watched in HD):

Watch it in HD here

UPDATE: YouTube seems to have disappeared my HD version, so there is an HD version mirrored at http://vimeo.com/3292149

I don't think I have to mention that this is all done in GWT? :)

-Ray

Friday, January 30, 2009

GWT Exporter now generates Javascript documentation

I added a rough implementation of a Javadoc Doclet (my first!) to the exporter trunk. It is an extension of the Standard Doclet, so it not only generates Javadoc for your GWT classes, but automatically emits a file called "jsdoc.html" that includes documentation for every @Export in your source base. This allows you to automatically produce documentation for Javascript programmers from GWT source. I use the Maven Javadoc Plugin to execute this, but you can use ant tasks or the command line as well.

It's quite ugly at the moment, no CSS stylesheet, but I'm working on a basic one. I'm happing to take any CSS donations from those who want to check out the trunk and play with it.

Code project is : http://code.google.com/p/gwt-exporter/

-Ray

Tuesday, January 27, 2009

Screencast of Timescope, the commercial version of Chronoscope

Hi Folks,
I've uploaded a screencast of Timescope, which is the commercial version of Chronoscope, to Vimeo. This is a sneak peak at some of the features in the upcoming release of Timescope, as well as Chronoscope 1.0. The chief differences between Timescope and Chronoscope, are a more accurate multiresolution algorithm that preserves data better when zooming out, new marker and renderer types (CandleStick charts for example), new marker types like Image, Axis, and Callout markers, iPhone support, a native Android version, as well as a Chart Server.

The screencast is at: http://vimeo.com/2983154, I'd embed it here, but it has to be watched in HD in order to see what's going on.

-Ray

Tuesday, December 8, 2009

Structural Typing

The future

Thursday, November 19, 2009

GZip vs 7-zip

LZ Distance

LZDIST and GZIP

Clustering with LZDIST

Results for GWT Showcase Application

Notes on implementation

Conclusion

Monday, August 17, 2009

Reversible Transforms

Deflate: A digression

A sort with no undo

Sort by Clustering

What's a good metric?

Results

Optimizing GWT's Obfuscation

Choosing a different GZIP implementation

Conclusion

Addendum

Thursday, May 14, 2009

Friday, April 24, 2009

Step 1: Overwrite GWT ServiceInterfaceProxyGenerator

Step 2; Override the superclass of the generated client proxy stub

Step 3: Write your own Proxy

Step 4: Modify RemoteServiceServlet

The Sky's the Limit

Tuesday, April 7, 2009

Almost Painlessly

Making JDO/JPA enhanced classes work over GWT RPC

But I don't want copies!

Less than ideal

It's still awesome

Thursday, April 2, 2009

The Overlay Type

Categories and Extension Methods

Type-Safe Enumerations with Zero Overhead

Turn a String into a JSO

Adding methods to Numbers

Thursday, March 12, 2009

RPC requests signed by OAuth

An EasyMock library for Hosted/Web Mode

GWT Exporter

GIN/Guice dependency injection

Combing compile-time and run-time parameters

Friday, March 6, 2009

Thursday, March 5, 2009

Structural Typing

Structural Type Exports

Refinements to the idea

Structurally Typed Interfaces

Tuesday, March 3, 2009

Exporting Classes you don't control

Fine control over Javascript->Java type conversions

Type Converter injection

Primitive arrays an oversight

Thursday, February 19, 2009

Friday, January 30, 2009

Tuesday, January 27, 2009

Followers

Blog Archive

About Us