Friday, January 4, 2008

Google Gears Image Manipulation API not ambitious enough

Google Gears is a very subversive and disruptive technology IMHO, and I mean that in a good way. Google has the muscle to extend native browser functionality in a cross browser way by sneaking extensions into Gears. Of course, Adobe and Microsoft can do this as well (Flash and Silverlight), but the difference is, Google is offering AJAX-level building block extensions to browser functionality, not an alternative environment-within-the-browser-environment that Flash, Silverlight, and Java applets yielded.

Gears is steadily building momentum, and I'm sure many of Google's properties will soon support offline or enhanced modes using it, which will tend to make it a 'must-have' plugin, hopefully achieving 80-90% penetration in the future. So, before the vast majority of people start using the plugin, let's try to be as ambitious with the extension functionality as we can, so when the rush-to-install happens, people will be getting a version with very rich functionality as the base. We want to avoid the need to check Gears version all over the place and force people to upgrade plugins continually ("what, oh, your Gears version 1.21 doesn't have the image.shear() function, you need Gears 1.25 for that...")

Case in point, the proposed Gears Image Manipulation API. Granted, it's just starting, but I'd like to offer some upfront suggestions before this thing gets finalized.

Don't duplicate Canvas, extend it


The proposed API adds resize() and crop() operations to an image object. And also the ability to turn images back into blobs. The resize() and crop() operations can be done today with JS Canvas, but only WHATWG Canvas allows you to turn an image back into data. I don't think this API goes anywhere near far enough to justify its existence.

Also, the biggest pain with using Canvas today is that every browser but IE supports it, so why not implement a cross-browser offscreen Gears WHATWG Canvas API to start with.

But don't stop there. WHATWG Canvas lacks text rendering, and image rendering that obeys affine transforms, two of the big complaints against the existing Canvas. The Web 2.0 world will love anyone who can get such an extended cross-browser canvas widely deployed.

So start with an off-screen WHATWG Canvas API, add text rendering (and atleast 90 degree rotated text would be nice), plus drawImageWithTransform() that obeys transforms.

Resize, Flip, and Crop aren't enough


Anyone looking to build a client-side photo-manipulation library will want more than just image scaling, cropping, composing, and flipping. They need the ability to run convolution kernels, lookup tables, and rescale/colorspace transforms as well. The most common operations people want to run on photographs, like contrast/brightness enhancement, sharpen/unsharpen, conversion to black and white or sepia, etc use these.

Don't forget RAW and EXIF/image metadata


In addition, the ability to open RAW files, manipulate exposure compensation, and extract image metadata would be a huge boon. An offline Gears photo album would be much cooler if EXIF info could be extracted as images are imported.

My own wish list for Gears Image API

  • Implements WHATWG CanvasRenderingContext2D as Base
  • Adds text rendering, with minimally 90 degree rotation support 
  • Image composition that obeys affine transforms (not just rotate)
  • Convolution and Lookup operators (NxN square kernels, N atleast up to 5)
  • Support opening RAW images
  • Support read (write would be good too!) access to image metadata


Objections?


The first objections that will be raised is bloat in the plugin. Leaving aside the fact that Flash delivered outstanding capability in a slim plugin, the proposed libgd implementation already has many of the core functions needed to satisfy the proposed richer API, and I'm sure it would not be hard for Google engineers to implement the rest. Convolves and Lookups aren't rocket science.

A likely second objection and real hiccup will be achieving cross-platform, antialiased, internationalized text rendering with rotations. I don't have an answer for this one, only that users want it. Almost everyone I've talked to who does client-side rendering wants this.

A third issue is simply complexity and time to market. Resize, Flip, Rotate, and Crop are trivial to implement as simple libgd glue code, whereas full support of WHATWG semantics, would require significantly more time. Although some of this might be mitigated by stealing WebKit or Gecko's implementation and hacking it into Gears.

Fourth, dealing with large photographs, especially RAWs will bring memory issues to bear (unless a smart tile cache oriented system is used), and could be a point of denial-of-service against the Gears plugin if care isn't taken.

Regardless of these objections, I'd still urge Google to go for it. Make client-side image processing and visualization a major part of the next Gears API. Please!

-Ray

5 comments:

boots said...

I'm on the Gears team and I just wanted to let you know that Gears autoupdates itself. So there is really no worry about getting all the good features in before a major spike in installs. We can update all those users at any time.

Thanks for the suggestions otherwise. Good stuff to consider.

If you want to chat more, or just lurk, we're at http://groups.google.com/group/google-gears-eng.

Unknown said...

another good feature is to crop a image with a free hand shape like the lasso tool in photoshop.

Unknown said...

another good feature is to crop a image with a free hand shape like the lasso tool in photoshop.

Unknown said...

another good feature in a image manipulation is crop a image with a free hand shape like thelasso tool in photoshop.

Eugene Lazutkin said...

Just read your post today --- I am impressed! So far it is the most comprehensive proposal to extend Canvas with a real image manipulation API. My writeup on 2D/3D web graphics is published here: http://docs.google.com/Doc?id=d764479_12f6qrmdd7. I am evaluating the possibility of implementing my proposal in Google Gears, any feedback and/or counter-proposals are welcomed.