Monday, April 21st, 2008

What is the future of Ajax applications talking to the data tier?

Category: Editorial

I have just posted an article on the new attack on the RDBMS on my personal blog. The post talks about the new thinking around data in the cloud, and on the Web. It first starts out by remembering that this isn’t the first time the RDBMS has been attacked, and remembers the OODBMS attack, which didn’t do too well. Then it gets into the cloud-y Web:

SQL is an enterprise victory that managed to make its way into the consumer Web and application space. A lot of people knew SQL, and it seemed obvious to have a LAMP stack or a Java / .NET stack backed by a RDBMS.

Is this really the right choice for Web applications? Why was Rails so successful? It was due to the productivity gain. How much of that is due to ActiveRecord vs. the other Action* pieces that make up Rails? I would argue a large percentage. Working with the database was actually a big pain in the tuches. ActiveRecord together with migrations helped a lot. It gave us a nice middle man between a full ORM and the SQL that we know and …. know.

What if the database piece didn’t need to be that painful? The source of the pain can be the paradigm shift between the various worlds, but also a huge part of it is scalability. When you have to scale your website, it can be fairly easy to make your application stateless, and then the bottleneck becomes the poor database. This is when you break out the master / slave relationships, think about partitioning of the application, and caching layers (Tangosol Coherence, memcached). Now you have to really think about an architecture ;)

Google had to do this thinking a long time ago, as they obviously have to scale their applications to a huge degree. Scaling the fairly read-only search operation is one thing, but as soon as you get to read-write operations you have a lot more of a head-ache. Scaling a MMORG astounds me. To be that real-time, and having the world constantly changing. Wow. At least there are the separations of locations (world X can be this cluster of machines).

Now we get to Bigtable, the engine that Google built to scale in the cloud. Amazon has their new SimpleDB, and there are others.

What these guys are all doing, is revisiting the database story. Maybe it is time to think about if a RDBMS is the no-brainer choice.

When Google App Engine launched, I thought there would be a lot of people saying “oh man, I just want MySQL instead of this new thing”. I barely heard that, and instead heard more thoughts along the lines of “It is great to be able to use the scalable database that Google uses internally.” In fact, when you start using it and see that it is schema-less, you get a bit of a relief. You can build your model, and even use an Expando to be highly dynamic on the data in the backend. You go along your way, iterating on your code and model and you don’t have to spend time working on up and down migration methods. Doesn’t that remind you a little of the OODBMS dreams? But this time it is fast and scalable!

Resting on the Couch

With the interest in Bigtable via App Engine pushing thought, we also have CouchDB pushing from the other end. The end that says, what would a RESTful approach to a database be?

Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API.

JSON built in. JavaScript right there. A database built for the Web?

It is great to see new ideas and thought about the storage of data. The RDBMS isn’t going anywhere of course. There are still a ton of tools out there for it and legacy code, and we all know that:

Data stays where it lies.

It is much easier to implement a new application talking to the old datastore, than migrate the datastore itself. It is like taking out the foundation. Also, SQL is getting new life in places too.

SQLite

I recently saw an application that used GWT on the client, and JavaScript on the server, which reminded me of my comic above. I wonder if we may end up with another flip, having SQL being used in the client, and other systems like CouchDB, Bigtable, etc being used in the enterprise / on the server.

It is happening on the client. SQLite seems to be everywhere. Your operating system, phone, browser, applications, everywhere. I bet I have around 20 SQLite engines on my system right now, and growing. Why is this happening? Well, instead of coming up with your own data format, parser, and search engine, why not just use SQLite and be done. It is very faster, perfect for single user mode, so everyone is a winner.

So, SQL has a looooong future ahead of it, but it will be interesting to see how the RDBMS weathers the latest storm.

Geoff Hendrey, of NextDB.net, emailed me discussing a similar issue and how he thinks that:

The database access issue is the “elephant in the room” as far as Ajax apps are concerned. It’s a very hot topic, evolving rapidly, and related to cloud computing and DAAS (SimpleDB, LongJump, S3, Blist, NextDB.net, BigTable, etc).

Geoff is going to be at Web 2.0 Expo talking on the subject.

What are your thoughts on Ajax and the data tier?

ASIDE: I will be giving a joint talk with Ryan Stewart of Adobe there too, so come say hi, and ping me on Twitter with any thoughts.

Posted by Dion Almaer at 8:56 am
4 Comments

+++--
3.4 rating from 18 votes

4 Comments »

Comments feed TrackBack URI

As far as databases for the web, I really think Persevere is worthy to be mentioned as well. DBs like CouchDB and Amazon’s S3 are not really safe to use directly from web clients/Ajax, since they don’t have the necessary security model, they are more intended to be used from the backend. Persevere can be used by Ajax, and Persevere also can integrate existing SQL tables, and make them available through RESTful HTTP/JSON, with JSONPath queries, and SSJS as well.

Comment by kriszyp — April 21, 2008

I’ve been thinking about this a lot (more) lately as I’ve been fighting with Active Record (in Rails) over the last week to make it mesh with my (good) relational database schema. The following thoughts are larval.

I don’t think the problem is the relational database. I don’t really see the difference between a database and the file system in one way: both are ways to persist data on a disk and both have an API of sorts to do that. An ACID compliant database offers all sorts of goodies that are worthwhile for piece of mind. Our application can use the API for the persistent store to serialize any data. It’s the applications job to figure out how to use that API and hopefully it will be some nice clean code that does that.

I think the problem is using a high-level ORM to map a schema that it wasn’t designed to map. The higher the level the abstraction the more difficult it is to work with if the low-level details it is abstracting change. Active Record works well with single tables and well with some types of table relationships. For other types of relationships that it doesn’t support (e.g. class table inheritance) it takes a great deal of massaging to make it work. I’m sure it took a great deal of massaging to get it to work with the relationships it supports out of the box. And all this could be done with a few lines of SQL…literally. It would be fine if Active Record works well with most schemas or all schemas a particular developer has but that just isn’t the case. It works well most of the time but seems to fail at least somewhere in a project.

SQL is a very high-level but still flexible API for working with stored data. Active Record is a high-level, less flexible API for working with stored data. Perhaps the ORM experiment failed and the RDMBS experiment won? I’m thinking about going back to straight SQL.

Comment by PeterMichaux — April 21, 2008

I agree with PeterMichaux. I have used several abstract layer for databases. Sometimes, it gona to save lots of time while using SQL instead of those abstract layers such as Active Record.

For large or medium scale web applications, it will be good to have abstract layers, such as Active Record. But just sometimes, the SQL query will be much easier and quicker.

And as we are using PHP a lot, it seems that MySQL would be first option all the time. That is only from developer point of view.

If someone from database management background gives us some ideas, that will be great

Comment by chalia — April 21, 2008

Hi,

Have a look at Olivier Dedieu thesis made 8 years ago.

It is the starting point of a very powerfull Java CMS named Jalios JCMS. It is a Product and a Platform (available since 2002) use to build CMS/Portal/GED Web Application.

The idea is to load an object graph in memory available to developers throught a powerfull search engine… There is no SQL, you just follow the graph: Channel.getPublication(‘c_1234’).getAuthor().getGroups()[1].getName()

Jalios is a growing french company not yet known in US. More on wikipedia.

Comment by nextone — April 22, 2008

Leave a comment

You must be logged in to post a comment.