Sunday, April 15, 2012

How to start a new Java Web Project

My background
I'm currently working as external consultant for a bank in which I consult customers how to integrate their services on the J2EE platform. I'm not a fully fledged developer but have a bright knowledge about large scale datacenters and hosting platforms from a technical/architectural and also from a management perspective. I have also developed some small apps in the past to keep my technical knowledge up to date. Lately while reading some docs I discovered that much has changed in the java world outside the large/slow new technology adopting industries.
Also the way of using database queries seems to change moving away from traditional join patterns which keeps all data seperated combining them to one query through joins to optimized query patterns that can deliver the data from one "table" with the compromise having redundant data in different "table" like structures. In Cassandra for example these structures are called Colomn, ColumnFamilies and SuperColumns.
I have been thinking about how I would start a new fully modern Java web project from scratch and which technologies would fit in todays demands and best practices. And that's where my journey through the new technology jungle started.
Requirements to my new web project
My requirements to a new application:
  • I want to use the new modern based like databases such as Cassandra or AppEngine to study the new kind of data modelling methods that allows me to grow to whatever data volume and replication is needed based on a master/master database concept.
  • I want to have an interactive Ajax based frontend that can respond as a web page and also as a mobile app giving the user a rich and easy to use interface.
  • I want to use Java as I'll need Java background for my future activities as a consultant 
Based on these requirements the below possible candidates of frontends/backends have emerged.
    Frontends/Backends - Possible Candidates
    I have been looking at following possible candidates:
    • JSF / J2EE Beans / Cassandra
    • backbone.js / Jersey REST / Cassandra
    • GWT / Jersey REST / Cassandra or AppEngine
    JSF / J2EE Beans / Cassandra
    JSF (Java Server Faces) is currently the standard used in J2EE Java development. It is a very nice and simple MVC (Model View Controller) framework which I personally really like. JSF has all the comforts of a server side language and if you know the concepts of MVC it'll take no time to build a very nice looking frontend. The integration with Managed Beans is also very simple and straight forward. There are also very good Ajax libraries out there. I have been looking at the amazing simple Primefaces. Also JSF is extremely well documented with examples to find everywhere.
    This has been my first choice because of the low learning curve coming from a background in Ruby on Rails.
    I have started to write a small JSF application and wanted to integrate Cassandra as my backend database. So next question was which driver to take. For Java I have discovered following potential candidates:
    • Hector (currently the most active community and advanced client)
    • Kundera (a full JPA integration and also very active community)
    • CQL that aims to get the new standard which will support JDBC in future (Support is currently only given for Datastax enterprise customers also when writing to the community mailing list. As I am no enterprise user this solution was not yet usable for me as I want to learn how to do things by myself.
    Cassandra is currently stable and used by large companies around the world but still very young and there are many different individuals writing different kind of drivers which still lack different kind of features. But it's getting there. A big lack with all these drivers is the documentation. When I talk about documentation I don't mean the Javadocs. Of course this is available. But the real world examples explaining how to do things in mid- to large size projects are fully missing. This is what I loved in the past about Ruby on Rails. You could always rely on Railscasts. But now I'm bound to Java and that's what I want to use. And at least the people in the mailing lists of Hector and Kundera are very responsive and helpful.
    I have managed to get the Kundera JPA running very quickly which perfectly integrated with Managed Beans. I have however been pointed out by the people from the Hector mailing list that this will eat much payload and is therefore not recommended for a lightweight environment. However I think for J2EE extrimists this is the way to go.
    Pros
    • Fully J2EE compatible with Kundera JPA integration
    • J2EE standards can be used
    • Only Java knowledge is required 
    • Well documented
    • SEO is taken care of
    Cons
    • High Payload
    • Can only be used with Java (Serialization J2EE) when not using REST
    On the other hand I got the advice that one should fully decouple the frontend from the backend, best using REST/JSON. So for Java nearly all of the people in the Hector mailing list pointed out that it's best practice currently using Jersey (JAX-RS). I was also introduced to new JavaScript based frontends like backbone.js that would enable me to write fully independent frontends to the backend which makes sense as this method enables us to build any kind of client applications. This way regardless if you use a JavaFX, Flash, Javascript or another type of frontend it will be able to store and load information over REST. This brought me to the next adventure with backbone.js.
    backbone.js / Jersey REST / Cassandra
    With the background from the JSF journey I have been taking a look at backbone.js (used for MVC based Javascript) that integrates with underscore.js (for additional functionality) and jQuery. I was asking myself how I would structure large applications and found require.js and following article. I followed the article and got a fully working example (without Jersey). I have also read following article regarding jQuery integration and Jersey. I however never got the example working properly and have also not been able to integrate backbone.js and require.js to work with this example. Reason is that I have only basic knowledge about JavaScript development.
    Debugging in general worked fine with FireBug but I got to the point where only the HTML was displayed and there were no JavaScript errors at all (how would I debug without seeing any errors?). I spent a lot (at least for me too much) time to debug problems.
    My personal impression is that backbone.js is a really well thought JavaScript variant of a MVC application which includes all the necessary pieces but with all the dependencies like underscore.js, require.js, jQuery.js makes it difficult to maintain for somebody without really deep JavaScript knowledge. Also jQuery and require.js are developed independently which could lead to compliance issues. It would be nice to see one common well documented JavaScript framework arise out of all these different scripts which is well structured in to the same direction with common best practices and possibilites to update to new versions with everything integrated to it. I think this could give it a big boost. Especially with "PHP" or "Ruby on Rails" like documentation and samples.
    Pros
    • A very light weight and flexible and fast way to code rich interfaces
    • Practicaly no limits as it is pure JavaScript
    • Can be used for any kind of mobile and web based apps
    • Well integrates with REST/JSON based server side apps
    • Partly well documented
    Cons
    • Harder to debug problems as it depends on individual libraries
    • Much knowledge in JavaScript is required
    • Different browsers may need to be handled individually within the JavaScript code
    • SEO has to be handled over Snapshots
    Because of the issues I had I was not at the end of my studies and some other nights brought me to GWT (Google Web Toolkit) which is pure Java that will be converted in to JavaScript. It is one framework not depending on many libraries. It supports different permutations for all browsers and it's Java. So the learning curve is high to learn the usage of the framework but lower then learning JavaScript.
    GWT / Jersey REST / Cassandra or AppEngine
    So my conclusion (which is a personal one based on the conclusions above) is that I want to use GWT with a Jersey REST backend and Cassandra or AppEngine database.
    One of the big problems when having the data stored on supplier platforms such as Google is that the suppliers usually don't provide much support moving data away in case a supplier change is necessary. I however have to say that much time/resources can be saved if a supplier is managing the whole infrastructure with data replication.
    The good thing with REST interfaces is that the data could actually be exported over REST and inserted in to a new database such as Cassandra. With this in mind I'm not in a big worry anymore about storing my data to AppEngine. The good thing about AppEngine is that my data will automatically be replicated to other datacentres and I don't need to concentrate on Servers/Database maintenance and replication. So with this in mind the REST interfaces have to be structured for the "worst case scenario" and a plan has to be made accordingly how to move the application and data to a provider independent infrastructure.
    From a GWT side as GWT is generating JavaScript it will also run on any other Google independent platform.
    Pros
    • One framework to handle JavaScript and all different browsers
    • Only Java knowledge is required
    • The permutation to different browser is done automatically
    Cons
    • It takes some time to learn the framework
    • Using AppEngine is provider dependent
    • SEO has to be handled over Snapshots 
    • JSNI may need to be used for additional non-existing JavaScript functionality
    • Java knowledge is required
    My technology stack in the end... Spring & jQuery & MongoDB

    I have learnt some important things:
    • Don't get yourself confused with all these hyped technologies. Depending on mature frameworks is usually the best.
    • GWT is great and new and for sure a very capable framework to write web applications. But when it comes to search engine indexing optimization and other stuff it is still very complicated. If you want to create a standard website and don't expect hundred thousands of users in the beginning you don't really need it.
    • Spring is well known for it's bean configuration over DI. Combined with Spring Data that is responsible for database access, Spring Templates to minimize coding and the support also for certain NoSQL Databases it is a powerful backend.
    • Spring MVC is also a very nice and also very robust and stable framework that has full REST support since Version 3.
    • I have tried Spring Roo but decided to not use it. First of all the REST integration didn't work properly with the current version 3.2.1 and as soon you have certain specialities that are not covered by Roo it'll get difficult to maintain code using Roo.
    • Spring would also support integrating GWT or the AppEngine Datastore if required and AppEngine is supporting Spring.
    • By layering the application in to Service, Repository, Domain Architecture it is not too difficult to change the database backend.
    • MongoDB is supported by Spring and not only contains support for small data records but also for files or large data using their GridFS. All data can be distributed over several servers. This makes MongoDB my favorite choice for persistence.
    The question marks with PaaS and IaaS services
    • As a customer of such a service I would like to have cost transparency and control. This however is not possible as most services are calculated on a per hour and usage bases and one would have to run many tests to actually compare if the costs are really cheaper to a platform that is maintained by yourself. I have seen that it takes much time to actually figure out the real costs on a monthly bases and the exponential growth.
    • With Google for example the price of Frontend, Discount and Backend instances is very intransparent. While reading the mailing lists there were also much people complaining that their costs are higher then expected while using the platform in production. For Google of course it's the customers fault because the customers could have developed their application in a different way that is optimized for AppEngine.
    • With Rackspace and Amazone the costs are more transparent when looking the the managed Cloud Database offers but the costs are also comparable high against running infrastructure by yourself.
    • I think it makes sense to evaluate those services properly when enough money is around and a product is already successful. For other products or startups I would still recommend using a single or two servers with backup capabilities depending of course on the demand of the customers.
    To learn Spring and how to connect to MongoDB there are only two books that you need:
    For MongoDB there is enough documentation on their website. For their documentation they earn defenatly an award. Just look at all the cumbersome and whimpy documentations of other tools.


    No comments:

    Post a Comment