Archive for the ‘ programming ’ Category


Oh boy… I’ve been struggling for a couple weeks trying to replace Jetbrains/IntelliJ IDEA with Eclipse for a GWT/Maven project. To make things more challenging, I’m relatively new to Java, Linux, Maven, GWT, Hibernate, Artifactory, Eclipse and IntelliJ to begin with. I’m competent enough at the primary skill every programmer needs — Googling — but I still haven’t been able to get Eclipse to debug the client-side GWT code without unplugging Maven. I could do it in IntelliJ, but that was part of the problem… IntelliJ’s built-in runner would compile Java that shouldn’t compile. It would happily sail right past missing classes (because they were mis-named), multiple methods with the same erasure in the same scope… yikes! Hey, I appreciate the “it just works” experience, but I my definition of “works” definitely doesn’t include compensating for code defects that would never run outside of the IDE.

Well thanks to AdWords’ exemplary intent matching, I kept seeing little reminders to try NetBeans 7. I asked a few Java pros for their opinion of it, and most had tried it years ago & written it off as less-than-useful. If that’s accurate, then I’m here to tell you, NetBeans has grown up quite a bit since then. I grabbed release 7 and a couple plugins, read no more than three paragraphs of instructions on it, watched the first minute of a GWT tutorial, then started fiddling. An hour later, I had Maven, Artifactory, GWT and Tomcat all cooperating nicely with each other & I was stepping through  client-side asynchronous methods in its debugger. I’m using the same source files as are Eclipse and IntelliJ without breakage, and the command-line Maven builds work fine too.

Sweet! Maybe now I can stop panicking over the complete lack of anything familiar in my environment and get back to cranking out platforms!

Oh, and did I mention the Git and Jira integration? I know, right?

Dear Google: Please Fork Java

From the Lemming Technology Blog:

October 6, 2010 — Jason

This is an open-letter to anyone at Google with a vested-interest in the long-term survival of Java as a language; and as a viable platform.

I tend to agree.

HBase vs Cassandra: why we moved (via Bits and Bytes | Dominic Williams)

Passing along an interesting post from Bits and Bytes; Dominic’s take is (in part) that the two take different approaches to Big Data: Cassandra is more amenable to online, interactive data operations while Hadoop is geared more towards data warehousing, offline index building and analytics.

My team is currently working on a brand new product – the forthcoming MMO This has given us the luxury of building against a NOSQL database, which means we can put the horrors of MySQL sharding and expensive scalability behind us. Recently a few people have been asking why we seem to have changed our preference from HBase to Cassandra. I can confirm the change is true and that we have in fact almost completed porting our c … Read More

via Bits and Bytes | Dominic Williams

What Would the Holy Grail of ORM Look Like?

Recent experiences and articles I’ve read have got me thinking about ORM again, and trying to conceive what the perfect one would look like (when it’s not custom matched to a specific set of patterns that I control).

The Microsoft Data Access Block was one of the first frameworks I used to make boilerplate data operations easier. Incidentally, it also led me down the evil path of exposing data access methods as static methods. I evaluated both the Entity Framework and LINQ to SQL for a large green-field project and neither were up to snuff at the time. I’ve recently migrated to Java development on Linux and gotten my fingers into Hibernate — enough to conclude that I hate it with a vengeance. Come to think of it, I’ve never seen an ORM framework that I’ve thought fully did the job, so I ended up going with the roll-your-own approach on that last project. That’s fine — maybe even superior — when you fully control the access/retrieval/update/delete patterns. But what criteria would make a new tool stand out for general adoption?

First, the short shopping list: Stored procedure support is a biggie for me, as are batch saves, client-side filtering/sorting, awareness of new/clean/dirty/delete objects (optimize wire traffic by not sending clean objects, and let me process multiple insert/update/delete operations in the same batch), intelligent awareness and automatic management of datetime properties like created/modified, and the ability to do soft deletes (set a ‘deleted’ or persistence status property, and omit those from standard fetches).

A few additional things I look for, some a little unorthodox:

Put nullability checks in get methods that return nullable collection types. I’m sick of seeing null reference exceptions when people try to render a child list that’s not populated — they’re ugly and they disrupt debugging sessions.

Let me generated extended enums (Java) or enum-type classes (.NET, bad idea to inherit from enums there) from stored data (e.g. in tables somehow flagged as being an application enum). Look at the classic Java “Planets” enum example for a use-case. This helps keep typo-prone string-based lookups out of the codebase.

Don’t push me into the entity:table paradigm. Maybe some entities are more easily used by exposing a few foreign properties on them (like names/labels that correspond to foreign keys). That facilitates much terser code and reduced IO. It’s not that hard to handle this, either; make those properties read-only and omit them from saves. Voila!

Give me smart “GetBy” parameter inference. Good candidates are primary keys, foreign keys,indexes/unique keys (including compound ones), and primary keys of other entities that have a foreign key to this. Bonus points for letting me browse the ancestor hierarchy and create GetBy methods for, e.g. grandchildren by grandparent, without having to fetch the intermediate (parent) first if I’m not going to show it. Similarly, give me delete by id and delete by instance methods.

Add “stale instance” checks to prevent overwriting more recent changes by others. (Huge bonus points if you can actually fetch the newer remote changes and merge them with the local ones when no conflicts exist.)

Provide an easily-swapped out data provider interface – don’t tie me to any specific backing store. This is a tall order, since it requires multi-way type mapping, plus decoupling and isolation of all provider-specific options and settings, and a backing-store agnostic controller layer on top of the data layer. Controllers deal with business intentions, but often must translate those into provider-specific language. This means controllers must pluggably or dynamically support data providers, without built-in knowledge of all of the types or options they use (probably via the mediator or Adapter patterns.)

Do not introduce any dependencies into POCOs/POJOs – for example, Hibernate forces its own annotations/attributes into the persistable classes, which makes them unusable in, e.g., GWT client code. Now I need to duplicate entity code in DTOs, and to create converter classes, for no other reason than to have a dependency-free clone of my entities. It’s wasteful, it promotes code bloat, and it introduces opportunity for error.

Similarly, facilitate serialization-contract injection – I’m sick of being unable to use the same entity for e.g. XML, binary, JSON and protobuf just because I need to serialize it in different ways (e.g. deep vs shallow, or using/skipping setters that contain logic). Why do my serialization preferences need to be written in stone into my entities? (Nobody does this well yet, IMO, and it’s not easy either.)

Those last two are biggies: Putting control statements into annotations/attributes is an egregious violation of SOC. Serialization, data access and RPC frameworks all want you to embed their control flags into your entity layer. Enough already! My entity layer is just that… a collection of dumb objects. Give me an imperative way to tell your framework what to do with my objects, or go home.

All code generation should be done at design time (as opposed to during build or at runtime) – for Pete’s sake, stop slowing down my builds and adding more JIT operations to my running app. (Do I need to mention that dynamically generated SQL is evil? And have you seen what ugly dynamic SQL Hibernate spits out?) Also, give me code where I can see the fetch/save/ID-generation/default-value-on-instantiation semantics without looking through 8 different files to trace it. The longer I code and the bigger the projects & teams I work on, the more I favor imperative approaches over declarative or aspect-based ones; whether I want the 3rd generation descendants to be fetched — whether lazily or eagerly — is a function of where I am in the app and what I’m doing, not of the entities themselves.

Don’t force a verbose new configuration syntax on me; use enumerations and flags that are in visible, static code, and write them with inline documentation so that explanations are visible in javadoc popups and Visual Studio mouseover tips. Pass those enum/flag values to DAO constructors and methods to control, for example, whether to re-fetch after save,what descendants to save or fetch along with the parent, etc.

Am I being too demanding? Am I missing some biggies? Programmers, let me know your thoughts!

Discarding or Rolling Back Changes in Git

When moving to Git for version control, I was amazed at how much trouble people have trying to revert a file or project to a previous state, and even more so at the variety of solutions I saw. People try (and recommend) everything from surgical to nuclear approaches to this — e.g. git checkout …, git rebase …, git revert… git stash or branch & then discard…, or even delete your entire working directory and re-clone the repository! Yet with many of these, people would still end up with unwanted changes left in their working copy! One problem is that certain commands are only appropriate for changes that have been committed to your current index, while others are for those that have not.

When I have a version I want to roll back to, I don’t like having to sort through what’s committed and what’s uncommitted; I just want to get back to that version. I’m all about finding something that works reliably and repeatedly in a way that I understand. git checkout <i>start_point</i> <i>path</i> is the “something” that seems easiest to me for reverting specific files back to specific previous states, and so far this approach has never left me with undesired changes remaining in my working copy.

Here’s the skinny…

First, get a simple list of the last few commits (7 in this example) to the file in question:

~/projects/myproj$ git log -7 --oneline src/main/java/settings/datasources.xml

Output (newest to oldest):

74106b9 Renamed PROD database
db05364 Changed root password
0d56c8b Renamed QA database
efc7eb0 Changed some hibernate mappings
97e68fe Added comments
a2c492f Fixed xml indentation
c1b0310 Wrecked xml indentation

Let’s say those last two commits were erroneous. Then using the syntax “git checkout <start_point> <path>” you would just do:

~/projects/myproj$ git checkout 0d56c8b src/main/java/settings/datasource-context.xml

All done!

Have other tricks for making “rollbacks” easier? Let me know in the comments!

Happy coding.

%d bloggers like this: