Archive for the ‘ orm ’ Category

What Would the Holy Grail of ORM Look Like?

Recent experiences and articles I’ve read have got me thinking about ORM again, and trying to conceive what the perfect one would look like (when it’s not custom matched to a specific set of patterns that I control).

The Microsoft Data Access Block was one of the first frameworks I used to make boilerplate data operations easier. Incidentally, it also led me down the evil path of exposing data access methods as static methods. I evaluated both the Entity Framework and LINQ to SQL for a large green-field project and neither were up to snuff at the time. I’ve recently migrated to Java development on Linux and gotten my fingers into Hibernate — enough to conclude that I hate it with a vengeance. Come to think of it, I’ve never seen an ORM framework that I’ve thought fully did the job, so I ended up going with the roll-your-own approach on that last project. That’s fine — maybe even superior — when you fully control the access/retrieval/update/delete patterns. But what criteria would make a new tool stand out for general adoption?

First, the short shopping list: Stored procedure support is a biggie for me, as are batch saves, client-side filtering/sorting, awareness of new/clean/dirty/delete objects (optimize wire traffic by not sending clean objects, and let me process multiple insert/update/delete operations in the same batch), intelligent awareness and automatic management of datetime properties like created/modified, and the ability to do soft deletes (set a ‘deleted’ or persistence status property, and omit those from standard fetches).

A few additional things I look for, some a little unorthodox:

Put nullability checks in get methods that return nullable collection types. I’m sick of seeing null reference exceptions when people try to render a child list that’s not populated — they’re ugly and they disrupt debugging sessions.

Let me generated extended enums (Java) or enum-type classes (.NET, bad idea to inherit from enums there) from stored data (e.g. in tables somehow flagged as being an application enum). Look at the classic Java “Planets” enum example for a use-case. This helps keep typo-prone string-based lookups out of the codebase.

Don’t push me into the entity:table paradigm. Maybe some entities are more easily used by exposing a few foreign properties on them (like names/labels that correspond to foreign keys). That facilitates much terser code and reduced IO. It’s not that hard to handle this, either; make those properties read-only and omit them from saves. Voila!

Give me smart “GetBy” parameter inference. Good candidates are primary keys, foreign keys,indexes/unique keys (including compound ones), and primary keys of other entities that have a foreign key to this. Bonus points for letting me browse the ancestor hierarchy and create GetBy methods for, e.g. grandchildren by grandparent, without having to fetch the intermediate (parent) first if I’m not going to show it. Similarly, give me delete by id and delete by instance methods.

Add “stale instance” checks to prevent overwriting more recent changes by others. (Huge bonus points if you can actually fetch the newer remote changes and merge them with the local ones when no conflicts exist.)

Provide an easily-swapped out data provider interface – don’t tie me to any specific backing store. This is a tall order, since it requires multi-way type mapping, plus decoupling and isolation of all provider-specific options and settings, and a backing-store agnostic controller layer on top of the data layer. Controllers deal with business intentions, but often must translate those into provider-specific language. This means controllers must pluggably or dynamically support data providers, without built-in knowledge of all of the types or options they use (probably via the mediator http://en.wikipedia.org/wiki/Mediator_pattern or Adapter http://en.wikipedia.org/wiki/Adapter_pattern patterns.)

Do not introduce any dependencies into POCOs/POJOs – for example, Hibernate forces its own annotations/attributes into the persistable classes, which makes them unusable in, e.g., GWT client code. Now I need to duplicate entity code in DTOs, and to create converter classes, for no other reason than to have a dependency-free clone of my entities. It’s wasteful, it promotes code bloat, and it introduces opportunity for error.

Similarly, facilitate serialization-contract injection – I’m sick of being unable to use the same entity for e.g. XML, binary, JSON and protobuf just because I need to serialize it in different ways (e.g. deep vs shallow, or using/skipping setters that contain logic). Why do my serialization preferences need to be written in stone into my entities? (Nobody does this well yet, IMO, and it’s not easy either.)

Those last two are biggies: Putting control statements into annotations/attributes is an egregious violation of SOC. Serialization, data access and RPC frameworks all want you to embed their control flags into your entity layer. Enough already! My entity layer is just that… a collection of dumb objects. Give me an imperative way to tell your framework what to do with my objects, or go home.

All code generation should be done at design time (as opposed to during build or at runtime) – for Pete’s sake, stop slowing down my builds and adding more JIT operations to my running app. (Do I need to mention that dynamically generated SQL is evil? And have you seen what ugly dynamic SQL Hibernate spits out?) Also, give me code where I can see the fetch/save/ID-generation/default-value-on-instantiation semantics without looking through 8 different files to trace it. The longer I code and the bigger the projects & teams I work on, the more I favor imperative approaches over declarative or aspect-based ones; whether I want the 3rd generation descendants to be fetched — whether lazily or eagerly — is a function of where I am in the app and what I’m doing, not of the entities themselves.

Don’t force a verbose new configuration syntax on me; use enumerations and flags that are in visible, static code, and write them with inline documentation so that explanations are visible in javadoc popups and Visual Studio mouseover tips. Pass those enum/flag values to DAO constructors and methods to control, for example, whether to re-fetch after save,what descendants to save or fetch along with the parent, etc.

Am I being too demanding? Am I missing some biggies? Programmers, let me know your thoughts!

%d bloggers like this: