Opinionated on Java

Friday, July 15, 2011

What I've Learned at UberConf - Part 3

So, I'm finally back home in Victoria, BC. I'm extremely tired, but it was great to have participated in ÜberConf 2011. Unfortunately, I only got to go to 3 of the 4 sessions on Friday as I had to leave early to catch my plane. Here's some of the highlights:

Jenkins: I'm already somewhat familiar with Jenkins, as my team is actively using it for our builds. Though this workshop was introductory, it was still good as it got us using GitHub, Sonar and Nexus along with Jenkins. I enjoy these types of sessions as I still often get a little trick or two to make my life easier.

Messaging and Concurrency with Spring: As I've been using Spring for years, I'm already quite familiar with many of its modules and paradigms. Still, with the vastness of their framework it's tough to know it all.

In this session, Bruce Snyder highlighted some custom Spring executors that are configurable and provide alternatives to the Java ExecutorService. He also went in depth into setting up Spring to work with JMS and ActiveMQ, which is actually fairly simple. I'd like to actually try out Spring Integration and see if it makes sense to use in one of my upcoming projects. Overall, this wasn't a world shattering session, but it was very informative.

Neo4j: Again with the NoSQL talks. I sat in on the first part of a workshop for Neo4j in order to get a fundamental understanding of what makes this Graph database different than other databases. Like MongoDB, this one is document based. The big difference is that instead of de-normalizing and nesting documents, they have relationships to each other.

In truth, this is much more of an actual relational database than traditional relational databases are. For this reason, Jim Webber suggested that people should start referring to traditional relational databases as "Square Databases". I'm inclined to agree and will actively start spreading this meme.

I think Neo4j houses a fascinating paradigm and can have some great applications. Being a Doctor Who fan, I loved the sample database we got running in the workshop, which was a historical storage of facts around the TV show.

Final Thoughts: Over all, I personally noticed three main themes at ÜberConf this year:

Continuous Delivery: With emphasis on better build tools (like Gradle), test automation tools (like Cucumber) and methodologies, there was a ton of focus on ensuring your software is always shippable.

Developing with a Team: Terrance Ryan gave a great keynote on Driving Technical Change. I noticed there were many workshops on working effectively within the context of your team, and how to best initiate forward progress on those teams.

NoSQL: There were a ton of talks addressing different tools in this space. With products like Cassandra, MongoDB, Solr and Neo4j available, there is a lot of different ways to address your data needs. The hard part is fully understanding the capabilities of these tools and when it is best to use one of them, or a traditional relational database. Database programming will change significantly over the next few years.

I'm also encouraged by what I observed at this conference, which helps confirm that I am currently working on a fantastic team. Many of the developers on my team are already pushing forward a ton of the ideas introduced at this conference. We've got guys pushing us toward Continuous Delivery and we're actively using many of the technologies and methods that were demonstrated this week.

As with any team, we still have a lot of room to grow, but we are clearly moving in the right direction. I'm very excited for the future at my company.

Thursday, July 14, 2011

What I've Learned at UberConf - Part 2

So it was another great day at ÜberConf on Thursday (though I must admit that I skipped supper and the party to go watch the Rockies pound the Brewers). Here's some of the highlights for me:

Solr: Erik Hatcher gave a great talk on this search engine today. I've worked with some search engines in my distant past, and I don't really know what to say except that this one shows some great promise. I'm hoping to have the opportunity to use Solr (or even Solandra) to prototype some business functionality at the place I work. They've done some exciting stuff, for sure.

DSLs: I'll admit that I went into Neal Ford's talk on practical uses for Domain Specific Languages as a bit of a skeptic. I think I still am, but I'm a bit more open minded about them, and understand they do have their place. One perfect example is Cucumber, which seems to use a plain English DSL to determine how it should operate. It is also intriguing that there are some cool tools starting to become available that will actually help you generate DSL's, including syntax checking editors. It's a bit mind-blowing, really.

JUnit Kung-Fu: I always love going to talks on unit testing, as they inevitably highlight some of the smells in my own practices. John Smart didn't fail. He provided some great insights that will really help improve my tests so they make sense to people other than myself. Some of his key points were around naming (ex. name your test cases using "should" instead of "test") and making good use of the Hamcrest matching libraries. Great stuff.

Gradle: I'm in. I've piddled around with Gradle a few times in the past, but have felt it was still a bit immature for me to try to introduce to my company. That time has passed. I firmly believe Gradle is ready for prime time, especially with the tooling now available in Eclipse via STS. Hans Dockler has converted me, and I almost can't bear the thought of going back to work and using Maven.

Honestly, Gradle gives us all the out of the box power we get with Maven, plus it provides us with a rich declarative Groovy based DSL that gives us the power to actually make stuff happen. Forget the fact that it's Groovy instead of XML, even if that weren't true it still blows away both Maven and Ant by far. Gradle is ready for prime time.

Just one more day left, I can't wait to see what Friday holds.

Wednesday, July 13, 2011

What I've Learned at UberConf - Part 1

I've come down to Denver this week for ÜberConf 2011. After the first full day of sessions, it's been a great time with lots of learning opportunities. I thought I'd share some of the insights I've gotten so far.

Java Script in HTML 5: Tim Berglund did a great job highlighting some of the JavaScript API's available in HTML 5, showing some nice concrete examples. I was most fascinated by Canvas and its graphics rendering capabilities. JavaScript's future is clearly brighter than its past.

NoSQL and Distributed Data: Again with Tim Berglund, he totally de-mystified this domain space. Before now, I'd heard alot of buzz about NoSQL, but didn't really have the opportunity to fully understand what it was. We went over several different databases, where we learned that they are mostly quite simple. I'm looking forward to trying out MongoDB, where my data is stored as a denormalized JSON document.

I also attended a detailed intro to Casandra, which has a great scaling strategy. Cassandra's modelling strategy is fairly simple as well. To over simplify, it basically amounts to a hash table of hash tables. I'd encourage you to check it out.

Finally, I went to an intro to Hadoop, where frankly I got a bit lost. What I did get out of this talk is that it is probably best to use Hadoop through other API's, such as Hive.

Architecture: I attended 2 great sessions by Ted Neward: Pragmatic Architecture and Architectural Kata. The Kata class is actually a workshop, where you have to work with a group of strangers to come up with an architecture for a complex fictional product, and then present it to the rest of the group. It's a difficult task, and quite fun.

I think my biggest take away for the day came from Ted's Pragmatic Architecture class, where he asked "When will you know if an architecture is bad?". The answer: "When you try to write code against it". The point is that it's important that software architects are involved in the development of code.

I've seen on several occasions where architects are former developers who are now some form of manager or in another role where they aren't directly involved in writing the software. The big question with this approach, how are you supposed to make sound decisions if you do not know the consequences of those decisions? How can you know if you're not doing the work?

So, for all you architects out there, the architectural committee doesn't need to be "the place programmers go to die" (another Ted Neward quote). Find a way to stay active in development or don't do the job. If you insist on doing it "old school", don't do it by yourself. It is vital to get feedback on any architecture from the developers.

That's all for today, I'll hopefully find the time to post about more exciting learning events over the next couple of days.

Saturday, June 4, 2011

Unit Testing, or Lack There of

It's probably been about 7 or 8 years since I was first introduced to JUnit and the concepts of TDD. Since those days, it's been clear that this is clearly a vital part of performing professional software development. Yet, for some reason, until recently my coverage has not been at the levels it should be. In fact, I've noticed it's not just me. Over the years, even as I often hear those around me advocating for more unit testing, rarely have I seen a project with even close to 50% coverage.

Why is it that something we think is so fundamental to our jobs so often gets left out of our projects? I think there are a few reasons:

Unit testing is hard: The fact is that writing a unit test is often harder than writing the code itself. As soon as I know the problem that I need to solve I quickly have a picture of the algorithm in my head, and often it will be simple to just throw it together in code. In order to test it, I need to think through all the nuances and write a test for every situation under which that code might reasonably be exexecuted. For example, let's say you were to create an equals method on an object that matched 2 fields. You could probably throw that together in a couple of minutes, but to unit test it, you've got several scenarios off the bat just accounting for null values.

Time constraints: The most common excuse I've heard for not writing unit tests is that developers are under a tight deadline to get something out. Because tests are not part of the deployable production system, they are treated as less critical and are often left out.

Tests are not written first: The mantra of TDD is that "before a single line of production code is written, a test must fail". This is something I personally haven't take seriously over the course of my career, nor have most of the people I've worked with. I've often stated in the past that "I don't really care if the tests are written before or after, just as long as they're there". I now know this to be a big mistake for 2 reasons:
1) I'm more likely to skip writing the tests because of time constraints
2) The tests are being tailored to the code, instead of code written to make tests pass

When it comes time to point the finger on these things, we need to aim them squarely at ourselves. It's easy to make excuses like those I've listed above, but we have the responsibility as professionals to be better.

While writing tests is hard, there are tools and methods that make it easier. One of the really tough parts is mocking out dependencies, which often takes time to understand even from a conceptual perspective. I've tried several Java mocking frameworks, but have recently started using Mockito, which I find very easy to use and able to handle all of my mocking needs. There are also lots of good ideas for making Unit Testing easier in xUnit Test Patterns.

Yes, it does take more time to write code with Unit Tests than it does without, and yes, we are under pressure to get our products out as soon as possible. Here's a typical conversation I've heard:
Developer: I expect that this task will take me about 2 days
Project Manager: Really? That long? But it's just slapping one more screen into this existing product. You sure you can't do it quicker?
Developer: I suppose you're right, I could probably throw it together based on another page and have it done in 1 day.

Does this conversation sound familar, at all? Constantly, I see managers and stake holders unhappy with estimates try to pressure developers into shorter estimates. Just as often, I see developers who sugar-coat or lower their estimates in order to please others. As professionals, it is our responsibility to be clear about our estimates and account for unit testing and other quality factors in those estimates. We need to educate our stake holders as to the benefits of unit-testing and the consequences of not. If you're not sure what those are, check out this page on Wikipedia.

I'll let you in on a little secret that I've discovered since I've earnestly started practicing TDD. Although it takes a while to write my unit tests, writing the production code is faster. In fact, it's alot faster. Once my test is written, writing the code to make the test pass is usually dead simple. One big reason for this, is that I rarely need to debug my code anymore. I write the code and run the test, which will either pass or fail. If it fails, I can quickly determine the cause and adjust my code. Now, if I am debugging, I find it's typically because I've made a mistake in my test.

In addition, I feel very comfortable making significant changes with very low risk. I can feel free to quickly make any refactorings I feel are needed without the risk of breaking anything.

Recently, I needed to refactor a very complex method that was designed to merge one set of data into a list of another set of data. It was a 150 line method with no pre-exsiting tests and had several levels of nesting. I decided that trying to modify the code would be unwise without having unit tests to verify the code, so I set out to do so, which took maybe an hour or so. Once my tests were written I was clearer on the intent of the method and how to better accomplish it. I was able to then re-write the method in about 5 minutes. It was now less than 10 lines long as it was reduced to a single if statement inside a for loop.

Was this success story because I am such a great developer? No, it was simply that I had good tests that made sure a method did what it was supposed to do, so I was able to find the simplest way to do that without making the tests break. The more and more I practice TDD, the more of these types of successes I am having.

I've said alot, but my point is this. We all know the value of unit testing, but we somehow end-up falling short in our delivery. Until we stop with the excuses and make TDD our primary development practice, we will continue to fall short. It needs to be a habitual part of our daily work, and we need to be vocal about it in order to get buy-in from those around us. Having un-tested code simply isn't good enough. We need to be better.

Sunday, October 25, 2009

Should you use ORM?

Over the last few years, I've seen a heavy trend towards using Object Relational Mapping (ORM) libraries to help manage persistence in Java applications. Is this good or bad? Is ORM the way to go now? To answer that, I think we first need to answer a few questions:

1) What need is ORM addressing?

First and for most, the primary goal is to map data to Java objects. This is done using some sort of definition which tells the library how to query (and write to) the database.

The other big piece is the "relational part" of the equation. Instead of using a foreign key field on a Java object, you would use another java Object to represent a relationship. You would then use an object setter and getter instead of a potentially meaningless key value in your Java code. This actually makes for fairly clear code when writing your business logic.

2) Are there other ways to address this?

Of course there are. As far as the basic data mapping, this a can be readily addressed in a traditional Data Access Object (DAO) with a row mapping method. You could then use plain JDBC if you wish, except the overhead can be quite irritating. Spring does a good job managing this, if you extend their JdbcDaoSupport class. Alternatively, you could use a tool like iBatis to deal with data mapping.

In order to address the relational aspect, one decent pattern I've seen is to introduce a layer of abstraction between your business logic and persistence layer. Essentially, you would create business objects to encapsulate your entity objects, which are just a direct mapping to a database object. You can then represent object relationships at this level.

Obviously, since these approaches require an additional business layer in your application, and manually coded SQL, the ORM approach can make Java coding simpler.

3) What's the downside of ORM?

XML! Well, that's not entirely accurate: In Hibernate, mappings used to be defined only in XML files. To manage your mappings could be quite complex and was error prone. I've seen a lot of time wasted just in the overhead of dealing with Hibernate XML. A full discussion on XML based tools is best saved for another day.

Fortunately, Hibernate eventually decided to give you annotations. Now you can define your mappings simply by throwing a few annotations on you entity class. Even better, Hibernate implements JPA, so you don't actually have to explicitly use Hibernate at all, in most cases. JPA gives you a standard that several ORM libraries can implement. This makes it much simpler to change your ORM, for example, to TopLink if you choose.

A specific problem I have with ORMs, is the query DSLs you need to use in place of SQL. Since you are dealing with objects instead of database tables, there are special languages, such as JPQL (in JPA) or HQL (in Hibernate) which your developers have to learn in order to do any non-standard queries.

In addition to the learning curve, these queries have the added problem that performance can
be unpredictable. Since the query DSL will be translated by the ORM into SQL, it is difficult to know what SQL will actually be executed. Unless your ORM provides you some sort of flexibility in this regard, tuning can become a huge issue. While I don't advocate trying to fix performance bottlenecks before they are identified, there is wisdom in ensuring your design gives you flexibility in this regard.

Finally, there's the gotchas. You need a good understanding of session and transaction management, as well as dealing with caching and lazy-vs-eager loading. These things are manageable, but be warned!

Are there any other benefits?

With the maturity of ORMs, the tooling has also come along way. Tools have become available that help to simplify your ORM implementation, even to the point of auto-generating your objects based on a database schema.

The benefits swing the other way, as well. With some ORM tools, you can define your objects and mappings and actually use these to generate your database schema. This can be a big win.

Final Answer: Should you use ORM?

It depends. Based on what I've seen, there is a place in Java development for object relational mapping. The biggest win is helping to simplify your design. That said, you also assume some risks and add overhead complexity to your application.

On a fairly simple project, ORM may be the most ideal way to go. On larger, more complex projects, you might want to use a different approach with a slightly bigger design. Where appropriate, you may even want a hybrid using ORM for general purpose and a different approach for the more complex pieces. Of course, doing this comes with significant risks of its own.

The bottom line is that you need to ask yourself what is most appropriate for your project based on:

project complexity
project needs
the team's skill set
comfort level
cost vs benefit

Saturday, October 3, 2009

Keep it Simple, Stupid

Yes, I know it's overused, but it really can't be overstated. Many of today's common coding practices really boil down to the KISS principle. But, really what does that mean? I've noticed that the term simple means different things to different people. Here's what Merriam-Webster says: readily understood or performed

That's actually a pretty simple explanation. How does this apply to software development? Here's the bottom line: Design applications that are readily understood

In my years involved in development, a large amount of my time has been spent maintaining applications. Whether by my own machinations or that of others, I frequently see this seemingly easy rule violated, which just makes maintenance a pain in the butt chore.

In writing this blog, my primary goal is to rail against overly complex application design and to promote methodologies that will result in more maintainable code. This will make you, whoever maintains your code, and whoever is paying you much happier people. My main areas of focus will be on usage of the Java language and related frameworks, as well as principles of software construction in general.

In closing this first entry, here's a couple of books I would suggest that really address these topics:

Code Complete: A Practical Handbook of Software Construction - Steven McConnell
Refactoring: Improving the Design of Existing Code - Martin Fowler

In Code Complete, Steven McConnell repeatedly states that "Software's Primary Technical Imperative is managing complexity". To paraphrase: Keep It Simple, Stupid!