Sujan Kapadia

Saturday, September 22, 2007

Unit Testing

Disclaimer: I'm no expert in testing, wouldn't even say I have much experience in it or do it consistently, but the times when I've done it, been disciplined about it, and tried to do it right, it really paid off, and that's why I'm writing this entry...

Verifying behavior / Demonstrating correctness

This is the obvious one. An automated test is a great way to verify that a program behaves as expected. An automated test is a great way to verify that a program continues to behave as expected. If expectations happen to change (and they always do), go ahead and modify the tests. Of course its never just that easy but you get the point... You write a test to confirm your expectations of a program.

Notice how I said "your expectations"... "Yeah of course my program works, I know it'll return B when I pass in A"... And there's the inherent bias when writing your own tests. Reducing that bias is essential in effective testing, but I always find it difficult to do. Hehe maybe the more and more experience I get at unit testing will bias me towards NOT trusting my own code. Isn't that strange... as I start to trust my code less and less, I'll write more tests, and therefore my code will be better. I think test-first development helps to reduce the bias because you're writing test code first... before the production code has had a chance to bewitch you into thinking its beautiful and perfect... but alas I find test-first development very hard to stick to after the first few days of development... kind of like going to the gym right after work...

So let's rephrase it: You write a test to confirm your expectations, the business analyst's expectations, the users' expectations, the runtime environment's expectations, etc. of the program. Poor little unit test, that's a lot for you to shoulder all by yourself... wouldn't it then be nice if if everybody agreed on expectations... that's probably harder than the hardest programming problem out there.

Handle a range of inputs (expected and unexpected):

Unit tests help you to determine the range of input values that do work and compare them against the range of input values that should work. Which values should you be handling that are not currently working? What unexpected values can I send to my code? How does it respond? Null objects, boundary conditions, bad types, etc. As you do this research (I believe unit tests involve a lot of research), you can add more cases to your code, deal with unexpected values... which makes your code more robust. Can your code handle the values you expected to come from the external system? I think this is where unit tests shine. You can verify your code conforms to the interface agreed upon between you and the external system. This can save you from many embarassing errors and arguments during integration tests.

As I write a unit test I'm always amazed at input values I forgot to consider...

Program consistency / Expose conflicting requirements and inconsistencies:

I believe unit tests can aid in exposing conflicting requirements or inconsistencies. Do two different parts of the program interpret the data the same way? Is a requirement driving one feature in the program compatible / consistent with one driving another feature that interacts with it. A lot of times as your developing and the spec is not fully complete (is it ever 100% complete), different business analysts may tell you different things. Or a consultant may tell you an external system returns a set of values when in fact it returns something completely different. When the code under test is executed, these inconsistencies can quickly surface.

Just recently a consultant told me that a certain field in a message coming from an external system could be one of two values, both strings. I had written code that depends on the field and had defined constants for these values. Days later when I had finally got a sample message from the external system for testing (by which time I completely forgot about this field and its values), I ran it through a unit test and it failed. I looked at the message and saw that the field was not one of the two values, in fact they weren't even strings, but integers! The consultant had very limited time (because working with developers was not billable time, but that's another story). I ended up having to do my own research about the external system and actually look at some of their code to find out the list of values for this field. I modified my code and then the unit test passed. If it had gone into productino like this it would have failed miserably.

Now I'm not saying that unit tests are a panacea for miscommunication, but in certain instances they can help big time.

Forces you to deal with exceptions (or your users will) / Forces you to eat your own dogfood:

Guess what... your code can throw exceptions... and it should... but nasty HTTP 500 errors, pages with stack traces that piss off the end user, threads silently dying, etc. are nightmares! Unit tests help you exercise different paths through code and therefore force you to think more about error handling. What will you do when the error occurs? Simply log it? Set a boolean flag? Return an error code? Throw an exception? I guess its hard to explain this in words, but having to execute your own code helps you make these kinds of decisions.

Application exceptions: What application exceptions need to be caught? When do they actually occur? Which ones make sense and are useful to the user and which ones just unnecessarily interrupt program flow? How should they be named (and perhaps placed in an exception class hierarchy) so clients of your code understand what the error means, so readability is increased. Remember those crazy things called use cases? Ideally a use case should describe exceptional and alternative flows. I will admit I've only formally dealt with use cases in academic situations (at work I've always worked with a monolithic specs, no specs, email design discussions, but never formal use cases), but understanding at a system level what exceptions occur help me to better define my domain classes and exceptions: it helps me express intent more clearly.

Runtime exceptions: Count your lucky stars if you get runtime exceptions during unit testing... Now you know some of the truly unexpected situations that your code allows. You can handle these directly (not usually the best way) or perform checks in your code so they don't occur (like checks for null objects). This directly helps to make your code more robust.

For example I'm writing some classes right now that map fields from one set of JAXB generated objects (that conform to an external company's schemas) to another set of JAXB generated objects (that conform to our schemas... you must be wondering why I'm not using stylesheets. Personally I'd rather deal with Java code than XSLT and I don't know yet if it will be worse in performance than stylesheets). If an XML element is empty, JAXB will convert that to a null value. As I was reading values from the JAXB object, I got a NullPointerException because I wasn't checking for nulls and my unit test blew up. Mind you this code is going to be executed within a web service, so I didn't want an uninformative SOAP fault to be returned.

So it made me think about the end user.. Hmmm we don't want them to get a useless, generic SOAP fault... hmmm what is the root cause of this error? Certain fields are required and therefore can't be null. We somehow need to check for these required fields and throw an appropriate exception / fault that indicates what required fields are missing and what object they belong to. I was able to add the following method to my abstract base class and allow each
of my mapping classes to override it:

protected void getRequiredFields(...) throws RequiredFieldMissingException

Now in each class I was able to place the code for the required fields in the overridden method. Then a template method in the abstract base class called this method before further processing. In one fell swoop all of my subclasses now have required field validation!! The template pattern is great when used appropriately.

This unit test caused me to refactor my code to make it much more readable, maintainable, robust, extensible... Just because of a NullPointerException, and it didn't take much time !!

Catch stupid mistakes before they become costly (and embarassing) / Inadvertent testing:

try {

doSomethingStupid();

catch(stupid mistakes) {

logger.log(Level.INFO, "Stupid is as stupid does");
}

Null objects, improperly initialized objects, missing resources, wrong logical operator usage (I don't want to tell you how many times I've done this), infinite while loops, unintended recursion (what you don't think this can happen?), array out of bounds, class cast exceptions, you get the picture.

One of the benefits of writing unit tests is inadvertent testing. Sometimes you're testing one feature and it expects things to be in a certain state. When you run the test things horribly fail but not because the feature didn't work as expected... Because you forgot to set up some objects (preconditions) or a path in the code was exercised that you weren't expecting. These kinds of failures are great because it gives you a better understanding of your code, helps you to come up with more unit tests, and helps you to specify the preconditions. Why did that object need to be created? Was it really necessary, can it be refactored - placed into another method or class?... Its get you thinking more and more...

Side Note 1: Any semi-competent developer should know how to build truth tables and finite state machines. They've always helped me clearly enumerate the different cases and write clear code).

Side Note 2: On another side I've noticed that management and customers seem to like "automated test suite" more than "unit tests"...

Scenario 1:
Wha wha what! You're writing your own unit tests?? Look at the Gantt chart, it says development here!! Anyway isn't that what we have a test team for? Wha wha what! You
wrote more lines of test code than production code? I think that's against the law son...

Scenario 2:
So the automated test suite runs by itself AND tells me what works and what doesn't? I can go to the breakroom for coffee and be productive all at once? Why didn't we do this before? Anybody can run it? Yes (even you sir).

So next time you're working on a new project, tell your team lead or manager that you plan on having an automated test suite... It just sounds cooler and is an easier sell during meetings and presentations..

Do fonts affect the process of reading?

Do you think fonts affect how we interpret text? What we assign importance to? What we remember and what we don't after reading a body of text. For example using bold and italics signals what the author thinks is important or most relevant but what if you read the same text with different words italicized and bolded? Or reading the entire text in a completely different font... Hmmmm....

Do you think fonts affect how we interpret text? What we assign importance to? What we remember and what we don't after reading a body of text. For example using bold and italics signals what the author thinks is important or most relevant but what if you read the same text with different words italicized and bolded? Or reading the entire text in a completely different font... Hmmmm....

Friday, September 21, 2007

Stream of Consciousness for the Week

Okay every week I will write one "stream of consciousness" entry where I don't do even a minimal amount of proofreading or editing... I start with an idea and take it wherever it leads me...

Can browsers take care of completing difficult URLs for us, or making it easier to identify sites? Where is the value in the URL these days? You can bookmark, create direct quick links on the browser window. Placing the URL on your business-card, vehicle, company building: initially the URL gives you an identity - so initially a user has to enter a URL, the easier to remember the better... easier URLs that clearly identify the company can spread much more quickly, by word of mouth, etc. If somebody decides to quick link a URL, that means its probably very important to them, something they visit very frequently. Does Firefox or Internet Explorer provide an option to send them data on what quick links a user has created? Can such a plugin be created? What prevents users from developing Firefox plugins that do "sneaky" things like this? Is it on the honor system?

Google has so much power because it presents links to relevant, high quality information quickly and early on... the URL is not so much important as Google deciding to display the link and where it displays it. So Google changed the economy of the Internet. Users don't want to enter or remember URLs... they are the only way to uniquely identify nodes on a graph, a one-to-one mapping (Google instead takes advantage of the "geometric" properties of a graph to provide amazing results).. We'll deal with the graph and its continually changing structure... you
just focus on the information you want. This is why Wikipedia is so awesome too... They take care of the structure and representation.. you just focus on the information you want.. they take it one step further... you can change the information at will (in theory).... Graphs are natural structures... graphs / interconnectedness will form automatically in nature..

New knowledge that can be generated from existing knowledge / data cannot really be represented on a graph.. It doesn't exist yet and it is not known what directed edges it would have. The potential to generate new knowledge: different combinations of existing data, new connections between existing data (it depends on how far we want to break down the knowledge (really each node in a "knowledge" graph is itself a graph, recursive)). So there is a whole
body of knowledge that can be discovered from existing knowledge... it already exists but we need to identify the combinations of nodes that represent this derived knowledge or break nodes themselves into graphs to find more basic units assemble and disassemble. Then there is a whole body of knowledge that doesn't exist yet and hasn't happened yet (the event has not occurred in our temporal frame)... I don't know if I'm going anywhere with this...

Jokes that really aren't jokes and sarcasm

A colleague came by the cube this morning: "Is it done yet? Come on is it done yet? Keep typing, keep typing" (followed by a few chuckles and then a scary grin). People always ask this in a joking manner, but come on are they really joking? Sometimes I can't stand it when people mask their true feelings with a joke. Just come out and say what you want!!

I wonder how different cultures mask their feelings, how they make jokes... I think it'd be very interesting to study sarcasm in different cultures and languages. What words, expressions (verbal and facial) constitute sarcasm? I think by analyzing sarcasm one can get a glimpse of the kinds of things a culture values, what's important to them, what is dishonorable, what's forced upon them by their society. Sarcasm is a great way to release social tension... So then what builds up that social tension? Of course this is just a specific instance of using language as a window into culture and humanity.

I look back and wish I had followed my heart and went into linguistics, Spanish, international studies, ethnology, ethnography...

Thesaurus.com is your friend!

When I'm stuck on naming classes, interfaces, etc., I quickly think of all the names that come to mind, go to thesaurus.com, and see what other synonyms exist. I think its a great way to take a general word (abstract) and go to more specific words (concrete) and vice versa. Sometimes I also take a look at the antonyms, it helps me reinforce the meaning of the word. Knowing what a word doesn't mean is just as important :). Its scary how much language takes a part in shaping our ideas, but I digress...

FYI being stuck on something is not always a bad thing! Its a sign that you care about your craft. You're willing to spend cycles on coming up with solutions that may have a good return on investment. But this ain't a good thing if you have a demo coming up in a week!

Don't Keep All Of It In Your Head

For the better part of this afternoon I was stuck on discovering the appropriate abstractions, whether any of the abstractions were of actual value, and what to name my classes. I had identified a common set of operations, a common sequence of code... Initially I kept mentally pacing back and forth without producing anything of value...

My personal definition of an abstraction: a way to view the problem at a higher level; a common thread or concept in multiple problems; a pattern in the problem space; a class of operations that will follow a similar pattern or "recipe".

I was motivated by the following goals: immediate and future code reuse, decoupling, and cohesion (all of which lead to good -"ilities").

Some problems that arise when trying to achieve this:
1. Where to place the code to maximize the benefit of reuse
2. Where to place the code without breaking good OO principles
3. How to anticipate future needs
4. How to name a class and what its responsible for (basically responsibilities should be evident from the name and the name should make sense when the responsibilities are enumerated)

So finally I was able to wrestle myself out of this and come up with a meaningful and useful abstraction by drawing some quick and dirty diagrams. Now I almost always like text better than diagrams because it usually has more substance. But I've found that diagrams help me to untangle ideas, separate concepts, focus, and organize my thoughts (whereas when I'm writing text its easy for me to jumble up my thoughts, jump from one idea to another without any structure. For example right now I'm writing four different blog topics at the same time, I can't completely focus on one thing, I keep "forking" to use Unix parlance.).

Anyway when I started drawing a simple high level diagram of the major components and how they interact, I was able to more clearly see the responsibilities of each component and what the interactions truly meant. I realized that one of the components at a high level was just a generic "connector" to one our web services and voila an abstraction was born. I was able to identify the boilerplate code and encapsulate it in a well-named abstract class and inherit from this for my specific operations. (BTW I'm purposely avoiding using UML terminology, why bother...)

Why does diagramming work so well? In short by making a diagram "You don't have to keep all of it in your head". If I were to add something to the C2 Wiki, it would be "DontKeepAllOfItInYourHead"... ha.. that makes it look more official!