GUI Testing rehab : Can we start saying NO?

Testing GUIs has been hard, tedious, painful... just bad. But they have been an occupational hazard due to lack of feasible alternatives..

There's a hard-earned confidence you get when you see a dancing UI twisting, turning... testing itself. And vendors smelt that from miles away.. and then they homed in with tools. Over-simplified demos were given, Influential people were influenced, buying decisions made, tools thrust on unsuspecting people... "The horror... the horror..."

But I digress. GUI tests were problematic because
  • Flaky: Running tests would just fail without reason one fine day. Reissuing a test run would pass that test (the flashing lights tests anti-pattern) but could fail cause a different intermittent test failure. Trust goes down.. tests get commented.. a dangerous path to tread.
  • Fragile: Vulnerable to UI/UX changes - Some test broke because someone turned a textbox into a combobox.. or worse someone just redesigned a commonly used dialog. Time to throw someone into "the hole" again.. record-n-replay/Fix all those broken tests.
  • Time to develop/write: Writing UI tests = tedium. Getting them to "stabilize" takes a while. But 'we use a record-n-replay tool!'.. put a pin on that one.
  • Time to execute: Don't hold your breath, these tests take a while. Waiting for windows to pop up, scraping info out of controls, etc.
  • Quirky controls: There are always some automation party poopers. Third party controls that don't exhibit standard behavior / the tool simply refuses to "see" them. But the UI is already "done".. Time to call in some specialists..
  • Vendor lock-in and Specialists: Our resident expert has vanished without a trace.. who can write/fix the tests? (Shrugs all around) Instant Emergency: "We surely can't swap tools now. How quickly can we hire someone who speaks ToolX ?"
  • Misc dept: Handle possible failure error dialogs so that the test doesn't block or wreck the subsequent tests, test sensitive to type of OS, theme, screen resolution, etc.

"Enough!" you say. Is there any hope in this post at all?

Let's tackle them one at a time.
Fragility / UX sensitivity
What if we could extract named actions ( a set of building blocks ) that we could then use to build up our tests. Think Lego blocks (named actions) combining to become models (tests) limited only by your imagination and time.

e.g. Let's say I want to test if my (unnamed) mail client

test CanReceiveEmails
     testMailServer.Setup(DummyEmails).For(username,password) 
     mailClient.Start()
     mailClient.AuthorizeOfflineMailStoreAccess(datafile_password)
     mailClient.LoginToMailServerAs(username, password)
     mailClient.SendAndReceiveAllFolders()
     var actualEmails = mailClient.GetUnreadEmails()
     Assert.That(actualEmails.Count).Is.EqualTo(DummyEmails.Count)
     // more comprehensive checks for message content...
     mailClient.Stop()


So there, we have identified the actors (I'll call them as Drivers henceforth) in our test and the corresponding keywords/actions that we need them to offer. How did that help us - you ask?

We have removed any traces of the UI out of the test. So let's say the LoginToMailServerAs changes from a modal application window to an inline standard widget provided by the specific mail server implementation. All I need to go fix now is the implementation of the LoginToMailServerAs action and all my tests should stay unchanged.
Also now everyone can just invoke LoginToMailServerAs as a magic incantation without worrying about how it works...it just does!

Separate intent (WHAT you want to do) from implementation (HOW you're doing it): Compare that to a run-of-the-mill UI test, the above test is much much more readable. Easier to read, understand, fix/maintain.

Time To Write - it still takes time but decreases as the store of named actions grows. Every keyword/action needs to be implemented once.. write once use wherever you need it.

We've lowered the technical expertise needed to write a test. Given the "drivers":cohesive clumps of named actions, requisite tooling and a brief walkthrough of the existing "drivers", someone can discover the APIs to choreograph a specific script - a test. Focus on testing/thinking rather than automation/coding.
Vendor lock-in and Specialists
  • The decline of the specialists : "That looks almost like a xUnit style test!" You're observant. Yes you could leverage whatever it is your developers are using for unit tests - this means anyone can now write a test. No more dependency on specialists, learning curves for mastering a proprietary tool, No magic-tool licenses to buy. More money to distribute among the team (that last part is still fiction.. but I'd bet you'll have a really motivated team the day after :)
  • Encapsulate Tools : The tool is bounded by the box exposing the keywords. No one outside this box (the driver) knows that you're using White for instance. This makes the tool replaceable and the choice of tool a reversible decision.

But how do we implement the HOW i.e. the keywords? The Drivers themselves.

You could use an open-source library like White (or equivalent) that can launch/attach to a running instance of a GUI app, find windows/controls and poke them. (Anything that helps you implement the ControlFinder role shown later)


Flakiness 
Depends on your choice of UI controls and your Automation library. e.g. with C#/WPF applications that run on Windows, I've found White to be pretty robust. < 5% chance of White playing truant.

Enter VM/PM Tests
"That's it ??!! These are still UI Tests! What about writing all that nasty UI Automation code?" White has wrapped the nastiness within a bunch of control-wrapper types. (You could add your own too). However for special controls, you'd still need to get your hands dirty.

"But these tests still crawl!"

Beyond this, the target application has to be (re-?) structured or as the self-righteous phrase goes 'designed for testability'. Here's one idea that should work...

All of the remaining issues are due to the GUI. There are so many types of UI controls to automate. Waiting for windows and finding controls in large hierarchies takes time. What if I slice the UI out ?
e.g. let's consider the Login.. named action (which involves bringing up the login dialog, entering the username, password and clicking OK)

If we design it such that the UI is thin (devoid of any code/logic) and merely "binds"/maps to fields and methods in a backing class. This means updating the control will trigger the backing field to update and vice-versa. Doing an action like clicking a button will trigger the underlying method.



This technique has been known for some time now (Presentation Model (MVP) - Martin Fowler (2003) OR Microsoft's variation called MVVM, which leverages .Net's in-built data-binding feature to make WPF apps faster to develop).

The only thing that the UI contains is the layout, the controls and the wiring to the underlying class. (Even that can be automatic if you move into the realm of advanced MVVM - look for Rob Eisenberg's MIX talk which uses convention to auto-bind). The more important thing is that most of the code (and bugs as a corollary) has moved into a testable class - the ViewModel / PresentationModel. The whole app is basically a symphony orchestrated by multiple presenters.

So instead of fidgeting with the UI, I can now just assign desired values to corresponding properties and invoke the OK method to simulate the whole Login process. Much better - plain method calls. What if I can load the whole app from the ViewModel layer down in my test process? That'd be great.

Benefits:
  • Time to develop: No need to write UI automation code. Just call existing methods and set properties that the developers have (already) created as part of the implementation. Quick, simple and easy.
  • Time to execute: No more flashing windows, looking for controls and manipulating them. If you're able to load the whole app sans the UI layer within your test process, you are effectively creating a bunch of objects, toying with them and then letting the garbage collector clean them up. It's way slower than a unit test (because you're using all the real services, data stores, devices etc..) but would be faster than a traditional GUI test (Presentation Intensive tests will show a bigger gap as compared to something that spends most of its time talking to a slow hardware device. YMMV)
  • Quirky anti-automation Controls: Buh-Bye! Instead of grappling with a third party tree/grid that doesn't want to be found, you can just reach into the VM/PM layer & grab a friendly in-memory collection (that it binds to) within the corresponding ViewModel/Presenter.

But wait it gets better..
  • Decoupled the testers from the implementation : This means as long as you give them some key information, the testers can start writing the tests

    
    public interface ControlLocator
    {
        Window GetWindow(string title);
        T GetControl<T>(Window parentWindow, string automationId) where T : Control;
    }

    public interface ViewModelLocator
    {
        T GetViewModel<T>() where T : ViewModel;
    }
What testers need to know to implement the Drivers.
  1. for UI Tests: ParentWindow + ControlType + unique ControlId 
  2. VM Tests: ViewModelType + PropertyName/CommandName
  • Enable Test-first: they don't have to wait till the whole thing is implemented to write the tests. e.g. With record-and-replay style of tools, you'd have to wait till the development team gives you the running application to begin test automation. Especially important for teams practicing one of the Agile methods. You could now enable the teams to move up to ATDD.
So let's do a recap, 

Current: We started at the top where teams have heavy investments in GUI Testing. These tests are work magnets : maintenance-heavy... sucking in team resources... high cost to benefit as compared to the PM/VM Tests.

Target: By identifying reusable actions with intention-revealing names, we can construct tests much faster than before and with less cost (Programmers will recognize this as the Extract Method refactoring).
Further by peeling off the UI layer, we get a scriptable interface (an API so to speak) for the target application. We can write most of the system-level tests without the UI.. most teams still like to write some UI tests just as backup.

Stretch: Finally IF a team resolves to write comprehensive unit tests (such that most bugs don't make it past the green section), uses VM tests to catch integration defects and makes every defect an opportunity to fix the process: you could STOP writing UI tests at all (James Shore is a proponent and seems to have had success with this). The time saved in UI Automation can be put to better use - exploratory testing. Not all teams will get here.. but if you make it here, you'll never want to go back. You'd be able to deliver more features per unit time.

So what do you need to do: 

  • Get enough user-context to create a library of named actions; called keywords by some. Tests are written in terms of these keywords. Remember What - not how. e.g. EnterUserNameField() or ClickLogin() is bad; ask "WHY?" to chunk up and you should reach Login(username, password)
  • Let testers step into the shoes of the user & shape this interface outside-in. Pair them with a good programmer to ensure you have a "Discoverable API" i.e. easier to figure out on your own given tooling support (e.g. IDE Intellisense).

For UI-less tests,

  • Follow a design technique like MVP or MVVM. Minimize code in the UI.. so that it's easier to test. 
  • Ensure that you do not need the UI to start an instance of your SUT/application. Have a composition-root (e.g. a Main() function where the app comes together) 
  • Abstract out the User-interaction. So you can't pull a MessageBox or ShowDialog() out of thin air in the ViewModel code. You create a Role e.g. User. The production implementation of User will probably pop up dialogs. When you need to test without the UI, you replace it with a fake-object controlled by your test.

Now the preceding bullets are easier said than done for zillion-line legacy apps. For greenfield projects, I find this a very enticing alternative - there is no reason to not build it in test by test. We've crossed out most of the perils of UI Tests.




Feel free to question / enhance / criticize with objective reasons / list pros and cons.... In the words of the Human Torch: "Flame on!"