Articles

The Way We Work: Test Automation

Test automation includes any configuration or code that verifies the expected working features of your system against expected outcomes. The inclusion of automated testing unlocks the true value of a DevOps-focused software delivery pipeline. Automated testing improves flow, validates the functionality that you are delivering, ensures that existing functionality is still functioning as expected, identifies issues that suddenly appear in production (monitoring), and most importantly gives you the peace of mind to let the features flow. 

I have worked with projects in the past that did not include automated testing and there was always that time before deploying where you need to ask yourself – is this good enough to release? When did the testing happen, how much testing did we do, and what else in the environment has changed that might interfere with this release? There are so many questions and risks to weigh before deploying. The right approach to automated testing within a DevOps delivery pipeline can make that process so much easier. 

Types of automated tests and where to include them in your pipeline

Several types of automated testing provide different benefits and have unique tradeoffs. You do not need 100% coverage of all types throughout your entire platform. Having too much testing will add to your initial development time and lead to more test maintenance time. Let us examine each one and focus on its benefits and tradeoffs. 

Unit tests 

Unit tests focus on the smallest level of code possible. These are tests that exist with your code and ensure that the unit (typically a component, method, class, or UI element) behaves as expected. Unit tests typically mock out external dependencies. 

Unit tests should be fast, likely a few milliseconds, and should be run with every code change. You should be running your unit tests locally before you commit your code. In my opinion, the best way to optimize software delivery is to get feedback as quickly as possible, and using unit tests to get this can be a huge time saver. However, forcing unit tests in some areas can slow down your development process by making them brittle. If you must update your tests for most changes in your system, it is a sign that you have a testing problem or a design problem. This is a skill that can take a while to master. 

There is a practice called Test Driven Development that uses unit tests to help discover the design of classes by starting with the simplest set of code and then adding additional functionality by adding tests one at a time, seeing them fail, and then fixing the code to make sure the new test and the previous tests all pass. This practice can feel very odd at first, but in some scenarios, can really improve the overall quality of your code. 

Unit tests are good for most code, but they are especially good if you have any complex calculations or logic that are critical to how a unit of code performs. One type of code that I do not prefer unit tests for is for components that orchestrate a lot of dependencies together. You can do them, but the complexity of all the mocking is better handled with an integration or end-to-end test. 

All systems should have unit tests. 

Integration tests 

While unit tests are focused on the smallest section of code that can be testable, integration tests are focused on scenarios where multiple units work together. Integration tests can mean multiple things, so I am going to break them into two categories – code-level integration tests and system-level integration tests. 

Code-level integration tests are like unit tests right in your code base. These tests cover multiple units of code to ensure that when working together they perform as expected. 

System-level integration tests are performed on a deployed subset of your ecosystem. For example, if you have a three-tier platform that has a UI, API, and a database, then you might have integration tests for each tier. API integration tests that test the APIs with the database behind it (or a blank or replica), or integration tests for the UI components that use a mocked or blank API and back end. 

While unit tests give you comfort at the code level, integration tests start to give you comfort at the function or feature level. 

One frequent practice for system-level integration tests is using BDD or behavior-driven development. The idea is that the tests use the same language that a user or product owner would use to describe their actions and then code behind that executed against the system. The idea is that a user could read these to understand the flow of the system. If you have used behave, gherkin, or SpecFlow tests then these are frameworks that use this approach. When done correctly these can be an excellent way for a new team member to learn the system from a user’s perspective. 

End-to-end tests 

 End-to-end tests are executed against all of the tiers of a system in a fully deployed environment and are generally for systems that have a user interface. They are built using a UI testing tool and work on a fully deployed system. These tests cover the entire experience an end user would experience. 

There are two kinds of UI testing products. The first one uses code and frameworks that know how to interact with the user interface natively. Selenium, Playwright, and Cypress are common frameworks in this space. The second one uses a tool that records and replays UI steps against the user interface. The second approach is helpful for user interfaces built in a technology that does not expose APIs to interact with it directly. The first one is usually more stable than the recorded approach as you can build up reliable approaches to interact with your system.   

End-to-end tests are the slowest tests of them all and depending on how many you have (and the performance of your system) they can take hours to run. I would recommend limiting your suite of tests to something that can run in an appropriate amount of time that will not make you want to skip them. In my experience, these tests are also the most likely to break or give false results (in the business we call this flakey tests). 

End-to-end tests should be used strategically depending on your product as they need to be maintained and take time to execute. 

Depending on the system’s architecture, either system-level integration tests or end-to-end tests give you the comfort level of your system. 

Performance tests 

Performance tests gauge the performance of your deployed system by executing several actions by multiple simulated users and record the success rate and the time it takes to complete the action. 

Performance testing can be helpful if you have infrastructure or cost dependencies, and you expect a release with a lot of users on day one. 

With the scaling capabilities of modern cloud infrastructure, I have seen more product teams skip performance testing and use cloud scaling, even with the potential of a higher cost, to get features out to clients sooner. However, there are still business cases where scripted performance tests can provide value. 

Production monitoring tests/synthetics 

Synthetics run in production and do the same action repeatedly. They are done at the system-level integration test level or at the end-to-end level. A synthetic test is typically linked to an alert that will inform the team that something is wrong with the deployed system. 

Synthetic tests are useful for proactively monitoring and alerting people to the health of your live production system. With certain 3rd-party tools, you can set up synthetic tests to run multiple times an hour from various locations in the world. I have used these many times to learn about any system weak points or to help the team be more proactive if a system breaks before the client is affected. 

Regression tests 

Regression tests cover any manual or automated test that you execute before releases that you do not have included in your pipeline on every build. You may have a set of regression tests that cover legacy functionality, but it is preferred to have these added to your unit, integration, or end-to-end tests as appropriate. If you have a suite of these that are executed manually, I encourage you to get these on your backlog to remove that impediment from releasing faster. 

Manual / scripted or recorded test 

There may be some tests that are either difficult to automate or are so critical that we always want to perform them manually. There is nothing wrong with this. It should be a business decision to weigh the tradeoffs of executing these manually instead of including them in your delivery pipeline. 

How we can help — Aligning your testing strategy with your goals and risks

At Green Leaf, our philosophy is building only what provides business value to your product. We would partner with you to design a testing strategy aligned to your corporate and product strategy and your acceptable level of risk. Here are the typical steps we would take.    

  1. First, we evaluate the risk of the product along with the specific goals to understand the needs of the testing strategy.
  2. Second, we would evaluate the existing product and platform along with the existing testing approaches and the potential risks that the architecture contains, especially focusing on the integration points between the components of the platform.
  3. Then we would define a testing strategy and roadmap that aligns the risks and goals of the product with the existing framework and code to ensure that you have the right amounts of the different testing methods for your product.  We would define this as a roadmap weighted with the most beneficial components first.  We would then join and coach your teams to ensure that these items are delivered efficiently to meet your goals

Impact  

The purpose of the whole DevOps flow is to deliver resilient features to your users as quickly as possible with the highest quality. It can be measured against the four main DORA metrics for throughput and stability. Let us examine how automated testing can impact these: 

Throughput 

Change lead time 

  • Automated testing can assist the delivery lifecycle by eliminating the need for manual functional testing on new features 
  • Having automated regression tests can reduce or eliminate additional manual testing needed which gives you the confidence to release features as soon as the team completes them 

Deployment frequency 

  • Knowing that the latest merged code has been fully tested from your automated testing suite gives you the confidence to release what is completed without waiting for manual steps 

Stability 

Change fail percentage 

  • Integration and end to end testing when done on deployed systems reduce the frequency that you could deploy code that will break a user workflow 

Failed deployment recovery time 

  • When you break production you really have two choices, roll back the change that was deployed or deploy a patch release. Both approaches are aided by automated testing 
  • Having the previous build stored in your DevOps tool pipeline might allow you to roll back the change that caused the issue. 
  • If you decide to patch instead of rolling back, then having a suite of automated regression tests can allow you to be confident that the patch release that was just merged is going to work and not break other existing functionality. I would recommend that your patch release always starts with writing a test that recreates the issue in production. 

Ready to get started?

In conclusion, automated testing is the key to unlocking elite levels of DevOps maturity. As our series continues, we’ll shift focus from testing processes to exploring the infrastructure behind cloud applications. Our next article will dive into cloud infrastructure and application hosting.   We will discuss how cloud components can further streamline the process of getting your features into the hands of your customers in a timely, consistent, and high-quality process. 

Green Leaf is here to help you overcome obstacles and unlock the full potential of your teams. Join us to discover how cloud environments are transforming the way we build, scale, and manage software infrastructure.