Kenneth's Corner - Engineering, data, and testing

High-functioning test organizations

14 February 2023 - Posted in Testing by Kenneth Kinion

I've been passionate about Test Engineering since learning about the SDET role when interviewing for (then later, preparing for) my first test-focused job at Microsoft. Since then, I've worked as a test lead at Amazon and as a Director of Test or QA at several Atlanta startups, where I've had the privilege of starting and/or maturing several teams.

How did I know if my teams were effective? Absolutely, I kept a close eye on test results, internal defect discovery, and field-found defects. However, effective test organizations do more than catch bugs. They provide organized, meaningful, and timely information about product behavior. They also help inform management about the state of the product.

Aside from bugs and tests, what does a good Test organization look like?

High-functioning Test organizations don't just find bugs - they make the development team more efficient. Test is part of the development workflow. Test activity gives development teams confidence in their teams' output. In high-functioning Test organizations, developers rely on the Test team's output to know that new changes have the desired effect.

Quantitative properties:

Test failures: annotated, categorized by reason, linked to supporting issues
Test results: accessible and visible across the organization
Test cases: visible, discoverable; low false-positive rate; high degree of determinism; run automatically when code changes

Qualitative properties:

Tests added rapidly in response to new functionality, bugs, inquiries
Test process facilitates, rather than hinders, development by providing rapid, clear, actionable feedback

Test Organizations - Levels of Sophistication

Unsophisticated: no automated tests; no testers; test/QA owned by developers
Low sophistication: dedicated testers check the product for defects before the release
Moderate sophistication: dedicated testers check products for defects and baseline specifications during the development cycle. Testers provide assistance from automated test suites.
High sophistication: tests continuously check the product for defects and baseline specifications. Automated tests highlight changes in product behavior as development teams make changes.

In summary, beyond catching bugs, a high-functioning Test organization supports development activities and helps leadership understand the state of the product.

Things I've unlearned in Test

30 January 2023 - Posted in Testing by Kenneth Kinion

One of the things that stuck with me from my time at Amazon was a willingness to constantly evaluate and challenge (or re-confirm) generally accepted best practices. Even their own leadership principles stated that they were only there "unless you know better" principles. As a result, my approach to Test Engineering has changed quite a bit since my first years at Microsoft, and I realized I've "unlearned" some things. Below are a few quick examples.

The role of Test/QA is to find bugs and validate requirements

This view is too specific and only captures the first part of the story: the visibility component. A highly-functioning Test organization adds value in two ways: 1) Test adds visibility into how the product behaves, and 2) Test makes the development process more efficient.

Finding bugs and validating requirements is part of adding visibility; however, so is reporting behavior under different circumstances, tracking performance, and documenting product configurations, setup, upgrades, etc. Test Engineering should know as much as possible about what's going out the door, bugs and all, and should relentlessly seek to understand the product being tested and how customers use and see that product.

Test Engineering makes the development process more efficient by providing quick, helpful feedback and assisting with driving testability and testing upstream in the development process. With good testing, developers can move faster with higher confidence, and your team and your company win.

Test harnesses should be highly architected, structured, and organized

I've seen a lot of over-complicated and incomprehensible tests. It isn't easy to make clear, simple tests, but those add the most value. Write tests in whichever ways best achieve that. Simple tests are more easily democratized. Ramp-up time is shorter. Failures due to complicated tests are less likely.

Make it easy to write tests, and write simple tests. Don't over-architect.

Tests should be written in the language of the thing being tested

Tests should be written in the language(s) that best facilitate testing.

It's important to use an industry-standard test framework

There are often many competing test frameworks. More important than a test framework: using the product through testing. That's what matters most. Focus on coverage, not framework fluff.

Don't repeat yourself

I used to take this to an extreme, generating broadly-encompassing test case generators to hit many input permutations with just a few lines of code. However, I'd often spend more time - usually much more - working on the perfect test case generator than it would have taken to create a base set of inputs and copy/paste/edit until I had everything I needed. So, instead of having inputs close to the test case, they were abstracted, took longer to write, and often had bugs in the generation code that took even more time to fix. Worst of all, I was the only one who could understand or maintain the code when it was done.

This isn't to say you shouldn't abstract standard functionality into libraries. But instead, when it comes to testing, obscuring the test interactions (parts of the test case) through abstractions with the product under test can hinder comprehension, take more time, and otherwise be fraught with enough peril to destroy any value-add the abstractions would otherwise create.

Track your test cases in a test case tracker

Test case trackers are often very expensive - hundreds of dollars a year per user - and hold your work (the test cases) hostage. Additionally, time spent futzing with a test case repo is NOT spent testing or adding other value to your team and company.

Let your automation document your test cases. Focus on that.

Don't reinvent the wheel

I've watched highly talented engineers spend significantly more time wrangling and debugging third-party libraries that do 80% of what needs to be done (and consist of 80% of things that aren't needed) than it would have taken to do what would have been required with custom code.

How much time have you spent in your career investigating how to do something in 3rd-party code that could have been coded directly without the library? If you do this a lot, you might need to "reinvent the wheel" more.

Conclusion

When it comes to testing, simpler is usually better. So, much of what I've "unlearned" are simplifications of process and tooling. Make it easy to write tests, and remove impediments that keep you from testing early and often.

Why I built Serj

24 January 2023 - Posted in Data by Kenneth Kinion

Validin started as a company with two data sources and a small amount of related IP: a huge database of IP and Domain references from hundreds of of places from around the Internet, some of which generate thousands of artifacts a day, an absolutely enormous database of DNS associations going back to mid-2019, and the tooling to collect this data.

My initial plan with Validin was to find insightful ways to package the DNS data so it could be sold as a service. However, the scale of the data made it challenging to find appropriate and cost-effective tooling. By the numbers:

~2TB/day of source PCAPs whittled down to ~30GB of data gathered every day, for years - in bz2, parquet-like format
~2.5 billion fully-qualified domain names (FQDNs) with at-least daily DNS association refreshes in 3 categories
~7.5 billion total associations gathered per day (more, if including bi-directional associations)

I considered a number of solutions for storing, indexing, and querying this data:

Traditional databases, like SQLite and PostgreSQL
Non-traditional databases, like MongoDB
Cloud databases, like AWS DocumentDB, GCP Bigtable, and AWS DynamoDB
Cloud serverless databases, like Athena
Semi-managed cloud big data platforms, like Amazon EMR

In each case, off-the-shelf solutions had some combination of at least two (and possibly ALL) of the following problems:

The solution was going to cost too much
- Usually: WAY too much
- Additionally, pricing was often VERY difficult to predict
The scale of my data couldn't easily be handled by the solution OR the latency between query and response would have been too high to put behind an API
The solution couldn't handle my questions without non-negligible engineering effort (e.g.: CIDR range queries)
The solution was non-portable - I'd be locked into a specific cloud provider

However, I'd spent the previous 3.5 years building, tweaking, and using custom, high-performance databases for my previous employer. I had some ideas. I wanted:

Type flexibility - support arbitrary data types and complex (multi-dimensional) associations at the platform layer
Support for hierarchical queries, like those needed for searching for keys within a domain zone, or for IPs within an arbitrary CIDR
Serialized backing files that are small enough to be stored uncompressed at rest, allowing them to be queried directly from the cloud
Low fleet maintenance overhead - support full query range with very limited static infrastructure so I only have to scale when needed
Move operational burden to the transform/serialization step, which is one of my core competencies

The above were turned into requirements, and Serj was born!

Testing Best Practices

17 January 2023 - Posted in Testing by Kenneth Kinion

In the ~10 years of my career that I dedicated to testing, I developed a set of testing best practices. These best practices assume working within an environment with automated baseline testing, but many apply to manual and traditional test framework testing.

Here are some of the best practices I've picked up over time:

Breadth over Depth

Generally, focus on broad coverage before edge cases. Possible counterpoint: prioritize coverage of known-buggy, historically problematic areas.

Prioritize by High-ROI

Prefer covering easy-to-test features of high value over difficult-to-test features of low consequence. This should be obvious, but the point of this is to think about tests explicitly in terms of value vs. effort.

How do you know what's high value? Tests of essential functionality (e.g. logging in, creating new accounts, essential happy-path product tests), and tests of functionality of historically-problematic/fickle product features.

Start from high-ROI tests that are easy to write

Break it Down

Prefer many small, granular tests over few large, all-encompassing tests. Granular tests are easier to debug, and potentially much more informative when they fail over a monolithic test. When a single, granular test fails, you'll know right where to focus your energy.

Counterpoint: sometimes setup is so expensive that it's cost-prohibitive to test granularity. In this case, consider bringing the environment setup into the framework outside of the test cases.

Make Readable Test Output

Investigating failures is easier with intent and context embedded in the test output. It might be really helpful to know details about what's being tested and why. Include context, links to bugs/tickets, and any other information that will help you or another future test debugger understand your tests.

For baseline testing: if it's good enough for a comment, it's probably good enough to be included in the baseline context.

Describe First, Then Judge

When creating baseline tests, capture the behavior of the system being tested first. Once you've captured behavior, carefully inspect the output before adding the captured behavior to the automated tests.

This could also be described as: document how the system behaves, then evaluate it for correctness.

Hastily Create Tests; Carefully Inspect Output

It's easy to miss subtly-incorrect behavior with baseline tests, so inspect it carefully. Help yourself by crafting the test output to make it easier to inspect (e.g., by adding context).

This is related to getting breadth first, but is also related to a mindset shift in baseline testing from a traditional test framework approach.

Don't Log the World

Log verbosely only in the dedicated tests of behavior. Too much unrelated baseline output can bury the signal you want in a test case. Example: if most test cases need to log in before the test scenario begins, verbosely logging sign-in behavior is redundant if done in every test case that needs to log in. Only verbosely log sign-in behavior in the granular tests dedicated to sign-in scenarios.

Avoid the Science Fair Project

Automation efforts should begin showing results quickly. This means catching bugs/behavior regressions quickly, and catching bugs that wouldn't have been caught if the automation hadn't been there.

This contrasts with the "science fair project" that I've observed several times when joining an existing, under-performing test team. You might have a "science fair project" if:

When the automation fails, the supporting teams presume a failure with the test cases, not a regression in the product (low "signal-to-noise")
There's no evidence that automation ever catches bugs
Coverage is shallow and difficult to extend
There are single points of failure on a team of testers for running or investigating test failures
Supporting teams (developers, PMs, other testers, etc) can see that the tests failed, but they have no ability to explain what failed or why

A very smart development manager I once worked with used to opine: "If you want more tests, make it easy to write tests". Tests should be as uncomplicated as possible. They should be high-visibility. They should be informative. Demonstrate your cleverness by how simple you can make the tests, not by making them complicated.

If you want more tests, make it easy to add tests

09 January 2023 - Posted in Testing by Kenneth Kinion

There are a lot of good arguments in favor of throwing away your traditional test framework. The strongest I've seen, however, is that it helps make it easy to add tests.

A very talented development manager I used to work with would say, both to me in my testing capacity but also to his team: "If you want more tests, make it easy to add tests." Build tools to interact with your product. Code an interaction. Capture the output, and you have a new baseline test!

By focusing on tools to interact with what you need to ship, and not on writing test cases to match the language of a test framework, you make it easier to add tests.

Baseline tests facilitate test writing by not locking testers into a framework. Sometimes, Python is the best language to use; other times, it could be Ruby. The best tool could even be curl or a shell script. Not being attached to a framework means you can write tests however is most natural. This makes testing easier. Easier test writing leads to more tests.

Emphasize the tooling for interacting with your system under test, not your test framework.

Introduction to Baseline Testing

02 January 2023 - Posted in Testing by Kenneth Kinion

What is baseline testing in software?

Baseline testing refers to comparing the behavior of a component or system under test to a baseline of previous behavior. The behavior being compared can include API responses, command outputs, file contents, and even performance metrics.

The simplest baseline tests capture the literal output of a deterministic system under test as a "baseline." To test against the baseline, a test harness runs the same test that generated the original baseline, captures the output in the same way as the baseline, then runs a "diff" against the captured output and the original output. If there is a delta (a "diff"), the test fails. If there is no delta ("no diff"), the test passes.

Baseline testing essential workflow

The concept is simple enough that we can create an end-to-end example in just a few minutes.

Suppose we want to create a baseline test for the application hello_world.rb:

puts 'hello, world!'

Running the program generates:

$ ruby hello_world.rb 
hello, world!

We'd capture the output to save it's baseline behavior:

$ ruby hello_world.rb > hello_world.rb.baseline

We now have a baseline in the file hello_world.rb.baseline. All we need to test this program against the baseline is to run it, pipe the output into diff, and observe that there is no diff:

$ ruby hello_world.rb | diff hello_world.rb.baseline -
$ echo $?
0

We now have a baseline test!

Now, let's use this baseline test to detect a change in the application hello_world.rb. Modifying the program to add an additional line:

echo 'puts "hello, world!nUpdated!"' > hello_world.rb

Running the program again and checking the output against the known baseline:

$ ruby hello_world.rb | diff hello_world.rb.baseline -
1a2
> Updated!
$ echo $?
1

There is a diff, and now the tester needs to determine one of two things:

Does the program have an error? Or,
Does the baseline need to be updated.

In it's most essential form, that's it!