Property based testing

What is Property Based Testing ?

Property-based tests make statements about the output of your code based on the input, and these statements are verified for many different possible inputs.

The problem

Fortunately nowadays we are used to do testing. But sometimes tests are hard to do, hard to maintain and hard to reason about. Even if we try our best, sometimes we forgot to add that case that made the app to crash in the worst possible moment.

Property based testing claims to be able to help us to fill this gap and make our tests easier to maintain. But I’ve also heard it is also harder to write and grasp. In this entry I’m reviewing how property based testing is handled in different programming languages. From a more imperative like Java or Groovy to more functional languages like Clojure, Scala and finally Frege. During this journey I’ll try to figure out which are the patterns and best practices to get advantage of property based testing.

Java

Assumptions ⇒ Theories/Properties ⇒ Proof
— Use case

For the Java sample I’ve tried junit-quickcheck. This library is meant to be used with junit tests.

dependencies
testCompile 'com.pholser:junit-quickcheck-core:0.5'
testCompile 'com.pholser:junit-quickcheck-generators:0.5'
testCompile 'junit:junit:4.12'
testCompile 'org.hamcrest:hamcrest-junit:2.0.0.0'

The sample is a silly example about loans. If somebody ask for a loan, depending on how much this person is asking, the state of the loan may vary. The Java library works on top of the concept of theories and assumptions.

  • 1st theory: If the loan is less than or equals to 200 EUR then the loan is ACCEPTED right away.

  • 2nd theory: If the loan is between 201 and 1000 then it should be marked as PENDING until the request is studied in detail

  • 3rd theory: All loans beyond 1000 will be automatically REJECTED.

Theories aree annotated with the @Theory annotation. Follow up the theory for the automatically accepted loans:

less or equals to 200
@Theory public void automaticallyApproved( (1)
    @ForAll @InRange(minDouble = 0d, maxDouble = 200d) Double amount) { (2)

    Loan loan = new Loan(State.PENDING, amount); (3)

    assumeThat(loan.state, equalTo(State.PENDING));
    assumeThat(loan.amount, lessThanOrEqualTo(200d)); (4)

    Supervisor supervisor = new Supervisor();
    Loan processedLoan = supervisor.process(loan); (5)

    assertEquals(processedLoan.state, State.ACCEPTED); (6)
}
1 This is a @Theory
2 @ForAll cases in this theory (by default are 100) the amount will be provided by a @InRange and it will be a number between 0 and 200 (It doesn’t make sense to grant a loan of 0 EUR right ? :P)
3 Building a new loan with the provided amount
4 Assuming the new loan state is PENDING and the amount is less than or equals 200
5 Processing that loan
6 The result should be an accepted loan

Then we can create theories for the remaining use cases:

between 200 and 100
@Theory public void needsAFurtherStudy(
    @ForAll @InRange(minDouble = 201d, maxDouble = 1000d) Double amount) {

    Loan loan = new Loan(State.PENDING, amount);

    assumeThat(loan.state, equalTo(State.PENDING));
    assumeThat(loan.amount, allOf(
        greaterThan(200d),
        lessThanOrEqualTo(1000d)
    ));

    Supervisor supervisor = new Supervisor();
    Loan processedLoan = supervisor.process(loan);

    assertEquals(processedLoan.state, State.PENDING);
}

and…​

beyond 1000
@Theory public void automaticallyRejected(
    @ForAll @InRange(minDouble = 1001d, maxDouble = 20000d) Double amount) {

    Loan loan = new Loan(State.PENDING, amount);

    assumeThat(loan.state, equalTo(State.PENDING));
    assumeThat(loan.amount, allOf(
        greaterThan(1000d),
        lessThanOrEqualTo(20000d)
    ));

    Supervisor supervisor = new Supervisor();
    Loan processedLoan = supervisor.process(loan);

    assertEquals(processedLoan.state, State.REJECTED);
}

I could have created a generator for Loan instances, but it seemed overkill for such a little example. It was easier to call to a predefined generator to feed a given Loan property.

Groovy

Know the output ⇒ Check a certain set of inputs give the right output
— Use case

To use property based testing with Groovy I’m using Spock as testing framework and Spock Genesis which has a set of value generators.

dependencies
testCompile 'org.spockframework:spock-core:1.0-groovy-2.4'
testCompile 'com.nagternal:spock-genesis:0.3.0'
testCompile 'junit:junit:4.12'

This time we have a function building URIs that should follow certain rules. First of all the final URI should follow these rules:

Rules
static final Pattern WORD            = ~/[a-z0-9-._\-]{1,8}/
static final Pattern OPTIONAL_SLASH  = ~/[\/]{0,1}/
static final Pattern OPTIONAL_WORD   = ~/($WORD){0,1}/

static final Pattern COMPLIANT_FRAGMENT  = ~/$OPTIONAL_SLASH$OPTIONAL_WORD$OPTIONAL_SLASH/
static final Pattern COMPLIANT_URI       = ~/s3:\/\/$WORD($COMPLIANT_FRAGMENT){1,10}/

Then we build the function that would match that rules:

Composer
static URI compose(String host, String bucket, String path) {
    String treatedRoot = bucket.endsWith('/') ? bucket : "$bucket/"
    String treatedPath = path.dropWhile { it == '/' }

    return URI.create("$host$treatedRoot$treatedPath")
}

And finally lets execute a test checking that function:

Test
@Unroll('Getting an URI from (h: #host, r: #root, p: #path)')
void 'composing URI fragments to get a full URI'() {
    when: 'composing all pieces'
    def uri = URIComposer.compose(host, root, path).toString()

    then: 'we should get a compliant URI'
    uri ==~ COMPLIANT_URI (1)

    where: 'possible values are'
    root << fragmentProperties.take(DEFAULT) (2)
    path << fragmentProperties.take(DEFAULT) (3)

    host = "s3://username"
}

StringGenerator getFragmentProperties() {
    return Gen.string(COMPLIANT_FRAGMENT)
}
1 Result should be a valid (upon our rules) URI
2 Bucket/Root path should follow the valid fragment property
3 Rest of paths should follow the valid fragment property

It’s clear, I could have forgotten to add many of the possible cases if I would have written those cases manually. This way I’m taking advantage of the declared rules to generate a bunch of use cases for me.

Scala

Assumptions ⇒ Theories/Properties ⇒ Proof
— Use case

Because I’m not yet used to Scala I’ve taken the same example I did in Java and tried to translate it to Scala to see how it looks like.

So we also have a function to process a given Loan:

def process(loan: Loan) : Loan = loan.amount match { (1)
  case x if 0   until 201  contains x  => Loan(State.ACCEPTED, loan.amount) (2)
  case x if 201 until 1001 contains x  => Loan(State.PENDING,  loan.amount) (3)
  case _                               => Loan(State.REJECTED, loan.amount) (4)
}
1 For a given loan amount
2 If the amount is 0 < amount < 201
3 If the amount is 201 < amount < 1000
4 If the amount is anything else

And dependending on the requested amount we should be receiving different state. For amounts automatically accepted:

val acceptableLoans = for {
  amount <- Gen.chooseNum(0,200) (1)
} yield Loan(State.PENDING, amount) (2)

property("accepted loans") = forAll(acceptableLoans) { (loan: Loan) =>
  Supervisor.process(loan).state == State.ACCEPTED (3)
}
1 Using a number generator for getting amounts from 0 to 200
2 Building instances of Loan with State.PENDING and 0 < amount < 200
3 For all provided loans once processed they all should be ACCEPTED

For those which are directly rejected:

val rejectableLoans = for {
  amount <- Gen.chooseNum(1001,2000) (1)
} yield Loan(State.PENDING, amount) (2)

property("rejected loans") = forAll(rejectableLoans) { (loan: Loan) =>
  Supervisor.process(loan).state == State.REJECTED (3)
}
1 Using a number generator for getting amounts from 1000 to 1999
2 Building instances of Loan with State.PENDING and 1000 < amount < 1999
3 For all provided loans once processed they all should be REJECTED

Clojure

Check behavior
— Use case

Clojure has a complete-like quick-check testing framework called Test Check. It can be used standalone but I’ll be using it withing a clojure.test thanks to the defspec macro.

dependencies
:dependencies [[org.clojure/clojure "1.7.0"]
               [org.clojure/test.check "0.9.0"]]

The Clojure example has to do with numbers. Lets say I’m reading a CSV file with lines of numbers. Those lines may have numbers or characters. I’m only interested in adding up all numbers of each line.

What are the properties ? Well, Given a line with elements separated by ,…​

  • Numbers are all elements minus the non numeric values

  • Adding up all numeric elements should follow the commutativity rule

In our test we’re declaring that for all possible values of a possible empty vector of alphanumeric values, once we join those values in a csv-like string, it doesn’t matter which is the order of the included digits, the outcome should remain the same.

Test
(ns qc.core-test
  (:require [clojure.test :refer :all]
            [qc.core :refer :all]
            [clojure.string :as str]
            [clojure.test.check :as tc]
            [clojure.test.check.generators :as gen]
            [clojure.test.check.properties :as prop]
            [clojure.test.check.clojure-test :as ct :refer (defspec)]))

(defn join-chars
  [chars]
  (str/join "," chars))

(defspec check-adding-up-numbers-from-line
  100 (1)
  (prop/for-all [v (gen/vector gen/char-alphanumeric)] (2)
                (let [line (join-chars v)
                      reversed-line (join-chars (reverse v))]
                  (= (sum-numbers line) (3)
                     (sum-numbers reversed-line))))) (4)
1 Number of iterations
2 Generators (It generates vectors of alphanumeric characters, sometimes could be empty)
3 Setting a sample line and a reversed version of that line
4 The result should be the same despite the order

A simple implementation of the required function:

Sum
(defn sum-numbers
  "Adds up all numbers within a CSV line expression"
  [line]
  (if (empty? line) 0 (1)
    (let [elements (str/split line #",")] (2)
      (->> elements
           (filter is-digit) (3)
           (map to-int) (4)
           (reduce +))))) (5)
1 Checking if the argument is empty
2 Splitting values
3 Filtering digits
4 Converting to integers
5 Adding up all integers

Frege

Reversible processes
— Use case

Finally I will be coding a very simple example representing how to test some process that you know that applied twice gives you the original value. The most used example for this is to reverse a string:

reverse string
module qc.Reverse where

-- Reverses a string
reverseString :: String -> String
reverseString = packed . reverse . toList

I’m defining two properties that should hold for the reverseString function:

properties
module qc.TestReverse where

import Test.QuickCheck
import qc.Reverse

applyTwice :: String -> Bool
applyTwice xs = ((reverseString . reverseString) xs) == xs

applyToOne :: Char -> Bool
applyToOne x = ((reverseString . packed) [x]) == packed [x]

reversible                 = property (applyTwice) (1)
noEffectToSingleCharacter  = property (applyToOne) (2)
1 reversible: The function applied twice to the same word should return the original value.
2 no effect to a single character: original word The function applied to a single character should return the same character

In this Frege example I’ve not used any generator explicitly, but declaring the functions that are going to be used with the property function is enough for the compiler to infer what type of values should provided to our function to test it.

Conclusion

About languages and frameworks:

  • All languages and testing framewors have generators, and most of them are pretty similar

  • Only Groovy and Java examples don’t have the concept of minimum failure sample but at least Groovy has the advantage of Spock which helps a lot defining the specification.

About use cases:

  • Some detected types:

    • Based on defined rules: generate value ranges to prove a set of pre-defined rules

    • Based on reversible processes: Some problems can be solved by asking a process to execute twice: reverse is a good example of this.

    • Based on known output: Problems requiring to check a wide range of values to give the expected output.

    • Not aimed to test result but behavior: Like commutativity over sums (Clojure example).

  • A set of properties can be considered a specification. Both Groovy and Scala languages mimic that line of thought very well.

  • If properties are difficult to define, then it may lead to indeterministic checking and different test results.

Well I’ve just started and my feeling is that I’ve seen just the tip of the iceberg. I really like the idea about trying to define what are the general properties that a given function should obey and let the testing framework to provide a set of possible values to challenge those properties.