Some examples of how you can test what your code should do instead of how it does it.
Tests are one of the most important parts of a codebase. They can catch many bugs before they are released. They can ensure your programs work as expected.
But, tests tied to implementation details can bind you to how code was written in unhelpful ways.
Testing behavior means testing what code does. Testing implementation means testing how code works.
When you write tests for behavior (the “what”), you can refactor your code (the implementation) in confidence without breaking the tests; the tests only break when behavior has changed (unintentionally – which means you’ve introduced a bug – or intentionally – which means tests need to be updated to the new behavior).
These types of tests are extremely valuable because a broken test means your code is broken.
Testing behavior guarantees your code works as expected (for the behaviors you have tested).
When you write tests for implementation (the “how”), tests break when you change code, even if the behavior didn’t change.
These types of tests aren’t very valuable because a broken test doesn’t mean your code is broken and a passing test doesn’t mean your code works as expected.
Testing implementation only guarantees your code was written as it was tested.
Let’s look at some real examples to better understand the difference between the two.
Although the examples are in JavaScript/React with Jest, the principles apply to any programming language, test framework etc. Also, the examples are not hard rules to be taken as law but general guidelines.
import lodash from 'lodash'
import add from './add'
let addSpy
beforeEach(() => {
addSpy = jest.spyOn(lodash, 'add')
})
afterEach(() => {
addSpy.mockReset()
addSpy.mockRestore()
})
test('takes two numbers and adds them together with lodash add', () => {
const sum = add(5, 10)
expect(addSpy).toHaveBeenCalledWith(5, 10)
})
This test is tied to implementation because it relies on the add
function using lodash
underneath the hood. It also has hidden side effects.
import add from './add'
test('can add', () => {
const input = add(5, 10)
const output = 15
expect(input).toBe(output)
})
Now this test only checks behavior. The implementation of add
can be changed in confidence without breaking the test. The test is also now self-contained so changes can be made to it without breaking other tests.
Ok, now that we got that contrived example out of the way…here are some more realistic examples.
test('renders <UtilityNav /> with props'...
This test label relies on implementation with the name of “UtilityNav” and React “props”.
test('can dismiss the onboarding guide'...
Now this test label focuses on the behavior we are checking not specific to file structure/libraries etc.
import React from 'react'
import { shallow } from 'enzyme'
import { mockStripeKey, unmockStripeKey } from 'fixtures/stripeKey'
import PromoCode from './PromoCode'
import Testimonials from './Testimonials'
import { Signup } from '.'
let component
let stripeMetaTag
const props = {
cardHolderName: "Jane Doe",
cardNumber: "4242424242424242",
}
beforeEach(() => {
component = shallow(<Signup {...props} />)
stripeMetaTag = mockStripeKey('stripe-key')
})
afterEach(() => {
unmockStripeKey(stripeMetaTag)
})
test('renders a Testimonials component', () => {
expect(component.find(Testimonials).exists()).toBe(true)
})
describe('when displayDiscount is true', () => {
beforeEach(() => {
const discountProps = {
...props,
displayDiscount: true,
}
component = shallow(<Signup {...discountProps} />)
})
test('renders a PromoCode', () => {
expect(component.find(PromoCode).exists()).toBe(true)
})
})
The difficulty with this approach is that when looking at a test like renders a PromoCode
, to debug and make changes you have to look at multiple levels of inheritance and mutations.
import React from 'react'
import { mount } from 'enzyme'
import { mockStripeKey, unmockStripeKey } from 'fixtures/stripeKey'
import { Signup } from '.'
test('has testimonials', () => {
const stripeMetaTag = mockStripeKey('stripe-key')
const signup = mount(
<Signup
cardHolderName="Jane Doe"
cardNumber="4242424242424242"
/>
)
const input = signup.text()
const output = 'This is one of my favorite apps'
expect(input).toContain(output)
unmockStripeKey(stripeMetaTag)
})
test('can use a promo code when enabled', () => {
const stripeMetaTag = mockStripeKey('stripe-key')
const signup = mount(
<Signup
cardHolderName="Jane Doe"
cardNumber="4242424242424242"
displayDiscount
/>
)
const input = signup.find('[data-testid="promoCode"]').exists()
const output = true
expect(input).toBe(output)
unmockStripeKey(stripeMetaTag)
})
This isn’t as “DRY”. But I’ve found having explicit tests makes it much easier to update tests without breaking things because tests are self-contained. Having explicit tests like this means tests aren’t tied together with hidden state at multiple inheritance levels. Also, the tests now test behavior instead of implementation (other than the minimal required implementation detail of the data-testid
).
If the repetition is cumbersome to you, you could put the repeating pieces in factory functions so that the output is still pure (has the same output every time you use it, instead of being mutated in inheritance levels).
I also think having some before/after blocks is totally fine as long as you keep it simple (maybe one level deep?). It’s when there are multiple levels of inheritance that it becomes difficult to understand and make changes.
expect(typeof startDateInput.prop('onChange')).toBe('function')
These kinds of assertions in tests are easier and have better deep checking with static analysis tools like type checkers.
I’d suggest deleting these sorts of type checking tests and instead rely on a type checker (for example, TypeScript or Flow in JavaScript). For example:
type Props = {
onChange: (newValue: DateTime) => void,
...
}
const DateInput = (props: Props) => ...
import React from 'react'
import { shallow } from 'enzyme'
let accountSelect
let accounts = getPopulatedAppState().accounts.data
test('renders a <BaseAccountSelect />', () => {
accountSelect = shallow(
<AccountSelect accounts={accounts} selectedAccountIds={['1', '2']} />
)
expect(accountSelect.find(BaseAccountSelect).length).toBe(1)
const props = accountSelect.find(BaseAccountSelect).props()
expect(props.accounts).toBe(accounts)
expect(props.selected).toEqual(['1', '2'])
expect(props.stackOptions).toBe(true)
})
The assertions in these kinds of tests (ie expect(props.accounts).toBe(accounts)
) are testing that the following wire-up/syntax in React works:
<BaseAccountSelect accounts={accounts} ... />
Which is covered by type checking.
Also, style wire up tests like this:
test('styles the component as unqueueable', () => {
expect(body.props('className')).toContain(styles.unqueueable)
})
I’d suggest removing these types of tests because they have a high cost (can’t make changes to implementation without breaking them) and low value.
I think it would be more valuable to instead test specific behaviors. For example:
import React from 'react'
import { mount } from 'enzyme'
test('shows amount of selected accounts out of the total accounts available to select', () => {
const accountSelect = mount(
<AccountSelect
accounts={{
data: [
{
type: 'account',
id: '1',
},
{
type: 'account',
id: '2',
},
],
}}
selectedAccountIds={['2']}
/>
)
const input = accountSelect.text()
const output = '1/4'
expect(input).toContain(output)
})
When we have UI code that has utility logic in it like this with related UI tests:
const hasMultipleTwitterAccounts = accounts
&& accounts.filter(
account => account.platform === 'twitter'
).length > 1
...
<OneTwitterAccountPerContentAlert
hasMultipleTwitterAccounts={hasMultipleTwitterAccounts}
/>
The logic with hasMultipleTwitterAccounts
doesn’t have anything specific to UI in it, but it’s tied to React and Enzyme because it is in the component code.
We can pull the tests out of the UI and into their own utility module like this:
import hasMultipleTwitterAccounts from '.'
test("returns false when there aren't any accounts", () => {
const input = hasMultipleTwitterAccounts([])
const output = false
expect(input).toBe(output)
})
test("returns false when there aren't any twitter accounts", () => {
const input = hasMultipleTwitterAccounts([
{
platform: 'facebook',
},
])
const output = false
expect(input).toBe(output)
})
test('returns false when there is one twitter account', () => {
const input = hasMultipleTwitterAccounts([
{
platform: 'twitter',
},
{
platform: 'facebook',
},
])
const output = false
expect(input).toBe(output)
})
test('returns true when there are multiple twitter accounts', () => {
const input = hasMultipleTwitterAccounts([
{
platform: 'twitter',
},
{
platform: 'facebook',
},
{
platform: 'twitter',
},
])
const output = true
expect(input).toBe(output)
})
Then implement the utility module:
import type { Account } from 'types'
const hasMultipleTwitterAccounts = (accounts: Account[]) => {
if (!accounts) {
return false
}
const twitterAccounts = accounts.filter(
account => account.platform === 'twitter'
)
return twitterAccounts.length > 1
}
export default hasMultipleTwitterAccounts
Then replace the logic in the UI with the utility module:
import hasMultipleTwitterAccounts from './hasMultipleTwitterAccounts'
...
<OneTwitterAccountPerContentAlert
hasMultipleTwitterAccounts={hasMultipleTwitterAccounts(accounts)}
/>
Now the tests and logic aren’t tied to any UI / libraries.
import React from 'react'
import { shallow } from 'enzyme'
import Badge from 'typography/Badge'
import Header from '.'
test('has warning badge when there is an error', () => {
const header = shallow(<Header errorCount={2} />)
const badge = result.find(Badge)
expect(
badge
.children()
.first()
.children()
).toBe('2')
})
For example, with React and Enzyme, using shallow
rendering means you need to traverse through specific DOM structure with .children
, .dive
, .find(a)
etc. This ties you to that DOM structure so if you want to add another wrapper or change the a
tag to a button
etc. your tests will break.
import React from 'react'
import { mount } from 'enzyme'
import Header from '.'
test('has warning badge when there is an error', () => {
const header = mount(<Header errorCount={42} />)
const input = header.text()
const output = '42'
expect(input).toContain(output)
})
Now this test isn’t tied to DOM structure or specific components etc.
To get around DOM structure testing with React and Enzyme you can use mount
and methods like .text()
, .html()
, .find('[data-testid="promoCode"]')
, .find({ someProp: someValue})
etc. In theory, shallow rendering sounds cool but in practice it ties you to implementation.
import React from 'react'
import { shallow } from 'enzyme'
import moment from 'moment'
import { DateCreated } from '.'
let dateCreated
let filter
let props
beforeEach(() => {
filter = {}
props = { filter, onFilterPatch: jest.fn() }
dateCreated = shallow(<DateCreated {...props} />)
})
describe('when the input value changes', () => {
describe('for the start date', () => {
let startDateInput
describe('by default', () => {
let newStartDate
beforeEach(() => {
startDateInput = dateCreated.find({ placeholder: 'Pick start date' })
newStartDate = moment('2017-09-24')
startDateInput.simulate('change', newStartDate)
})
test('invokes onChange with a formatted startDate', () => {
const formattedStartDate = newStartDate.format('YYYY-MM-DD')
expect(props.onFilterPatch).toHaveBeenCalledWith({
startDate: formattedStartDate,
})
})
})
})
})
import React from 'react'
import { mount } from 'enzyme'
import { createWaitForElement } from 'enzyme-wait'
import moment from 'moment'
import { DateCreated } from '.'
test('can change the start date', () => {
const dateCreated = mount(<DateCreated filter={{}} />)
const changedStartDatetime = moment('2017-09-24')
const startDateInput = dateCreated.find('[data-testid="dateInput"]')
startDateInput.simulate('click')
await createWaitForElement('[data-testid="openDateInput"]')(dateCreated)
startDateInput.simulate('change', changedStartDatetime)
await createWaitForElement('[data-testid="closedDateInput"]')(dateCreated)
const input = dateCreated.text()
const output = '2017-09-24'
expect(input).toContain(output)
})
Now the test checks behavior (waiting on async React state changing) instead of function passing implementation with spies.
However, I’m not saying all spies are bad. If your behavior you are testing is that a function is called, a spy can be useful.
If it is impossible to write a test with simple input => output, end-to-end tests might be a better fit. I wrote a blog post on how simple end-to-end tests can be written using Puppeteer that provide huge coverage for little cost. Puppeteer is from the Chrome team and uses the official headless Chrome so the tests can run in a real environment but without the brittleness and slowness of traditional end-to-end test setups.
Here is one example:
describe('logout', () => {
test('can logout', async () => {
await page.waitForSelector('[data-testid="userMenuButton"]')
await page.click('[data-testid="userMenuButton"]')
await page.waitForSelector('[data-testid="userMenuOpen"]')
await page.click('[data-testid="logoutLink"]')
await page.waitForSelector('[data-testid="userLoginForm"]')
})
})
These types of tests can remove the need for many heavily mocked and brittle unit tests trying to accomplish the same thing. They also ensure these integrated pieces work together in a real environment which is very valuable (because you will be notified if any of the pieces stop working – ie your database or API is down or a button no longer works etc.).
Here are some good indicators that you are testing behavior instead of implementation:
The moral of the story is: each test should focus on a single behavior you don’t want to break 🙂
Send this to a friend