Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Required fields*

Is it a bad practice to randomly-generate test data? [closed]

Since I've started using rspec, I've had a problem with the notion of fixtures. My primary concerns are this:

  1. I use testing to reveal surprising behavior. I'm not always clever enough to enumerate every possible edge case for the examples I'm testing. Using hard-coded fixtures seems limiting because it only tests my code with the very specific cases that I've imagined. (Admittedly, my imagination is also limiting with respect to which cases I test.)

  2. I use testing to as a form of documentation for the code. If I have hard-coded fixture values, it's hard to reveal what a particular test is trying to demonstrate. For example:

    describe Item do
      describe '#most_expensive' do
        it 'should return the most expensive item' do
          Item.most_expensive.price.should == 100
          # OR
          #Item.most_expensive.price.should == Item.find(:expensive).price
          # OR
          #Item.most_expensive.id.should == Item.find(:expensive).id
        end
      end
    end
    

    Using the first method gives the reader no indication what the most expensive item is, only that its price is 100. All three methods ask the reader to take it on faith that the fixture :expensive is the most expensive one listed in fixtures/items.yml. A careless programmer could break tests by creating an Item in before(:all), or by inserting another fixture into fixtures/items.yml. If that is a large file, it could take a long time to figure out what the problem is.

One thing I've started to do is add a #generate_random method to all of my models. This method is only available when I am running my specs. For example:

class Item
  def self.generate_random(params={})
    Item.create(
      :name => params[:name] || String.generate_random,
      :price => params[:price] || rand(100)
    )
  end
end

(The specific details of how I do this are actually a bit cleaner. I have a class that handles the generation and cleanup of all models, but this code is clear enough for my example.) So in the above example, I might test as follows. A warning for the feint of heart: my code relies heavily on use of before(:all):

describe Item do
  describe '#most_expensive' do
    before(:all) do
      @items = []
      3.times { @items << Item.generate_random }
      @items << Item.generate_random({:price => 50})
    end

    it 'should return the most expensive item' do
      sorted = @items.sort { |a, b| b.price <=> a.price }
      expensive = Item.most_expensive
      expensive.should be(sorted[0])
      expensive.price.should >= 50      
    end
  end
end

This way, my tests better reveal surprising behavior. When I generate data this way, I occasionally stumble upon an edge case where my code does not behave as expected, but which I wouldn't have caught if I were only using fixtures. For example, in the case of #most_expensive, if I forgot to handle the special case where multiple items share the most expensive price, my test would occasionally fail at the first should. Seeing the non-deterministic failures in AutoSpec would clue me in that something was wrong. If I were only using fixtures, it might take much longer to discover such a bug.

My tests also do a slightly better job of demonstrating in code what the expected behavior is. My test makes it clear that sorted is an array of items sorted in descending order by price. Since I expect #most_expensive to be equal to the first element of that array, it's even more obvious what the expected behavior of most_expensive is.

So, is this a bad practice? Is my fear of fixtures an irrational one? Is writing a generate_random method for each Model too much work? Or does this work?

Answer*

Cancel
3
  • Is the FixtureReplacement link broken? Commented Mar 12, 2009 at 11:28
  • Lots of excellent answers, but this one cut to the chase -- there's a better way to do what I want to do, and my test data doesn't have to be 'random' anymore. Commented Mar 12, 2009 at 16:09
  • bobocopy: It seems so. Odd, I think it was working yesterday. It's fixed now. Commented Mar 13, 2009 at 18:03