Background and Purpose
After the SimpleTest session Wednesday morning I talked with sethcohn about some possible solutions to some of the issues that are being faced with full scale implementation of the unit test coverage. Some of the ideas are specific to the SimpleTest framework, but much could be applied to other testing frameworks. According to chx it sounds like the SimpleTest framework is going to be used which makes this information especially relevent.
To start off the biggest unknown currently is how to go about implementing "function unit testing" (which I beleive is the term). In other words calling an individual function with different arguements and checking the results against the know correct output. Until now this has been virtually untested and, to my knowledge, not really thought about at all. Instead "functional" testing has been worked on (as I was apart of).
We received, by show of hands, a considerable number of people who claim to be interested in contributing to the testing challenge, as introduced by Dries in his keynote. If anywhere near the number of people that raised their hands contribute, it could create a very chaotic situation and put a lot of pressure on the SimpleTest maintainer(s). In an attempt to address this I have compiled some ideas and thoughts.
Function Unit Testing
First off, we need to have a standard method for naming and partitioning the function tests. This could be very similar to the current schema used with the "functional" unit testing. So for every include file, core file, etc. would be one test file. It would be good to name the unit test files the same as the file they would be testing, but change the extension to .test. This would not only provide a consistent method for naming and would remove some of the guess-work from creating unit tests. Removing the guess-work would increase the likelyness of new contributors and would make the overall tasks simpler.
Since there is an enormous amount of work that needs to be done in order to get complete "function" coverage, or at least work towards it, there needs to be some way of listing out the tasks in order for them to be completed. I am proposing a script to automate several parts of this task.
A script could scan the files for all functions that will have SimpleTests written for them. This will have many benefits which I will discuss further. The most apparent benefit would be a list of functions that could be published on d.o. and used by those who wish to contribute tests. The list could be refined by allowing weighting that could be initialized by the number of times a each particular function is called throughout the Druapal core. Then the SimpleTest maintainers, with suggestions from the community, could re-order functions that require more or less priority. This would give contributors some sense of priority and a sense of accomplishment as things are moved off the list.
Moving items off the list brings up another point of discussion. If any volume of tests start to be committed reviewing them will be time consuming enough much less trying to sycronize the list and such. A solution would be to attach the discussion/patch node with the function/test that it is associated with and when the issue is marked as committed/closed then the item would be removed from the list. This would create a nice workflow without spamming the issue queue with a node for each function or set of functions ahead of time.
The script could then go one step further and create stub test files that follow the naming standards and give contributors a nice base to start from. An approximate summary of what should be tested could also be created. The script could scan for some general test cases:
- Conditional logic
- Secondary functions
- Others patterns determined from looking at existing tests
From which it could determine the most common things that could be tested for that function and generate a test function stub with comments describing ideas for testing and give contributors a place to start.
Progress
Progress could be measured in several ways.
- The method described above - node attachment
- Some sort of scanning script
- Attach a documentation tag like
@test
The latter method could be implemented in addition to the first which would provide a nice overall flow. The tag could then be interpreted by the API generation scripts to create a link to the inform any latter developer of the test location so they can easily update it when changing the original. This will save developers from having to guess if a test has been created for functions and if viewing through the api documentation on api.d.o could also provide a link so some sort of description of the test file and how to update tests.
Functional and Integration Testing
Lists of tasks could also be provided for both functional and integration testing. The lists could provide an overview of what tests need to be written and could be attached to nodes describing them in detail. These lists would most likely need to be created and managed by hand, but would be much smaller in comparison to the function testing list.
SimpleTest Theme
In all functional, and possible integration testing, it is difficult to test formatted output, when using the internal browser, to ensure that elements are output in the correct order or form. In addition the theme layer is being indirectly tested.
A SimpleTest theme could be created in order to output the page in a machine readable format. The theme should be fairly simple and could possibly use a combination of var_export and some block formating. This theme could be toggled or a function could be created that would allow the tests to switch themes.
The Drupal SimpleTest class could then be extended to support simple parsing of the page when using the SimpleTest theme. The data could then be stored in a hierarchal array and connivence functions could be created for easy comparisons. The functions could be in the form of assertions. This would make interface testing allot simpler and reliable.
Conclusion
I am very interested in implementing the above suggestions and can begin upon returning home, this weekend. Any comments and suggestions are welcome. Some sort of consensus on whether or not this is something that should be done would also be nice.

Comments
list is good
editable list of functions broken down by functional area so folks can attach "in progress..." may be useful.
we may learn more as tests come in.
Automation +1
Excellent ideas. Automation will enable testing to be done on a much larger scale. The implementation proposed seems sound as well. Let me know if you want any help.
...And a little looking out for the other guy too. -- Mr. Smith
...And a little looking out for the other guy too. -- Mr. Smith
I started a discussion for
I started a discussion for adding @test to doxygen here: http://groups.drupal.org/node/9476
PHPUnit has some nice features I'd like to see in SimpleTest...
The two that come to mind immediately are
1) data providers: http://www.phpunit.de/pocket_guide/3.2/en/writing-tests-for-phpunit.html
2) Marking a test as incomplete: http://www.phpunit.de/pocket_guide/3.2/en/incomplete-and-skipped-tests.html
Project Started
I have started a project to implement many of the ideas mentioned here. It will not be completed until the coding standards have been discussed and I have time to implement them.
http://drupal.org/project/simpletest_unit
I'm not clear...
What exactly will you be doing? Creating a new module? Making mods to existing stuff? More, please :)
serialize became my friend but its bad
hi I implemented a small amount of function unit testing pre-xmas and I quickly found that the hardest thing was testing for particular array value returns, to short cut this I used the serialize function and compared the two resulting strings
while this felt like a bad idea because I was testing far too much detail I found it was the only way to speed up tests.
To me its not the functional unit testing that is hard its really that most of the data structures are uber generic, i.e. nested arrays. And the interface for the data structures is only implied not queryable so its hard to make mock values for testing purposes.
Arrays are great because then everything is the same but for testing it makes it difficult. You end up creating a copy of the array that is returned and often this is a hugely nested array or even a std object with arrays for values.