In within-subject study designs (also known as the cross-over), subjects may be given sequences of treatments with the intent of studying the differences between the effects produced by individual treatments. One should keep in mind that such sequence of testing always bears the danger that the first test might affect the following ones. If there are reasons to expect such interference, within-subjects designs should be avoided.

In the simplest case of a cross-over design, there are only two treatments and only two possible sequences to administer these treatments (e.g. A-B and B-A). In nonclinical research and, particularly, in pharmacological studies, there is a strong trend to include at least three doses of a test drug and its vehicle. A Latin Square design is commonly used to allocate subjects to treatment conditions. Latin Square is a very simple technique but it is often applied in a way that does not result in a proper randomization:

In the example above, each subject receives each of the four treatments over four consecutive study periods and, for any given study period, each treatment is equally represented. If there are more than four subjects participating in a study, then the above schedule is copied as many times as need to cover all study subjects.

Despite its apparent convenience (such schedules can be generated without any tools), resulting allocation schedules are predictable and, even worse, are not balanced with respect to first-order carry-over effects (e.g., except for the first test period, D comes always after C). Therefore, such a Latin Square design is not an example of a properly conducted randomization procedure.

One solution would be to create a complete set of orthogonal Latin Squares. For example, when the number of treatments equals three, there are six (i.e. 3!) possible sequences – ABC, ACB, BAC, BCA, CAB, and CBA. If the sample size is a multiple of six, then all six sequences would be applied. As the preclinical studies typically involve small sample sizes, this approach becomes problematic for larger numbers of treatments such as four, where there are already 24 (i.e. 4!) possible sequences. 

A good alternative is the Williams design, which is a special case of a Latin Square, where every treatment follows every other treatment for the same number of times:

The Williams design maintains all the advantages of the Latin Square but is balanced. There are six Williams squares possible in case of four treatments. Thus, if there are more than four subjects, more than one Williams square would be applied (e.g. two squares for eight subjects).

Constructing the Williams squares is not a randomization yet. In studies based on within-subject designs, subjects are not randomized to treatment in the same sense as they are in the between-subject design. For a within-subject design, the treatment sequences are randomized. In other words, after the Williams squares are constructed and selected, individual sequences are randomly assigned to the subjects.

For practical use, cross-over designs are supported by an R package called crossdes (see https://test2.paasp.net/wp-content/plugins/resource-center/r-scripts/).