Making design exploration software speak the language of engineers and not mathematicians has been a focus of development since the industry’s inception. Even so, our recent case study was typical in referencing the Latin hypercube design-of-experiments method, the radial basis function for generating a response surface model, the non-dominated sorting evolutionary algorithm to generate a Pareto front—all prompting this look into some of the quantitative methods that drive design space exploration.
DOE fundamentals recap—A designed experiment is a structured set of tests of a system or process. Integral to a designed experiment are response(s), factor(s) and a model.
Appendix A: Latin Hypercube Sampling 1. Software was refined and was first released. Latin hypercube sampling is used worldwide in computer modeling. Making design exploration software speak the language of. Latin hypercubes and all that. Full factorial sampling, and (c) Latin hypercube sampling.
Most risk analysis simulation software products offer Latin Hypercube Sampling (LHS). It is a method for ensuring that each probability distribution. Introduction Latin hypercube sampling. Are due to ongoing development of the software application and are not related to the issue being demonstrated.). Latin hypercubes and all that. Making design exploration software speak the language of engineers and not mathematicians has. Latin hypercube sampling.
- A response is a measurable result—fuel mileage (automotive), deposition rate (semiconductor), reaction yield (chemical process).
- A factor is any variable that the experimenter judges may affect a response of interest. Common factor types include continuous (may take any value on an interval; e.g., octane rating), categorical (having a discrete number of levels; e.g., a specific company or brand) and blocking (categorical, but not generally reproducible; e.g., automobile driver-to-driver variability).
- A model is a mathematical surrogate for the system or process.
- The experiment consists of exercising the model across some range of values assigned to the defined factors.
In deciding what values to use—more precisely, in deciding a strategy for choosing values—the goal is to achieve coverage of the design space that yields maximum information about its characteristics with least experimental effort, and with confidence that the set of points sampled gives a representative picture of the entire design space. Numerous sampling methods exist to do this: which to use depends on the nature of the problem being studied, and on the resources available—time, computational capacity, how much is already known about the problem.
In a helpful taxonomic discussion, Noesis Solutions observes that DOE methods can be classified into two categories: orthogonal designs and random designs. “The orthogonality of a design means that the model parameters are statistically independent. It means that the factors in an experiment are uncorrelated and can be varied independently. Widely used methods are fractional- and full-factorialdesigns, central composite designs and Box-Behnken designs.
Source: Noesis Solutions
“A factorial design has some disadvantages: initially it is usually unclear which factor is important and which is not. Since the underlying function is deterministic, there is a possibility that some of the initial design points collapse and one or more of the time-consuming computer experiments become useless. This issue’s called the collapse problem. Most classic DOEs are only applicable to rectangular design regions. And the number of experiments increases exponentially with increasing number of levels.”
What of the other kind? Noesis: “A random design means that the model parameter values for the experiments are assigned on the basis of a random process, which is another widely used DOE method. The most commonly used random DOE method is the so-called Latin Hypercube Design (LHD).
Source: Noesis Solutions
“The collapse problem does not occur with LHDs. This is because if one or more factors appear not to be important, every point in the design still provides some information regarding the influence of the other factors on the response. In this way, none of the time-consuming computer experiments will turn out to be useless.”
Drill-down on some principal DOE methods
Examples of (a) random sampling, (b) full factorial sampling, and (c) Latin hypercube sampling, for a simple case of 10 samples (samples for τ~ U (6,10) and λ ~ N (0.4, 0.1) are shown). In random sampling, there are regions of the parameter space that are not sampled and other regions that are heavily sampled; in full factorial sampling, a random value is chosen in each interval for each parameter and every possible combination of parameter values is chosen; in Latin hypercube sampling, a value is chosen once and only once from every interval of every parameter (it is efficient and adequately samples the entire parameter space). Source: Hoare et al., Theoretical Biology and Medical Modelling, 2008.
- Full factorial designs—The experiment is run on every possible combination of the factors being studied. The most conservative of all design types, yielding the highest-confidence results, but at the highest cost in experimental resources. Sample size is the product of the numbers of levels of the factors: a factorial experiment with a two-level factor, a three-level factor and a four-level factor requires 2 X 3 X 4 = 24 runs. Too expensive to run in many if not most cases.
- Fractional factorial designs—Experiment consists of a subset (fraction) of the experiments that would have been run on the equivalent full factorial design. The subset is chosen to expose information about the most important features of the problem studied, using only a fraction of the experimental runs and resources of a full factorial design. Exploits the sparsity-of-effects principle that a system is usually dominated by main effects and low-order interactions, and thus only a few effects in a factorial experiment will be statistically significant.
- Latin hypercube designs—Latin hypercube sampling is a statistical method for generating a sample of plausible collections of parameter values from a multidimensional distribution. In statistical sampling, a square grid containing sample positions is a Latin square if (and only if) there is only one sample in each row and each column. A Latin hypercube is the generalization of this concept to an arbitrary number of dimensions, whereby each sample is the only one in each axis-aligned hyperplane containing it. When sampling a function of N variables, the range of each variable is divided into M equally probable intervals. M sample points are then placed to satisfy the Latin hypercube requirements; this forces the number of divisions, M, to be equal for each variable. This sampling scheme does not require more samples for more dimensions (variables); this independence is one of the main advantages of this sampling scheme. Another advantage is that random samples can be taken one at a time, remembering which samples were taken so far.
- Plackett-Burman designs—Used to identify the most important factors early in design exploration when complete knowledge about the system is often unavailable. An efficient screening method to identify the active factors in a design using as few experimental runs as possible.
- Central composite designs—Experimental design useful in response surface methodology for building a second-order (quadratic) model for the response variable without needing to use a complete three-level factorial experiment. After the designed experiment is performed, linear regression is used, sometimes iteratively, to obtain results.
- Box-Behnken designs—A type of response surface design that does not contain an embedded factorial or fractional factorial design. Box-Behnken designs have treatment combinations that are at the midpoints of the edges of the experimental space and require at least three continuous factors. These designs allow efficient estimation of the first- and second-order coefficients. Because Box-Behnken designs often have fewer design points, they can be less expensive to run than central composite designs with the same number of factors. However, because they do not have an embedded factorial design, they are not suited for sequential experiments.
- Taguchi orthogonal arrays—Instead of having to test all possible combinations like the factorial design, the Taguchi method tests pairs of combinations. This allows for collection of the necessary data to determine which factors most affect product quality with a minimum amount of experimentation. The Taguchi method is best used when there is an intermediate number of variables (3 to 50) and few interactions between variables, and when only a few variables contribute significantly.
- Taguchi robust design arrays—Taguchi robust design is used to find the appropriate control factor levels in a design or a process to make the system less sensitive to variations in uncontrollable noise factors—i.e., to make the system robust.
Headline after the classic1066 and All That
This article was originally published here.
Latin Hypercube Sampling LHC
JOSE LOPEZ-COLLADO
JULY, 2015
After searching fow a while, I finally found a paper describing the LHC sampling (Swiler and Wyss 2004). LHC is a re-scaling function in the domain of a random uniform variate so to have a better dispersion of the input numbers used to generate the pdf deviates. The paper of Swiler & Wyss presents a detailed example of the algorithm so anybody can check the results and the algorithm itself (pages 2-9 in the paper).
In essence, the sample size ss serves to divide the sampling space into ss categories and then the U values are re-scaled to the new limits:

(* HERE IS THE MATHEMATICA CODE WITH COMMENTS *)
SetDirectory[NotebookDirectory[]];
wdist = WeibullDistribution[1.5, 3];
dname = wdist;
(* ss is the sample size*)
ss= 1500;
(*scaleu is the LHS re-scaling, very simple indeed! *)
scaleu[u_, i_, ss_] := u (1/ss) + ((i - 1)/ss);
(* set function as listable, capable of handling lists*)
SetAttributes[scaleu, Listable];
(*get a list of uniform random numbers*)
un = RandomVariate[UniformDistribution[{0, 1}], ss];
(*Get a sequence of integers, 1,2,3,... ,ss*)
strata = Range[ss];
Statistical Sampling Software
(*Re-distribute the random numbers using LHC*)


usc = scaleu[un, strata, ss];
(*Obtain a random number from the list above, in this example is wdist, a Weibull Distribution with shape and scale parameters of 1.5 and 3 respectively*)
(*Obtain a list of random numbers using LHC, check that we use the inverse of the cumulative density function CDF to translate from the re-scaled U values to their pdf values *)
pvL = Map[InverseCDF[dname, #] &, usc];
(* get some statistics, mean and standard deviation *)
mL = Mean[pvL]
2.70744
sdL = StandardDeviation[pvL]
1.83532
(*The next call is is the conventional random sampling, NOT the LHC, it uses the built-in Mathematica function RandomVariate and the distribution name and sample size as arguments *)
pvR = RandomVariate[dname, ss];
(* get the same statistics: mean and standard deviation*)
Latin Hypercube Sampling Example
mR = Mean[pvR]
2.62812
sdR = StandardDeviation[pvR]
Latin Hypercube Excel
1.7902
(* Draw the distributions, Latin Hypercube on the left, regular sampling on the right *)
GraphicsRow[{Histogram[pvL], Histogram[pvR]}]
댓글