Optimal Experimental Design

Optimal Experimental Design
Author: Jesús López-Fidalgo
Publisher: Springer Nature
Total Pages: 228
Release: 2023-10-14
Genre: Mathematics
ISBN: 3031359186

This textbook provides a concise introduction to optimal experimental design and efficiently prepares the reader for research in the area. It presents the common concepts and techniques for linear and nonlinear models as well as Bayesian optimal designs. The last two chapters are devoted to particular themes of interest, including recent developments and hot topics in optimal experimental design, and real-world applications. Numerous examples and exercises are included, some of them with solutions or hints, as well as references to the existing software for computing designs. The book is primarily intended for graduate students and young researchers in statistics and applied mathematics who are new to the field of optimal experimental design. Given the applications and the way concepts and results are introduced, parts of the text will also appeal to engineers and other applied researchers.

Numerical Approaches for Sequential Bayesian Optimal Experimental Design

Numerical Approaches for Sequential Bayesian Optimal Experimental Design
Author: Xun Huan
Publisher:
Total Pages: 186
Release: 2015
Genre:
ISBN:

Experimental data play a crucial role in developing and refining models of physical systems. Some experiments can be more valuable than others, however. Well-chosen experiments can save substantial resources, and hence optimal experimental design (OED) seeks to quantify and maximize the value of experimental data. Common current practice for designing a sequence of experiments uses suboptimal approaches: batch (open-loop) design that chooses all experiments simultaneously with no feedback of information, or greedy (myopic) design that optimally selects the next experiment without accounting for future observations and dynamics. In contrast, sequential optimal experimental design (sOED) is free of these limitations. With the goal of acquiring experimental data that are optimal for model parameter inference, we develop a rigorous Bayesian formulation for OED using an objective that incorporates a measure of information gain. This framework is first demonstrated in a batch design setting, and then extended to sOED using a dynamic programming (DP) formulation. We also develop new numerical tools for sOED to accommodate nonlinear models with continuous (and often unbounded) parameter, design, and observation spaces. Two major techniques are employed to make solution of the DP problem computationally feasible. First, the optimal policy is sought using a one-step lookahead representation combined with approximate value iteration. This approximate dynamic programming method couples backward induction and regression to construct value function approximations. It also iteratively generates trajectories via exploration and exploitation to further improve approximation accuracy in frequently visited regions of the state space. Second, transport maps are used to represent belief states, which reflect the intermediate posteriors within the sequential design process. Transport maps offer a finite-dimensional representation of these generally non-Gaussian random variables, and also enable fast approximate Bayesian inference, which must be performed millions of times under nested combinations of optimization and Monte Carlo sampling. The overall sOED algorithm is demonstrated and verified against analytic solutions on a simple linear-Gaussian model. Its advantages over batch and greedy designs are then shown via a nonlinear application of optimal sequential sensing: inferring contaminant source location from a sensor in a time-dependent convection-diffusion system. Finally, the capability of the algorithm is tested for multidimensional parameter and design spaces in a more complex setting of the source inversion problem.

Optimal Experimental Design for Large-scale Bayesian Inverse Problems

Optimal Experimental Design for Large-scale Bayesian Inverse Problems
Author: Keyi Wu (Ph. D.)
Publisher:
Total Pages: 0
Release: 2022
Genre:
ISBN:

Bayesian optimal experimental design (BOED)—including active learning, Bayesian optimization, and sensor placement—provides a probabilistic framework to maximize the expected information gain (EIG) or mutual information (MI) for uncertain parameters or quantities of interest with limited experimental data. However, evaluating the EIG remains prohibitive for largescale complex models due to the need to compute double integrals with respect to both the parameter and data distributions. In this work, we develop a fast and scalable computational framework to solve Bayesian optimal experimental design (OED) problems governed by partial differential equations (PDEs) with application to optimal sensor placement by maximizing the EIG. We (1) exploit the low-rank structure of the Jacobian of the parameter-to-observable map to extract the intrinsic low-dimensional data-informed subspace, and (2) employ a series of approximations of the EIG that reduce the number of PDE solves while retaining a high correlation with the true EIG. This allows us to propose an efficient offline–online decomposition for the optimization problem, using a new swapping greedy algorithm for both OED problems and goal-oriented linear OED problems. The offline stage dominates the cost and entails precomputing all components requiring PDE solusion. The online stage optimizes sensor placement and does not require any PDE solves. We provide a detailed error analysis with an upper bound for the approximation error in evaluating the EIG for OED and goal-oriented OED linear cases. Finally, we evaluate the EIG with a derivative-informed projected neural network (DIPNet) surrogate for parameter-to-observable maps. With this surrogate, no further PDE solves are required to solve the optimization problem. We provided an analysis of the error propagated from the DIPNet approximation to the approximation of the normalization constant and the EIG under suitable assumptions. We demonstrate the efficiency and scalability of the proposed methods for both linear inverse problems, in which one seeks to infer the initial condition for an advection–diffusion equation, and nonlinear inverse problems, in which one seeks to infer coefficients for a Poisson problem, an acoustic Helmholtz problem and an advection–diffusion–reaction problem. This dissertation is based on the following articles: A fast and scalable computational framework for large-scale and high-dimensional Bayesian optimal experimental design by Keyi Wu, Peng Chen, and Omar Ghattas [88]; An efficient method for goal-oriented linear Bayesian optimal experimental design: Application to optimal sensor placement by Keyi Wu, Peng Chen, and Omar Ghattas [89]; and Derivative-informed projected neural network for large-scale Bayesian optimal experimental design by Keyi Wu, Thomas O’Leary-Roseberry, Peng Chen, and Omar Ghattas [90]. This material is based upon work partially funded by DOE ASCR DE-SC0019303 and DESC0021239, DOD MURI FA9550-21-1-0084, and NSF DMS-2012453

On the Advancement of Optimal Experimental Design with Applications to Infectious Diseases

On the Advancement of Optimal Experimental Design with Applications to Infectious Diseases
Author: David Price
Publisher:
Total Pages: 208
Release: 2015
Genre: Bayesian statistical decision theory
ISBN:

In this thesis, we investigate the optimal experimental design of some common biological experiments. The theory of optimal experimental design is a statistical tool that allows us to determine the optimal experimental protocol to gain the most information about a particular process, given constraints on resources. We focus on determining the optimal design for experiments where the underlying model is a Markov chain - a particularly useful stochastic model. Markov chains are commonly used to represent a range of biological systems, for example: the evolution and spread of populations and disease, competition between species, and evolutionary genetics. There has been little research into the optimal experimental design of systems where the underlying process is modelled as a Markov chain, which is surprising given their suitability for representing the random behaviour of many natural processes. While the first paper to consider the optimal experimental design of a system where the underlying process was modelled as a Markov chain was published in the mid 1980's, this research area has only recently started to receive significant attention. Current methods of evaluating the optimal experimental design within a Bayesian framework can be computationally inefficient, or infeasible. This is due to the need for many evaluations of the posterior distribution, and thus, the model likelihood - which is computationally intensive for most non-linear stochastic processes. We implement an existing method for determining the optimal Bayesian experimental design to a common epidemic model, which has not been considered in a Bayesian framework previously. This method avoids computationally costly likelihood evaluations by implementing a likelihood-free approach to obtain the posterior distribution, known as Approximate Bayesian Computation (ABC). ABC is a class of methods which uses model simulations to estimate the posterior distribution. While this approach to optimal Bayesian experimental design has some advantages, we also note some disadvantages in its implementation. Having noted some drawbacks associated with the current approach to optimal Bayesian experimental design, we propose a new method - called ABCdE - which is more efficient, and easier to implement. ABCdE uses ABC methods to calculate the utility of all designs in a specified region of the design space. For problems with a low-dimensional design space, it evaluates the optimal design in significantly less computation time than the existing methods. We apply ABCdE to some common epidemic models, and compare the optimal Bayesian experimental designs to those published in the literature using existing methods. We present a comparison of how well the designs - obtained from each of the different methods - performs when used for statistical inference. In each case, the optimal designs obtained via ABCdE are similar to those obtained via existing methods, and the statistical performance is indistinguishable. The main applications we consider are concerned with group dose-response challenge experiments. A group dose-response challenge experiment is an experiment in which we expose subjects to a range of doses of an infectious agent or bacteria (or drug), and measure the number that are infected (or, the response) at each dose. These experiments are routinely used to quantify the infectivity or harmful (or safe) levels of an infectious agent or bacteria (e.g., minimum dose required to infect 50% of the population), or the efficacy of a drug. We focus particularly on the introduction of the bacteria Campylobacter jejuni to chickens. C. jejuni can be spread from animals to humans, and is the species most commonly associated with enteric (intestinal) disease in humans. By quantifying the dose-response relationship of the bacteria in chickens - via group dose-response challenge experiments - we can determine the safe levels of bacteria in chickens with the aim to minimise, or eradicate, the risk of transmission amongst the flock, and thus, to humans. Thus, accurate estimates of the dose-response relationship are crucial - and can be obtained efficiently by considering the optimal experimental design. However, the statistical analysis of most dose-response experiments assume that the subjects are independent. Chickens engage in copraphagic activity (oral ingestion of faecal matter), and are social animals meaning they must be housed in groups. Thus, oral-faecal transmission of the bacteria may be present in these experiments, violating the independence assumption and altering the measured dose-response relationship. We use a Markov chain model to represent the dynamics of these experiments, accounting for the latency period of the bacteria, and the transmission between chickens. We determine the optimal experimental design for a range of models, and describe the relationship between different model aspects and the resulting designs.