For the very latest, see autogenerated lists, like:

Click the title or triangle to show abstracts


    2023 Global solutions in deep learning in sub-linear timeAbstract:
    • Stephen Becker. NeurIPS 2023 (best paper award) preprint
    2024 Online randomized interpolative decomposition with a posteriori error estimator for temporal PDE data reductionAbstract: Traditional low-rank approximation is a powerful tool to compress the huge data matrices that arise in simulations of partial differential equations (PDE), but suffers from high computational cost and requires several passes over the PDE data. The compressed data may also lack interpretability thus making it difficult to identify feature patterns from the original data. To address this issue, we present an online randomized algorithm to compute the interpolative decomposition (ID) of large-scale data matrices in situ. Compared to previous randomized IDs that used the QR decomposition to determine the column basis, we adopt a streaming ridge leverage score-based column subset selection algorithm that dynamically selects proper basis columns from the data and thus avoids an extra pass over the data to compute the coefficient matrix of the ID. In particular, we adopt a single-pass error estimator based on the non-adaptive Hutch++ algorithm to provide real-time error approximation for determining the best coefficients. As a result, our approach only needs a single pass over the original data and thus is suitable for large and high-dimensional matrices stored outside of core memory or generated in PDE simulations. We also provide numerical experiments on turbulent channel flow and ignition simulations, and on the NSTX Gas Puff Image dataset, comparing our algorithm with the offline ID algorithm to demonstrate its utility in real-world applications.
    • Angran Li, Stephen Becker, Alireza Doostan. [arXiv]
    2024 Variational Entropy Search for Adjusting Expected ImprovementAbstract: Bayesian optimization is a widely used technique for optimizing black-box functions, with Expected Improvement (EI) being the most commonly utilized acquisition function in this domain. While EI is often viewed as distinct from other information-theoretic acquisition functions, such as entropy search (ES) and max-value entropy search (MES), our work reveals that EI can be considered a special case of MES when approached through variational inference (VI). In this context, we have developed the Variational Entropy Search (VES) methodology and the VES-Gamma algorithm, which adapts EI by incorporating principles from information-theoretic concepts. The efficacy of VES-Gamma is demonstrated across a variety of test functions and read datasets, highlighting its theoretical and practical utilities in Bayesian optimization scenarios.
    • Nuojin Cheng, Stephen Becker. [arXiv]
    2023 In Situ Framework for Coupling Simulation and Machine Learning with Application to CFDAbstract: Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations. As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks. Additionally, performing inference at runtime requires non-trivial coupling of ML framework libraries with simulation codes. This work offers a solution to both limitations by simplifying this coupling and enabling in situ training and inference workflows on heterogeneous clusters. Leveraging SmartSim, the presented framework deploys a database to store data and ML models in memory, thus circumventing the file system. On the Polaris supercomputer, we demonstrate perfect scaling efficiency to the full machine size of the data transfer and inference costs thanks to a novel co-located deployment of the database. Moreover, we train an autoencoder in situ from a turbulent flow simulation, showing that the framework overhead is negligible relative to a solver time step and training epoch.
    • Riccardo Balin, Filippo Simini, Cooper Simpson, Andrew Shao, Alessandro Rigazzi, Matthew Ellis, Stephen Becker, Alireza Doostan, John A. Evans, Kenneth E. Jansen. [arXiv]
    2023 The Dependence of Parallel Imaging with Linear Predictability on the Undersampling DirectionAbstract: Parallel imaging with linear predictability takes advantage of information present in multiple receive coils to accurately reconstruct the image with fewer samples. Commonly used algorithms based on linear predictability include GRAPPA and SPIRiT. We present a sufficient condition for reconstruction based on the direction of undersampling and the arrangement of the sensing coils. This condition is justified theoretically and examples are shown using real data. We also propose a metric based on the fully-sampled auto-calibration region which can show which direction(s) of undersampling will allow for a good quality image reconstruction.
    • Alex McManus, Stephen Becker, Nicholas Dwork. [arXiv]
    2022 Fast Algorithms for Monotone Lower Subsets of Kronecker Least Squares ProblemsAbstract: Approximate solutions to large least squares problems can be computed efficiently using leverage score-based row-sketches, but directly computing the leverage scores, or sampling according to them with naive methods, still requires an expensive manipulation and processing of the design matrix. In this paper we develop efficient leverage score-based sampling methods for matrices with certain Kronecker product-type structure; in particular we consider matrices that are monotone lower column subsets of Kronecker product matrices. Our discussion is general, encompassing least squares problems on infinite domains, in which case matrices formally have infinitely many rows. We briefly survey leverage score-based sampling guarantees from the numerical linear algebra and approximation theory communities, and follow this with efficient algorithms for sampling when the design matrix has Kronecker-type structure. Our numerical examples confirm that sketches based on exact leverage score sampling for our class of structured matrices achieve superior residual compared to approximate leverage score sampling methods.
    • Osman Malik, Yiming Xu, Nuojin Cheng, Stephen Becker, Alireza Doostan, Akil Narayan. [arXiv] [code]
    2021 On the computation of a non-parametric estimator by convex optimizationAbstract: Estimation of linear functionals from observed data is an important task in many subjects. Juditsky & Nemirovski [The Annals of Statistics 37.5A (2009): 2278-2300] propose a framework for non-parametric estimation of linear functionals in a very general setting, with nearly minimax optimal confidence intervals. They compute this estimator and the associated confidence interval by approximating the saddle-point of a function. While this optimization problem is convex, it is rather difficult to solve using existing off-the-shelf optimization software. Furthermore, this computation can be expensive when the estimators live in a high-dimensional space. We propose a different algorithm to construct this estimator. Our algorithm can be used with existing optimization software and is much cheaper to implement even when the estimators are in a high-dimensional space, as long as the Hellinger affinity (or the Bhattacharyya coefficient) for the chosen parametric distribution can be efficiently computed given the parameters. We hope that our algorithm will foster the adoption of this estimation technique to a wider variety of problems with relative ease.
    • Akshay Seshadri, Stephen Becker. [arXiv]
    2022 Approximate maximum likelihood estimators for linear regression with design matrix uncertaintyAbstract: In this paper we consider regression problems subject to arbitrary noise in the operator or design matrix. This characterization appropriately models many physical phenomena with uncertainty in the regressors. Although the problem has been studied extensively for ordinary/total least squares, and via models that implicitly or explicitly assume Gaussianity, less attention has been paid to improving estimation for regression problems under general uncertainty in the design matrix. To address difficulties encountered when dealing with distributions of sums of random variables, we rely on the saddle point method to estimate densities and form an approximate log-likelihood to maximize. We show that the proposed method performs favorably against other classical methods.
    • Richard Clancy, Stephen Becker. [arXiv]
    2021 High probability convergence and uniform stability bounds for nonconvex stochastic gradient descentAbstract: Stochastic gradient descent is one of the most common iterative algorithms used in machine learning. While being computationally cheap to implement, recent literature suggests it may have implicit regularization properties that prevent over-fitting. This paper analyzes the properties of stochastic gradient descent from a theoretical standpoint to help bridge the gap between theoretical and empirical results. Most theoretical results either assume convexity or only provide convergence results in mean, while this paper proves convergence bounds in high probability without assuming convexity. Assuming strong smoothness, we prove high probability convergence bounds in two settings: (1) assuming the Polyak-Łojasiewicz inequality and norm sub-Gaussian gradient noise and (2) assuming norm sub-Weibull gradient noise. In the first setting, we combine our convergence bounds with existing generalization bounds in order to bound the true risk and show that for a certain number of epochs, convergence and generalization balance in such a way that the true risk goes to the empirical minimum as the number of samples goes to infinity. In the second setting, as an intermediate step to proving convergence, we prove a probability result of independent interest. The probability result extends Freedman-type concentration beyond the sub-exponential threshold to heavier-tailed martingale difference sequences.
    • Liam Madden, Emiliano Dall'Anese, Stephen Becker. [arXiv] [code]

Journal papers

    [J45] 2024 Optimization of Array Encoding for Ultrasound ImagingAbstract: Objective; The transmit encoding model for synthetic aperture imaging is a robust and flexible framework for understanding the effect of acoustic transmission on ultrasound image reconstruction. Our objective is to use machine learning (ML) to construct scanning sequences, parameterized by time delays and apodization weights, that produce high quality B-mode images. Approach; We use an ML model in PyTorch and simulated RF data from Field II to probe the space of possible encoding sequences for those that minimize a loss function that describes image quality. This approach is made computationally feasible by a novel formulation of the derivative for delay-and-sum beamforming. We demonstrate these results experimentally on wire targets and a tissue-mimicking phantom. Main Results; When trained according to a given set of imaging parameters (imaging domain, hardware restrictions), our ML imaging model produces optimized encoding sequences that improve a number of standard quality metrics including resolution, field of view, and contrast, over conventional sequences. Significance; This work demonstrates that the set of encoding schemes that are commonly used represent only a narrow subset of those available. Additionally, it demonstrates the value for ML tasks in synthetic transmit aperture imaging to consider the beamformer within the model, instead of as purely post-processing.
    • Jacob Spainhour, Korben Smart, Stephen Becker, Nick Bottenus. Physics in Medicine and Biology. [DOI] [arXiv] [code]
    [J44] 2024 Bi-fidelity Variational Auto-encoder for Uncertainty QuantificationAbstract: Quantifying the uncertainty of quantities of interest (QoIs) from physical systems is a primary objective in model validation. However, achieving this goal entails balancing the need for computational efficiency with the requirement for numerical accuracy. To address this trade-off, we propose a novel bi-fidelity formulation of variational auto-encoders (BF-VAE) designed to estimate the uncertainty associated with a QoI from low-fidelity (LF) and high-fidelity (HF) samples of the QoI. This model allows for the approximation of the statistics of the HF QoI by leveraging information derived from its LF counterpart. Specifically, we design a bi-fidelity auto-regressive model in the latent space that is integrated within the VAE's probabilistic encoder-decoder structure. An effective algorithm is proposed to maximize the variational lower bound of the HF log-likelihood in the presence of limited HF data, resulting in the synthesis of HF realizations with a reduced computational cost. Additionally, we introduce the concept of the bi-fidelity information bottleneck (BF-IB) to provide an information-theoretic interpretation of the proposed BF-VAE model. Our numerical results demonstrate that BF-VAE leads to considerably improved accuracy, as compared to a VAE trained using only HF data when limited HF data is available.
    [J43] 2023 QuadConv: Quadrature-Based Convolutions with Applications to Non-Uniform PDE Data CompressionAbstract: We present a new convolution layer for deep learning architectures which we call QuadConv -- an approximation to continuous convolution via quadrature. Our operator is developed explicitly for use on unstructured data, and accomplishes this by learning a continuous kernel that can be sampled at arbitrary locations. In the setting of neural compression, we show that a QuadConv-based autoencoder, resulting in a Quadrature Convolutional Neural Network (QCNN), can match the performance of standard discrete convolutions on structured uniform data, as in CNNs, and maintain this accuracy on unstructured data.
    • Kevin Doherty, Cooper Simpson, Stephen Becker, Alireza Doostan. Journal of Computational Physics, to appear. [DOI] [arXiv] [code]
    [J42] 2023 Subsampling of Parametric Models with Bi-fidelity BoostingAbstract: Least squares regression is a ubiquitous tool for building emulators (a.k.a. surrogate models) of problems across science and engineering for purposes such as design space exploration and uncertainty quantification. When the regression data are generated using an experimental design process (e.g., a quadrature grid) involving computationally expensive models, or when the data size is large, sketching techniques have shown promise to reduce the cost of the construction of the regression model while ensuring accuracy comparable to that of the full data. However, random sketching strategies, such as those based on leverage scores, lead to regression errors that are random and may exhibit large variability. To mitigate this issue, we present a novel boosting approach that leverages cheaper, lower-fidelity data of the problem at hand to identify the best sketch among a set of candidate sketches. This in turn specifies the sketch of the intended high-fidelity model and the associated data. We provide theoretical analyses of this bi-fidelity boosting (BFB) approach and discuss the conditions the low- and high-fidelity data must satisfy for a successful boosting. In doing so, we derive a bound on the residual norm of the BFB sketched solution relating it to its ideal, but computationally expensive, high-fidelity boosted counterpart. Empirical results on both manufactured and PDE data corroborate the theoretical analyses and illustrate the efficacy of the BFB solution in reducing the regression error, as compared to the non-boosted solution
    • Nuojin Cheng, Osman Malik, Yiming Xu, Stephen Becker, Alireza Doostan, Akil Narayan. SIAM/ASA Journal on Uncertainty Quantification (to appear). [DOI] [arXiv] [code]
    [J41] 2024 Versatile fidelity estimation with confidenceAbstract: As quantum devices become more complex and the requirements on these devices become more demanding, it is crucial to be able to verify the performance of such devices in a scalable and reliable fashion. A cornerstone task in this challenge is quantifying how close an experimentally prepared quantum state is to the desired one. Here we present a method to construct an estimator for the quantum state fidelity that is compatible with any measurement protocol. Our method provides a confidence interval on this estimator that is guaranteed to be nearly minimax optimal for the specified measurement protocol. For a well-chosen measurement scheme, our method is competitive in the number of measurement outcomes required for estimation. We demonstrate our method using simulations and experimental data from a trapped-ion quantum computer and compare the results to state-of-the-art techniques. Our method can be easily extended to estimate the expectation value of any observable, such as entanglement witnesses.
    [J40] 2024 Theory of Versatile fidelity estimation with confidenceAbstract: Estimating the fidelity with a target state is important in quantum information tasks. Many fidelity estimation techniques present a suitable measurement scheme to perform the estimation. In contrast, we present techniques that allow the experimentalist to choose a convenient measurement setting. Our primary focus lies on a method that constructs an estimator with nearly minimax optimal confidence interval for any specified measurement setting. We demonstrate, through a combination of theoretical and numerical results, various desirable properties for the method -- robustness against experimental imperfections, competitive sample complexity, and accurate estimates in practice. We compare this method with Maximum Likelihood Estimation and the associated Profile Likelihood method, a Semi-Definite Programming based approach, as well as a popular direct fidelity estimation technique.
    [J39] 2023 Modeling massive multivariate spatial data with the basis graphical lassoAbstract: We propose a new modeling framework for highly-multivariate spatial processes that synthesizes ideas from recent multiscale and spectral approaches with graphical models. The basis graphical lasso writes a univariate Gaussian process as a linear combination of basis functions weighted with entries of a Gaussian graphical vector whose graph is estimated from optimizing an ℓ1 penalized likelihood. This paper extends the setting to a multivariate Gaussian process where the basis functions are weighted with Gaussian graphical vectors. We motivate a model where the basis functions represent different levels of resolution and the graphical vectors for each level are assumed to be independent. Using an orthogonal basis grants linear complexity and memory usage in the number of spatial locations, the number of basis functions, and the number of realizations. An additional fusion penalty encourages a parsimonious conditional independence structure in the multilevel graphical model. We illustrate our method on a large climate ensemble from the National Center for Atmospheric Research's Community Atmosphere Model that involves 40 spatial processes.
    • Mitchell Krock, William Kleiber, Dorit Hammerling, Stephen Becker. Journal of Computational and Graphical Statistics. [DOI] [arXiv]
    [J38] 2023 Comparison of spatial encodings for ultrasound imagingAbstract: Ultrasound pulse sequencing and receive signal focusing work hand-in-hand to determine image quality. These are commonly linked by geometry, for example, using focused beams or plane waves in transmission paired with appropriate time-of-flight calculations for focusing. Spatial encoding allows a broader class of array transmissions but requires decoding of the recorded echoes before geometric focusing can be applied. Recent work has expanded spatial encoding to include not only element apodizations, but also element time delays. This powerful technique allows for a unified beamforming strategy across different pulse sequences and increased flexibility in array signal processing giving access to estimates of individual transmit element signals, but tradeoffs in image quality between these encodings have not been previously studied. We evaluate in simulation several commonly used time delay and amplitude encodings and investigate the optimization of the parameter space for each. Using the signal-to-noise ratio (SNR), point resolution, and lesion detectability, we found tradeoffs between focused beams, plane waves, and Hadamard weight encodings. Beams with broader geometries maintained a wider field of view after decoding at the cost of the SNR and lesion detectability. Focused beams and plane waves showed slightly reduced resolution compared to Hadamard weights in some cases, especially close to the array. We also found overall degraded image quality using random weight or random delay encodings. We validate these findings with experimental phantom imaging for select cases. We believe that these findings provide a starting point for sequence optimization and improved image quality using the spatial encoding approach for imaging.
    • Nick Bottenus, Jacob Spainhour, Stephen Becker. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 70(1) pp 52--63. [DOI] Cover artwork for this journal issue features our article
    [J37] 2021 A study of scalar optically-pumped magnetometers for use in magnetoencephalography without shieldingAbstract: Scalar optically-pumped magnetometers (OPMs) are being developed in small packages with high sensitivities. The high common-mode rejection ratio of these sensors allows for detection of very small signals in the presence of large background fields making them ideally suited for brain imaging applications in unshielded environments. Despite a flurry of activity around the topic, questions remain concerning how well a dipolar source can be localized under such conditions, especially when using few sensors. In this paper, we investigate the source localization capabilities using an array of scalar OPMs in the presence of a large background field while varying dipole strength, sensor count, and forward model accuracy. We also consider localization performance as the orientation angle of the background field changes. Our results are validated experimentally through accurate localization using a phantom virtual array mimicking a current dipole in a conducting sphere in a large background field. Our results are intended to give researchers a general sense of the capabilities and limitations of scalar OPMs for magnetoencephalography systems.
    • Richard Clancy, Vladislav Gerginov, Orang Alem, Stephen Becker, Svenja Knappe. Physics in Medicine and Biology. [DOI] [arXiv]
    [J36] 2021 Spectral estimation from simulations via sketchingAbstract: Sketching is a stochastic dimension reduction method that preserves geometric structures of data and has applications in high-dimensional regression, low rank approximation and graph sparsification. In this work, we show that sketching can be used to compress simulation data and still accurately estimate time autocorrelation and power spectral density. For a given compression ratio, the accuracy is much higher than using previously known methods. In addition to providing theoretical guarantees, we apply sketching to a molecular dynamics simulation of methanol and find that the estimate of spectral density is 90% accurate using only 10% of the data.
    • Zhishen Huang, Stephen Becker. Journal of Computational Physics. [DOI] [arXiv]
    [J35] 2021 A stochastic subspace approach to gradient-free optimization in high dimensionsAbstract: We present a stochastic descent algorithm for unconstrained optimization that is particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The algorithm maps the gradient onto a low-dimensional random subspace of dimension l at each iteration, similar to coordinate descent but without restricting directional derivatives to be along the axes. Without requiring a full gradient, this mapping can be performed by computing l directional derivatives (e.g., via forward-mode automatic differentiation). We give proofs for convergence in expectation under various convexity assumptions as well as probabilistic convergence results under strong-convexity. Our method extends the well-known Gaussian smoothing technique to descent in subspaces of dimension greater than one, opening the doors to new analysis of Gaussian smoothing when more than one directional derivative is used at each iteration. We also provide a finite-dimensional variant of a special case of the Johnson-Lindenstrauss lemma. Experimentally, we show that our method compares favorably to coordinate descent, Gaussian smoothing, gradient descent and BFGS (when gradients are calculated via forward-mode automatic differentiation) on problems from the machine learning and shape optimization literature.
    • David Kozak, Stephen Becker, Alireza Doostan, Luis Tenorio. Computational Optimization and Applications. [DOI] [arXiv] This is an updated version of Stochastic Subspace Descent
    [J34] 2021 Bounds for the tracking error of first-order online optimization methodsAbstract: This paper investigates online algorithms for smooth time-varying optimization problems, focusing first on methods with constant step-size, momentum, and extrapolation-length. Assuming strong convexity, precise results for the tracking iterate error (the limit supremum of the norm of the difference between the optimal solution and the iterates) for online gradient descent are derived. The paper then considers a general first-order framework, where a universal lower bound on the tracking iterate error is established. Furthermore, a method using "long-steps" is proposed and shown to achieve the lower bound up to a fixed constant. This method is then compared with online gradient descent for specific examples. Finally, the paper analyzes the effect of regularization when the cost is not strongly convex. With regularization, it is possible to achieve a non-regret bound. The paper ends by testing the accelerated and regularized methods on synthetic time-varying least-squares and logistic regression problems, respectively.
    [J33] 2020 l_1-regularized maximum likelihood estimation with focused-spot illumination quadruples the diffraction-limited resolution in fluorescence microscopyAbstract: Super-resolution fluorescence microscopy has proven to be a useful tool in biological studies. To achieve more than two-fold resolution improvement over the diffraction limit, existing methods require exploitation of the physical properties of the fluorophores. Recently, it has been demonstrated that achieving more than two-fold resolution improvement without such exploitation is possible using only a focused illumination spot and numerical post-processing. However, how the achievable resolution is affected by the processing step has not been thoroughly investigated. In this paper, we focus on the processing aspect of this emerging super-resolution microscopy technique. Based on a careful examination of the dominant noise source and the available prior information in the image, we find that if a processing scheme is appropriate for the dominant noise model in the image and can utilize the prior information in the form of sparsity, improved accuracy can be expected. Based on simulation results, we identify an improved processing scheme and apply it in a real-world experiment to super-resolve a known calibration sample. We show an improved super-resolution of 60nm, approximately four times beyond the conventional diffraction-limited resolution.
    • J. Xing, S. Chen, S. Becker, J.-Y. Yu, C. Cogswell. Optics Express, 28(26) pp 39413-39429. [DOI]
    [J32] 2020 Robust Least Squares for Quantized DataAbstract: In this paper we formulate and solve a robust least squares problem for a system of linear equations subject to quantization error in the data matrix. Ordinary least squares fails to consider uncertainty in the operator, modeling all noise in the observed signal. Total least squares accounts for uncertainty in the data matrix, but necessarily increases the condition number of the operator compared to ordinary least squares. Tikhonov regularization or ridge regression is frequently employed to combat ill-conditioning, but requires parameter tuning which presents a host of challenges and places strong assumptions on parameter prior distributions. The proposed method also requires selection of a parameter, but it can be chosen in a natural way, e.g., a matrix rounded to the 4th digit uses an uncertainty bounding parameter of 0.5e-4. We show here that our robust method is theoretically appropriate, tractable, and performs favorably against ordinary and total least squares.
    • Richard Clancy, Stephen Becker. Signal Processing, 176(107711). [DOI] [arXiv]
    [J31] 2021 Randomization of Approximate Bilinear Computation for Matrix MultiplicationAbstract: We present a method for randomizing formulas for bilinear computation of matrix products. We consider the implications of such randomization when there are two sources of error: One due to the formula itself only being approximately correct, and one due to using floating point arithmetic. Our theoretical results and numerical experiments indicate that our method can improve performance when each of these error sources are present individually, as well as when they are present at the same time.
    • Osman Asif Malik and Stephen Becker. Journal of Computer Mathematics: Computer Systems Theory (IJCM: CST) , 6(1) pp 54--93. [DOI] [arXiv]
    [J30] 2020 Resolvability of Hamming GraphsAbstract: A subset of vertices in a graph is called resolving when the geodesic distances to those vertices uniquely distinguish every vertex in the graph. Here, we characterize the resolvability of Hamming graphs in terms of a constrained linear system and deduce a novel but straightforward characterization of resolvability for hypercubes. We propose an integer linear programming method to assess resolvability rapidly, and provide a more costly but definite method based on Gröbner bases to determine whether or not a set of vertices resolves an arbitrary Hamming graph. As proof of concept, we identify a resolving set of size 77 in the metric space of all octapeptides (i.e., proteins composed of eight amino acids) with respect to the Hamming distance; in particular, any octamer may be readily represented as a 77-dimensional real-vector. Representing k-mers as low-dimensional numerical vectors may enable new applications of machine learning algorithms to symbolic sequences.
    • Lucas Laird, Richard C. Tillquist, Stephen Becker, Manuel E. Lladser. SIAM J. Discrete Math, 34(4) pp 2063-2081. [DOI] [arXiv]
    [J29] 2020 Fast Randomized Matrix and Tensor Interpolative Decomposition Using CountSketchAbstract: We propose a new fast randomized algorithm for interpolative decomposition of matrices which utilizes CountSketch. We then extend this approach to the tensor interpolative decomposition problem introduced by Biagioni et al. (J. Comput. Phys. 281, pp. 116-134, 2015). Theoretical performance guarantees are provided for both the matrix and tensor settings. Numerical experiments on both synthetic and real data demonstrate that our algorithms maintain the accuracy of competing methods, while running in less time, achieving at least an order of magnitude speed-up on large matrices and tensors.
    [J28] 2020 Guarantees for the Kronecker Fast Johnson-Lindenstrauss Transform Using a Coherence and Sampling ArgumentAbstract: In the recent paper [Jin, Kolda & Ward, arXiv:1909.04801], it is proved that the Kronecker fast Johnson-Lindenstrauss transform (KFJLT) is, in fact, a Johnson-Lindenstrauss transform, which had previously only been conjectured. In this paper, we provide an alternative proof of this, for when the KFJLT is applied to Kronecker vectors, using a coherence and sampling argument. Our proof yields a different bound on the embedding dimension, which can be combined with the bound in the paper by Jin et al. to get a better bound overall. As a stepping stone to proving our result, we also show that the KFJLT is a subspace embedding for matrices with columns that have Kronecker product structure. Lastly, we compare the KFJLT to four other sketch techniques in numerical experiments on both synthetic and real-world data.
    • Osman Asif Malik and Stephen Becker. Linear Algebra and its Applications, 602(1) pp 120-127. [DOI] [arXiv]
    [J27] 2020 Nonstationary Modeling With Sparsity for Spatial Data via the Basis Graphical LassoAbstract: Many modern spatial models express the stochastic variation component as a basis expansion with random coefficients. Low rank models, approximate spectral decompositions, multiresolution representations, stochastic partial differential equations and empirical orthogonal functions all fall within this basic framework. Given a particular basis, stochastic dependence relies on flexible modeling of the coefficients. Under a Gaussianity assumption, we propose a graphical model family for the stochastic coefficients by parameterizing the precision matrix. Sparsity in the precision matrix is encouraged using a penalized likelihood framework. Computations follow from a majorization-minimization approach, a byproduct of which is a connection to the graphical lasso. The result is a flexible nonstationary spatial model that is adaptable to very large datasets. We apply the model to two large and heterogeneous spatial datasets in statistical climatology and recover physically sensible graphical structures. Moreover, the model performs competitively against the popular LatticeKrig model in predictive cross-validation, but substantially improves the Akaike information criterion score
    • Mitchell Krock, William Kleiber, Stephen Becker. Journal of Computational and Graphical Statistics. [DOI] [arXiv] Previous arXiv version under the name "Penalized basis models for very large spatial datasets"
    [J26] 2020 Analyzing the super-resolution characteristics of focused-spot illumination approachesAbstract: Significance: It is commonly assumed that using the objective lens to create a tightly focused light spot for illumination provides a twofold resolution improvement over the Rayleigh resolution limit and that resolution improvement is independent of object properties. Nevertheless, such an assumption has not been carefully examined. We examine this assumption by analyzing the performance of two super-resolution methods, known as image scanning microscopy (ISM) and illumination-enhanced sparsity (IES). Aim: We aim to identify the fundamental differences between the two methods, and to provide examples that help researchers determine which method to utilize for different imaging conditions. Approach: We input the same image datasets into the two methods and analyze their restorations. In numerical simulations, we design objects of distinct brightness and sparsity levels for imaging. We use biological imaging experiments to verify the simulation results. Results: The resolution of IES often exceeds twice the Rayleigh resolution limit when imaging sparse objects. A decrease in object sparsity negatively affects the resolution improvement in both methods. Conclusions: The IES method is superior for imaging sparse objects with its main features being bright and small against a dark, large background. For objects that are largely bright with small dark features, the ISM method is favorable.
    • J-Y Yu, V. Narumanchi, S. Chen, J. Xing, S. Becker, C. Cogswell. J. of Biomedical Optics, 25(5), 056501. [DOI]
    [J25] 2020 Efficient Solvers for Sparse Subspace ClusteringAbstract: Sparse subspace clustering (SSC) clusters n points that lie near a union of low-dimensional subspaces. The SSC model expresses each point as a linear or affine combination of the other points, using either ℓ1 or ℓ0 regularization. Using ℓ1 regularization results in a convex problem but requires O(n2) storage, and is typically solved by the alternating direction method of multipliers which takes O(n3) flops. The ℓ0 model is non-convex but only needs memory linear in n, and is solved via orthogonal matching pursuit and cannot handle the case of affine subspaces. This paper shows that a proximal gradient framework can solve SSC, covering both ℓ1 and ℓ0 models, and both linear and affine constraints. For both ℓ1 and ℓ0, algorithms to compute the proximity operator in the presence of affine constraints have not been presented in the SSC literature, so we derive an exact and efficient algorithm that solves the ℓ1 case with just O(n2) flops. In the ℓ0 case, our algorithm retains the low-memory overhead, and is the first algorithm to solve the SSC-ℓ0 model with affine constraints. Experiments show our algorithms do not rely on sensitive regularization parameters, and they are less sensitive to sparsity misspecification and high noise.
    • Farhad Pourkamali-Anaraki, James Folberth, Stephen Becker. Signal Processing, vol 172. [DOI] [arXiv]
    [J24] 2020 Optimization and Learning with Information Streams: Time-varying Algorithms and Applications Abstract: There is a growing cross-disciplinary effort in the broad domain of optimization and learning with streams of data, applied to settings where traditional batch optimization techniques cannot produce solutions at time scales that match the inter-arrival times of the data points due to computational and/or communication bottlenecks. Special types of online algorithms can handle this situation, and this article focuses on such time-varying optimization algorithms, with emphasis on Machine Leaning and Signal Processing, as well as data-driven Control. Approaches for the design of time-varying or online first-order optimization methods are discussed, with emphasis on algorithms that can handle errors in the gradient, as may arise when the gradient is estimated. Insights on performance metrics and accompanying claims are provided, along with evidence of cases where algorithms that are provably convergent in batch optimization may perform poorly in an online regime. The role of distributed computation is discussed. Illustrative numerical examples for a number of applications of broad interest are provided to convey key ideas.
    • Emiliano Dall'Anese, Andrea Simonetto, Stephen Becker, Liam Madden. IEEE Signal Processing Magazine, vol 37(3), pp. 71--83. [DOI] [arXiv]
    [J23] 2019 Safe Feature Elimination for Non-Negativity Constrained Convex OptimizationAbstract: Inspired by recent work on safe feature elimination for 1-norm regularized least-squares, we develop strategies to eliminate features from convex optimization problems with non-negativity constraints. Our strategy is safe in the sense that it will only remove features/coordinates from the problem when they are guaranteed to be zero at a solution. To perform feature elimination we use an accurate, but not optimal, primal-dual feasible pair, making our methods robust and able to be used on ill-conditioned problems. We supplement our feature elimination problem with a method to construct an accurate dual feasible point from an accurate primal feasible point; this allows us to use a first-order method to find an accurate primal feasible point, then use that point to construct an accurate dual feasible point and perform feature elimination. Under reasonable conditions, our feature elimination strategy will eventually eliminate all zero features from the problem. As an application of our methods we show how safe feature elimination can be used to robustly certify the uniqueness of non-negative least-squares (NNLS) problems. We give numerical examples on a well-conditioned synthetic NNLS problem and a on set of 40000 extremely ill-conditioned NNLS problems arising in a microscopy application.
    [J22] 2019 Improved Fixed-Rank Nystrom Approximation via QR Decomposition: Practical and Theoretical Aspects Abstract: The Nystrom method is a popular technique that uses a small number of landmark points to compute a fixed-rank approximation of large kernel matrices that arise in machine learning problems. In practice, to ensure high quality approximations, the number of landmark points is chosen to be greater than the target rank. However, for simplicity the standard Nystrom method uses a sub-optimal procedure for rank reduction. In this paper, we examine the drawbacks of the standard Nystrom method in terms of poor performance and lack of theoretical guarantees. To address these issues, we present an efficient modification for generating improved fixed-rank Nystrom approximations. Theoretical analysis and numerical experiments are provided to demonstrate the advantages of the modified method over the standard Nystrom method. Overall, the aim of this paper is to convince researchers to use the modified method, as it has nearly identical computational complexity, is easy to code, has greatly improved accuracy in many cases, and is optimal in a sense that we make precise.
    • Farhad Pourkamali-Anaraki, Stephen Becker. Neurocomputing, 363(21) pp 261--272. [DOI] [arXiv]
    [J21] 2019 Stochastic Lanczos estimation of genomic variance components for linear mixed-effects modelsAbstract: Background Linear mixed-effects models (LMM) are a leading method in conducting genome-wide association studies (GWAS) but require residual maximum likelihood (REML) estimation of variance components, which is computationally demanding. Previous work has reduced the computational burden of variance component estimation by replacing direct matrix operations with iterative and stochastic methods and by employing loose tolerances to limit the number of iterations in the REML optimization procedure. Here, we introduce two novel algorithms, stochastic Lanczos derivative-free REML (SLDF_REML) and Lanczos first-order Monte Carlo REML (L_FOMC_REML), that exploit problem structure via the principle of Krylov subspace shift-invariance to speed computation beyond existing methods. Both novel algorithms only require a single round of computation involving iterative matrix operations, after which their respective objectives can be repeatedly evaluated using vector operations. Further, in contrast to existing stochastic methods, SLDF_REML can exploit precomputed genomic relatedness matrices (GRMs), when available, to further speed computation. Results Results of numerical experiments are congruent with theory and demonstrate that interpreted-language implementations of both algorithms match or exceed existing compiled-language software packages in speed, accuracy, and flexibility. Conclusions Both the SLDF_REML and L_FOMC_REML algorithms outperform existing methods for REML estimation of variance components for LMM and are suitable for incorporation into existing GWAS LMM software implementations.
    [J20] 2019 On Quasi-Newton Forward-Backward Splitting: Proximal Calculus and ConvergenceAbstract: We introduce a framework for quasi-Newton forward--backward splitting algorithms (proximal quasi-Newton methods) with a metric induced by diagonal ± rank-r symmetric positive definite matrices. This special type of metric allows for a highly efficient evaluation of the proximal mapping. The key to this efficiency is a general proximal calculus in the new metric. By using duality, formulas are derived that relate the proximal mapping in a rank-r modified metric to the original metric. We also describe efficient implementations of the proximity calculation for a large class of functions; the implementations exploit the piece-wise linear nature of the dual problem. Then, we apply these results to acceleration of composite convex minimization problems, which leads to elegant quasi-Newton methods for which we prove convergence. The algorithm is tested on several numerical examples and compared to a comprehensive list of alternatives in the literature. Our quasi-Newton splitting algorithm with the prescribed metric compares favorably against state-of-the-art. The algorithm has extensive applications including signal processing, sparse recovery, machine learning and classification to name a few.
    [J19] 2019 Adapting Regularized Low Rank Models for Parallel ArchitecturesAbstract: We introduce a reformulation of regularized low-rank recovery models to take advantage of GPU, multiple CPU, and hybridized architectures. Low-rank recovery often involves nuclear-norm minimization through iterative thresholding of singular values. These models are slow to fit and difficult to parallelize because of their dependence on computing a singular value decomposition at each iteration. Regularized low-rank recovery models also incorporate non-smooth terms to separate structured components (e.g. sparse outliers) from the low-rank component, making these problems more difficult. Using Burer-Monteiro splitting and marginalization, we develop a smooth, non-convex formulation of regularized low-rank recovery models that can be fit with first-order solvers. We develop a computable certificate of convergence for this non-convex program, and use it to establish bounds on the suboptimality of any point. Using robust principal component analysis (RPCA) as an example, we include numerical experiments showing that this approach is an order-of-magnitude faster than existing RPCA solvers on the GPU. We also show that this acceleration allows new applications for RPCA, including real-time background subtraction and MR image analysis.
    • Derek Driggs, Stephen Becker, Aleksandr Aravkin. SIAM J. Sci. Computing. [DOI] [arXiv] [code]
    [J18] 2018 Template Polyhedra and Bilinear OptimizationAbstract: In this paper, we study the template polyhedral abstract domain using connections to bilinear optimization techniques. The connections between abstract interpretation and convex optimization approaches have been studied for nearly a decade now. Specifically, data flow constraints for numerical domains such as polyhedra can be expressed in terms of bilinear constraints. Algorithms such as policy and strategy iteration have been proposed for the special case of bilinear constraints that arise from template polyhedra wherein the desired invariants conform to a fixed template form. In particular, policy iteration improves upon a known post-fixed point by alternating between solving for an improved post-fixed point against finding certificates that are used to prove the new fixed point. In the first part of this paper, we propose a policy iteration scheme that changes the template on the fly in order to prove a target reachability property of interest. We show how the change to the template naturally fits inside a policy iteration scheme, and thus, propose a scheme that updates the template matrices associated with each program location. We demonstrate that the approach is effective over a set of benchmark instances, wherein, starting from a simple predefined choice of templates, the approach is able to infer appropriate template directions to prove a property of interest. However, it is well known that policy iteration can end up “stuck” in a saddle point from which future iterations cannot make progress. In the second part of this paper, we study this problem further by empirically comparing policy iteration with a variety of other approaches for bilinear programming. These approaches adapt well-known algorithms to the special case of bilinear programs as well as using off-the-shelf tools for nonlinear programming. Our initial experience suggests that policy iteration seems to be the most advantageous approach for problems arising from abstract interpretation, despite the potential problems of getting stuck at a saddle point.
    • Jessica Gronski, Mohamed-Amin Ben Sassi, Stephen Becker, Sriram Sankaranarayanan. Formal Methods and Design. [DOI]
    [J17] 2018 Achieving superresolution with illumination-enhanced sparsityAbstract: Recent advances in superresolution fluorescence microscopy have been limited by a belief that surpassing two-fold resolution enhancement of the Rayleigh resolution limit requires stimulated emission or the fluorophore to undergo state transitions. Here we demonstrate a new superresolution method that requires only image acquisitions with a focused illumination spot and computational post-processing. The proposed method utilizes the focused illumination spot to effectively reduce the object size and enhance the object sparsity and consequently increases the resolution and accuracy through nonlinear image post-processing. This method clearly resolves 70nm resolution test objects emitting ~530nm light with a 1.4 numerical aperture (NA) objective, and, when imaging through a 0.5NA objective, exhibits high spatial frequencies comparable to a 1.4NA widefield image, both demonstrating a resolution enhancement above two-fold of the Rayleigh resolution limit. More importantly, we examine how the resolution increases with photon numbers, and show that the more-than-two-fold enhancement is achievable with realistic photon budgets.
    • Jiun-Yann Yu, Stephen R. Becker, James Folberth, Bruce F. Wallin, Simeng Chen, and Carol J. Cogswell. Optics Express, vol 26(8) pp. 9850-9865. [DOI] Researchgate full text
    [J16] 2017 Preconditioned Data Sparsification for Big Data with Applications to PCA and K-meansAbstract: We analyze a compression scheme for large data sets that randomly keeps a small percentage of the components of each data sample. The benefit is that the output is a sparse matrix and therefore subsequent processing, such as PCA or K-means, is significantly faster, especially in a distributed-data setting. Furthermore, the sampling is single-pass and applicable to streaming data. The sampling mechanism is a variant of previous methods proposed in the literature combined with a randomized preconditioning to smooth the data. We provide guarantees for PCA in terms of the covariance matrix, and guarantees for K-means in terms of the error in the center estimators at a given step. We present numerical evidence to show both that our bounds are nearly tight and that our algorithms provide a real benefit when applied to standard test data sets, as well as providing certain benefits over related sampling approaches.
    [J15] 2016 Efficient Adjoint Computation for Wavelet and Convolution Operators (Lecture Notes)Abstract: First-order optimization algorithms, often preferred for large problems, require the gradient of the differentiable terms in the objective function. These gradients often involve linear operators and their adjoints, which must be applied rapidly. We consider two example problems and derive methods for quickly evaluating the required adjoint operator. The first example is an image deblurring problem, where we must compute efficiently the adjoint of multi-stage wavelet reconstruction. Our formulation of the adjoint works for a variety of boundary conditions, which allows the formulation to generalize to a larger class of problems. The second example is a blind channel estimation problem taken from the optimization literature where we must compute the adjoint of the convolution of two signals. In each example, we show how the adjoint operator can be applied efficiently while leveraging existing software.
    • J. Folberth and S. Becker. IEEE Signal Processing Magazine, vol 33(6) 135-147. [DOI] [arXiv]
    [J14] 2015 Designing Statistical Estimators That Balance Sample Size, Risk, and Computational CostAbstract: This paper proposes a tradeoff between computational time, sample complexity, and statistical accuracy that applies to statistical estimators based on convex optimization. When we have a large amount of data, we can exploit excess samples to decrease statistical risk, to decrease computational cost, or to trade off between the two. We propose to achieve this tradeoff by varying the amount of smoothing applied to the optimization problem. This work uses regularized linear regression as a case study to argue for the existence of this tradeoff both theoretically and experimentally. We also apply our method to describe a tradeoff in an image interpolation problem.
    • J. Bruer, J. Tropp, V. Cevher and S. Becker. IEEE J. Selected Topics in Signal Processing. [DOI] Final version
    [J13] 2014 Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analyticsAbstract: This article reviews recent advances in convex optimization algorithms for Big Data, which aim to reduce the computational, storage, and communications bottlenecks. We provide an overview of this emerging field, describe contemporary approximation techniques like first-order methods and randomization for scalability, and survey the important role of parallel and distributed computation. The new Big Data algorithms are based on surprisingly simple principles and attain staggering accelerations even on classical problems.
    • Volkan Cevher, Stephen Becker, Mark Schmidt. IEEE Signal Processing Magazine, vol. 31, no. 5. [DOI] [arXiv]
    [J12] 2014 An Algorithm for Splitting Parallel Sums of Linearly Composed Monotone Operators, with Applications to Signal RecoveryAbstract: We present a new primal-dual splitting algorithm for structured monotone inclusions in Hilbert spaces and analyze its asymptotic behavior. A novelty of our framework, which is motivated by image recovery applications, is to consider inclusions that combine a variety of monotonicity-preserving operations such as sums, linear compositions, parallel sums, and a new notion of parallel composition. The special case of minimization problems is studied in detail, and applications to signal recovery are discussed. Numerical simulations are provided to illustrate the implementation of the algorithm.
    • Stephen Becker and Patrick L. Combettes. Journal of Nonlinear and Convex Analysis, vol. 15, no. 1, pp. 137--159. [arXiv]
    [J11] 2013 Improving IMRT delivery efficiency with iteratively reweighted L1-minimization for inverse planningAbstract:
    • Hojin Kim, Lei Xing, Ruijiang Li et al.. Medical Physics, vol. 40, 071719. [DOI]
    [J10] 2012 A Non-Uniform Sampler for Wideband Spectrally-Sparse EnvironmentsAbstract: We present a wide bandwidth, compressed sensing based non-uniform sampling (NUS) system with a custom sample-=and-hold chip designed to take advantage of a low average sampling rate. By sampling signals non-uniformly, the average sample rate can be more than a magnitude lower than the Nyquist rate, provided that these signals have a relatively low information content as measured by the sparsity of their spectrum. The hardware design combines a wideband Indium-Phosphide (InP) heterojunction bipolar transistor (HBT) sample-and-hold with a commercial off-the-shelf (COTS) analog-to-digital converter (ADC) to digitize an 800 MHz to 2 GHz band (having 100 MHz of non-contiguous spectral content) at an average sample rate of 236 Msps. Signal reconstruction is performed via a non-linear compressed sensing algorithm, and the challenges of developing an efficient GPU implementation are discussed. Measured biterror-rate (BER) data for a GSM channel is presented, and comparisons to a conventional wideband 4.4 Gsps ADC are made.
    • Michael Wakin, Stephen Becker, Eric Nakamura, Michael Grant, Emilio Sovero, Daniel Ching, Juhwan Yoo, Justin Romberg, Azita Emami-Neyestanak, Emmanuel Candès. IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS), vol. 2, no. 3, pp. 516--529. [DOI] link
    [J9] 2012 A Compressed Sensing Parameter Extraction Platform for Radar Pulse Signal AcquisitionAbstract: In this paper we present a complete (hardware/ software) sub-Nyquist rate (× 13) wideband signal acquisition chain capable of acquiring radar pulse parameters in an instantaneous bandwidth spanning 100 MHz-2.5 GHz with the equivalent of 8 effective number of bits (ENOB) digitizing performance. The approach is based on the alternative sensing-paradigm of compressed sensing (CS). The hardware platform features a fully-integrated CS receiver architecture named the random-modulation preintegrator (RMPI) fabricated in Northrop Grumman's 450 nm InP HBT bipolar technology. The software back-end consists of a novel CS parameter recovery algorithm which extracts information about the signal without performing full time-domain signal reconstruction. This approach significantly reduces the computational overhead involved in retrieving desired information which demonstrates an avenue toward employing CS techniques in power-constrained real-time applications. The developed techniques are validated on CS samples physically measured by the fabricated RMPI and measurement results are presented. The parameter estimation algorithms are described in detail and a complete description of the physical hardware is given.
    • J. Yoo, C. Turnes, E. Nakamura, C. Le, S. Becker, E. Sovero, M. Wakin, M. Grant, J. Romberg, A. Emami-Neyestanak, and E. Candès. IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS), vol. 2, no. 3, pp. 626--638. [DOI] link
    [J8] 2011 Dynamical Behavior Near a Liquid-Liquid Phase Transition in Simulations of Supercooled WaterAbstract: We examine the behavior of the diffusion coefficient of the ST2 model of water over a broad region of the phase diagram via molecular dynamics simulations. The ST2 model has an accessible liquid–liquid transition between low-density and high-density phases, making the model an ideal candidate to explore the impacts of the liquid–liquid transition on dynamics. We locate characteristic dynamical loci in the phase diagram and compare them with the previously investigated thermodynamic loci. The low-density liquid phase shows a crossover from non-Arrhenius to Arrhenius behavior, signaling the onset of a crossover from fragile-to-strong behavior. We explain this crossover in terms of the asymptotic approach of the low-density liquid to a random tetrahedral network and show that the temperature dependence of the diffusion coefficient over a wide temperature range can be simply related to the concentration of defects in the network. Our findings thus confirm that the low-density phase of ST2 water is a well-defined metastable liquid.
    • P. H. Poole, S. R. Becker, F. Sciortino, and F. W. Starr. Journal of Physical Chemistry B, vol. 114, no. 48, pp. 14176--14183. [DOI]
    [J7] 2011 Templates for Convex Cone Problems with Applications to Sparse Signal RecoveryAbstract: This paper develops a general framework for solving a variety of convex cone problems that frequently arise in signal processing, machine learning, statistics, and other fields. The approach works as follows: first, determine a conic formulation of the problem; second, determine its dual; third, apply smoothing; and fourth, solve using an optimal first-order method. A merit of this approach is its flexibility: for example, all compressed sensing problems can be solved via this approach. These include models with objective functionals such as the total-variation norm, ||Wx||1 where W is arbitrary, or a combination thereof. In addition, the paper introduces a number of technical contributions such as a novel continuation scheme and a novel approach for controlling the step size, and applies results showing that the smooth and unsmoothed problems are sometimes formally equivalent. Combined with our framework, these lead to novel, stable and computationally efficient algorithms. For instance, our general implementation is competitive with state-of-the-art methods for solving intensively studied problems such as the LASSO. Further, numerical experiments show that one can solve the Dantzig selector problem, for which no efficient large-scale solvers exist, in a few hundred iterations. Finally, the paper is accompanied with a software release. This software is not a single, monolithic solver; rather, it is a suite of programs and routines designed to serve as building blocks for constructing complete algorithms.
    • S. Becker, E. Candès, M. Grant. Mathematical Programming Computation, vol 3 no 3 165--218. [DOI] [code] local version. An extended version appears as chapter 4 in my thesis. Typos: eq. 4.6a should have 1, not \sigma_k^{-1}t_k^{(1)}, in the second parameter of Trunc, and the two equations above should have z^{(1)} not z^{(2)} in the l-infinity norm constraint. Listing 6 should have the adjoint of W in line 4.
    [J6] 2011 NESTA: a fast and accurate first-order method for sparse recoveryAbstract: Accurate signal recovery or image reconstruction from indirect and possibly undersampled data is a topic of considerable interest; for example, the literature in the recent field of compressed sensing is already quite immense. Inspired by recent breakthroughs in the development of novel first-order methods in convex optimization, most notably Nesterov's smoothing technique, this paper introduces a fast and accurate algorithm for solving common recovery problems in signal processing. In the spirit of Nesterov's work, one of the key ideas of this algorithm is a subtle averaging of sequences of iterates, which has been shown to improve the convergence properties of standard gradient-descent algorithms. This paper demonstrates that this approach is ideally suited for solving large-scale compressed sensing reconstruction problems as 1) it is computationally efficient, 2) it is accurate and returns solutions with several correct digits, 3) it is flexible and amenable to many kinds of reconstruction problems, and 4) it is robust in the sense that its excellent performance across a wide range of problems does not depend on the fine tuning of several parameters. Comprehensive numerical experiments on realistic signals exhibiting a large dynamic range show that this algorithm compares favorably with recently proposed state-of-the-art methods. We also apply the algorithm to solve other problems for which there are fewer alternatives, such as total-variation minimization, and convex programs seeking to minimize the l1 norm of Wx under constraints, in which W is not diagonal.
    • S. Becker, J. Bobin and E. J. Candès. SIAM J. on Imaging Sciences (SIIMS), vol 4 no. 1. [DOI] [arXiv] [code] An extended version appears as chapter 2 in my thesis
    [J5] 2010 Quantum state tomography via compressed sensingAbstract: We establish methods for quantum state tomography based on compressed sensing. These methods are specialized for quantum states that are fairly pure, and they offer a significant performance improvement on large quantum systems. In particular, they are able to reconstruct an unknown density matrix of dimension d and rank r using O(rd log^2 d) measurement settings, compared to standard methods that require d^2 settings. Our methods have several features that make them amenable to experimental implementation: they require only simple Pauli measurements, use fast convex optimization, are stable against noise, and can be applied to states that are only approximately low-rank. The acquired data can be used to certify that the state is indeed close to pure, so no a priori assumptions are needed. We present both theoretical bounds and numerical simulations.
    • Gross, Liu, Flammia, Becker and Eisert. Physical Review Letters, vol 105, number 15 (4 pages). [DOI] [arXiv]
    [J4] 2007 Relation between the Widom line and the breakdown of the Stokes-Einstein relation in supercooled water]Abstract: Supercooled water exhibits a breakdown of the Stokes–Einstein relation between the diffusion constant D and the alpha relaxation time τα. For water simulated with two different potentials, TIP5P and ST2, we find that the temperature of the decoupling of diffusion and alpha relaxation correlates with the temperature of the maximum in specific heat that corresponds to crossing the Widom line TW(P). Specifically, we find that our results for Dτα/T collapse onto a single “master curve” if temperature is replaced by T − TW(P). We further find that the size of the mobile molecule clusters (dynamical heterogeneities) increases sharply near TW(P). Moreover, our calculations of mobile particle cluster size <n(t*)>w for different pressures, where t* is the time for which the mobile particle cluster size is largest, also collapse onto a single master curve if T is replaced by T − TW(P). The crossover to a more locally structured low density liquid (LDL) as T → TW(P) appears to be well correlated both with the breakdown of the Stokes–Einstein relation and with the growth of dynamic heterogeneities. Our results are consistent with the possibility that the breakdown of the SE relation in supercooled water at low pressures is associated with the hypothesized liquid–liquid phase transition.
    • P. Kumar, S.V. Buldyrev, S.R. Becker, P.H. Poole, F.W. Starr, and H.E. Stanley. Proceedings of the National Academy of Science, Vol. 104, 9575-9579. [DOI]
    [J3] 2006 Fractional Stokes-Einstein and Debye-Stokes-Einstein Relations in a Network-Forming LiquidAbstract: We study the breakdown of the Stokes-Einstein (SE) and Debye-Stokes-Einstein (DSE) relations for translational and rotational motion in a prototypical model of a network-forming liquid, the ST2 model of water. We find that the emergence of fractional SE and DSE relations at low temperature is ubiquitous in this system, with exponents that vary little over a range of distinct physical regimes. We also show that the same fractional SE relation is obeyed by both mobile and immobile dynamical heterogeneities of the liquid.
    • Stephen R. Becker, Peter H. Poole, and Francis W. Starr. Physical Review Letters, vol. 97 no. 5. [DOI] self-hosted version
    [J2] 2005 The Dynamics of Falling DominoesAbstract:
    • Stan Wagon, Adrianne Pontarelli, Stephen Becker and William Briggs. UMAP Journal, (26)1 pp. 37-48.
    [J1] 2012 Sparse projections onto the simplexAbstract:
    • Anastasios Kyrillidis, Stephen Becker, Volkan Cevher. NIPS Workshop on Discrete Optimization in Machine Learning (DISCML), December. [arXiv]

Conference papers

    [C30] 2023 Versatile fidelity estimation with confidenceAbstract:
    • Akshay Seshadri, Martin Ringbauer, Rainer Blatt, Thomas Monz and Stephen Becker. QIP (Quantum Information Processing), as poster, Ghent, February 6--10.
    [C29] 2022 Broadband Spectroscopic Imaging Using Dual Frequency Comb Spectroscopy and Compressive SensingAbstract:
    • E. Strong, S. Coburn, A. Anderson, R. Cole, J. Gopinath, S. Becker, G. Rieker. CLEO.
    [C28] 2022 A Strategy for Synthetic Aperture Sequence Design Using Numerical OptimizationAbstract: Modifying acoustic transmission represents one way of improving image quality for ultrasound imaging. This has been demonstrated in the transmit encoding model for synthetic aperture imaging [1], which allows for great flexibility in apodization and time delay of each array element. After decoding these recorded echoes, geometric focusing can be applied to produce an image whose quality depends highly on the specific encoding used. Commonly used encoding schemes, such as those which produce focused beams or planewaves, are often motivated from geometric principles. We note that these geometrically intuitive schemes represent only a narrow subset of all possible encoding schemes, and demonstrate that even small modification to transmit parameters can lead to meaningfully improved resolution, field of view, and contrast in images. These modified encodings often cannot be derived through application of fundamental geometric principles. In response to these challenges, we propose a machine learning (ML) framework for sequence design that identifies a set of optimal transmit parameters, in the sense that they minimize some loss metric on beamformed images. Additionally, we describe critical implementation details that make backpropagation in our approach computationally feasible.
    • Jacob Spainhour, Stephen Becker, Nicholas Bottenus. IEEE International Ultrasonics Symposium, Oct 10-13 Venice. [DOI] [code]
    [C27] 2021 Superresolution photoacoustic tomography using random speckle illumination and second order momentsAbstract: Idier et al. [IEEE Trans. Comput. Imaging 4(1), 2018] propose a method which achieves superresolution in the microscopy setting by leveraging random speckle illumination and knowledge about statistical second order moments for the illumination patterns and model noise. This is achieved without any assumptions on the sparsity of the imaged object. In this paper, we show that their technique can be extended to photoacoustic tomography. We propose a simple algorithm for doing the reconstruction which only requires a small number of linear algebra steps. It is therefore much faster than the iterative method used by Idier et al. We also propose a new representation of the imaged object based on Dirac delta expansion functions.
    • Osman Asif Malik, Venkatalakshmi Vyjayanthi Narumanchi, Stephen Becker, Todd W. Murray. WASPAA conference. [arXiv]
    [C26] 2021 A Sampling Based Method for Tensor Ring DecompositionAbstract: We propose a sampling-based method for computing the tensor ring (TR) decomposition of a data tensor. The method uses leverage score sampled alternating least squares to fit the TR cores in an iterative fashion. By taking advantage of the special structure of TR tensors, we can efficiently estimate the leverage scores and attain a method which has complexity sublinear in the number of input tensor entries. We provide high-probability relative-error guarantees for the sampled least squares problems. We compare our proposal to existing methods in experiments on both synthetic and real data. Our method achieves substantial speedup -- sometimes two or three orders of magnitude -- over competing methods, while maintaining good accuracy. We also provide an example of how our method can be used for rapid feature extraction.
    • Osman Asif Malik, Stephen Becker. ICML. [arXiv]
    [C25] 2020 Nuclear Norm Based Spectrum Estimation for Molecular Dynamic SimulationsAbstract: Molecular dynamic (MD) simulations are used to probe molecular systems in regimes not accessible to physical experiments. A common goal of these simulations is to compute the power spectral density (PSD) of some component of the system such as particle velocity. In certain MD simulations, only a few time locations are observed, which makes it difficult to estimate the autocorrelation and PSD. This work develops a novel nuclear norm minimization-based method for this type of sub-sampled data, based on a parametric representation of the PSD as the sum of Lorentzians. We show results on both synthetic data and a test system of methanol.
    • Shuang Li, Stephen Becker, Michael Wakin. In Asilomar conference on Signals, Systems and Computers. link
    [C24] 2021 Stochastic Gradient Langevin Dynamics with Variance ReductionAbstract: Stochastic gradient Langevin dynamics (SGLD) has gained the attention of optimization researchers due to its global optimization properties. This paper proves an improved convergence property to local minimizers of nonconvex objective functions using SGLD accelerated by variance reductions. Moreover, we prove an ergodicity property of the SGLD scheme, which gives insights on its potential to find global minimizers of nonconvex objectives.
    • Zhishen Huang, Stephen Becker. IJCNN2021 (International Joint Conference on Neural Networks). [DOI] [arXiv]
    [C23] 2020 Computational super-resolution microscopy: leveraging noise model, regularization and sparsity to achieve highest resolutionAbstract: We report progress in algorithm development for a computation-based super-resolution microscopy technique. Building upon previous results, we examine our recently implemented microscope system and construct alter- native processing algorithms. Based on numerical simulations results, we evaluate the performance of each algorithm and determine the one most suitable for our super-resolution microscope
    • Jian Xing, Simeng Chen, Stephen Becker, Jiun-Yann Yu and Carol Cogswell. SPIE BiOS, San Francisco, CA. [DOI]
    [C22] 2019 One-Pass Sparsified Gaussian MixturesAbstract: We present a one-pass sparsified Gaussian mixture model (SGMM). Given N data points in P dimensions, X, the model fits K Gaussian distributions to X and (softly) classifies each point to these clusters. After paying an up-front cost of (NPlogP) to precondition the data, we subsample Q entries of each data point and discard the full P-dimensional data. SGMM operates in (KNQ) time per iteration for diagonal or spherical covariances, independent of P, while estimating the model parameters in the full P-dimensional space, making it one-pass and hence suitable for streaming data. We derive the maximum likelihood estimators for the parameters in the sparsified regime, demonstrate clustering on synthetic and real data, and show that SGMM is faster than GMM while preserving accuracy.
    [C21] 2019 Online Sparse Subspace ClusteringAbstract: This paper focuses on the sparse subspace clustering problem, and develops an online algorithmic solution to cluster data points on-the-fly, without revisiting the whole dataset. The strategy involves an online solution of a sparse representation (SR) problem to build a (sparse) dictionary of similarities where points in the same subspace are considered "similar," followed by a spectral clustering based on the obtained similarity matrix. When the SR cost is strongly convex, the online solution converges to within a neighborhood of the optimal time-varying batch solution. A dynamic regret analysis is performed when the SR cost is not strongly convex.
    [C20] 2019 Perturbed Proximal Descent to Escape Saddle Points for Non-convex and Non-smooth Objective FunctionsAbstract: We consider the problem of finding local minimizers in non-convex and non-smooth optimization. Under the assumption of strict saddle points, positive results have been derived for first-order methods. We present the first known results for the non-smooth case, which requires different analysis and a different algorithm.
    [C19] 2018 Low-rank Tucker decomposition of large tensors using TensorSketchAbstract: We propose two randomized algorithms for low-rank Tucker decomposition of tensors. The algorithms, which incorporate sketching, only require a single pass of the input tensor and can handle tensors whose elements are streamed in any order. To the best of our knowledge, ours are the only algorithms which can do this. We test our algorithms on sparse synthetic data and compare them to multiple other methods. We also apply one of our algorithms to a real dense 38 GB tensor representing a video and use the resulting decomposition to correctly classify frames containing disturbances.
    • S. Becker and O. Malik. NeurIPS. [code] paper
    [C18] 2018 Randomized Clustered Nystrom for Large-Scale Kernel MachinesAbstract: The Nystrom method is a popular technique for generating low-rank approximations of kernel matrices that arise in many machine learning problems. The approximation quality of the Nystrom method depends crucially on the number of selected landmark points and the selection procedure. In this paper, we introduce a randomized algorithm for generating landmark points that is scalable to large high-dimensional data sets. The proposed method performs K-means clustering on low-dimensional random projections of a data set and thus leads to significant savings for high-dimensional data sets. Our theoretical results characterize the tradeoffs between accuracy and efficiency of the proposed method. Moreover, numerical experiments on classification and regression tasks demonstrate the superior performance and efficiency of our proposed method compared with existing approaches.
    [C17] 2017 Estimating Active Subspaces with Randomized Gradient SamplingAbstract: In this work, we present an ecient method for estimating active subspaces using only random observations of gradient vectors. Our method is based on the bi-linear representation of low-rank gradient matrices with a novel initialization step for alternating minimization.
    • S. Becker and F. Pourkamali-Anaraki. SIAM Workshop on Dimension Reduction, Pittsburg, July 9-10. link or CU Scholar website
    [C16] 2017 Robust Partially-Compressed Least-SquaresAbstract: Randomized matrix compression techniques, such as the Johnson-Lindenstrauss transform, have emerged as an effective and practical way for solving large-scale problems efficiently. With a focus on computational efficiency, however, forsaking solutions quality and accuracy becomes the trade-off. In this paper, we investigate compressed least-squares problems and propose new models and algorithms that address the issue of error and noise introduced by compression. While maintaining computational efficiency, our models provide robust solutions that are more accurate--relative to solutions of uncompressed least-squares--than those of classical compressed variants. We introduce tools from robust optimization together with a form of partial compression to improve the error-time trade-offs of compressed least-squares solvers. We develop an efficient solution algorithm for our Robust Partially-Compressed (RPC) model based on a reduction to a one-dimensional search. We also derive the first approximation error bounds for Partially-Compressed least-squares solutions. Empirical results comparing numerous alternatives suggest that robust and partially compressed solutions are effectively insulated against aggressive randomized transforms.
    • S. Becker, B. Kawas, M. Petrik, K. N. Ramamurthy. AAAI. [arXiv]
    [C15] 2016 A Randomized Approach to Efficient Kernel ClusteringAbstract: Kernel-based K-means clustering has gained popularity due to its simplicity and the power of its implicit non-linear representation of the data. A dominant concern is the memory requirement since memory scales as the square of the number of data points. We provide a new analysis of a class of approximate kernel methods that have more modest memory requirements, and propose a specific one-pass randomized kernel approximation followed by standard K-means on the transformed data. The analysis and experiments suggest the method is accurate, while requiring drastically less memory than standard kernel K-means and significantly less memory than Nystrom based approximations.
    • F. Pourkamali-Anaraki, S. Becker. IEEE GlobalSIP (Global Conference on Signal and Information Processing. [arXiv] Also available via CU Scholar website
    [C14] 2015 Efficient Dictionary Learning via Very Sparse Random ProjectionsAbstract: Performing signal processing tasks on compressive measurements of data has received great attention in recent years. In this paper, we extend previous work on compressive dictionary learning by showing that more general random projections may be used, including sparse ones. More precisely, we examine compressive K-means clustering as a special case of compressive dictionary learning and give theoretical guarantees for its performance for a very general class of random projections. We then propose a memory and computation efficient dictionary learning algorithm, specifically designed for analyzing large volumes of high-dimensional data, which learns the dictionary from very sparse random projections. Experimental results demonstrate that our approach allows for reduction of computational complexity and memory/data access, with controllable loss in accuracy.
    • F. Pourkamali-Anaraki, S. Becker, S. Hughes. SampTA, May.. [DOI] [arXiv]
    [C13] 2015 General Optimization Framework for Robust and Regularized 3D FWIAbstract: Scarcity of hydrocarbon resources and high exploration risks motivate the development of high fidelity algorithms and computationally viable approaches to exploratory geophysics. Whereas early approaches considered least-squares minimization, recent developments have emphasized the importance of robust formulations, as well as formulations that allow disciplined encoding of prior information into the inverse problem formulation. The cost of a more flexible optimization framework is a greater computational complexity, as least-squares optimization can be performed using straightforward methods (e.g., steepest descent, Gauss-Newton, L-BFGS), whilst incorporation of robust (non-smooth) penalties requires custom changes that may be difficult to implement in the context of a general seismic inversion workflow. In this study, we propose a generic, flexible optimization framework capable of incorporating a broad range of noise models, forward models, regularizers, and reparametrization transforms. This framework covers seamlessly robust noise models (such as Huber and Student's t), as well as sparse regularizers, projected constraints, and Total Variation regularization. The proposed framework is also expandable --- we explain the adjustments that are required for any new formulation to be included. Lastly, we conclude with few numerical examples demonstrating the versatility of the formulation.
    • S. Becker, L. Horesh, A. Aravkin, E. van den Berg, S. Zhuk. as extended abstract at EAGE (Madrid, Spain, June). [arXiv]
    [C12] 2014 Time--Data Tradeoffs by Aggressive SmoothingAbstract: This paper proposes a tradeoff between sample complexity and computation time that applies to statistical estimators based on convex optimization. As the amount of data increases, we can smooth optimization problems more and more aggressively to achieve accurate estimates more quickly. This work provides theoretical and experimental evidence of this tradeoff for a class of regularized linear inverse problems.
    • J. Bruer, J. A. Tropp, V. Cevher, S. Becker. NIPS, Montreal. link
    [C11] 2014 QUIC & DIRTY: A Quadratic Approximation Approach for Dirty Statistical ModelsAbstract: In this paper, we develop a family of algorithms for optimizing superposition-structured” or “dirty” statistical estimators for high-dimensional problems involving the minimization of the sum of a smooth loss function with a hybrid regularization. Most of the current approaches are first-order methods, including proximal gradient or Alternating Direction Method of Multipliers (ADMM). We propose a new family of second-order methods where we approximate the loss function using quadratic approximation. The superposition structured regularizer then leads to a subproblem that can be efficiently solved by alternating minimization. We propose a general active subspace selection approach to speed up the solver by utilizing the low-dimensional structure given by the regularizers, and provide convergence guarantees for our algorithm. Empirically, we show that our approach is more than 10 times faster than state-of-the-art first-order approaches for the latent variable graphical model selection problems and multi-task learning problems when there is more than one regularizer. For these problems, our approach appears to be the first algorithm that can extend active subspace ideas to multiple regularizers."
    • C-J. Hsieh, I. Dhillon, P. Ravikumar, S. Becker, P. Olsen. NIPS. link
    [C10] 2014 A variational approach to stable principal component pursuitAbstract: We introduce a new convex formulation for stable principal component pursuit (SPCP) to decompose noisy signals into low-rank and sparse representations. For numerical solutions of our SPCP formulation, we first develop a convex variational framework and then accelerate it with quasi-Newton methods. We show, via synthetic and real data experiments, that our approach offers advantages over the classical SPCP formulations in scalability and practical parameter selection.
    • A. Aravkin, S. Becker, V. Cevher, P. Olsen. UAI. [arXiv]
    [C9] 2014 Joint low-rank representation and matrix completion under a singular value thresholding frameworkAbstract: Matrix completion is the process of estimating missing entries from a matrix using some prior knowledge. Typically, the prior knowledge is that the matrix is low-rank. In this paper, we present an extension of standard matrix completion that leverages prior knowledge that the matrix is low-rank and that the data samples can be efficiently represented by a fixed known dictionary. Specifically, we compute a low-rank representation of a data matrix with respect to a given dictionary using only a few observed entries. A novel modified version of the singular value thresholding (SVT) algorithm named joint low-rank representation and matrix completion SVT (J-SVT) is proposed. Experiments on simulated data show that the proposed J-SVT algorithm provides better reconstruction results compared to standard matrix completion
    • C. Tzagkarakis, S. Becker and A. Mouchtaris. EUSIPCO. link
    [C8] 2014 Metric learning with rank and sparsity constraintsAbstract: Choosing a distance preserving measure or metric is fundamental to many signal processing algorithms, such as k-means, nearest neighbor searches, hashing, and compressive sensing. In virtually all these applications, the efficiency of the signal processing algorithm depends on how fast we can evaluate the learned metric. Moreover, storing the chosen metric can create space bottlenecks in high dimensional signal processing problems. As a result, we consider data dependent metric learning with rank as well as sparsity constraints. We propose a new non-convex algorithm and empirically demonstrate its performance on various datasets; a side benefit is that it is also much faster than existing approaches. The added sparsity constraints significantly improve the speed of multiplying with the learned metrics without sacrificing their quality.
    • B. Bah, S. Becker, V. Cevher and B. Gozcu. ICASSP. [DOI] link
    [C7] 2013 A proximal splitting method for inf-convolutive variational models in image recoveryAbstract:
    • S. Becker and P. L. Combettes. accepted in ICIP Melbourne.
    [C6] 2013 Scalable and accurate quantum tomography from fewer measurementsAbstract: —Quantum tomography is a matrix recovery problem that is not amenable to nuclear-norm minimization. We review existing recovery strategies and propose new ones, with an emphasis on methods with provable guarantees. An outcome of this is a new method with guarantees that is more accurate and more scalable than previous methods.
    • Stephen Becker and Volkan Cevher. SPARS Lausanne Switzerland. link
    [C5] 2013 Randomized Singular Value ProjectionAbstract: Affine rank minimization algorithms typically rely on calculating the gradient of a data error followed by a singular value decomposition at every iteration. Because these two steps are expensive, heuristic approximations are often used to reduce computational burden. To this end, we propose a recovery scheme that merges the two steps with randomized approximations, and as a result, operates on space proportional to the degrees of freedom in the problem. We theoretically establish the estimation guarantees of the algorithm as a function of approximation tolerance. While the theoretical approximation requirements are overly pessimistic, we demonstrate that in practice the algorithm performs well on the quantum tomography recovery problem.
    • Stephen Becker, Volkan Cevher, Anastasios Kyrillidis. SampTA Bremen Germany. [arXiv] This arXiv version is an extended version (i.e., with proofs) of the 4 page SampTA proceedings version.
    [C4] 2013 Sparse projections onto the simplexAbstract: Most learning methods with rank or sparsity constraints use convex relaxations, which lead to optimization with the nuclear norm or the l1-norm. However, several important learning applications cannot benefit from this approach as they feature these convex norms as constraints in addition to the non-convex rank and sparsity constraints. In this setting, we derive efficient sparse projections onto the simplex and its extension, and illustrate how to use them to solve high-dimensional learning problems in quantum tomography, sparse density estimation and portfolio selection with non-convex constraints.
    • Stephen Becker, Volkan Cevher, Christopher Koch, Anastasios Kyrillidis. ICML, Atlanta, (the v3 on arXiv is our expanded version of our NIPS conference version). [arXiv] For fun, see the ICML 2013 review boards. There is an earlier version of this paper on the JMLR website.
    [C3] 2012 A quasi-Newton proximal splitting methodAbstract: A new result in convex analysis on the calculation of proximity operators in certain scaled norms is derived. We describe efficient implementations of the proximity calculation for a useful class of functions; the implementations exploit the piece-wise linear nature of the dual problem. The second part of the paper applies the previous result to acceleration of convex minimization problems, and leads to an elegant quasi-Newton method. The optimization method compares favorably against state-of-the-art alternatives. The algorithm has extensive applications including signal processing, sparse recovery and machine learning and classification.
    • Stephen Becker, M. Jalal Fadili. NIPS (Lake Tahoe) (recieved a "spotlight presentation" as one of the top 5% of submissions). [arXiv] official link
    [C2] 2012 A 100MHz-2GHz 12.5x sub-Nyquist Rate Receiver in 90nm CMOSAbstract: A fully-integrated, high-speed, wideband receiver called the random modulation pre-integrator is realized in IBM 90nm digital CMOS. It achieves an effective instantaneous bandwidth of 2GHz, with >54dB dynamic range. Most notably, the aggregate digitization rate is fs =320MSPS, 12.5x below the Nyquist rate. Signal recovery can be accomplished for any signal with a concise representation. The system is validated using radarpulses and tones as the input and recovering the time-domain waveforms.
    • J. Yoo, S. Becker, M. Loh, M. Monge, E. Candès, A. Emami-Neyestanak. RFIC, Montreal, Canada, June. [DOI] link
    [C1] 2012 Design and implementation of a fully integrated compressed-sensing signal acquisition systemAbstract: Compressed sensing (CS) is a topic of tremendous interest because it provides theoretical guarantees and computationally tractable algorithms to fully recover signals sampled at a rate close to its information content. This paper presents the design of the first physically realized fully-integrated CS based Analog-to-Information (A2I) pre-processor known as the Random-Modulation Pre-Integrator (RMPI) [1]. The RMPI achieves 2GHz bandwidth while digitizing samples at a rate 12.5x lower than the Nyquist rate. The success of this implementation is due to a coherent theory/algorithm/hardware co-design approach. This paper addresses key aspects of the design, presents simulation and hardware measurements, and discusses limiting factors in performance.
    • J. Yoo, S. Becker, M. Monge, M. Loh, E. Candès, A. Emami-Neyestanak. ICASSP (Kyoto, Japan, March 2012). [DOI] link

Book chapters

  • Dual Smoothing Techniques for Variational Matrix Decomposition, S. Becker and A. Aravkin, in “Robust Low-Rank and Sparse Matrix Decomposition: Applications in Image and Video Processing”, T. Bouwmans, N. Aybat, E. Zahzah, eds. CRC Press, 2016. arXiv

Selected Talks

Here I list a few talks that have video or slides or websites

Technical Reports


Abstract Since 2004, the field of compressed sensing has grown quickly and seen tremendous interest because it provides a theoretically sound and computationally tractable method to stably recover signals by sampling at the information rate. This thesis presents in detail the design of one of the world's first compressed sensing hardware devices, the random modulation pre-integrator (RMPI). The RMPI is an analog-to-digital converter (ADC) that bypasses a current limitation in ADC technology and achieves an unprecedented 8 effective number of bits over a bandwidth of 2.5 GHz. Subtle but important design considerations are discussed, and state-of-the-art reconstruction techniques are presented. Inspired by the need for a fast method to solve reconstruction problems for the RMPI, we develop two efficient large-scale optimization methods, NESTA and TFOCS, that are applicable to a wide range of other problems, such as image denoising and deblurring, MRI reconstruction, and matrix completion (including the famous Netflix problem). While many algorithms solve unconstrained l1 problems, NESTA and TFOCS can solve the constrained form of l1 minimization, and allow weighted norms. In addition to l1 minimization problems such as the LASSO, both NESTA and TFOCS solve total-variation minimization problem. TFOCS also solves the Dantzig selector and most variants of the nuclear norm minimization problem. A common theme in both NESTA and TFOCS is the use of smoothing techniques, which make the problem tractable, and the use of optimal first-order methods that have an accelerated convergence rate yet have the same cost per iteration as gradient descent. The conic dual methodology is introduced in TFOCS and proves to be extremely flexible, covering such generic problems as linear programming, quadratic programming, and semi-definite programming. A novel continuation scheme is presented, and it is shown that the Dantzig selector benefits from an exact-penalty property. Both NESTA and TFOCS are released as software packages available freely for academic use.
Abstract This thesis presents the results of extensive simulations of the ST2 model of water designed to explore both thermodynamic and dynamic properties in super- cooled states. Constant volume simulations were performed at 1,233 state points, with densities ranging from 0.80 g/cm3 to 1.20 g/cm3 in steps of .01 g/cm3; along these isochores, simulations were run at temperatures from 250 K to 400 K in 5 K intervals. Our results qualitatively reproduce many of the expected properties of supercooled water such as the density anomaly and a novel low-temperature phase transition between two liquid states. The Stokes-Einstein and Debye-Stokes- Einstein equations provide a simple hydrodynamic relation between viscosity and diffusion. These relations hold for simple liquids, but are known to fail near a glass transition. For ST2 water, we find that the Debye-Stokes-Einstein equation does not hold at low temperatures. Furthermore, the data indicate that rotational diffusion is enhanced at low temperatures relative to translational diffusion. We also uncover an unexpected connection between dynamics and thermodynamics; specifically, we find that the structural relaxation time τα along the critical iso- chore provides a precursor to the liquid-liquid phase transition at temperatures up to 150 K above the critical temperature. We observe a nearly identical sig- nature in τα at a higher density which may indicate a second liquid-liquid phase transition, adding credibility to the recent idea that there may be more than one liquid-liquid critical point.

Student theses

Theses of my students (after 2019, CU stopped posting these online, so I’m posting local copies when I haven them):

In the pipeline

  • Akshay Seshadri (PhD, CU Physics) 2023?
  • Kevin Doherty (PhD), 2023-4?
  • Jacob Spainhour (PhD), 2024?

PhD theses

Masters theses

Undergraduate theses

Professional Masters “culminating experience”

  • Austin Wagenknecht, 2022, Digital Signal Processing Methods in Music Information Retrieval
  • Jacob Tiede, 2021, culminating experience


Please support journals that allow authors to post their final papers.

  • SIAM “Information for Authors”. On their copyright transfer form, they state that the author retains the right to post the final paper on their academic website, as long as the SIAM copyright notice is posted and no fee is charged.
  • American Physical Society (APS) posting policy. APS publishes the Physical Review series. Summary: authors can post the paper as long as the citation and APS copyright are included, and no fee is charged. The copyright and citation look like: “Authors names, journal title, volume number, page number (or article identifier), year of publication. “Copyright (year) by the American Physical Society.””
  • IEEE Electronic Information Dissemination page. Authors can post a published paper, but need to include the IEEE copyright notice and full citation (see the link for the copyright notice).
  • The SHERPA\/ROMEO website lists the policies of many publishers. The SPARC addendum is an addendum to copyright agreements that authors may request their publisher to sign.