Conference on Fast Direct Solvers

Department of Mathematics, Purdue University, Nov 9-11, 2018

Program and Abstracts (with links to abstracts)

(Conference location: STEW 314 (Stewart Center Room 314), Purdue University (map, Google map))

 

Note: subject to change

Fri, Nov 9
Friday Session (Chair: Jie Shen)
2:00-2:10 Opening remark
2:10-3:00 Nikos Pitsianis, Aristotle University of Thessaloniki and Duke University
Sparse Dual of the Density Peaks Algorithm for Cluster Analysis of High-Dimensional Data
3:00-3:30

Leopold Cambier, Stanford University

A General Graph Sparsified Nested Dissection Algorithm Using Low-Rank Approximations
3:30-4:00

Xin Ye, University of Minnesota

Kernel Matrix Compression with Proxy Points
4:00-4:30 Coffee break
4:30-5:00

Ke Chen, University of Wisconsin-Madison

Random Sampling and Efficient Algorithms for Multiscale PDEs
5:00-5:30

Mirjeta Pasha, Kent State University

Linearized Bregman Iteration Method with Non-negativity Constraint Solution and Applications in Sparse Denoising
5:30-6:00

Xin Xing, Georgia Institute of Technology

Accelerated Interpolative Decomposition by the General Proxy Point Method
 
Sat, Nov 10
Saturday Session 1 (Chair: Alex Pothen)
9:00-9:50

Eric Michielssen, University of Michigan, Ann Arbor

Butterfly+ Fast Direct Solvers for Highly Oscillatory Problems
9:50-10:40

Hongkai Zhao, UC Irvine

Intrinsic Complexity and its Scaling Law: From Approximation of Random Vectors and Random Fields to High Frequency Waves
10:40-11:10 Coffee break
11:10-11:40

Manki Cho, Rochester Institute of Technology

Steklov Expansion Methods for Laplacian Boundary Value Problems
11:40-12:20

Xueping Zhao, University of South Carolina

A Second Order Fully-discrete Linear Energy Stable Scheme for a Binary Compressible Viscous Fluid Model
12:20-2:00 Lunch (on your own)
Saturday Session 2 (Chair: Guang Lin)
2:00-2:50

Xiaobai Sun, Duke University

Fast Column Subset Partition of a Sparse High-dimensional Data Interaction Matrix
2:50-3:20

Edgar Solomonik, University of Illinois at Urbana-Champaign

Communication-Avoiding Factorization Algorithms
3:20-3:50

Vasileios Kalantzis, IBM Research

Beyond Automated Multilevel Substructuring: Domain Decomposition with Rational Filtering
3:50-4:20 Coffee break
4:20-4:50

Ryan Chilton, MyraCore LLC

Direct Algorithms for Sparse Schur Complements and Inverses
4:50-5:20

Xiao Liu, Rice University

Direct Solution of Elliptic Problems Under Coefficient Updates
5:20-5:50

Yu Hong Yeung, Purdue University

AMPS: A Real-time Mesh Cutting Algorithm for Surgical Simulations
 
Sun, Nov 11
Sunday Session (Chair: Jianlin Xia)
8:30-9:20

Sabine Le Borne, Hamburg University of Technology

Direct Solvers for RBF Interpolation Problems
9:20-10:10

Dan Jiao, Purdue University

Accuracy Directly Controlled Fast Direct Solutions of General H^2-Matrices
10:10-10:40

Zhiping Mao, Brown University

A Fast Solver for Spectral Element Approximation Applied to Fractional Differential Equations Using Hierarchical Matrix Approximation
10:40-11:00 Coffee break
11:00-11:50

Ming Gu, UC Berkeley

Advanced Techniques for Low-Rank Matrix Approximation
11:50-12:20

Zhaopeng Hao, Worcester Polytechnic Institute

Fast Numerical Methods for Time Fractional and Space Fractional Equations

 

 

Abstracts

 


 

A General Graph Sparsified Nested Dissection Algorithm Using Low-Rank Approximations

Leopold Cambier, Stanford University

 

Solving large sparse and ill-conditioned linear systems remains a difficult task. On one hand, direct methods using nested dissection offer a robust approach, but they typically scale like O(N2). Custom preconditioners are a good alternative but usually require expert knowledge about the problem. We position ourselves between those two and we use approximate factorizations. We present a sparsified nested dissection algorithm that works on any matrix with properties similar to elliptic partial differential equations discretization. We start with a nested dissection ordering and compute low-rank approximations between clusters. This allows us to sparsify all separators (i.e., reduce their size without introducing any fill-in) from the beginning. This reduces the factorization time dramatically to almost O(N). The result is an approximate factorization that can then be used as an efficient preconditioner. Its factorization time is near O(N) and it has an almost constant number of iterations on a large range of problems. Furthermore, it can be applied on any matrix, is guaranteed to not need any pivoting for SPD problems and the only parameter is the low-rank approximations tolerance. We present in details the algorithm, how it builds upon nested dissection and how one can sparsify separators. Then, we compare multiple variants and show that, using an appropriate block diagonal scaling, one can further improve the preconditioner's accuracy. Finally, we apply the algorithm on a wide range of regular and irregular problems, showing that we can effectively solve large problems at a fraction of the cost of direct methods.

 

Random Sampling and Efficient Algorithms for Multiscale PDEs

Ke Chen, University of Wisconsin-Madison

 

We describe an efficient framework for multiscale PDE problems that uses random sampling to capture low-rank local solution spaces arising in a domain decomposition framework. In contrast to existing techniques, our method does not rely on detailed analytical understanding of specific multiscale PDEs, in particular, their asymptotic limits. Our framework is applied to two specific problems - a linear kinetic equation and an elliptic equation with oscillatory media - for which recover the asymptotic preserving scheme and numerical homogenization, respectively. Numerical results confirm the efficacy of our approach.

 

Direct Algorithms for Sparse Schur Complements and Inverses

Ryan Chilton, MyraCore LLC

 

There are many packages available for directly solving sparse A*x=b linear systems (ie packages that encapsulate sparse A=LU factorization and triangular x=L\U\b backsolution). But in this talk, we review less commonly available algorithms for (i) efficient computation of Schur complements involving sparse matrices and (ii) sampling arbitrary entries of the inverse of a sparse matrix. We show how these calculations can be naturally folded into skeletonization/compression algorithms (cross approximation and randomized projection in particular) to generate rank structured approximations to sparse Schur complements and inverses. These representations are particularly useful in the context of substructuring (DDM) solvers for FEM systems, and hybrid direct solvers for combined FEM+BEM (sparse+dense) systems. An open-source (GPL) implementation is made available at the author's website (myracore.com).

 

Steklov Expansion Methods for Laplacian Boundary Value Problems

Manki Cho, Rochester Institute of Technology

 

Eigenfunction expansion methods have been studied in various ways to study solutions of PDEs. This talk will feature error estimates for approximation of solutions of Laplace's equation with Dirichlet, Robin or Neumann boundary value conditions using the harmonic Steklov eigenfunctions. Based on the spectral theory of trace spaces the solutions are represented by orthogonal basis from by the Steklov eigenfuntions. When the region is a rectangle, with explicit formulae for the Steklov eigenfunctions, both theoretical analysis and numerical experiments will introduce the efficiency and accuracy of the Steklov expansion methods in this talk.

 

Advanced Techniques for Low-Rank Matrix Approximation

Ming Gu, UC Berkeley

 

Low-rank matrix approximations have become a technique of central importance in large scale data science. In this talk we discuss a set of novel low-rank matrix approximation algorithms that are taylored for all levels of accuracy requirements for maximum computational efficiency. These algorithms include spectrum-revealing matrix factorizations that are optimal up to dimension-dependent constants, and an efficient truncated SVD (singular value decomposition) that is accurate up to a given tolerance. We provide theoretical error bounds and numerical evidence that demonstrate the superiority of our algorithms over existing ones, and show their usefulness in a number of data science applications.

 

Fast Numerical Methods for Time Fractional and Space Fractional Equations

Zhaopeng Hao, Worcester Polytechnic Institute

 

In this talk we are going to talk about efficient numerical methods for solving the fractional diffusion equations where the canonical integer differential operators are replaced by the nonlocal fractional ones. Particularly we will consider two model equations: (1) the multi-term time fractional diffusion equation, e.g., which are used to model the viscoelastic material; and (2) the elliptical space fractional diffusion equation with fractional Laplacian for anomalous diffusion, which arise in fluid and has been observed for long time. Due to the nonlocality and singularity of the fractional differential operators, it is more difficult to solving the fractional differential equations than usual integer counterpart. Special care should be taken to design numerical methods and fast solvers are required. For time fractional equation, we propose a finite difference method and a fast solver with linearithmic complexity to solve the equation. For space fractional one, we will talk about the regularity, spectral method and the corresponding solver.

 

Accuracy Directly Controlled Fast Direct Solutions of General H^2-Matrices

Dan Jiao, Purdue University

 

The dense matrix resulting from an integral equation (IE) based solution of Maxwell\u2019s equations can be compactly represented by an H^2-matrix. Given a general H^2-matrix, existing fast direct solutions involve approximations whose accuracy cannot be directly controlled. In this work, we propose new accuracy-controlled direct solution algorithms, including both factorization and inversion, for solving general H^2-matrices. For constant-rank H^2-matrices, the proposed direct solution has a strict O(N) complexity in both time and memory. For rank that linearly grows with the electrical size, the complexity of the proposed direct solution is O(NlogN) in factorization and inversion time, and O(N) in solution time and memory for solving volume IEs. Rapid direct solutions of electrodynamic volume IEs involving millions of unknowns have been obtained on a single CPU core with directly controlled accuracy. Comparisons with state-of-the-art H^2-based direct solvers have also demonstrated the advantages of the proposed direct solution in accuracy control, as well as achieving better accuracy with much less CPU time.

 

Beyond Automated Multilevel Substructuring: Domain Decomposition with Rational Filtering

Vasileios Kalantzis, IBM Research

 

This talk discusses a rational filtering domain decomposition technique for the solution of large and sparse symmetric generalized eigenvalue problems. The proposed technique is purely algebraic and decomposes the eigenvalue problem associated with each subdomain into two disjoint subproblems. The first subproblem is associated with the interface variables and accounts for the interaction among neighboring subdomains. To compute the solution of the original eigenvalue problem at the interface variables we leverage ideas from contour integral eigenvalue solvers. The second subproblem is associated with the interior variables in each subdomain and can be solved in parallel among the different subdomains using real arithmetic only. Compared to rational filtering projection methods applied to the original matrix pencil, the proposed technique integrates only a part of the matrix resolvent while it applies any orthogonalization necessary to vectors whose length is equal to the number of interface variables. In addition, no estimation of the number of eigenvalues located inside the interval of interest is needed. Numerical experiments performed in distributed memory architectures illustrate the competitiveness of the proposed technique against rational filtering Krylov approaches.

 

Direct Solvers for RBF Interpolation Problems

Sabine Le Borne, Hamburg University of Technology

 

Scattered data approximation deals with the problem of producing a function that in some sense represents some given (typically scattered) data and allows to make predictions at other times/locations/parameter settings. Applications are quite diverse: Surface reconstruction, image compression, numerical solution of PDEs (with their diverse applications), to name just a few.

In a scattered data interpolation problem, the interpolant is typically a linear combination of some radial basis functions (RBF). The coefficient vector c of the interpolant may be computed as the solution of a linear system Bc=y which results from enforcing the interpolation conditions for the given scattered data. While properties of the matrix B obviously depend on the choice of basis functions, several of the most commonly used approaches yield highly ill-conditioned, dense matrices B, resulting in a challenge to solve the linear system Bc=y, and hence to solve the scattered data interpolation problem. This talk deals with these challenges and some possible strategies for the solution of this system Bc=y.

In particular, we study the application of techniques from the H-matrix framework both for the approximation of the system matrix B itself as well as for the construction of solvers. H-matrices provide a data-sparse matrix format that permits storage and matrix arithmetic in almost linear complexity. It turns out that several typical sets of basis functions from the (scattered data) literature lead to matrices B that fit into this framework, yielding a cost-effective approximation scheme to be illustrated in this talk.

 

Direct Solution of Elliptic Problems Under Coefficient Updates

Xiao Liu, Rice University

 

We propose a new direct solution method for elliptic partial differential equations (PDE) that supports coefficient updates in subdomains. Factorization update is expensive for existing direct factorization methods because changes propagate to larger subdomains. We bypass this restriction by introducing factors of certain exterior problems. The update cost for the proposed method is as small as the re-factorization cost in a small subdomain. The method is tested on the transmission problem for the Helmholtz equation. The method can be applied to PDE-constraint optimization problems. This is joint work with Jianlin Xia and Maarten V. de Hoop.

 

A Fast Solver for Spectral Element Approximation Applied to Fractional Differential Equations Using Hierarchical Matrix Approximation

Zhiping Mao, Brown University

 

We develop a fast solver for the spectral element method (SEM) applied to the two-sided fractional diffusion equation on uniform, geometric and graded meshes. By approximating the singular kernel with a degenerate kernel, we construct a hierarchical matrix (H-matrix) to represent the stiffness matrix of the SEM and provide error estimates verified numerically. We can solve efficiently the H-matrix approximation problem using a hierarchical LU decomposition method, which reduces the computational cost to $O(R^2 N_d \log^2N) +O(R^3 N_d \log N)$, where $R$ it is the rank of submatrices of the H-matrix approximation, $N_d$ is the total number of degrees of freedom and $N$ is the number of elements. However, we lose the high accuracy of the SEM. Thus, we solve the corresponding preconditioned system by using the H-matrix approximation problem as a preconditioner, recovering the high order accuracy of the SEM. The condition number of the preconditioned system is independent of the polynomial degree $P$ and grows with the number of elements, but at modest values of the rank $R$ is below order 10 in our experiments, which represents a reduction of more than 11 orders of magnitude from the unpreconditioned system; this reduction is higher in the two-sided fractional derivative compared to one-sided fractional derivative. The corresponding cost is $O(R^2 N_d \log^2 N)+O(R^3 N_d \log N)+O(N_d^2)$. Moreover, by using a structured mesh (uniform or geometric mesh), we can further reduce the computational cost to $O(R^2 N_d\log^2 N) +O(R^3 N_d \log N)+ O(P^2 N\log N)$ for the preconditioned system. We present several numerical tests to illustrate the proposed algorithm using $h$ and $p$ refinements.

 

Butterfly+ Fast Direct Solvers for Highly Oscillatory Problems

Eric Michielssen, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor

 

Fast direct integral equation (IE) solvers for oscillatory problems governed by the Helmholtz and Maxwell's equations constitute an active area of research in applied mathematics and engineering. Direct solvers represent an attractive alternative to iterative schemes for problems that are inherently ill-conditioned and/or involve many excitations (right-hand sides). Present direct solvers leveraging hierarchical (semi-separable) matrix and related constructs rely on low-rank (LR) representations of blocks of discretized forward and inverse IE operators to rein in the CPU and memory requirements of traditional direct methods. Unfortunately, when applied to highly oscillatory problems, the computational complexity and memory requirements of these solvers suffer from a lack of LR compressibility of blocks of the discretized operators.

Recently, we developed a family of butterfly-enhanced direct IE solvers that circumvent this bottleneck. Contrary to LR schemes, butterfly representations are well-suited for compressing highly oscillatory operators and can be used to construct LU and Hierarchical Semi-Separable matrix-based fast direct solvers. The resulting solvers leverage randomized schemes to construct butterfly-compressed approximations of blocks obtained by adding, multiplying, and inverting previously butterfly-compressed matrices. Moreover, to render them applicable to both 2D and 3D scenarios, recursive extensions of classical butterfly structures, termed butterfly+, are called for. The CPU and memory complexities of these solvers are estimated and numerically validated to be O(N log^(2) N) and at most (O(N^(1.5) log N), respectively. Applications of the proposed solvers to radar cross section analysis, wireless propagation in complex environments, and focusing through opaque media involving millions of spatial unknowns will be discussed.

 

Linearized Bregman Iteration Method with Non-negativity Constraint Solution and Applications in Sparse Denoising

Mirjeta Pasha, Kent State University

 

The solution of discrete ill-posed problems that arise in real life situations and not only has been the aim of the work for many researchers. The Bregman method is procedure which becomes very successful when sparsity in the desired solution is involved. The purpose of this paper is to describe a way how to speed up the convergence of the Linearized Bregman method and apply non-negativity constraints to the solution. If we are not sure that the solution is sparse, we transform it in the wavelet domain and then apply the iterations that we propose. It becomes a constrained optimization problem looking for the sparsest solution of minimal norm. We prove the convergence of the sequences that our method yields. Some numerical results are shown in the end. The implementation is done in Matlab.

 

Sparse Dual of the Density Peaks Algorithm for Cluster Analysis of High-Dimensional Data

Nikos Pitsianis, Aristotle University of Thessaloniki and Duke University

 

Accelerating General Matrix Inverse Direct Solvers with Algorithmic and Software Advance

Olaf Schenk, Universita della Svizzera italiana

 

 

Communication-Avoiding Factorization Algorithms

Edgar Solomonik, University of Illinois at Urbana-Champaign

 

The parallel scalability of dense matrix factorizations is constrained by their communication and synchronization costs. We present new 3D algorithms that minimize interprocessor communication for QR factorization and the symmetric eigenvalue problem. We also present techniques that minimize synchronization cost for these problems. However, the synchronization and communication costs cannot be minimized simultaneously. Using a new general lower bound analysis technique, we formally derive this trade-off for parallel schedules that adhere to the basic dependency structure of matrix factorization algorithms. Nevertheless, the new communication-avoiding algorithms demonstrate practical speed-ups, and we discuss ongoing work on software library abstractions for such 3D algorithms.

 

Fast Column Subset Partition of a Sparse High-dimensional Data Interaction Matrix

Xiaobai Sun, Duke University

 

We introduce a formal description of the column subset partition problem abstracted from various existing algorithms for cluster analysis of data in an interpretable, high-dimensional attribute/feature space. We describe major advances in the past 50+ years and remaining gaps in multiple aspects between the desirable and partial solutions, especially with high-dimensional data. We present then a robust, linearly scaling algorithm, based on the principle and properties of local density peaks. First and most, the algorithm is robust with high dimensional data, in the presence of noise and uncertainty. It automatically detects the number of clusters in various sizes and shapes. The algorithm scales linearly with data size, regardless of feature dimensions. In several case studies, the algorithm renders higher accuracy at much higher pace in comparison to state-of-art algorithms.

 

Accelerated Interpolative Decomposition by the General Proxy Point Method

Xin Xing, Georgia Institute of Technology

 

Interpolative decomposition (ID) is a popular low-rank form widely used in structured matrix representations such as butterfly factorizations and hierarchical matrices. While dealing with a kernel matrix block, e.g., $K(X_0, Y_0)$ with a kernel function $K(x,y)$ and two point clusters $X_0$, $Y_0$, various methods have been proposed to accelerate the ID approximation compared to the more expensive QR-based algebraic approach. In this talk, we summarize the general proxy point method that converts the ID approximation problem of $K(X_0, Y_0)$ to that of $K(X_0, Y_p)$ with a proxy point set $Y_p$ that is manually selected. A rigorous error analysis of the proxy point method is provided. Furthermore, based on the error analysis, we discuss the selection of proxy point set $Y_p$, which should adapt to the kernel function and also to the domains containing $X_0$ and $Y_0$.

 

Kernel Matrix Compression with Proxy Points

Xin Ye, University of Minnesota

 

It has been long known in potential theory that, for kernels evaluated at well separated points, fast low-rank compression can be achieved via the use of proxy points. However, simple algebraic understanding of the idea is missing. In this work, we use contour integration to clearly justify the idea and provide rigorous accuracy bounds for the approximation. We also show how to choose optimal proxy points for the compression. The study is done in terms of certain typical cases, and can be generalized to many other cases. This work gives an efficient and reliable way of computing low-rank approximations to certain important matrices. This is a joint work with Jianlin Xia and Lexing Ying.

 

AMPS: A Real-time Mesh Cutting Algorithm for Surgical Simulations

Yu Hong Yeung, Purdue University

 

We present the AMPS algorithm, a finite element solution method that combines principal submatrix updates and Schur complement techniques, well-suited for interactive simulations of deformation and cutting of finite element meshes. Our approach features real-time solutions to the updated stiffness matrix systems to account for interactive changes in mesh connectivity and boundary conditions. Updates are accomplished by an augmented matrix formulation of the stiffness equations to maintain its consistency with changes to the underlying model without refactorization at each timestep. As changes accumulate over multiple simulation timesteps, the augmented solution algorithm enables tens or hundreds of updates per second. Acceleration schemes that exploit sparsity, memoization and parallelization lead to the updates being computed in real-time. The complexity analysis and experimental results for this method demonstrate that it scales linearly with the problem size. Results for cutting and deformation of 3D elastic models are reported for meshes with node counts up to 50,000, and involve models of astigmatism surgery and the brain.

 

Intrinsic Complexity and its Scaling Law: From Approximation of Random Vectors and Random Fields to High Frequency Waves

Hongkai Zhao, UC Irvine

 

We characterize the intrinsic complexity of a set in a metric space by the least dimension of a linear space that can approximate the set to a given tolerance. This is dual to the characterization using Kolmogorov n-width, the distance from the set to the best n-dimensional linear space. We study the approximation of random vectors (via principal component analysis a.k.a. singular value decomposition) and random fields (via Karhunen–Loeve expansion) as well as the approximate separability of the Green’s function of the Helmholtz equation for high frequency waves. We provide lower bounds and upper bounds for the intrinsic complexity and its asymptotic scaling law.

 

A Second Order Fully-discrete Linear Energy Stable Scheme for a Binary Compressible Viscous Fluid Model

Xueping Zhao, University of South Carolina

 

We present a linear, second order fully discrete numerical scheme on a staggered grid for a thermodynamically consistent hydrodynamic phase field model of binary compressible fluid flow mixtures derived from the generalized Onsager Principle. The hydrodynamic model not only possesses the variational structure, but also warrants the mass, linear momentum conservation as well as energy dissipation. We first reformulate the model in an equivalent form using the energy quadratization method and then discretize the reformulated model to obtain a semi-discrete partial differential equation system using the Crank-Nicolson method in time. The numerical scheme so derived preserves the mass conservation and energy dissipation law at the semi-discrete level. Then, we discretize the semi-discrete PDE system on a staggered grid in space to arrive at a fully discrete scheme using the 2nd order finite difference method, which respects a discrete energy dissipation law. We prove the unique solvability of the linear system resulting from the fully discrete scheme. Mesh refinements and two numerical examples on phase separation due to the spinodal decomposition in two polymeric fluids and interface evolution in the gas-liquid mixture are presented to show the convergence property and the usefulness of the new scheme in applications.