Abstract

Reduced-order models (ROM) are very popular for surrogate modeling of full-order computational fluid dynamics (CFD) simulations, allowing for real-time approximation of complex flow phenomena. However, their application to CFD models including large eddy simulation (LES) and direct numerical simulaton (DNS) is limited due to the highly chaotic and multi-scale nature of resolved turbulent flow. Due to the large amounts of noise present in small-scale turbulent structures, error propagation becomes a major issue, making long-term prediction of unsteady flow infeasible. While linear subspace methods like dynamic mode decomposition (DMD) can be used to pre-process turbulent flow data to remove small-scale structures, this often requires a very large number of modes and a non-trivial mode selection process. In this work, a ROM framework using Koopman \(\beta\)-variational autoencoders (\(\beta\)-VAEs) is introduced for reduced-order modeling of large-scale turbulence. The Koopman operator captures the variation of a non-linear dynamical system through a linear representation of state observables. By constraining the latent space of a \(\beta\)-VAE to grow linearly using a Koopman loss function, small-scale turbulent structures are filtered out in reconstructions of input data and latent variables are denoised in an unsupervised manner so that they can be sufficiently modeled over time. Combined with an LSTM ensemble for time series prediction of latent variables, the model is tested on LES flow past a Windsor body at multiple yaw angles, showing that the Koopman \(\beta\)-VAE can effectively denoise latent variables and remove small-scale structures from reconstructions while acting globally over multiple cases.

1 Introduction↩︎

Reduced-order models (ROMs) have become an indispensable tool in engineering design processes, allowing for accurate and rapid real-time approximations of full-order models (FOMs) of high-fidelity complex physical systems [1]. ROMs create a low-dimensional surrogate model of the FOM using a limited number of high-fidelity training data. ROMs are trained during a computationally expensive offline stage, where both the FOM is run to collect training data for use with the surrogate model and the ROM is trained. An inexpensive online stage involves rapidly evaluating the ROM at non-evaluated designs and/or timesteps. Approaches used to develop the surrogate model involve linear subspace methods like the proper orthogonal decomposition (POD) [2], dynamic mode decomposition (DMD) [3], as well as deep neural networks, particularly convolutional autoencoders (CAEs) [4]. CAEs allow for a non-linear relationship between the full-order states and a low dimensional latent space, which can drastically lower the number of variables required for reasonable accuracy when compared to linear methods, particularly for highly non-linear problems. [4]–[6].

When applied to fluid dynamics, ROMs have been shown to offer very good accuracy when applied to laminar flow [7], [8] and problems utilizing Reynolds-averaged Navier–Stokes (RANS) turbulence models [9], [10]. However, ROMs face difficulty in producing accurate results for scale-resolving turbulent simulations such as large eddy simulation (LES) [11], [12]. Turbulent flow involves an energy cascade [13] where energy is transferred from large, coherent structures like vortices to increasingly smaller eddies that are chaotic in nature. At larger scales, the flow is organized and driven by external forces. As energy cascades down to smaller scales, flow interactions become increasingly disordered until viscous forces dominate and energy is dissipated. The highly non-linear, unsteady, multi-scale, and chaotic nature of scale-resolved turbulent flows make them difficult to model accurately in a low-dimensional framework when using both physics-based and data-driven models. When applied to LES, ROMs suffer from instability, a loss of accuracy over time, and limited generalizability. Predictions become less accurate over time due to error propagation, making ROMs unsuitable for long-term prediction of unsteady flow.

When using surrogate models in computational fluid dynamics (CFD), it is often the case that obtaining predictions of the large-scale flow behavior that is organized and recurring is sufficient to guide design optimization processes. The large-scale turbulent structures are coherent, contain most of the flow’s kinetic energy, and define the macroscopic properties of the flow. Conversely, the highly chaotic nature of small-scale turbulent structures can make them intractably difficult to model over time. This is particularly problematic when using time series models like long short-term memory (LSTM) or transformer neural networks [14] as the latent variables contain large amounts of noise, making it difficult learn patterns even when using large models. While turbulent flow data can be pre-processed using DMD by selecting modes corresponding to large-scale structures, this process is often cumbersome due to the large number of modes required for reasonable accuracy and the difficulty in formulating a selection criteria for modes. Furthermore, DMD cannot be applied to multiple turbulent flow datasets simultaneously, hindering its use in design optimization.

The Koopman operator [15], [16] is a linear, infinite-dimensional operator that captures the evolution of a non-linear dynamical system using a linear representation of state observables. The Koopman operator is also instrumental in DMD [17], allowing for the extraction of spatio-temporal modes from data which are found as approximate eigenfunctions and eigenvalues of the Koopman operator. By constraining state observables to evolve linearly, the Koopman operator can allow for filtering out small-scale turbulent fluctuations while retaining large-scale coherent flow structures. In the context of autoencoders, the Koopman operator has previously been implemented to forecast sequential data such as time series [18], [19], utilizing the ability to model complex non-linear dynamics in a linearly evolving latent space. Koopman autoencoders have also been used for dynamical systems governed by physics. Lusch et al. [20] utilize the Koopman autoencoder in deep neural networks to extract eigenfunctions of dynamical systems including fluid flow by identifying non-linear coordinates on which the dynamics are globally linear through the use of Koopman-based loss functions. Otto et al. [21] impose a linear dynamics constraint on the latent space through the use of a simple loss function in order to extract Koopman eigenfunctions. A work by Nayak et al. [22] uses Koopman autoencoders with physics-informed constraints for reduced-order modeling of kinetic plasmas. Each of these works directly model the subsequent state using the autoencoder rather than reconstructing the input and do not focus on highly chaotic problems. To the best of our knowledge, there are no works that have used Koopman autoencoders for the purpose of filtering out small-scale structures from turbulent flow.

In this work, we introduce the use of Koopman \(\beta\)-variational autoencoders (\(\beta\)-VAEs) for unsupervised filtering of small-scale turbulent structures from input fields and denoising latent variables. This allows for the extraction of only large-scale coherent structures which correspond to a smoothly varying latent space that is able to be used for reduced-order modeling. \(\beta\)-VAEs are a probabilistic formulation of autoencoders that encourage latent variables to follow a target distribution, most often Gaussian, a useful property for attaining latent variables that are similar in both magnitude and structure. A Koopman loss function is used to constrain the latent variables to grow linearly in time, which also denoises them and as a result filters out small-scale chaotic fluctuations in reconstructions of input fields. A training procedure is also detailed that avoids overfitting to temporal patterns present within the training data. For time series forecasting of latent variables, LSTM ensemble models are used as they lead to improved robustness and stability over long time horizons [8]. The ROM is tested on LES flow past a Windsor body [23], a standardized automotive benchmark case, at 5 different yaw angles, highlighting the ability of the Koopman \(\beta\)-VAEs to act globally over multiple cases to smooth latent variables.

2 Methods↩︎

This section presents the methods used for the ROM implementing Koopman \(\beta\)-VAEs. Some background information of \(\beta\)-VAEs is given and the implementation of the Koopman loss function is described in addition to the training procedure for the neural network. The time series prediction component of the ROM, an LSTM ensemble utilizing bootstrap aggregation introduced by Halder et al. [8] is described briefly.

2.1 Koopman \(\beta\)-Variational Autoencoders↩︎

2.1.1 \(\beta\)-Variational Autoencoders↩︎

Standard convolutional autoencoders reconstruct the high-dimensional input state \(\boldsymbol{x} \in \mathbb{R}^{n_x \times n_y}\) using a combination of two individual feedforward neural networks, the encoder \(f_{\text{enc}}\) and decoder \(f_{\text{dec}}\),

\[f: \boldsymbol{\hat{x}} = f_{\text{enc}} \circ f_{\text{dec}}(\boldsymbol{x}),\]

where \(f_{\text{enc}}\) maps from the input to the latent vector \(\boldsymbol{z} \in \mathbb{R}^{k}\) and \(f_{\text{dec}}\) takes the latent vector as an input and provides an approximate reconstruction \(\boldsymbol{\hat{x}}\). Standard autoencoders impose no constraints or regularization on the latent space, allowing the encoder to map inputs to arbitrary regions without encouraging properties such as smoothness or continuity. The loss function typically used for standard CAEs in ROM applications is the mean square error (MSE),

\[\mathcal{L}_{\text{CAE}} = \mathcal{L}_{\text{MSE}}.\]

Variational autoencoders, first introduced by Kingma and Welling [24], incorporate a probabilistic framework. Rather than the encoder mapping inputs to a fixed latent vector, a probabilistic encoder \(q_{\phi}(\boldsymbol{z|x})\) maps the inputs to the parameters of a latent distribution, most often a multivariate Gaussian,

\[q_{\phi}(\boldsymbol{z|x}) = \mathcal{N}(\boldsymbol{z}; \boldsymbol{\mu}_{\phi}(\boldsymbol{x}), \boldsymbol{\sigma}^2_{\phi}(\boldsymbol{x})),\]

where \(\boldsymbol{\mu}_{\phi}\) and \(\boldsymbol{\sigma}^2_{\phi}\) are the mean and variance respectively. Instead of mapping to a single latent vector, the encoder in a VAE consists of two neural networks that output the mean and variance of the approximate posterior distribution over the latent variables. The decoder \(p_{\theta}(\boldsymbol{x|z})\) in VAEs remains deterministic, and latent vectors are sampled from the encoder outputs \(\boldsymbol{z} \sim \mathcal{N}(\boldsymbol{\mu}_{\phi}(\boldsymbol{x}), \boldsymbol{\sigma}^2_{\phi}(\boldsymbol{x}))\). Since sampling from the encoder’s distribution is non-differentiable and prevents backpropagation, a reparameterization trick is employed, expressing the sampled latent vector as

\[\boldsymbol{z} = \boldsymbol{\mu}_{\phi}(\boldsymbol{x}) + \boldsymbol{\sigma}_{\phi}(\boldsymbol{x})\epsilon,\]

where \(\epsilon \sim \mathcal{N}(0, I)\) is a source of randomness that allows for indirect sampling of \(\boldsymbol{z}\). The original work by Kingma and Welling can be consulted for further detail. By incorporating a probabilistic framework, VAEs allow for regularizing the latent variables to closely folow a target distribution, typically an isotropic Gaussian. Some desirable properties of isotropic Gaussians are that the variables are independent of each other and within the same numerical range. To encourage this, the loss function is augmented by the Kullback–Leibler (KL) divergence (\(D_{\text{KL}})\), a measure of distance between two distributions, between the encoder and prior function \(p(\boldsymbol{z}) = \mathcal{N}(0, I)\),

\[\mathcal{L}_{\text{VAE}} = \mathcal{L}_{\text{MSE}} + D_{\text{KL}}(q_{\phi}(\boldsymbol{z|x}), p(\boldsymbol{z})).\]

To control the impact of the KL divergence term and subsequently the level of regularization, \(\beta\)-VAEs were introduced by Higgins et al. [25], which simply add a regularization term \(\beta\) to the KL divergence term of the loss function,

\[\mathcal{L}_{\text{\beta-VAE}} = \mathcal{L}_{\text{MSE}} + \beta D_{\text{KL}}(q_{\phi}(\boldsymbol{z|x}), p(\boldsymbol{z})).\]

In practice, \(\beta\) is often slowly and linearly increased from 0 to a maximum value \(\beta_{\text{max}}\) over a number of epochs after which it remains constant to allow for better stability during training [26]. \(\beta\)-VAEs have been used for a wide number of tasks, including learning interpretable generative latent factors for images [25], anomaly detection [27], and non-linear modal analysis of fluid flow [28].

2.1.2 Koopman Loss Function↩︎

To retain only the large-scale coherent structures in the autoencoder reconstructions given raw turbulent flow data, the goal is to regularize the latent variables in a manner that filters out the small-scale choatic fluctuations. If small-scale flow structures are retained, this leads to highly noisy latent variables which are difficult to model. Encouraging the latent variables to evolve linearly according to a trainable Koopman operator matrix \(\boldsymbol{A} \in \mathbb{R}^{k \times k}\) can lead to latent variables that exhibit a smooth evolution over time, resulting in reconstructions that do not include noise from small-scale structures and are able to be sufficiently modeled. The Koopman operator constrains latent variables to evolve linearly in time,

\[\boldsymbol{z}_{t+1} = \boldsymbol{A}\boldsymbol{z}_{t},\]

where \(\boldsymbol{z}_{t}\) are the latent variables for a given temporal snapshot. The \(\beta\)-VAE loss function is augmented by a Koopman loss term \(\mathcal{L}_{\text{Koop}}\),

\[\mathcal{L}_{\text{Koop}} = \dfrac{\left\lVert\boldsymbol{z}_{t+1} - \boldsymbol{Az}_{t}\right\rVert^{2}} {\left\lVert\boldsymbol{z}_{t+1}\right\rVert^2}.\]

The loss term is the relative \(L^2\) error between the sampled latent vector at \(t+1\) and its Koopman approximation. A relative error is used so that the magnitudes of the latent variables are not arbitrarily driven to zero in order to minimize the loss. Augmenting the \(\beta\)-VAE loss function with the Koopman loss results in \(\mathcal{L}_{\text{K\beta-VAE}}\) given as

\[\mathcal{L}_{\text{K\beta-VAE}} = \mathcal{L}_{\text{MSE}} + \beta D_{\text{KL}}(q_{\phi}(\boldsymbol{z|x}), p(\boldsymbol{z})) + \alpha \mathcal{L}_{\text{Koop}},\]

where \(\alpha\) is a regularization term for the Koopman loss. The Koopman operator matrix \(\boldsymbol{A}\) is initialized as the identity matrix and all of its elements are trainable. While a vanilla CAE can be used for our desired purpose, in practice they lead to poor stability of the Koopman operator during training. Since vanilla CAEs impose no regularization on the latent space, it is often the case that the magnitudes of the latent variables vary drastically and are sensitive to the initialization of the neural network. As a result, the elements of \(\boldsymbol{A}\) tend to overfit to these trends in order to minimize the loss and fail to denoise and smooth the latent variables. By using a \(\beta\)-VAE and constraining the latent variables to be similar in magnitude, the training process is more stable and the desired result is more easily attained. Independence of latent variables, which is often why \(\beta\)-VAEs are often preferred, cannot be easily attained when implementing the Koopman loss. This would require that \(\boldsymbol{A}\) remain diagonal, which is too rigid of a constraint when attempting to encourage linear growth over time.

For proper implementation of the Koopman loss, training mini-batches containing temporally ordered data must be used. However, this can lead to lower reconstruction accuracy as the network overfits to temporal patterns in the data. Temporally consecutive data snapshots also exhibit some degree of correlation, which violates the independent and identically distributed (i.i.d.) assumption that VAEs make about the training data. To mitigate this, the model is pre-trained using only the standard \(\beta\)-VAE loss (\(\alpha = 0\)) with randomly shuffled mini-batches for a prescribed number of \(e_{\text{pre}}\) epochs so that the network can better fit the overall trends of the training data and learn a more structured latent space. After pre-training, temporally ordered mini batches are used to train the network for \(e_{\text{Koop}}\) epochs as \(\alpha\) is slowly and linearly raised from 0 to a maximum value \(\alpha_{\text{max}}\) over a number of epochs. In practice, we find that allocating approximately 2/3 of the total number of epochs to \(e_{\text{pre}}\) and 1/3 to \(e_{\text{Koop}}\) leads to optimal performance.

Figure 1: Schematic of the Koopman \beta-VAE, where small-scale structures are filtered out from turbulent input data through the use of the Koopman loss function. — Figure 1: Schematic of the Koopman \(\beta\)-VAE, where small-scale structures are filtered out from turbulent input data through the use of the Koopman loss function.

2.2 LSTM Ensembles↩︎

Neural network architectures such as transformers have been used in non-intrusive ROMs for time series prediction of low-dimensional coefficients [29], [30] as they offer a more powerful and expressive architecture compared to LSTMs. However, like with all time series models, error propagation is a major issue when using transformers. Small errors made in early predictions can compound over time, leading to large inaccuracies as the prediction horizon grows. The model’s performance becomes very sensitive to the initialization of its parameters, an undesirable property for practical use. A common approach used to mitigate this issue is ensemble learning [31], where multiple individual models, or weak learners, are combined and have their predictions aggregated to create a more stable and robust model. While the individual models exhibit high variance, their combined results provide more accurate and stable predictions. Additionally, this significantly diminishes the effect of the model’s initial parameters.

Bootstrap aggregating, often referred to as bagging [32], is a widely used ensemble learning approach that simply averages the results of multiple weak learners that are trained on different subsets of the training data chosen through random selection with replacement, as shown in Figure 2, where each subset has the same number of data points as the original dataset. By training multiple models on different subsets of the data, the weak learners become diverse and learn different patterns of the training data. As a result, aggregating their outputs leads to a large reduction in variance, leading to more stable and accurate results over time. Bagging also allows models to be trained in parallel as they are independent of each other, making it more computationally efficient than other ensemble methods like boosting [33], which requires sequential training of weak learners. As an increasing number of weak learners are added to the ensemble, the reduction in variance pleateaus, leading to diminishing returns in model performance. Beyond a certain point, the added computational expense of including more weak learners in the ensemble exceeds the marginal gains in performance. The optimal number of weak learners to use is highly problem dependent and difficult to know a priori. LSTMs are chosen to be used over transformers due to having a significantly lower training cost as well as less hyperparameters to tune, while still being powerful time series prediction models. A many-to-one LSTM architecture is used to predict \(\boldsymbol{z}\) a single temporal snapshot ahead given a window of the previous \(w\) latent variables. During inference, predictions are re-incorporated into the current window, leading to autoregressive forecasting.

Figure 2: An example of bootstrapping, where random subsets of the original dataset are chosen through sampling with replacement.

2.3 Reduced-Order Model↩︎

The ROM implemented in this work involves a computationally expensive offline stage where high-fidelity snapshots are computed by solving the FOM and training both the Koopman \(\beta\)-VAE and LSTM ensemble. A number of \(T_i\) initial snapshots are first assembled into a snapshot matrix \(\boldsymbol{X} \in \mathbb{R}^{(n_d \times T_i) \times n_x \times n_y \times n_c}\), where \(n_d\) is the number of parameters present in the training data, \(n_x\) and \(n_y\) are the number of grid points in each direction, with \(n_x \times n_y = N\), and \(n_c\) is the number of channels for each velocity component. The Koopman \(\beta\)-VAE is first pre-trained on randomly shuffled mini-batches from \(\boldsymbol{X}\) for \(e_{\text{pre}}\) epochs without the Koopman loss function (\(\alpha = 0\)) to allow the network to learn a set of well-structured latent variables in addition to avoid overfitting to temporal patterns in the data. Next, the model is trained for \(e_{\text{Koop}}\) epochs on temporally ordered mini-batches to smooth and denoise the latent variables so small-scale turbulent structures are filtered out in reconstructions. The encoder means \(\boldsymbol{\mu}\) from each sample are used to construct a matrix of latent variables \(\boldsymbol{Z} \in \mathbb{R}^{T_i \times k}\). The means are used as they represent the most likely latent representations of each sample, providing a deterministic way of describing the flow dynamics. Using the latent variables, \(l\) individual LSTMs with the same initial weights and biases are trained on different subsets of sequences of length \(w\) generated from \(\boldsymbol{Z}\) through selection with replacement.

During the online stage, the last \(w\) temporal snapshots of \(\boldsymbol{Z}\) are used as the initial LSTM window \(\boldsymbol{Z}_t\) for autoregressive forecasting of latent variables until the end of the desired time horizon at time \(T\). The average value taken over each LSTM is used to compute \(\boldsymbol{z}_{t+1}\), which is incorporated into the current window. Finally, the full-order states are predicted using the decoder \(\boldsymbol{\tilde{X}} = p_{\theta}\left(\boldsymbol{{Z}}\right)\). The ROM is outlined in Algorithm 3.

Figure 3: Offline and online stages of Koopman \beta-VAE ROM — Figure 3: Offline and online stages of Koopman \(\beta\)-VAE ROM

3 Results↩︎

The test case used in this work involves large eddy simulations of flow past a Windsor body, a simplified square-back vehicle shown in Figure 4, at 5 different yaw angles \(\delta\) = [2.5, 5, 7.5, 10, 12.5] degrees. Data are interpolated onto the plane in the turbulent wake depicted in black. The flow is simulated at a Reynolds number \(Re_L = U_\infty L/ \nu = 2.9 \times 10^6\), where \(U_\infty\) is the freestream velocity, \(L\) is the body length, and \(\nu\) is the kinematic viscosity. SOD2D (Spectral high-Order coDe 2 solve partial Differential equations), a low-dissipation GPU-based spectral element method (SEM) code [34], is used to solve the spatially filtered Navier-Stokes equations,

\[\frac{\partial \bar{u}_i}{\partial x_i} = 0, \label{eq:continuity}\tag{1}\]

\[\frac{\partial \bar{u}_i}{\partial t} + \frac{\partial \bar{u}_i \bar{u}_j }{\partial x_j} - \nu \frac{\partial^2 \bar{u}_i }{\partial x_j x_j} + \frac{1}{\rho} \frac{\partial \bar{p}}{\partial x_i} = \frac{\partial \tau_{ij}}{x_j}, \label{eq:momentum}\tag{2}\]

where \(x_i\) are the spatial coordinates (\(x, y\), and \(z\)), \(u_i\) are the velocity components (\(u, v\), and \(w\)), \(p\) is the pressure, and \(\rho\) is the density. The filtered variables are represented using a bar. The right-hand side of Equation 2 represents the subgrid stresses, with the anisotropic part represented as

\[\tau_{ij} - \frac{1}{3}\tau_{kk}\delta_{ij} = -2\nu_{\text{sgs}}\bar{\mathcal{S}}_{ij},\]

where the large-scale strain rate tensor \(\bar{\mathcal{S}}_{ij}\) is evaluated as \(\bar{\mathcal{S}}_{ij} = \frac{1}{2}\left(g_{ij} + g_{ji} \right),\) \(g_{ij} = \partial \bar{u}_i / \partial x_j\), and \(\delta_{ij}\) is the Kronecker delta function. The unresolved flow scales are modeled using a local formulation of the integral length scale approximation presented in a work by Lehmkuhl et al. [35]. The near-wall region is modeled using the Reichardt wall-law [36] using an exchange location in the 5th node [37].

The velocity components \(u\) and \(v\) are interpolated onto a uniform grid measuring 384 \(\times\) 256 points on the x-y plane at \(z/L = 0.186\) with bounds \(x/L \in [1, 1.6]\) and \(y/L \in [-0.2, 0.2]\).

Figure 4: Geometry of the Windsor body (gray) and the plane in the wake on which flow data are interpolated onto (black).

640 snapshots spanned along 58 convective time units, \(t=L/U_{\infty}\) are generated at each angle and split into 80% training and 20% test data, corresponding to \(T_i = 512\) training snapshots and 128 test snapshots, resulting in a total of 2560 training and 640 test samples. In order to provide an example of the flow topology, Figure 5 presents an instantaneous representation of the flow for the case at \(\delta=12.5\). More details on the case description, flow evaluation, and validation of the numerical methodology can be found in the AC-1.12 entry of the ERCOFTAC Knowledge Wiki [38].

Figure 5: Instantaneous Q criterion isocontours at Q=100 and \delta=12.5 portraying the streamwise velocity magnitude. — Figure 5: Instantaneous \(Q\) criterion isocontours at \(Q=100\) and \(\delta=12.5\) portraying the streamwise velocity magnitude.

Before training the Koopman \(\beta\)-VAE, the snapshot data are pre-processed using feature-wise min-max scaling between a range of [0, 1]. Data normalization improves model performance and allows for learning the optimal network parameters at a faster rate [39]. A sigmoid activation function is used in the output layer of the model to constrain the range of the outputs to match the range of the inputs, after which the data is unscaled to the original range. The model is trained for 1500 epochs, with \(e_{\text{pre}}\) = 1000 and \(e_{\text{Koop}}\) = 500. A batch size of 64 is used for both training stages. The full architecture of the Koopman \(\beta\)-VAE can be found in Appendix 5. A comparison to a non pre-trained model is given in Appendix 6. When trained on an NVIDIA H100 GPU, the model takes approximately 1200 seconds to train. Values of \(\beta_{\text{max}} = 1 \times 10^{-4}\) and \(\alpha_{\text{max}} = 1\) are used. The variation of both parameters is shown in Figure 6. \(\beta\) is increased from 0 to \(\beta_{\text{max}}\) linearly over 100 epochs, in the first 10% of the pre-training procedure, after which it remains constant. \(\alpha\) remains at 0 for the pre-training procedure to retain the vanilla \(\beta\)-VAE loss, after which it is increased linearly from 0 to \(\alpha_{\text{max}}\) over 100 epochs in the first 20% of \(e_{\text{Koop}}\). Both values are increased linearly from 0 to add stability to the training process and avoid sudden over-regularization of latent variables. \(\beta_{\text{max}}\) was chosen to maximize the reconstruction accuracy whilst retaining a latent space that remains well-structured (similar latent variable magnitudes) after the implementation of the Koopman loss. A comparison to a Koopman CAE (\(\beta_{\text{max}}\) = 0) is given in Appendix 7. The effect of different values of \(\alpha_{\text{max}}\) is given in Appendix 8. A bidirectional LSTM architecture [40] with three hidden layers each consisting of 96 neurons is used for the LSTM ensemble. A dropout rate [41] of 0.2 is used in each hidden layer. Again, the data are pre-processed using min-max scaling and a sigmoid layer is used in the output layer. Bidirectional LSTMs process data in both forward and backward directions, allowing the model to leverage both past and future context. A window size of \(w\) = 64 is used. \(l = 128\) weak learners are used and trained consecutively in parallel across 4 NVIDIA H100 GPUs, with a single model taking approximately 60 seconds to train. Training sequences are generated at each yaw angle and the ensemble model is trained on all of them simultaneously using randomly shuffled mini-batches of size 64. All hyperparameters were chosen through a trial-and-error process. At each yaw angle, inference over the 128 test snapshots takes approximately 24 seconds, which is negligible when compared to the time required for simulating the FOM.

Figure 6: Koopman \beta-VAE hyperparameters. — Figure 6: Koopman \(\beta\)-VAE hyperparameters.

3.1 Reconstruction Accuracy↩︎

The reconstruction accuracy of the Koopman \(\beta\)-VAE is compared against a vanilla CAE using the same architecture and latent dimension that is trained for 1000 epochs using only the MSE loss function (\(\beta = 0\)) on randomly shuffled mini-batches. We expect the reconstruction accuracy to be lower as small-scale turbulent structures are filtered out from the training data. The metric used to assess performance is the relative \(L^2\) error between the raw data and reconstruction \(\epsilon\),

\[\epsilon = \dfrac{\left\lVert\boldsymbol{x} - \hat{\boldsymbol{x}}\right\rVert^{2}} {\left\lVert\boldsymbol{x}\right\rVert^2}.\]

The turbulent kinetic energy \(TKE\) is also measured to quantify the reduction in velocity fluctuations from the mean flow, which are dominated by small-scale turbulent structures,

\[TKE = \frac{1}{2} \sum_{t=1}^{T_i} \left( (u^t - \bar{u})^2 + (v^t - \bar{v})^2 \right),\]

where \(\bar{u}\) and \(\bar{v}\) are the mean x and y velocity respectively. Table 1 shows both the overall and component-wise relative errors averaged over all of the training data. As expected, the vanilla CAE reconstructs the snapshots much more accurately due to the inclusion of small-scale turbulent structures. Both models exhibit significantly higher errors in reconstructing \(v\) compared to \(u\); structures corresponding to the spanwise velocity generally correspond to smaller and more chaotic scales, while the streamwise velocity is more spatially and temporally coherent. Using the Koopman \(\beta\)-VAE results in lower reconstruction accuracy due to the exclusion of small-scale structures. As a result, there is a significant decrease in the average turbulent kinetic energy.

Table 1: Reconstruction error and turbulent kinetic energy comparison for training data.
Model	\(\epsilon\)	\(\epsilon_u\)	\(\epsilon_v\)	\(TKE\)
Vanilla CAE	0.147	0.114	0.254	2.84e3
Koopman \(\beta\)-VAE	0.252	0.188	0.443	2.04e3

Figures 7-11 show contour plots of the velocity magnitude from the raw LES input data and reconstructions from both autoencoders and all yaw angles at \(t = 256\). The vanilla CAE does well at reconstructing most scales of the flow and fine-scale details are well-preserved, although not entirely. Using the Koopman \(\beta\)-VAE leads to the bulk properties of the flow being retained while finer small-scale structures are largely filtered out. While this leads to lower reconstruction accuracy, the flow is easier to model, as shown by latent variable samples in Figures 12-16. The latent variables produced by the Koopman \(\beta\)-VAE exhibit a much smoother variation over time when compared to the ones produced by the vanilla CAE, which show high levels of noise and are thus difficult to model over time. The numerical range of the latent variables produced by the Koopman \(\beta\)-VAE remain consistent, following that of an isotropic Gaussian. As the vanilla CAE imposes no constraint on the latent variables, their magnitudes can vary significantly. Given the large decrease in turbulent kinetic energy, loss of fineness in reconstructions, and smoother latent variables, it is shown that the Koopman \(\beta\)-VAE effectively filters out small-scale turbulent structures from input data. Additionally, the model can act on multiple datasets simultaneously, a notable advantage over methods like DMD.

Figure 7: Velocity magnitude comparison at t = 256 and \delta = 2.5. — Figure 7: Velocity magnitude comparison at \(t = 256\) and \(\delta = 2.5\).

Figure 8: Velocity magnitude comparison at t = 256 and \delta = 5. — Figure 8: Velocity magnitude comparison at \(t = 256\) and \(\delta = 5\).

Figure 9: Velocity magnitude comparison at t = 256 and \delta = 7.5. — Figure 9: Velocity magnitude comparison at \(t = 256\) and \(\delta = 7.5\).

Figure 10: Velocity magnitude comparison at t = 256 and \delta = 10. — Figure 10: Velocity magnitude comparison at \(t = 256\) and \(\delta = 10\).

Figure 11: Velocity magnitude comparison at t = 256 and \delta = 12.5. — Figure 11: Velocity magnitude comparison at \(t = 256\) and \(\delta = 12.5\).

Figure 12: Latent variable comparison at \(\delta = 2.5\)..

Figure 13: Latent variable comparison at \(\delta = 5\)..

Figure 14: Latent variable comparison at \(\delta = 7.5\)..

Figure 15: Latent variable comparison at \(\delta = 10\)..

Figure 16: Latent variable comparison at \(\delta = 12.5\)..

The individual components of the training loss are shown in Figure 17. \(\mathcal{L}_{\text{MSE}}\) exhibits a monotonic decrease until \(\mathcal{L}_{\text{Koop}}\) is activated at \(e = 1000\), after which it exhibits a sharp increase until it gradually decreases over time, although not to as low as it previously was due to the exclusion of small-scale fluctuations. Although \(\mathcal{L}_{\text{MSE}}\) continues to drop, the number of epochs is not increased as this leads to lower predictive performance on the test data due to overfitting. \(\mathcal{L}_{\text{KL}}\) initially decreases rapidly, until gradually increasing until reaching a near-constant value before \(\mathcal{L}_{\text{Koop}}\) is activated. Initially, as \(\beta\) increases, the model focuses on reducing \(\mathcal{L}_{\text{KL}}\) to regularize the latent space. Once \(\beta\) reaches its maximum value and remains constant, the decoder starts to utilize latent information to focus on improving reconstructions, causing \(\mathcal{L}_{\text{KL}}\) to slowly increase. After \(\mathcal{L}_{\text{Koop}}\) is activated, \(\mathcal{L}_{\text{KL}}\) exhibits a sharp increase, after which it slightly fluctuates. This is similar to the behavior of \(\mathcal{L}_{\text{Koop}}\), which first decreases rapidly until exhibiting small fluctuations. Due to the implementation of the Koopman operator \(\boldsymbol{A}\), the latent variables exhibit dependence on each other and are no longer independent of each other after the pre-training stage. However, the marginal distributions are still approximately Gaussian which allows the latent variables to be similar in magnitude. When using the Koopman operator with a vanilla CAE, where magnitudes vary, denoising is not achieved, as shown in Appendix 7.

Figure 17: Koopman \(\beta\)-VAE training losses..

3.2 ROM Results↩︎

Table 2 shows the overall and component-wise predictive accuracy of both the reconstructions and ROM averaged over all of the test data. The errors are significantly higher when compared to the training data, which is expected for highly non-linear data. Again, the spanwise errors are significantly greater than the streamwise ones, due to them corresponding to small-scale fluctuations. However, the errors in the streamwise velocity, which is representative of the bulk flow properties, are reasonable. The ROM produces errors that are close to that of the reconstructions, showing that the LSTM ensemble can reasonably predict the evolution of the latent variables over time with a good degree of stability.

Figures 18-22 show velocity magnitude contours of the raw LES data, ROM predictions, and absolute errors at each yaw angle at \(t = T_i + [40, 80, 120]\). The errors are highest at \(\delta = [2.5, 12.5]\), which is expected as they are at the extremes of the training data range. At all angles and prediction points, errors are concentrated in the shear layers and near-wake region. Both areas are highly chaotic, especially the near-wake region where both shear layers mix and small-scale structures dominate. The errors don’t exhibit a growing trend over time, highlighting the stability of the ROM’s predictions which can be attributed to the use of an ensemble model. While the ROM predictions visually differ greatly from the ground truth LES data, the intended goal of the ROM framework is to accurately capture the bulk properties of the flow corresponding to large-scale structures over unseen time horizons in real-time.

Figures 23-27 contain latent variables from both reconstructions and ROM predictions of the test data at each yaw angle. While the latent variables of the test data are denoised, it is not to the same extent of the training data, which had the Koopman operator directly applied to them. The ROM predicts the overall pattern of the latent variables well, with peaks and troughs being effectively tracked. The predictions at \(\delta = [5, 7.5, 10]\) are most accurate, where the patterns of the test data are very well matched. The latent variables are better predicted initially after which there is some divergence from the test data, although this doesn’t grow drastically. Again, the predictions are less accurate at \(\delta = [2.5, 12.5]\), where the latent variable frequencies vary greatly from the other angles. As an ensemble model was used, the latent variable predictions are largely non-sensitive to the initial weights and biases of the LSTM model and the choice of seed has minimal impact on the predictions.

Table 2: Prediction error comparison for test data.
Prediction	\(\epsilon\)	\(\epsilon_u\)	\(\epsilon_v\)
Reconstruction	0.392	0.286	0.704
ROM	0.442	0.323	0.787

Figure 18: ROM predictions and absolute errors at \delta = 2.5. — Figure 18: ROM predictions and absolute errors at \(\delta = 2.5\).

Figure 19: ROM predictions and absolute errors at \delta = 5. — Figure 19: ROM predictions and absolute errors at \(\delta = 5\).

Figure 20: ROM predictions and absolute errors at \delta = 7.5. — Figure 20: ROM predictions and absolute errors at \(\delta = 7.5\).

Figure 21: ROM predictions and absolute errors at \delta = 10. — Figure 21: ROM predictions and absolute errors at \(\delta = 10\).

Figure 22: ROM predictions and absolute errors at \delta = 12.5. — Figure 22: ROM predictions and absolute errors at \(\delta = 12.5\).

Figure 23: ROM latent variable prediction at \delta = 2.5. — Figure 23: ROM latent variable prediction at \(\delta = 2.5\).

Figure 24: ROM latent variable prediction at \delta = 5. — Figure 24: ROM latent variable prediction at \(\delta = 5\).

Figure 25: ROM latent variable prediction at \delta = 7.5. — Figure 25: ROM latent variable prediction at \(\delta = 7.5\).

Figure 26: ROM latent variable prediction at \delta = 10. — Figure 26: ROM latent variable prediction at \(\delta = 10\).

Figure 27: ROM latent variable prediction at \delta = 12.5. — Figure 27: ROM latent variable prediction at \(\delta = 12.5\).

4 Conclusion↩︎

In this work, a ROM framework utilizing Koopman \(\beta\)-variational autoencoders is introduced for reduced-order modeling of large-scale turbulent flow structures. Surrogate modeling of scale-resolving turbulent flow simulations such as large eddy simulation poses a challenge due to the presence of chaotic small-scale turbulent structures which are difficult to accurately model over time using both physics-based and data-driven methods. Often, the goal of surrogate modeling is to only approximate the macroscopic, bulk flow properties which contain most of the kinetic energy to reasonably inform the design optimization process in real-time. While linear subspace methods like dynamic mode decomposition exist to extract spatio-temporal modes from turbulent flow data, they often require a very large number of modes for satisfactory accuracy and the mode selection process is non-trivial. Additionally, DMD cannot be applied to multiple datasets simultaneously, limiting its use. To this end, Koopman \(\beta\)-VAEs are implemented as an alternative to vanilla CAEs to filter small-scale turbulent structures from turbulent flow data in an unsupervised manner. Through the use of a Koopman loss function which encourages a linear evolution of the latent space in time, latent variables are efficiently denoised and are able to be sufficiently modeled over time. \(\beta\)-VAEs are used as they produce a well-structured latent space that allows for a more efficient implementation of the Koopman loss.

When applied to a dataset involving turbulent flow past a Windsor body at multiple yaw angles, the Koopman \(\beta\)-VAE is shown to efficiently filter out small-scale turbulent structures from input data in reconstructions and denoise latent variables. An ensemble LSTM is used as the time series prediction model and can predict latent variables over unseen time horizons with acceptable accuracy and high stability, with ROM errors matching reconstruction errors closely and not growing drastically over time. To maximize predictive performance and avoid overfitting to temporal patterns, the model is pre-trained on randomly shuffled mini-batches before being trained on temporally ordered mini-batches, which are required for correct implementation of the Koopman loss function. A limitation of the method is that it is not readily applicable to unstructured meshes and flow data must be interpolated onto a uniform grid. Additionally, the filtering of small-scale structures is unsupervised and a threshold for which flow scales are retained cannot be set explicitly. Future work will focus on extending the method to unstructured meshes using graph neural networks and applying it to three-dimensional flows.

Acknowledgements↩︎

The research leading to this work has been partially funded by the project TIFON with reference PLEC2023-010251/ AEI/10.13039/501100011033. Benet Eiximeno’s work was funded by a contract from the Subprograma de Ayudas Predoctorales given by the Ministerio de Ciencia e Innovacion (PRE2021-096927). Oriol Lehmkuhl has been partially supported by a Ramon y Cajal postdoctoral contract (Ref: RYC2018- 025949-I). The authors acknowledge the support given by the Departament de Recerca i Universitats de la Generalitat de Catalunya to the Large-scale Computational Fluid Dynamics Research Group (Code: 2021 SGR 00902). We also acknowledge the Barcelona Supercomputing Center for awarding us access to the MareNostrum V machine based in Barcelona, Spain.

5 Koopman \(\beta\)-VAE Architecture↩︎

The architecture of the Koopman \(\beta\)-VAE used in this work is shown in Table 3. The encoder and decoder consist of a series of convolutional and convolutional transpose layers with a kernel size of 3 and a stride of 2 followed by batch normalization layers [42], which help reduce internal covariate shift and lead to more efficient gradient flow. In the encoder, the number of filters is doubled in each subsequent convolutional layer while the spatial resolution is reduced by a factor of 2, which allows the network to learn progressively more abstract features, while the opposite is done in the decoder. Two fully connected layers are present in both the encoder and decoder; although this increases the total number of parameters, it also leads to greatly improved reconstruction accuracy. Before the latent space given in the table, the encoder feeds separately into \(\boldsymbol{\mu}\) and log\((\boldsymbol{\sigma}^2)\), from which \(\boldsymbol{z}\) is sampled. The log of the variance is used as this leads to improved numerical stability when training the network, a common approach when training VAEs. Tanh activation functions are used for all convolutional and fully connected layers except the latent space, which uses no activation function, and the output layer, which uses a sigmoid activation function to constrain the data to the pre-processing range. In total, the model contains 25,993,050 parameters. PyTorch [43] is used to train the model in addition to the LSTM ensemble.

Table 3: Koopman \(\beta\)-VAE architecture.
Layer	Number of Filters	Kernel Size	Activation Function	Size of Output
Input				384 \(\times\) 256 \(\times\) 2
Convolutional	16	3 \(\times\) 3	Tanh	192 \(\times\) 128 \(\times\) 16
Batch Norm				192 \(\times\) 128 \(\times\) 16
Convolutional	32	3 \(\times\) 3	Tanh	96 \(\times\) 64 \(\times\) 32
Batch Norm				96 \(\times\) 64 \(\times\) 32
Convolutional	64	3 \(\times\) 3	Tanh	48 \(\times\) 32 \(\times\) 64
Batch Norm				48 \(\times\) 32 \(\times\) 64
Convolutional	128	3 \(\times\) 3	Tanh	24 \(\times\) 16 \(\times\) 128
Batch Norm				24 \(\times\) 16 \(\times\) 128
Convolutional	256	3 \(\times\) 3	Tanh	12 \(\times\) 8 \(\times\) 256
Batch Norm				12 \(\times\) 8 \(\times\) 256
Reshape				24576
Fully Connected			Tanh	512
Fully Connected (Latent Space)				10
Fully Connected			Tanh	512
Fully Connected			Tanh	24576
Reshape	256			12 \(\times\) 8 \(\times\) 256
Convolutional Transpose	128	3 \(\times\) 3	Tanh	24 \(\times\) 16 \(\times\) 128
Batch Norm				24 \(\times\) 16 \(\times\) 128
Convolutional Transpose	64	3 \(\times\) 3	Tanh	48 \(\times\) 32 \(\times\) 64
Batch Norm				48 \(\times\) 32 \(\times\) 64
Convolutional Transpose	32	3 \(\times\) 3	Tanh	96 \(\times\) 64 \(\times\) 32
Batch Norm				96 \(\times\) 64 \(\times\) 32
Convolutional Transpose	16	3 \(\times\) 3	Tanh	192 \(\times\) 128 \(\times\) 16
Batch Norm				192 \(\times\) 128 \(\times\) 16
Convolutional Transpose	2	3 \(\times\) 3	Sigmoid	384 \(\times\) 256 \(\times\) 2

6 Effect of Pre-training↩︎

To avoid overfitting to temporal patterns present within the training data and provide a well-structured latent space before using the Koopman loss function, the model is pre-trained on randomly shuffled mini-batches using only the \(\beta\)-VAE loss. The hyperparameter scheduling for a model trained only on temporally ordered mini-batches without a pre-training stage is shown in Figure 28. Again, \(\beta\) is linearly increased to its maximum value over the first 10% of the total number of epochs, while \(\alpha\) is over the first 20%. Figure 29 shows a comparison for each component of the training loss against the non pre-trained model. Both \(\mathcal{L}_{\text{MSE}}\) are \(\mathcal{L}_{\text{KL}}\) are significantly higher and plateau quickly, while \(\mathcal{L}_{\text{Koop}}\) decreases rapidly and also plateaus early in the training procedure. This suggests that the model quickly overfits to temporal patterns in the training data due to the use of temporally ordered mini-batches. This also poses a problem for \(\mathcal{L}_{\text{KL}}\), as a key assumption of VAEs is that samples are i.i.d. Additionally, implementing the Koopman loss function immediately further hinders the models ability to learn a well-structured latent space as there are two competing regularizing loss functions. Table 4 shows the overall and component-wise reconstruction errors for the training data from both models, where the pre-trained model offers markedly better performance in all metrics. Figure 30 shows latent variable samples from the non pre-trained model at \(\delta = 7.5\). Large amounts of noise are retained in the latent variables and the scales do not match those of a standard Gaussian, highlighting the importance of allowing the latent variables to become well-structured before implementing the Koopman loss function.

Figure 28: Koopman \beta-VAE hyperparameters for a non pre-trained model. — Figure 28: Koopman \(\beta\)-VAE hyperparameters for a non pre-trained model.

Figure 29: Training loss comparisons with a non pre-trained model..

Table 4: Effect of pre-training on reconstruction errors for the training data.
Model	\(\epsilon\)	\(\epsilon_u\)	\(\epsilon_v\)
Pre-trained	0.252	0.188	0.443
Non pre-trained	0.332	0.248	0.582

Figure 30: Non pre-trained Koopman \(\beta\)-VAE latent variable samples at \(\delta\) = 7.5..

7 Comparison to a Koopman CAE↩︎

The rationale behind using a \(\beta\)-VAE is that it results in a well-structured latent space with latent variables being similar in magnitude and distribution, leading to a more stable implementation of the Koopman loss function and better denoising. Using the same hyperparameter setup and number of epochs as in Figure 6 except with \(\beta = 0\), results are compared to a Koopman CAE using the same architecture. A comparison of the training losses are shown in Figure 31. As expected, \(\mathcal{L}_{\text{MSE}}\) is lower for the Koopman CAE in both training stages. \(\mathcal{L}_{\text{Koop}}\) is also lower for the Koopman CAE, which can be attributed to the lack of any constraint being placed on the latent space. Figure 32 shows two latent variable samples at \(\delta = 7.5\) from the Koopman CAE; in spite of attaining a significantly lower final value of \(\mathcal{L}_{\text{Koop}}\), the latent variables fail to be adequately denoised. When applied to a CAE, where the latent variables are unconstrained and exhibit high variance in scale, the Koopman operator \(\boldsymbol{A}\) is forced to reduce the loss while fitting to spurious patterns within the latent space. While the latent variables do follow an approximately linear evolution in time and small-scale structures are filtered out, their mismatched scales result in a non-smooth variation. The Koopman operators from both models are shown in Figure 33. There are no significant differences between them; as expected, the diagonal elements are close to 1, while the off-diagonal elements are much smaller and similar in scale. Figure 34 shows velocity magnitude comparisons between the raw LES data, Koopman CAE, and Koopman \(\beta\)-VAE at \(t = 256\) and \(\delta = 7.5\). While the reconstructions from both models are similar, the latent variables produced by the Koopman CAE are much more difficult to predict over time due to the amount of noise present.

Figure 31: Training loss comparison between Koopman \(\beta\)-VAE and CAE..

Figure 32: Koopman CAE latent variable samples at \(\delta\) = 7.5..

Figure 33: Koopman operator \(\boldsymbol{A}\) from both models..

Figure 34: Velocity magnitude comparison at t = 256 and \delta = 7.5 with a Koopman CAE. — Figure 34: Velocity magnitude comparison at \(t = 256\) and \(\delta = 7.5\) with a Koopman CAE.

8 Impact of \(\alpha\)↩︎

While the Koopman \(\beta\)-VAE efficiently removes small-scale structures from reconstructions of turbulent input data, a limitation is that a threshold for which flow scales are retained cannot be set explicitly. However, \(\alpha_{\text{max}}\) can be altered to control \(\mathcal{L}_{\text{Koop}}\) and subsequently the degree of linearity. Figure 35 shows a comparison between the three different components of the loss function with values of \(\alpha_{\text{max}} = [0.1, 1, 10]\) using the same hyperparameter scheduling in Figure 6. Increasing \(\alpha_{\text{max}}\) imposes a stricter constraint on the evolution of the latent space in time, and higher values result in higher values of \(\mathcal{L}_{\text{MSE}}\) and \(\mathcal{L}_{\text{KL}}\). As expected, \(\mathcal{L}_{\text{Koop}}\) significantly decreases as \(\alpha_{\text{max}}\) is increased. Figure 36 shows latent variable samples at \(\delta = 7.5\) and \(\alpha_{\text{max}} = 0.1\) and 10. A value of 0.1 results in under-regularization of the latent space and a failure to denoise. Raising \(\alpha_{\text{max}}\) to 10 results in over-regularization; the range of the latent variables no longer follow that of a standard Gaussian, implying that \(\mathcal{L}_{\text{KL}}\) rises too much. While the latent variables do lower in frequency, they still exhibit high levels of noise and non-smoothness, suggesting that coherent structures are also filtered out. This is shown in velocity magnitude comparisons at \(t = 256\) and \(\delta = 7.5\) in Figure 37; using \(\alpha_{\text{max}} = 0.1\) does filter out small-scale structures but retains noise in the latent space, while using \(\alpha_{\text{max}} = 10\) filters out both large and small-scale structures, resulting in flow resembling the mean being reconstructed. These results show that the ability to both filter out small-scale structures and denoise the latent variables is highly sensitive to \(\alpha_{\text{max}}\) and requires a careful choice, which is likely problem dependent.

Figure 35: Training losses with different values of \(\alpha_{\text{max}}\)..

Figure 36: Latent variable samples at \(\delta\) = 7.5 and \(\alpha_{\text{max}}\) = 0.1 and 10..

Figure 37: Velocity magnitude comparison at t = 256 and \delta = 7.5 for different values of \alpha_{\text{max}}. — Figure 37: Velocity magnitude comparison at \(t = 256\) and \(\delta = 7.5\) for different values of \(\alpha_{\text{max}}\).

References↩︎

[1]

David J Lucia, Philip S Beran, and Walter A Silva. Reduced-order modeling: new approaches for computational physics. Progress in Aerospace Sciences, 40(1-2):51–117, 2004.

[2]

G Berkooz, PJ Holmes, and John Lumley. The proper orthogonal decomposition in the analysis of turbulent flows. Annual Review of Fluid Mechanics, 25:539–575, 11 2003.

[3]

Jonathan H Tu. Dynamic mode decomposition: Theory and applications. PhD thesis, Princeton University, 2013.

[4]

Kookjin Lee and K. Carlberg. Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders. J. Comput. Phys., 404, 2020.

[5]

Rakesh Halder, Krzysztof J Fidkowski, and Kevin J Maki. Non-intrusive reduced-order modeling using convolutional autoencoders. International Journal for Numerical Methods in Engineering, 123(21):5369–5390, 2022.

[6]

Romit Maulik, Bethany Lusch, and Prasanna Balaprakash. Reduced-order modeling of advection-dominated systems with recurrent neural networks and convolutional autoencoders. Physics of Fluids, 33(3), 2021.

[7]

J.S. Hesthaven and S. Ubbiali. Non-intrusive reduced order modeling of nonlinear problems using neural networks. Journal of Computational Physics, 363:55–78, 2018.

[8]

Rakesh Halder, Mohammadmehdi Ataei, Hesam Salehipour, Krzysztof Fidkowski, and Kevin Maki. Reduced-order modeling of unsteady fluid flow using neural network ensembles. Physics of Fluids, 36(7), 2024.

[9]

Kevin Carlberg, Charbel Bou-Mosleh, and Charbel Farhat. Efficient non-linear model reduction via a least-squares Petrov–Galerkin projection and compressive tensor approximations. International Journal for Numerical Methods in Engineering, 86(2):155–181, 2011.

[10]

Ping He, Rakesh Halder, Krzysztof Fidkowski, Kevin Maki, and Joaquim RRA Martins. An efficient nonlinear reduced-order modeling approach for rapid aerodynamic analysis with OpenFOAM. In AIAA Scitech 2021 Forum, page 1476, 2021.

[11]

Nicholas Arnold-Medabalimi, Cheng Huang, and Karthik Duraisamy. Large-eddy simulation and challenges for projection-based reduced-order modeling of a gas turbine model combustor. International Journal of Spray and Combustion Dynamics, 14(1-2):153–175, 2022.

[12]

Annalisa Quaini, Omer San, Alessandro Veneziani, and Traian Iliescu. Bridging large eddy simulation and reduced order modeling of convection-dominated flows through spatial filtering: Review and perspectives. arXiv preprint arXiv:2407.00231, 2024.

[13]

Athony Leonard. Energy cascade in large-eddy simulations of turbulent fluid flows. In Advances in Geophysics, volume 18, pages 237–248. Elsevier, 1975.

[14]

Ricardo Vinuesa and Steven L Brunton. Enhancing computational fluid dynamics with machine learning. Nature Computational Science, 2(6):358–366, 2022.

[15]

Bernard O Koopman. Hamiltonian systems and transformation in hilbert space. Proceedings of the National Academy of Sciences, 17(5):315–318, 1931.

[16]

Samuel E Otto and Clarence W Rowley. Koopman operators for estimation and control of dynamical systems. Annual Review of Control, Robotics, and Autonomous Systems, 4(1):59–87, 2021.

[17]

Matthew O Williams, Ioannis G Kevrekidis, and Clarence W Rowley. A data–driven approximation of the Koopman operator: Extending dynamic mode decomposition. Journal of Nonlinear Science, 25:1307–1346, 2015.

[18]

Omri Azencot, N Benjamin Erichson, Vanessa Lin, and Michael Mahoney. Forecasting sequential data using consistent Koopman autoencoders. In International Conference on Machine Learning, pages 475–485. PMLR, 2020.

[19]

Ilan Naiman, N Benjamin Erichson, Pu Ren, Michael W Mahoney, and Omri Azencot. Generative modeling of regular and irregular time series data via Koopman VAEs. arXiv preprint arXiv:2310.02619, 2023.

[20]

Bethany Lusch, J Nathan Kutz, and Steven L Brunton. Deep learning for universal linear embeddings of nonlinear dynamics. Nature Communications, 9(1):4950, 2018.

[21]

Samuel E Otto and Clarence W Rowley. Linearly recurrent autoencoder networks for learning dynamics. SIAM Journal on Applied Dynamical Systems, 18(1):558–593, 2019.

[22]

Indranil Nayak, Mrinal Kumar, and Fernando L Teixeira. Koopman autoencoders for reduced-order modeling of kinetic plasmas. Advances in Electromagnetics Empowered by Artificial Intelligence and Deep Learning, pages 515–542, 2023.

[23]

Gary J Page and Astrid Walle. Towards a standardized assessment of automotive aerodynamic CFD prediction capability-AutoCFD 2: Windsor body test case summary. Technical report, SAE Technical Paper, 2022.

[24]

Diederik P Kingma and Max Welling. Auto-encoding variational Bayes, 2013.

[25]

Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. beta-vae: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations, 2017.

[26]

Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy Bengio. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349, 2015.

[27]

Furkan Ulger, Seniha Esen Yuksel, and Atila Yilmaz. Anomaly detection for solder joints using \(\beta\)-VAE. IEEE Transactions on Components, Packaging and Manufacturing Technology, 11(12):2214–2221, 2021.

[28]

Hamidreza Eivazi, Soledad Le Clainche, Sergio Hoyas, and Ricardo Vinuesa. Towards extraction of orthogonal and parsimonious non-linear modes from turbulent flows. Expert Systems with Applications, 202:117038, 2022.

[29]

Alberto Solera-Rico, Carlos Sanmiguel Vila, Miguel Gómez-López, Yuning Wang, Abdulrahman Almashjary, Scott TM Dawson, and Ricardo Vinuesa. \(\beta\)-variational autoencoders and transformers for reduced-order modelling of fluid flows. Nature Communications, 15(1):1361, 2024.

[30]

Pin Wu, Feng Qiu, Weibing Feng, Fangxing Fang, and Christopher Pain. A non-intrusive reduced order model with transformer neural network and its application. Physics of Fluids, 34(11), 2022.

[31]

Jorg D Wichard and Maciej Ogorzalek. Time series prediction with ensemble models. In 2004 IEEE international Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), volume 2, pages 1625–1630. IEEE, 2004.

[32]

Leo Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996.

[33]

Yoav Freund, Robert Schapire, and Naoki Abe. A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence, 14(771-780):1612, 1999.

[34]

Lucas Gasparino, Filippo Spiga, and Oriol Lehmkuhl. : A GPU-enabled spectral finite elements method for compressible scale-resolving simulations. Computer Physics Communications, 297:109067, 2024.

[35]

Oriol Lehmkuhl, Ugo Piomelli, and Guillaume Houzeaux. On the extension of the integral length-scale approximation model to complex geometries. International Journal of Heat and Fluid Flow, 78:108422, 2019.

[36]

Hans Reichardt. Vollständige darstellung der turbulenten geschwindigkeitsverteilung in glatten leitungen. ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift für Angewandte Mathematik und Mechanik, 31(7):208–219, 1951.

[37]

O Lehmkuhl, GI Park, ST Bose, and P Moin. Large-eddy simulation of practical aeronautical flows at stall conditions. Proceedings of the 2018 Summer Program, Center for Turbulence Research, Stanford University, 87, 2018.

[38]

Benet Eiximeno, Oriol Lehmkuhl, and Ivette Rodriguez. , 2025. https://kbwiki.ercoftac.org/w/index.php/AC1-12.

[39]

J. Sola and J. Sevilla. Importance of input data normalization for the application of neural networks to complex industrial problems. IEEE Transactions on Nuclear Science, 44:1464–1468, 1997.

[40]

Mike Schuster and Kuldip K Paliwal. Bidirectional recurrent neural networks. IEEE transactions on Signal Processing, 45(11):2673–2681, 1997.

[41]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.

[42]

Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pages 448–456. pmlr, 2015.

[43]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.

Corresponding author: rakesh.halder@bsc.es↩︎

Reduced-order modeling of large-scale turbulence
using Koopman \(\beta\)-variational autoencoders

Abstract

1 Introduction↩︎

2 Methods↩︎

2.1 Koopman \(\beta\)-Variational Autoencoders↩︎

2.1.1 \(\beta\)-Variational Autoencoders↩︎

2.1.2 Koopman Loss Function↩︎

2.2 LSTM Ensembles↩︎

2.3 Reduced-Order Model↩︎

3 Results↩︎

3.1 Reconstruction Accuracy↩︎

3.2 ROM Results↩︎

4 Conclusion↩︎

Acknowledgements↩︎

5 Koopman \(\beta\)-VAE Architecture↩︎

6 Effect of Pre-training↩︎

7 Comparison to a Koopman CAE↩︎

8 Impact of \(\alpha\)↩︎

References↩︎

Subjects

Updated on Academus

Reduced-order modeling of large-scale turbulence using Koopman \(\beta\)-variational autoencoders

Abstract

1 Introduction↩︎

2 Methods↩︎

2.1 Koopman \(\beta\)-Variational Autoencoders↩︎

2.1.1 \(\beta\)-Variational Autoencoders↩︎

2.1.2 Koopman Loss Function↩︎

2.2 LSTM Ensembles↩︎

2.3 Reduced-Order Model↩︎

3 Results↩︎

3.1 Reconstruction Accuracy↩︎

3.2 ROM Results↩︎

4 Conclusion↩︎

Acknowledgements↩︎

5 Koopman \(\beta\)-VAE Architecture↩︎

6 Effect of Pre-training↩︎

7 Comparison to a Koopman CAE↩︎

8 Impact of \(\alpha\)↩︎

References↩︎

Subjects

Updated on Academus

Reduced-order modeling of large-scale turbulence
using Koopman \(\beta\)-variational autoencoders