October 08, 2025
Molecular clouds (MCs) are active sites of star formation in galaxies, and their formation and evolution are largely affected by stellar feedback. This includes outflows and winds from newly formed stars, radiation from young clusters, and supernova explosions. High-resolution molecular line observations allow for the identification of individual star-forming regions and the study of their integrated properties. Moreover, state-of-the-art simulations are now capable of accurately replicating the evolution of MCs including all key stellar feedback processes. We present \(^{13}\)CO(2-1) synthetic observations of the STARFORGE simulations produced using the radiative transfer code RADMC-3D, matching the observational setup of the SEDIGISM survey. From these, we identified the population of MCs using hierarchical clustering and analysed them to provide insights into the interpretation of observed MCs as they evolve. The flux distributions of the post-processed synthetic observations and the properties of the MCs, namely radius, mass, velocity dispersion, virial parameter and surface density, are consistent with those of SEDIGISM . Both samples of MCs occupy the same regions in the scaling relation plots; however, the average distributions of MCs at different evolutionary stages do not overlap on the plots. This highlights the reliability of our approach in modelling SEDIGISM and suggests that MCs at different evolutionary stages contribute to the scatter in observed scaling relations. We study the trends in MC properties, morphologies, and fragmentation over time to analyse their physical structure as they form, evolve, and are destroyed. MCs appear as small, diffuse cloudlets in early stages, followed by their evolution to filamentary structures, before being shaped by stellar feedback into 3D bubbles and getting dispersed. These trends in the observable properties of MCs are consistent with other realisations of simulations and provide strong evidence that clouds exhibit distinct morphologies over the course of their evolution.
The molecular gas in the interstellar medium (ISM) is hierarchically clustered in the form of molecular clouds [1], [2]. Molecular clouds are turbulent, magnetically supercritical [3] structures with sizes \(\sim 1\)–\(200\) pc [4], [5], mass M \(\sim 10^2\)–\(10^7 ~ \mathrm{M_\odot}\) [6], surface densities \(\Sigma \sim 1\)–\(1000 ~ \mathrm{M_\odot \; pc^{-2}}\) [7] and virial parameter (\(\alpha_{vir}\)) around unity [8], [9]. They show various morphologies [10], with filaments being the most ubiquitous [11]–[16].
The evolution of molecular clouds is often presented in the form of two scenarios, i.e. regulated by either SN-driven turbulence [17], [18] or hierarchical gravitational collapse [19], [20]. These mechanisms lead to the fragmentation and formation of dense cores, which further lead to the formation of protostars. Protostars accrete gas from the surrounding ISM, leading to feedback in the form of bipolar outflows [21]. These outflows (jets) are relatively collimated structures that heat and compress the gas as they interact with the surrounding ISM up to pc-scales [22]–[24]. Outflows can vary significantly in energetics, with momentum rates between \(10^{-5}\) and \(10^{-2} ~\mathrm{M}_\odot ~ \mathrm{km} \mathrm{s}^{-1} ~ \mathrm{yr}^{-1}\), for low-mass and O-type stars, respectively [25], [26].
As the forming protostars move onto the main sequence stage, they begin to drive isotropic stellar winds [27], which help clear out the remaining gas in their envelope. Stellar winds form bubble-shaped cavities around stars [28]–[30] by releasing a significant amount of kinetic energy into the ISM (\(\sim 10^{48}\) erg over the lifetime of the bubble, [31]). In addition to stellar winds, high-energy radiation from massive stars (\(M\ge 8\) \(M_{\odot}\)) ionises the surrounding gas, releasing thermal energy \(\sim 10^{46}\) erg, and forming ionised (H ii) regions [32]–[34]. Stellar winds and photoionising radiations are active during the main-sequence phase of stars (few Myrs) and play a major role in dispersing molecular clouds [35], [36]. The last form of feedback, and perhaps one of the most important in setting the global conditions of the ISM in galaxies, are supernova explosions from massive stars [37], [38]. Supernovae inject a large amount of energy (\(\sim 10^{51}\) ergs) and drive the supersonic turbulence in the ISM on tens to hundreds of parsec scales. [18], [39].
Although various stellar feedback mechanisms have a significant impact on cloud formation, evolution, and dissolution [2], observational studies of molecular clouds rarely attempt to distinguish between clouds affected by different forms of feedback. Clouds are often generalised into a single population of quasi-static entities in near equilibrium [5], [40], and are collectively analysed using scaling relation, such as those of [41] and [42]. However, they are not necessarily hydrostatic structures; as both turbulence support (static clouds) and global hierarchical collapse (GHC; dynamical clouds), cloud formation models produce the same observational signatures [43]. Studying the effects of local feedback events on cloud properties might provide some evidence in support or opposition to these models.
The last decade has led to several high-resolution surveys of the Milky Way, revealing the intricate structure of molecular clouds and allowing their properties to be resolved in detail (COHRS [44], CHIMPS [45], SEDIGISM [46], OGHReS [47], [48]). Complementing such observations, state-of-the-art Giant Molecular Cloud (GMC) simulations are now able to model the ISM at high resolution (e.g. SICLL [49], [50], STARFORGE [51], FIRE [52]), while including most of the physical phenomena related to star formation. Comparison of observations and simulations requires mapping the simulations into observational space using radiative transfer [53] to model the emission that the simulated gas and stars would produce.
One goal of the paper is to use state-of-the-art simulations that incorporate all the relevant physics of star formation to produce synthetic observations that closely resemble observational data. We generate synthetic observations from the STARFORGE (Star Formation in Gaseous Environments) simulations following the observational setup of the SEDIGISM survey [5]. Using RADMC-3D, we produce the \(^{13}\)CO(2-1) line-emission cubes from the simulations and use a dendrogram-based cloud identification technique to extract molecular clouds (MCs). We thus evaluate the extent to which current simulations can reproduce the observed MCs. This work also aims to analyse the observable properties of MCs as they evolve and determine whether their distributions vary across different evolutionary stages, in order to provide some insight into the interpretation of observational molecular cloud surveys. As the cloud properties are comparable to SEDIGISM, we are able to provide a direct comparison of the cloud properties, as they evolve and are affected by stellar feedback.
The structure of the paper is as follows. Section 2 describes the STARFORGE simulations used for our analysis and the SEDIGISM survey, which we use as a observational comparative benchmark. In Section 3, we describe the methodology to create the synthetic observations from STARFORGE (Sect. 3.1), along with the post-processing of the data (Sect. 3.2). We then describe the dendrogram algorithm used to extract stuctures from these synthetic observation cubes in Sect. 3.3, followed by the definition of MC properties. We present our results on the comparison between clouds from the simulations and those from SEDIGISM in Sect. 4.1. We investigate the evolution of the properties and morphologies of clouds over time and connect them to formation of stars in Sect. 4.2. We analyse the scaling relations to compare the correlation between STARFORGE and SEDIGISM cloud properties and understand the time evolution of clouds on these plots in Sect. 5. In Sect. 6, we explain the trends in the properties, morphology, and substructures of MCs over time and connect them to their observed counterparts. We conclude by summarising our work in Sect. 7.
The STARFORGE 2 simulations are three-dimensional radiation magnetohydrodynamic (MHD) simulations that follow the evolution of GMCs, achieving spatial resolutions down to a few tens of AU. These simulations model the formation, evolution, and dynamics of individual stars within a GMC, incorporating all forms of stellar feedback: jets, radiation, stellar winds, and supernovae. The simulation framework is built on the GIZMO code [54], which uses a Lagrangian meshless finite mass (MFM) method to solve MHD equations [55]. A comprehensive explanation of the numerical methods and validation tests is provided in [51].
We use the M2e4a2 suite of STARFORGE simulations (Table 1 of [56]). It follows the evolution of a \(20~000\) M\(_\odot\) GMC3 to \(\sim\) 11 Myr while saving all the properties every 24.7 kyr, thus producing a total of 445 snapshots. Of these, we analyse the 410 snapshots that have significant \(^{13}\)CO(2-1) emission to identify dendrogram structures. The GMC is initiated as a uniform surface density sphere with \(R=10\) pc, \(\alpha_{vir}=2\), and T = 10 K, surrounded by diffuse medium (density contrast of 1000) in a (100 pc)\(^3\) box4. The gas is initiated as fully atomic, but the ionisation state rapidly converges to local equilibrium, such that the interior of the cloud rapidly becomes fully molecular. The calculation we analyse includes the heating and cooling contributions from all key molecular, atomic, nebular, and continuum processes [58]. Although the simulation does not explicitly follow the formation and destruction of H\(_2\), it computes the molecular gas fraction in each cell using a fitting function that depends on the gas metalicity and surface density [59]. We then derive the number density of molecular hydrogen by assuming that the molecular gas mass in each cell is composed of molecular hydrogen.
We restricted our analysis to the gas within the inner 60 pc of the simulation box, as that is sufficient to capture the \(^{13}\)CO(2-1) emission througout the time evolution studied here. As the simulation progresses, the GMC collapses under self-gravity to form protostars (at 0.8 Myr) with outflows. It then forms stars with photoionising radiation (at 2.7 Myr) and stellar winds (at 3.6 Myr), which disperse most of the GMC before the first supernova (at 9.8 Myr) occurs. This follows a global hierarchichal collapse (GHC) scenario with various stellar feedback mechanisms that can contribute to the injection of local turbulence and provide support against local collapse [56], [60].
The Structure, Excitation, and Dynamics of the Inner Galactic InterStellar Medium (SEDIGISM ; see [46], [61] for an overview) survey spans an area of 84 deg\(^2\) within the Galactic longitude range of \(-60^\circ \leq l \leq +18^\circ\) and latitude \(|b| \leq 0.5^\circ\) (with some regional variations). It includes multiple molecular tracers, specifically targeting the \(J = 2\)-\(1\) transitions of \(^{13}\)CO and C\(^{18}\)O. Observations were conducted from 2013 to 2017 using the 12-meter Atacama Pathfinder Experiment (APEX) telescope [62]. The survey provides a contiguous dataset divided into 77 datacubes, each covering approximately \(2^\circ \times 1^\circ\), with a velocity coverage of \(-200\) to 200 \(\mathrm{km \; s}^{-1}\) and a pixel size of \(\ang{;;9.5}\). The first data release (DR1) features \(^{13}\)CO observations with a full width at half maximum (FWHM) beam size of \(\ang{;;28}\) and a typical 1-\(\sigma\) sensitivity of 0.8–1.0 K per 0.25 \(\mathrm{km \; s}^{-1}\).
Using this dataset, [5] constructed a catalogue of 10,663 MCs with their physical properties, further updated by [63], [64] and [10]. MCs were identified using the Spectral Clustering for Interstellar Molecular Emission Segmentation (scimes) algorithm (v.0.3.2, [40], [65]). [5] also defined a science sample that consists of well-resolved clouds with reliable distance estimates. We selected MCs from the science sample that are at distances between 2.5 and 3.5 kpc away for our comparative analysis. This distance rages provides a sufficiently large sample size (of 835 clouds), with a consistent physical resolution (\(\sim 0.3 - 0.5\) pc) and minimises the effects of different sensitivities at different distances. In the following, we will refer to these clouds as SEDIGISM clouds. We use the deconvolved equivalent radius (5), cloud mass (), velocity dispersion (), virial parameter () and surface density () to compare SEDIGISM clouds to synthetic MCs identified in this paper (Sect. 4.1).
We used the RADMC-3D [53] radiative transfer code on STARFORGE simulation to obtain the position-position-velocity (PPV) \(^{13}\)CO(2-1) emission cubes that mimic the SEDIGISM survey. RADMC-3D uses the simulation data as input, including the distribution of the density, temperature, composition of the gas, and sources of radiation, and generates a three-dimensional grid of points that are used to sample the environment and calculate the radiative transfer. We briefly outline the procedure in the following.
We interpolated the STARFORGE data to a uniform grid with resolution \(480 \times 480\) as input for RADMC-3D6, matching the resolution in
SEDIGISM at a distance \(\sim\) 3 kpc. The RADMC-3D inputs are gas temperature, velocity, and the number densities of \(^{13}\)CO and H\(_2\), which acts as
a collision partner. The collional rate coefficients for \(^{13}\)CO are provided by the Leiden Atomic and Molecular Database (LAMDA7). We
obtained the \(^{13}\)CO number density from the H\(_2\) number density estimated from the molecular gas fraction using an abundance ratio \(\mathrm{H_2/CO} =
10^{4}\) [66], [67] and a constant isotopic ratio \(\mathrm{^{12}CO/^{13}CO} = 42.6\) [68]. We also applied a freeze-out criterion by setting the \(^{13}\)CO abundance ratio to 50% in regions with gas temperature below 17 K and gas density above \(10^5 \mathrm{cm}^{-3}\) [69]–[71]. However, various temperature and density thresholds tested for the freeze-out criterion did not strongly affect the distribution of the \(^{13}\)CO(2-1) emission studied in this work. RADMC-3D calculates line profiles based on thermal broadening by default, but allows the inclusion of turbulent widths (microturbulence8). We calculated the microturbulence as the product of the velocity gradient in a cell and the size of the cell, where the velocity gradient for the cells is obtained using the numerical differentiation
functionality of the MESHOID
9 script.
RADMC-3D calculates line emission by obtaining the level population in each grid cell based on a line transfer model. We used the non-LTE approximation together with the Large Velocity Gradient (LVG) mode and the Escape Probability method10. The \(^{13}\)CO(2-1) transition emits at 220.398 GHz or 1360.227 \(\mu\)m. We used this as the central wavelength to
obtain PPV cubes with 65 channels separated by 0.25 kms\(^{-1}\), leading to a velocity coverage of \(\pm\) 8 kms\(^{-1}\). We used the RADMC-3D
image
option with loadlambda
to create PPV cubes of size \(65 \times 448 \times 448\) pixels at a distance of 3 kpc and a pixel size of 9.5(corresponding to 60 pc), in order to be directly
comparable to the SEDIGISM clouds (see Sect. 2.2) We also made use of second-order integrations while creating the image to avoid pixelation. For each snapshot, the \(^{13}\)CO(2-1) cubes were constructed for three different lines of sight, along the \(x\), \(y\) and \(z\) directions (App. 8), using the phi={90, 0, 0} and incl={90, 90, 0} parameters respectively.
Figure 1: Left: A synthetic integrated \(^{13}\)CO(2-1) emission map created with RADMC-3D. Right: The same map after convolution and with noise of sigma = \(0.2 \pm 0.02\) K..
We post-processed the output PPV cube from RADMC-3D (Fig. 1, left) to replicate the characteristics of SEDIGISM observations using the following procedure. First, the cube is convolved with a 2D Gaussian kernel with a full width at half maximum (FWHM) of 28, corresponding to the APEX telescope beam at 220.398 GHz. The data from the noisy (first and last 50) channels of the SEDIGISM G305 cube is to added to the cube as noise. To improve the signal-to-noise ratio, the data is spectrally smoothed at 0.5 kms\(^{-1}\) resolution and resampled to 0.25 kms\(^{-1}\) [5]. An example of an original and post-processed cube is shown in Fig. 1.
To suppress the noise and prepare the data for the analysis of dendrograms, we applied the dilated masking method described in [72], resulting in clean masked cubes (App. 9). We perform two masking iterations as [72] for robust data cleaning, with s2n_low = 2, s2n_high = 4, and s2n_vel = 3 for the first iteration and s2n_low = 2, s2n_high = 5, and s2n_vel = 3 for the second. The s2n_low follows the min_val parameter in [5], setting the value of pixels whose emission is lower than 2\(\sigma\) of the local noise to zero. The higher value of s2n_high in the second iteration provides a stricter constraint to remove any spurious sources. These data processing steps allow us to closely replicate the observed data set in SEDIGISM (Sect. 4.1). Of the 445 snapshots, the first 35 snapshots (up to \(\sim\) 1 Myr) have no detectable \(^{13}\)CO(2-1) above the s2n_low limit. Our analysis is therefore limited to 410 snapshots, resulting in a total of 1230 synthetic datacubes for the three projections. Excluding the early snapshots minimises potential biases in the reproduction of the synthetic line emission that could arise from the initial settling of the cloud. By \(\sim\) 1 Myr, the gas density distribution reaches a quasi-equilibrium state, satisfying a log-normal distribution, which is expected for supersonically turbulent gas [73], [74].
Dendrograms describe the distribution and nesting of isosurfaces in data cubes and have long been used to discretise molecular gas emission at different scales in observations [5], [40] and simulations [75]. We used dendrograms [76] to segment the molecular gas in each snapshot into leaves, branches, and trunks. Leaves are structures formed by single local maxima in the gas distribution and are nested in branches, which in turn are nested in trunks. The trunks can be isolated structures without substructures or hierarchical structures with multiple substructures (App. 11). Dilated masking sets the value of all noisy voxels to zero, setting the lower threshold of the dendrograms. We built the dendrograms using min_val = 0 and n_delta = 2 \(\sigma_{rms}\), where \(\sigma_{rms} = 0.87\) represents the mean rms noise in our data (Sect. 3.2). We set the min_pix using n_area = 3 and n_vchan =2, such that all structures are both spatially (i.e. at least 3 beam) and spectrally (span at least 2 velocity channels) resolved. We find a total of 3710 hierarchichal trunks and refer to them as MCs throughout this work. The entirety of detectable \(^{13}\)CO(2-1) emission in a snapshot is referred to as the “molecular gas complex” or the “entire \(^{13}\)CO(2-1) emission”. Additionally, the dendrograms store the information about the nested structures (descendants) for every structure. We perform a quantitative analysis of these substructures (descendants) in each MC in Sect. 4.2.3.
The dendrogram analysis returns a catalogue of structures11 with their properties [77], which we use to derive other MCs properties assuming a heliocentric distance of 3 kpc. The most useful directly-measured properties are the projected footprint area (\(A\)), the velocity dispersion (\(\sigma_r\)), and the total brightness temperature (T\(_B\)). The effective radius is defined as \(R_{eff}\) = \(\sqrt{A/\pi}\) and the deconvolved radius as \(R\) = \(\sqrt{R_{eff}^2 - R_{beam}^2}\), where \(R_{beam} = 0.2\) pc is the physical size of the beam12. The luminosity mass is estimated from the luminosity (\(L\)) as \(M_{lum} ~[\mathrm{M}_\odot]= \alpha_\mathrm{CO} ~ L ~ [\mathrm{L}_\odot]\), where \(\alpha_\mathrm{CO}\) = 22.43 M\(_\odot\) (K km s\(^{-1}\))\(^{-1}\) pc \(^{-2}\), estimated using \(X_{^{13}\mathrm{CO}(2-1)} = 1^{+1}_{-0.5} \times 10^{21} \mathrm{cm}^{-2}\) (K km s\(^{-1})^{-1}\) to be consistent with SEDIGISM . From this we derive the surface mass density \(\Sigma = M_{lum}/A\) and the virial parameter \(\alpha_{vir} = 5\sigma_v^2R/GM_{lum}\), assuming a spherical and uniform cloud. The analysis presented in this paper uses deconvolved properties, although global trends are virtually unchanged by this choice.
We obtained the true molecular gas mass (\(M\)) directly from the simulation by projecting the H\(_2\) masses of gas cells onto the RADMC-3D and summing over the pixels associated with a given dendrogram structure. The true molecular gas mass is not subject to observational biases or limitations (e.g. \(\alpha_{\mathrm{CO}}\) factor) and includes the CO-dark and freeze-out regions. We used this mass to study the time evolution of clouds in Sect. 4.2.1, but their derived properties are calculated using \(M_{lum}\) to be consistent with the observations. STARFORGE also tracks the positions, mass, ages, and evolutionary stages of individual stars and protostars [51]. We record the number of newborn stars (protostars and stars younger than 250 kyr) and the main-sequence stars (with \(M > 2 \, \mathrm{M}_\odot\)) in each snapshot and use them to investigate the MC fragmentation in Sect. 4.2.3.
We analyse cloud morphologies using the \(RJ\) plots [78] algorithm, which is an extension of \(J\) plots [79]. These algorithms classify pixelated structures into different morphologies based on the relationship between their mass distribution and moment of inertia. The \(J\) moments \(J_1\) and \(J_2\) are obtained by compairing the principal moments (\(I_{1} \, \& \, I_2\)) with those of a uniform surface density disk \(\left(I_0 = \mathrm{\frac{AM}{4 \pi}} \right)\) of the same area and mass; \(J_i = \frac{I_0 - I_i}{I_0 + I_i}\; \{i = 1, 2\}.\) The \(RJ\) moments \(R_1\) and \(R_2\) are obtained by rotating the \(J\) plot 45 degrees in the anticlockwise direction. \(R_2\) is further normalised to remove the parameter space constraints given by \(|J_i| \leq 1\), resulting in \(R_1 = \frac{J_1 - J_2}{2}\) and \(R_2 = \frac{J_1 + J_2}{\sqrt2 (\sqrt2 - R_1)} .\) \(R_1 = 0\) corresponds to a perfectly circular structure, with increasing values indicating progressively higher degrees of elongation. \(R_2\) measures the weight distribution of a structure compared to its centre of weight. The positive and negative values of \(R_2\) represent centrally overdense and underdense structures, respectively.
Before we can study the evolution of MCs using our sample of synthetic clouds, we first need to ensure that our simulated sample is representative of the observed clouds, in terms of properties and parameter space probed. For that purpose, here we compare our synthetic MCs with those observed as part of the SEDIGISM survey [5]. Figure 2 shows the flux distribution in each pixel for the SEDIGISM \(^{13}\)CO cubes and the emission of all STARFORGE snapshots along the three projections. The strong agreement between the two flux distributions serves as a validation that our synthetic emission maps replicate the SEDIGISM data at first order.
Figure 3 shows the distribution of the integrated properties for our synthetic MCs and the SEDIGISM clouds. The good agreement between the two datasets for all properties indicates similar structural characteristics of the clouds in both samples. This further highlights the relability of our approach in not only creating emission cubes similar to SEDIGISM but also identifying SEDIGISM-like clouds. However, this should not be confused as a single STARFORGE simulation replicating the entire diverse sample of clouds in SEDIGISM . Rather, due to robust data processing, our synthetic MCs occupy the same parameter space as SEDIGISM clouds.
The slight shift in the STARFORGE distributions towards higher values compared to SEDIGISM can be explained as follows. STARFORGE simulates an isolated GMC and therefore the lowest level dendrogram structures, i.e., hierarchichal trunks, are called MCs. SEDIGISM , on the other hand, traces the larger gas structures in the Galaxy, and every dendrogram structure (e.g. nonoverlapping trunks, branches, or leaves) is considered a cloud if it complies with the clustering criteria set by the segmentation algorithm. Using the same cloud identification criteria as SEDIGISM , i.e., the scimes algorithm [40] is not suitable for this study. This is mainly due to the relatively small spatial coverage of each snapshot, which causes SCIMES to identify branches within the trunks as MCs, making the extracted clouds significantly smaller than those of SEDIGISM.The exclusion of leaves constraints the lower limit on the properties of our synthetic MCs; however, despite these differences, our MCs are consistent with those in SEDIGISM (Fig. 8 & 9).
The aim of this section is to understand the changes in the structure and properties of MCs as they evolve and are affected by different stellar feedback mechanisms (Fig. 4). To highlight the broader evolutionary trend, the MCs are binned based on evolutionary time. Each bin corresponds to \(\sim 250\) kyr and contains a total of 30 data cubes (three projections per snapshot). The evolution of properties of individual MCs is illustrated in the App. 8. The following subsections focus on the general trends in integrated properties of the clouds (Sect. 4.2.1), their morphologies (Sect. 4.2.2), and substructures (Sect. 4.2.3). The corresponding figures (Fig. 5, 6 & 7) also show the onset of different stellar feedback mechanisms indicated on the time axis.
Figure 4: \(^{13}\)CO(2-1) moment 0 maps at different evolutionary time. The background greyscale represents H\(_2\) gas density with \(^{13}\)CO(2-1) emission overlaid as viridis maps (molecular gas complex), and coloured contours represent different MCs (dendrogram trunks), with red contours representing the largest MCs (\(R\)) in the cube. The \(^{13}\)CO(2-1) maps for multiple snapshots along different projections are presented in App. 9..
In this section, we analyse the integrated properties of MCs as a function of time, over the \(\sim 11\) Myrs covered by the simulations. Figure 5 shows that radius, mass, luminosity, and surface density follow each other closely, which is expected because larger MCs are more massive on average [41], [80]. The similar trends of molecular gas mass and \(^{13}\)CO luminosity distributions over time confirm that our results are not significantly affected by the choice of canonical \(^{13}\)CO abundances (see Sect. 3.1.1). The four properties show a steady increase until \(\sim 5\), as progressively more \(^{13}\)CO(2-1) is above the detection limit. The increase represents the active growth of the MCs as they transition from newly formed small diffuse cloudlets to large, massive, and dense MCs (illustrated in Fig. 4). The increase in mass is also influenced by the transition of the gas from the atomic to the molecular phase and is a key characteristic of the GHC model [20], [81]. The 6 Myr transition marks the beginning of gas expulsion and cloud dispersal by stellar feedback, as noted by the decrease in the average properties. This is supported by the significant increase in the number of clouds \(\sim 7\) Myrs, which suggests that the gas is being removed and eroded from within and around clouds, resulting in a higher number of small cloud fragments (Sect. 7). These trends in MC properties are also seen in other simulation sets; however, their onset and duration vary depending on initial conditions (App. 12).
The velocity dispersion distribution remains relatively flat until \(\sim 6\) Myr and then decreases. The initial high values for some MCs are largely a result of the initial supersonic turbulence, with a possible contribution from the momentum injected by the protostellar outflows. The average virial parameter decreases from \(\sim 10\) to \(\sim 2\) during the first 5 Myr and remains relatively constant throughout most of the evolution (Fig. 5). These distributions are not significantly affected by the growth and dispersal of MCs. However, they show peaks at \(\sim 2\) Myr and \(\sim 10\) Myr, which are also seen in the surface density distribution. The peaks correspond to the formation of turbulent gas structures and the onset of the first supernova, respectively. Supernovae inject turbulence into the ISM and produce relatively dense MCs in an environment that has been mostly dispersed by stellar winds and radiation. However, they do not significantly alter the general trends in the MC properties [60].
In this section, we study how the morphology of the MCs vary over time under the effect of various stellar feedback mechanisms. We use the \(RJ\) plots algorithm (Sect. 3.3) to analyse the morphology of the entire \(^{13}\)CO(2-1) emission in the simulation box, as well as the individual MCs as they evolve over time (Fig. 6). MCs show a large scatter in \(R_1\) and \(R_2\) at all times, as cloud structures present a continuous spectrum rather than discrete classes. We therefore bin the MCs following the same criteria as Sect. 4.2.1 to analyse the overall morphological changes and present the scatter plots in App. 8. We also analyse the \(R_1\) and \(R_2\) moments for molecular gas complexes showing the structure of the entire emission in a snapshot.
Figure 6 shows that on average the clouds (both the individual MCs and the molecular gas complexes) have different elongations (\(R_1\)) along the different projections. This is expected as they evolve in a unisotropic environment, which causes them to evolve asymmetrically. However, this difference is not very significant for individual MCs, as seen by the large scatter in Fig. 8 (bottom left). The trends in \(R_2\) are similar along the three projections, which highlights that the internal structure of the clouds appears the same regardless of the viewing angle.
The large average values of \(R_1\) for the MCs throughout most of the simulation highlight the ubiquitousness of filamentary structures (solid lines in Fig. 6, top). A notable trend in the distribution of \(R_1\) is a peak during 3-6 Myr, suggesting an elongated shape for CO emission in snapshots (Fig. 4, centre). The increase in elongation is more dominant in one projection direction, suggesting that the molecular gas complex is collapsing faster along one axis. Although the average value of \(R_1\) for MCs increases after 6 Myr, this is not the case for molecular gas complexes. It points towards the emergence of filamentary MCs within feedback-affected spherical bubble-like regions, potentially indicating the formation of intra-cloud filaments. However, this increase should be interpreted with caution due to the large scatter in \(R_1\) among individual MCs. The distribution of \(R_2\) for the molecular gas complexes shows a constant increase until \(\sim 6\) Myr, followed by a constant decrease. The rise in central concentration (\(R_2\)) is in agreement with the gravitational collapse of the simulated GMC. The decrease signifies the dispersion of the molecular gas due to stellar feedback, leading to centrally underdense bubble-like structures (Fig. 4, right). The increase in \(R_2\) along all three projections suggests that these MCs represent 3D bubbles rather than 2D rings (illustrated in App. 9). The different morphologies of the clouds for different projections reveal the anisotropy of the MCs. The anisotropy is driven by a combination of the initial turbulence, magnetic fields, and various stellar feedback events. A caveat of STARFORGE is that it assumes an isolated cloud and does not account for galactic-scale processes, such as gas accretion and the galactic potential. However, we expect the galactic potential to have a minor effect on the cloud scale over a timescale of 10 Myr. Although we see the trends in the morphology for the entire \(^{13}\)CO emission in snapshots, the individual MCs on average remain elongated centrally concentrated structures throughout the simulation. Visual analysis further confirmed their filamentary and clumpy nature (App. 9) during most of their lifetime (\(\sim\) 3-7 ) Myr.
The fragmentation of MCs is often attributed to various stellar feedback mechanisms [72], [82], however, fundamental questions about the physics responsible for fragmenting MCs are still open. Analysing the number of MCs, their substructures, and stars informs us about how the MCs fragment to form dense substructures that lead to star formation, which in turn produce stellar feedback and disperse these gas structures. Here, we present a quantitative analysis of the substructures within our MCs. The substructures are stored by dendrogram algorithm as descendants and represent subparsec-scale compact structures, typically referred to as clumps. Figure 7 shows an increase in the number of MCs and their substructures around (\(\sim\) 3 Myr), representing progressively more emission that is above the noise level. The second peaks (\(\sim 7-8\) Myr) in both distributions are the result of gas dispersion by stellar feedback.
We also analyse the number of stars and protostars with ages \(< 250\) kyr and collectively refer to these as newborn stars. These reflect the instantaneous star formation rate of the clouds, with the 3-7 Myr period representing the peak of star formation activity. The newborn stars evolve to the main sequence, producing stellar winds and photoionising radiation. The stellar evolution is a strong function of their mass [83] and a large number of low-mass stars remain in the main-sequence phase throughout the lifetime of the simulated GMC, seen as a constant increase in the number of stars over time (Fig. 7).
The significant increase in the number of stars \(\sim\) 5 Myr is followed by peaks in the number of substructures (7 Myr) and MCs (8 Myr). Stellar winds and radiation from individual stars disperse and expels gas in their neighbourhood, fragmenting dense gas structures (6-7 Myr, Fig. 4). Over time, the feedback becomes stronger and erodes these fragmented clumps, decreasing in their number (> 7 Myr). This strong gas dispersal affects the entire MC leading to the removal of gas between dense MCs, i.e. the entire \({^13}\)CO emission is identified as multiple small MCs instead of a single continuous structure/trunk (7-8 Myr). These smaller MCs are dispersed over time as a result of continuous feedback events (>8 Myr). The lack of a dense molecular gas further decreases the number of embedded stars.
The \(-\) Larson’s and Heyer’s \(-\) scaling relations show the correlations between the physical properties of clouds. Larson’s first relation, originally derived from the analysis of numerous MCs by [41], was later refined by [84], resulting in the relation \(\sigma_v = 0.74 \, L^{0.5}\) [40]. The spatial and velocity structures of MCs following power laws are often considered a proof of universal cloud turbulence [17]. It is a simplification of Kolmogorov’s law for turbulence, indicating that larger clouds exhibit broader line-widths. Larson’s laws individually do not provide information about the virial state of a cloud. To take this into account, [42] combined Larson’s second and third laws, creating Heyer’s relation, which compares the surface density of a cloud with its scaling parameter (\(\sigma_v^2 / R \propto \Sigma\)).
We present the two scaling relations for our MCs and compare them with the SEDIGISM clouds in Fig. 8 & 9. The synthetic MCs distribution shows almost a complete overlap with the 3-\(\sigma\) KDE (kernel density estimator) for SEDIGISM clouds. This provides strong evidence that the MCs from our synthetic observations have global properties similar to those of real clouds. Moreover, this shows that the correlation of properties is consistent across both samples, e.g. similar sized MCs have similar velocity dispersions, leading to their overlap on the scaling relation plots.
Figure 8 shows an increase in the average size and linewidth of the clouds up to \(\sim 6\) Myr. This is largely a result of the formation of dense gas structures that merge and result in progressively more \(^{13}\)CO(2-1) emission being detectable. In addition, stellar feedback mechanisms drive the velocities in the MCs and expand them, resulting in larger sizes and linewidths over time. At evolutionary times beyond \(\sim 6\) Myr, stellar feedback mechanisms begin to disperse the gas significantly, resulting in the identification of smaller MCs. This appears as a sharp drop in the average velocity dispersion followed by a gradual decrease in their size. The higher values in the average velocity dispersion at early times (< 6 Myr) for similar sized structures are most likely due to the initial supersonic turbulence injected in the simulation.
Figure 9 highlights an initial trend of MCs as they transform from underdense and highly supervirial structures to denser virialised structures. The decrease in the average scaling ratio (\(\sigma^2/R\)) is due to the significant increase in the MC radius compared to the velocity dispersion. The erosion of MCs due to feedback after 6 Myr causes a horizontal shift in Fig. 9 toward lower surface densities and higher virial parameters.
Molecular clouds are often analysed collectively in scaling relation plots, which typically show a large scatter [5], [40]. [85] show that the cloud morphology and internal substructures influence their distribution in these relations and hypothesised that the different morphologies might correspond to different evolutionary stages. Figures 8 & 9 show that MC populations at different evolutionary times occupy different positions in the scaling relation plots. This suggests that the large scatter in the scaling relations could be due to the emsemble of observed MCs being at different stages of their evolution. SEDIGISM clouds could have undergone through a diversity of physical conditions, as they are influenced by the larger Galactic environment and feedback events and follow various evolutionary paths. The gas flows and the effects of external factors are not simulated in STARFORGE . However, our MCs lie in the same parameter space as SEDIGISM and the simulation traces all relevant physics of star formation at parsec scales. Therefore, at least some of the SEDIGISM clouds follow an evolutionary path similar to that of our MCs. When combined with the fact that MC morpholgies evolve over time (Sect. 4.2.2), our results support the hypothesis proposed by [85].
The distribution of integrated properties, morphology, and fragmentation show that MCs are evolving from small, diffuse structures to dense filamentary structures before being dispersed by stellar feedback. They appear as filamentary and clumpy structures throughout most of their lifetimes, being consistent with other simulations [86]. The smaller structures they host collapse and form stars, even though the parent MC appears unbound (\(\alpha_{vir}\gg 1\)). The stars produce stellar feedback that disperses the molecular gas resulting in smaller, less massive, and diffuse structures. The presence of MCs as small clumps, filamentary and bubble-like structures is also supported by observations [10].
The initial turbulence in the simulations produces overdensities that become denser over time because of gravitational collapse. These MCs (< 3 Myr) appear as small, diffuse, low mass, gravitationally unbound (\(\alpha_{vir}\sim 10\)) and approximately spherical structures. We refer to them as MCs following their definition as hierarchichal trunks to be consistent throughout the paper; however, they are closer to starless molecular gas clumps in observations (e.g., starless clumps in [87] and quiescent clumps in [88]).
Molecular clouds in 3-7 Myr appear as large filamentary structures with dense clumps. The entire \(^{13}\)CO(2-1) emission in the simulation box appears as a single (or a few) large MC(s), since most of the gas in the simulation domain is molecular13. These represent a majority of the MCs detected in observational surveys [10], [15], [89]–[91]. The long lives of filamentary MCs are often attributed to continuous gas flows from the larger environment onto small-scale clumps through the filaments [92]. [93] show that the substructures within the MCs produce a deep gravitational potential and accrete the gas from the filament, thus dynamically decoupling from the MCs to grow faster. The formation of these dense clumps14 leads to a central infall of gas along the filament, which feeds the clumps, forms new small clumps, and causes turbulent movements [94]–[97]. The higher number of dense clumps results in an accelerated formation of protostars and stars (Fig. 7; 4-6 Myr).
Stellar winds and radiation become more effective throughout the simulation domain after \(\sim\) 6 Myr, resulting in gas expulsion and dispersion. These phenomena result in the \(^{13}\)CO(2-1) emission appearing as centrally underdense structures with a shell-like morphology. These 3D bubble-like clouds are widely studied as wind- and radiation-driven bubbles [98]–[100], associated with H regions [101] and classified as the last evolutionary stage of clouds [102]. Feedback disperses most of the \(^{13}\)CO emission by \(\sim\) 8 Myr, causing the broken shells to be identified as individual MC. The formation of massive stars at \(\sim\) 3 Myr and most of the \(^{13}\)CO(2-1) emission being dispersed by \(\sim\) 8 Myr agrees with the fast dispersal of molecular clouds by feedback [36], [103]–[105].
STARFORGE simulates an isolated GMC within a closed box with a fixed total gas mass \((2 \times 10^4 M_\odot)\), restricting the upper mass limit of the MCs. Moreover, the simulation does not track the real-time abundance of CO, so the use of canonical abundance values and ad hoc freeze-out prescriptions is a simplification. This results in under- or overestimation in the real abundances, thus introducing an uncertainty on the measured M\(_{lum}\) from the synthetic observations, potentially skewing these distributions. To minimise this error, we set the CO abundance to zero in regimes where it can freeze out. We also set strict constraints while performing dilated masking (Sect. 3.2) and choose only the hierarchichal trunks (Sect. 3.3) as MCs to avoid spurious sources. Inclusion of a chemical network in the simulations or radiative transfer could improve the accuracy of property estimates, but this goes beyond our current scope and does not significantly affect our overall analysis (App. 10).
Predicting the evolutionary stages of SEDIGISM MCs based on their properties and morphology might be possible, since our MCs share the same parameter space as SEDIGISM (Sect. 5). However, the degeneracy in these distributions on either side of the 6 Myr peak, visible as the large scatter in the scaling relation plots, makes this task extremely challenging using solely \(^{13}\)CO(2-1) observations. The early MCs show a H\(_2\) envelope (Fig. 12) which could be traced with diffuse gas tracers such as \(^{12}\)CO, thus separating them from the feedback-affected MCs (Fig. 13). Moreover, an analysis of the dense gas structures within MCs using tracers such as N\(_2\)H\(^+\) and NH\(_3\) could reveal the fragmentation trends and aid in the evolutionary classification of MC. However, such a multiwavelength study is beyond the scope of this work.
Observational works often perform multiwavelength studies using tracers of dense gas, young stellar objects, and HII regions to classify molecular clouds and clumps into various evolutionary stages [87], [88], [102], [106]. In a follow-up paper, we will study how clumps (dendrogram branches) and cores (dendrogram leaves) within our MCs are affected by various stellar feedback mechanisms [107]. This will improve our understanding of the evolution of the molecular gas structures from clouds to core scales. We will also compare these gas structures with their observed counterparts in observations at various evolutionary stages [88], to understand the degree to which such multiwavelenth analysis are able to predict the evolutionary stages of gas structures.
In this paper, we have created synthetic observations from a \(20~000\) M\(_{\odot}\) STARFORGE simulation modelled after the SEDIGISM survey. We used the RADMC-3D radiative transfer code to convert the gas density cubes into \(^{13}\)CO(2-1) emission maps and performed a dendrogram analysis to identify MCs. We analysed this sample of synthetic MCs and investigated the trends in properties, morphology, and substructures to understand how MCs evolve under the effects of different stellar feedback mechanisms.
The flux distributions of the SEDIGISM ppv cubes and our synthetic data cubes are in strong agreement, validating the replication of the SEDIGISM data to first order. The properties of synthetic MCs show good agreement with the SEDIGISM clouds and the two sample fill the same parameter space in the scaling relation plots, which further confirms the robustness of our approach. Although the two cloud show overall good agreement, the sythetic MCs reproduce only a subset of the diversity seen in the observations. Moreover, synthetic MCs at different evolutionary stages occupy distinct regions of the scaling relation plots, suggesting that evolutionary time plays a significant role in driving the observed scatter.
We study the formation, evolution, and destruction of MCs through variation in their observable properties. The initial turbulence in the simulations creates gas overdensities that collapse under self-gravity and are detected as \(^{13}\)CO(2-1) emission. These reflect the early cloudlets in observations that are accreting gas from the larger environment to appear as moderately dense gas structures. Gas flows from large to small scales shape MCs into elongated filamentary structures with multiple substructures. The fractal substructures in MCs form stars, which eject matter and radiation into the surrounding environment, driving the formation of gas bubbles. These 3D bubble-like MCs are often associated with stellar winds, radiation, and H regions. Our analysis presents MCs as evolving from small, diffuse structures to dense filamentary MCs followed by 3D gas bubbles, and these evolutionary trends are consistent with simulations initialised differently. This confirms the key hypothesis from our previous observational work that MCs evolve from concentrated to elongated to ring-like structures.
In conclusion, we have produced \(^{13}\)CO(2-1) synthetic observations modelling the SEDIGISM survey using the STARFORGE simulations that include all the relevant physics for star formation. Analysing the properties, morpholgies, and fragementation trends of MCs, we show that they evolve from small, diffuse structures to dense filamentary structures to bubble-like structures. The distributions of MCs occupy different parameter spaces in the scaling relation plots, suggesting that they drive the scatter in the observed scaling relations. In an upcoming paper, we will study the effect of individual feedback mechanisms – outflows, stellar winds, radiation, supernovae – on these MCs and their substructures. We will also explore the possibility of compairing the structures at different evolutionary stages in simulations and observations.
The authors thank the anonymous referee for a constructive report, which has significantly improved the quality of the manuscript. KN thanks Prof. Stefanie Walch-Gassner, Dr. Daniel Seifried, and Dr. Piyush Sharda for helpful discussions. AK acknowledges support from the Polish National Science Center SONATA BIS grant No. 2024/54/E/ST9/00314. MF acknowledges support from the Polish National Agency for Academic Exchange grant No. BPN/BEK/2023/1/00036/DEC/01 and from the Polish National Science Centre SONATA grant No. 2022/47/D/ST9/00419. S.N. gratefully acknowledges the Collaborative Research Center 1601 (SFB 1601 sub-project B1) funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 500700252.
The ppv cubes for this study were produced using RADMC-3D by projecting along three orthagonal axes. This is achieved using three combinations of incl-phi: 0-0, 90-0 and 90-90 in the RADMC-3D script for spectral line imaging
(radmc3d image
)15. The simulation box is thus projected along the \(z\), \(y\), and \(x\) axes, respectively. MCs identified in different projections have similar properties, which is consistent with previous similar works [108]. This is largely due to the fact that \(^{13}\)CO(2-1) emission is optically thin and thus the entire MC is traced along all projections. We conclude that using a specific projection
does not alter the MC properties and provides a sanity check that the simulations and RADMC-3D produce model clouds reasonably well.
In this section, we show a sequence of the \(^{13}\)CO(2-1) moment 0 maps for the GMC (i.e. the molecular gas complexes) as it evolves along with the dendrogram trunks (MCs). The MC obtained by projecting along three orthagonal axes (Sect. 3.1.2) are shown in figures 11 - 13. We also provide videos that represent all snapshots along the three projections as ancillary materials. This helps us to visualise the clouds and understand it’s structure at different evolutionary stages. Fig. 11 shows the formation of MCs as small diffuse structures. Fig. 12 shows a single (or few) contour(s) that cover the entire emission in the viridis, representing most of the observed MCs. The filamentary, fractal, and complex nature of these structures is also visible in the emission maps. Fig. 13 shows the MCs that are significantly impacted by stellar feedback processes. These lead to gas expulsion and dispersion, thus presenting the \(^{13}\)CO(2-1) as a 3D bubble. As more and more gas are dispersed, the number of MCs decreases. Some of these late MCs represent the early structures (Fig. 11), with the difference that the early MCs have accompanying H\(_2\) gas. The absence of a molecular gas prevents the formation of MCs and ends the simulation.
Figure 11: \(^{13}\)CO(2-1) moment 0 maps for different snapshots. The three columns represent the clouds projected along the x, \(y\) and \(z\) axes, respectively. The background greyscale represents H\(_2\) gas density with \(^{13}\)CO(2-1) emission overlaid as viridis maps, and coloured contours represent different MCs (dendrogram trunks), with red contours representing the largest MCs (\(R\)) in the cube..
Figure 12: \(^{13}\)CO(2-1) moment 0 maps for different snapshots. The colors and symbols follow fig. 11..
Figure 13: \(^{13}\)CO(2-1) moment 0 maps for different snapshots. The colors and symbols follow fig. 11..
We post-processed the STARFORGE simulations with UCLCHEM [109] chemical code16 to estimate the abundance of CO (Sharda et al. in prep). However, due to computational cost, this has only been possible for three snapshots. Figures 14 – 16 show that although our fiducial approach slightly overestimates the \(^{13}\)CO emission, the data processing steps produce MCs of comparable size and morphology in both cases. This is further highlighted in Fig. 17 & 18, which show that both sets of MCs have similar properties in the same snapshots. The inclusion of chemistry might cause a small difference in the properties of our MCs and result in smaller MCs not being detectable. However, since we study the trends in the distribution of properties over time, we expect that these are not significantly influenced by the absence of CO chemistry.
This section explains the reason for selecting only trunks that are branches as molecular clouds (MCs), rather than including all trunks. In figure 19, we show the distribution of properties for all dendrogram trunks. The results clearly demonstrate that only hierarchical trunks (branches) exhibit consistent trends in their properties as they evolve. In contrast, isolated trunks (leaves) have scattered distributions with no clear trends. This is likely because most of these isolated structures represent transient gas features that do not correspond to the fractal molecular clouds seen in observations (Fig. 11). Furthermore, the large sample of isolated trunks results in the average properties of the trunks (Fig. 19, central line) that show no significant trends over time.
We perform our analysis on two other simulation sets. These have the same initial conditions as our fiducial runs, with the exception of initial turbulence. These are M2e4a1 and M2e4a4 with \(\alpha_{vir} = 1\) and \(\alpha_{vir} = 4\), respectively. Figs. 20 & 21 show the evolution of the MC properties for these two simulations, respectively. Although the GMC lifetime and the onset of different feedback mechanisms are different in the two simulations, they show a trend of increase in the MC properties representing actively growing MCs followed by a decrease in the properties due to MC dispersal by feedback. The morphology and fragementation trends of these MCs are consistent with the fiducial simulations.
Member of the International Max Planck Research School (IMPRS) for Astronomy and Astrophysics at the Universities of Bonn and Cologne.↩︎
The GMC hereafter refers to this simulated GMC, as defined by the STARFORGE collaboration.↩︎
The equations to calculate these properties are described in [57]↩︎
https://home.strw.leidenuniv.nl/~moldata/datafiles/13co.dat↩︎
https://www.ita.uni-heidelberg.de/~dullemond/software/radmc-3d/manual_radmc3d/lineradtrans.html#input-the-local-microturbulent-broadening-optional↩︎
RADMC-3D lines_mode = 3↩︎
ppv_catalog
: https://dendrograms.readthedocs.io/en/latest/api/astrodendro.analysis.ppv_catalog.html↩︎
SEDIGISM beam (28) FWHM at 3 kpc is 0.4 pc↩︎
as shown by the molecular gas fraction https://starforge-tools.readthedocs.io/en/latest/data.html#gas-data-fields values stored in STARFORGE for each snapshot.↩︎
These central overdensities in MCs are visible in Fig. 12 and is evident from the large values of \(R_1\) (Sect. 4.2.2).↩︎
https://www.ita.uni-heidelberg.de/~dullemond/software/radmc-3d/manual_radmc3d/imagesspectra.html↩︎
The pipeline is provided here: https://github.com/psharda/gizmo_carver/tree/pschanges↩︎