October 08, 2025
JWST has identified a large population of faint, broad-line active galactic nuclei (AGN) in the early universe that are powered by black holes (BHs) that often appear overmassive relative to their host galaxies. In this study, we examine the relationship between BH mass and galaxy stellar mass at \(3<z<7\) using a sample of 70 broad-line AGN identified using NIRSpec/G395M spectroscopy from the CEERS, JADES, and RUBIES surveys. Roughly half (43%) of our sample appear heavily reddened and are classified as little red dots (LRDs). We estimate BH masses (\(M_{\rm BH}\)) using single-epoch virial techniques, while host stellar masses (\(M_{\star}\)) are inferred using a combination of two-dimensional surface brightness profile fitting and spectral energy distribution modeling. We find that a majority of our sources (50/70) have \(M_{\rm BH}/M_{\star}\) ratios that are 1-2 dex higher than that observed in AGN locally. Using a forward-modeling Bayesian framework that accounts for uncertainties, intrinsic scatter, and selection effects, we infer a \(M_{\rm BH}\)–\(M_{\star}\) relationship that is \(>3\sigma\) above the relationship measured for local broad-line AGN. We derive an intrinsic scatter in this relationship of \(0.9\) dex, which does not vary over the redshift range of our sample. We also find that the \(M_{\rm BH}/M_{\star}\) ratio increases by \(2.3\) dex from \(z = 3.5\) and \(z = 6.5\) with a confidence level of \(> 3\sigma\). We attribute this trend with the increasing fraction of LRDs in our sample at \(z>4\) as their host masses are \(\sim1\) dex lower than the non-LRD AGN in our sample. These results support a picture in which the BHs powering JWST’s broad-line AGN are genuinely overmassive and become increasingly so with redshift. We discuss the implications of our findings on early BH growth relative to that of their host galaxies and the constraints it places on BH seeding models.
Studies over the past three decades have found that supermassive black holes (SMBHs) are universally found at the centers of massive galaxies and that their growth is connected to that of their hosts. The latter is evidenced by scaling relations between the mass of the central black hole (BH) and a variety of galaxy properties, such as their velocity dispersion and bulge mass [1]–[7], as well as similarities between the cosmic star formation rate density and the BH accretion rate density with redshift [8]–[12]. How and when the scaling relationships observed in the local universe are put in place and whether they evolve with redshift are still very much open questions.
In recent years, JWST has provided an extraordinary view of BHs in the early universe by pushing the BH horizon to higher redshifts (\(z=9-11\); [13]–[16]) and lower masses (\(10^6 < M_\odot < 10^8\); [17]–[21]) than previously possible. These lower-mass BHs, in particular, are more representative of the underlying BH population than bright quasars and are potentially the key to constraining models of BH seeding [22]–[25] and the early coevolution of galaxies and BHs [26]–[29].
Roughly a third of the broad-line AGN identified with JWST appear heavily reddened in the rest-frame optical, yet have relatively blue colors in the rest-frame ultraviolet (UV) [18]–[20], [30]–[33]. Sources with this “v-shaped”, red plus blue spectral energy distribution (SED) have come to be known as “little red dots” (LRDs) in the literature [20]. LRDs are now thought to be a previously-unknown population of gas-enshrouded AGN [16], [34], [35] that are ubiquitous at high redshifts, but whose number density declines rapidly below \(z\sim4\) [36]–[39].
A key property of the broad-line AGN identified with JWST at high redshifts, including the LRDs, is an elevated BH to galaxy stellar mass ratio. Several studies have reported BH masses that are 1-10% of their host masses [18], [19], [21], [30], [36], [40]–[43], which is higher than the 0.1% commonly found in local AGN (see, e.g., [44]). In some exceptional cases, the BH mass is estimated to be even higher than the galaxy stellar mass; for instance, [45] reports the detection of a LRD at \(z \approx 7\) with \(M_{\rm BH}/M_\star > 2\).
There is significant debate in the literature as to whether these findings signal an evolution of the \(M_{\rm BH}-M_{\star}\) relationship towards overmassive BHs at higher redshifts [29], [46], [47] or if these sources are simply the massive, biased tail of a larger population of AGN that lie below the detection threshold of JWST [48]–[50].
Several models have pointed out that the \(M_{\rm BH}-M_{\star}\) relationship should evolve with redshift based on gas availability or self-regulation arguments [51]–[55]; however, observational studies out to \(z\sim3\) have produced mixed results [56]–[60]. Furthermore, there is no consensus from cosmological simulations as to whether the \(M_{\rm BH}/M_{\star}\) ratio is higher or lower at higher redshifts compared to the local Universe, however most fail to reproduce the overmassive BH population observed with JWST [27]. There is also the possibility that the scatter of the high-redshift relationship is higher than that observed locally due to the effects of merger-averaging, which reduces the intrinsic scatter at lower redshifts [61]–[63].
In this paper, we make use of the largest sample of JWST-identified broad-line AGN yet compiled in the redshift range \(3<z<7\) to statistically investigate the nature of the \(M_{\rm BH}-M_{\star}\) relationship in the early universe and its potential evolution with redshift. We make use of a forward-modeling Bayesian framework that accounts for selection biases to infer the intrinsic \(M_{\rm BH}-M_{\star}\) relationship at these redshifts [29].
We present our analysis as follows. In Section 2, we describe the JWST imaging and spectroscopic data used for this study, while Section 3 describes our sample selection and the properties of the broad-line AGN used in this work. Section 4 describes the methodology used to measure our BH and galaxy stellar masses, as well as the forward modeling approach used in determining the intrinsic \(M_{\rm BH}-M_{\star}\) scaling relationship. We present our results in Section 5 and discuss the implications of our findings in Section 6. Throughout this paper we use the following cosmological parameters: \(H_{0} = 70~{\rm km~s^{-1}~Mpc^{-1}; \Omega_{tot}, \Omega_{\Lambda}, \Omega_{m} = 1, 0.7, 0.3}\).
In this study, we make use of JWST NIRCam imaging from the Cosmic Evolution Early Release Science Survey (CEERS; [64]), the JWST Advanced Deep Extragalactic Survey (JADES; [65]), and the Public Release IMaging for Extragalactic Research survey [66]. For the JADES survey, we make use of the public NIRCam mosaics made available as part of the JADES second data release1. The CEERS and PRIMER NIRCam data were processed using the JWST Calibration Pipeline2 (versions 1.8.5 and 1.10.2, respectively) with custom modifications described in [64] and [67]. The resulting images were registered to the same World Coordinate System reference frame (based on Gaia DR1.2; [68]) and combined into a single mosaic for each field using the drizzle algorithm with an inverse variance map weighting [69], [70] via the Resample step in the pipeline. The final mosaics in all fields have pixel scales of 0/pixel.
Source detection and photometry on the NIRCam mosaics were computed on PSF-matched images using SExtractor
[71] version 2.25.0 in two-image
mode, with an inverse-variance weighted combination of the PSF-matched F277W and F356W images as the detection image. Photometry was measured in all of the available NIRCam bands in each field, as well as the F606W and F814W HST bands using public data
from the CANDELS and 3D-HST surveys [72]–[75]. Two runs of
SExtractor
were conducted in a hot+cold setup similar to the process described in [76]. Sources are first selected with conservative
(cold) detection parameters, designed not to split up large, bright galaxies, followed by a more aggressive (hot) run, which is performed to identify fainter objects. Objects from the hot catalog that fall outside the segmentation map from the cold run are
added to the final photometry catalog. Our analysis of the NIRCam imaging data is similar to that described in [77]; hence, we refer the reader
there for additional details.
The NIRSpec observations used in this study consist of data from the Cycle 1 CEERS survey (JWST ERS #1345, PI Finkelstein [78]; the Cycle 2 RUBIES survey (JWST GO #4233, PIs de Graaff and Brammer, [79]); and the Cycle 1 JADES survey [65]. The CEERS data consists of six MSA pointings in the EGS field, while the RUBIES data consists of six MSA pointings in EGS and 12 pointings in the UKIDSS Ultra Deep Survey (UDS) field. For this study, we only make use of NIRSpec observations taken with the G395M/F290LP \(R\simeq 1000\) grating/filter pair.
The CEERS observations used a three-nod dither pattern and the NRSIRS2 readout mode with 14 groups per nod, resulting in a total exposure time of 3107 s. The RUBIES observations used a three-nod dither pattern with the NRSIRS2RAPID readout mode with 65 groups per nod, resulting in a total exposure time of 2889 s. The details of the JADES observations, which used multiple configurations, can be found in [80] and [21].
For the JADES spectroscopic data, we made use of the reduced 1D spectra made available as part of the JADES third data release. The CEERS and RUBIES data were independently reduced as described in [81] using version 1.13.4 of the JWST Science Calibration Pipeline with the Calibration Reference Data System (CRDS) mapping 1215, starting from the Level 1 uncalibrated data products (“_uncal.fits”
files) available on MAST. Custom parameters were used for the jump
step at the detector-level calibration for a better treatment of the “snowballs”3 produced by
high-energy cosmic ray events, and a nodded background subtraction was adopted.
The reduced two-dimensional (2D) spectra (“s2d”) have a rectified trace with a flat slope. To best optimize the extraction of one-dimensional (1D) spectra from the 2D spectra, we perform a weighted extraction based on the methodology of [82]. Briefly, for a given spectrum, we take the median of the 2D spectrum along the spectral direction to produce a spatial profile for the source. We then identify the central peak of this profile, which corresponds to the source’s spectral trace. We then set all pixels in the spatial profile that are not a part of this central feature to zero and normalize the area under this masked spatial profile to one. We then use the normalized profile as the variable P in Table 1 of [82] and follow the prescription given therein to extract an optimized 1D spectrum.
ID | RA | Dec | \(z_{\rm spec}\) | F\(_{{\rm H}\alpha, {\rm broad}}\) | FWHM\(_{{\rm H}\alpha, {\rm broad}}\) | \(\log\,(M_{\rm BH}/{\rm M_\odot})\) | \(\log\,(M_{\rm *}/{\rm M_\odot})\) | LRD |
(J2000) | (J2000) | (10\(^{-18}\) erg s\(^{-1}\) cm\(^{-2}\)) | (km s\(^{-1}\)) | |||||
Kocevski et al. (2023) | ||||||||
CEERS 746 | 214.809142 | 52.868484 | 5.624 | 4.52\({\pm 0.45}\) | 1800\({\pm 200}\) | 7.19\(^{+0.11}_{-0.13}\) | 7.99\({\pm 0.37}\) | 1 |
CEERS 2782 | 214.823453 | 52.830281 | 5.241 | 5.62\({\pm 0.72}\) | 2060\({\pm 290}\) | 7.32\(^{+0.14}_{-0.16}\) | \(<\) 9.53 | 0 |
Harikane et al. (2023) | ||||||||
CEERS 397 | 214.836197 | 52.882693 | 6.000 | \(4.27\pm 0.39\) | 2091\({\pm 158}\) | 7.34\(^{+0.08}_{-0.09}\) | \(9.21\pm 0.33\) | 0 |
CEERS 672 | 214.889677 | 52.832977 | 5.666 | 2.76\({\pm 0.24}\) | 1504\({\pm 193}\) | 6.93\(^{+0.13}_{-0.14}\) | 8.33\({\pm 0.24}\) | 1 |
CEERS 1236 | 215.145298 | 52.967279 | 4.484 | 2.38\({\pm 0.24}\) | 2959\({\pm 0.24}\) | 7.39\(^{+0.13}_{-0.14}\) | 9.04\({\pm 0.24}\) | 0 |
Maiolino et al. (2023) | ||||||||
JADES-GN 954 | 189.15197 | 62.25964 | 6.760 | 1.73\({\pm 0.07}\) | 1911\({\pm 126}\) | 7.60\(^{+0.07}_{-0.07}\) | 8.92\({\pm 0.76}\) | 1 |
JADES-GN 1093 | 189.17974 | 62.22463 | 5.595 | 1.27\({\pm 0.30}\) | 1591\({\pm 252}\) | 6.81\(^{+0.18}_{-0.21}\) | 7.82\({\pm 0.32}\) | 1 |
JADES-GS 8083 | 53.13284 | -27.80186 | 4.648 | 3.35\({\pm 0.21}\) | 1750\({\pm 130}\) | 7.01\(^{+0.08}_{-0.08}\) | 8.50\({\pm 0.24}\) | 0 |
JADES-GN 11836 | 189.22059 | 62.26368 | 4.409 | 4.09\({\pm 0.35}\) | 1518\({\pm 171}\) | 6.90\(^{+0.11}_{-0.13}\) | 8.54\({\pm 0.72}\) | 0 |
JADES-GN 20621 | 189.12252 | 62.29285 | 4.681 | 3.36\({\pm 0.40}\) | 1670\({\pm 475}\) | 6.97\(^{+0.25}_{-0.33}\) | \(<\) 9.08 | 0 |
JADES-GN 61888 | 189.16802 | 62.21701 | 5.875 | 3.31\({\pm 0.25}\) | 1320\({\pm 133}\) | 6.86\(^{+0.10}_{-0.11}\) | 8.24\({\pm 0.32}\) | 1 |
JADES-GN 62309 | 189.24898 | 62.21835 | 5.172 | 1.22\({\pm 0.24}\) | 986.8\({\pm 144}\) | 6.34\(^{+0.16}_{-0.19}\) | 8.34\({\pm 0.31}\) | 0 |
JADES-GN 77652 | 189.29323 | 62.199 | 5.229 | 3.59\({\pm 0.47}\) | 922.4\({\pm 204}\) | 6.51\(^{+0.20}_{-0.25}\) | 8.04\({\pm 0.89}\) | 0 |
Kocevski et al. (2025) | ||||||||
RUBIES-EGS 37124 | 214.99098 | 52.916523 | 5.684 | 9.00\({\pm 0.37}\) | 1520\({\pm 115}\) | 7.18\(^{+0.07}_{-0.08}\) | 8.42\({\pm 0.37}\) | 1 |
RUBIES-EGS 42046 | 214.79537 | 52.788848 | 5.280 | 1.97\({\pm 0.01}\) | 3300\({\pm 60.0}\) | 8.47\(^{+0.02}_{-0.02}\) | 8.96\({\pm 0.46}\) | 1 |
RUBIES-EGS 42232 | 214.88680 | 52.855376 | 4.955 | 3.76\({\pm 0.06}\) | 1850\({\pm 50.0}\) | 7.58\(^{+0.03}_{-0.03}\) | 8.71\({\pm 0.46}\) | 1 |
RUBIES-EGS 49140 | 214.89225 | 52.877406 | 6.687 | 4.54\({\pm 0.47}\) | 5420\({\pm 370}\) | 8.72\(^{+0.08}_{-0.09}\) | 9.01\({\pm 0.35}\) | 1 |
RUBIES-EGS 55604 | 214.98304 | 52.956006 | 6.986 | 1.88\({\pm 0.11}\) | 4180\({\pm 480}\) | 8.55\(^{+0.11}_{-0.12}\) | 9.05\({\pm 0.52}\) | 1 |
RUBIES-EGS 60935 | 214.92338 | 52.925588 | 5.288 | 1.93\({\pm 0.03}\) | 1670\({\pm 60.0}\) | 7.39\(^{+0.04}_{-0.04}\) | 8.49\({\pm 0.29}\) | 1 |
RUBIES-EGS 61496 | 214.97245 | 52.962196 | 5.079 | 2.31\({\pm 0.17}\) | 1800\({\pm 300}\) | 7.00\(^{+0.15}_{-0.18}\) | 8.68\({\pm 0.46}\) | 1 |
RUBIES-EGS 926125 | 215.13706 | 52.988557 | 5.286 | 4.48\({\pm 0.13}\) | 1690\({\pm 70.0}\) | 7.10\(^{\pm 0.04}\) | 8.32\({\pm 0.36}\) | 1 |
RUBIES-EGS 927271 | 215.07826 | 52.948504 | 6.786 | 9.60\({\pm 1.9}\) | 1410\({\pm 200}\) | 6.74\(^{+0.16}_{-0.18}\) | 8.70\({\pm 0.36}\) | 1 |
RUBIES-UDS 40579 | 34.244190 | -5.245834 | 3.103 | 1.49\({\pm 0.08}\) | 2700\({\pm 130}\) | 8.29\(^{+0.05}_{-0.06}\) | 8.09\({\pm 0.51}\) | 1 |
RUBIES-UDS 50716 | 34.313132 | -5.226765 | 6.17 | 1.83\({\pm 0.27}\) | 2280\({\pm 220}\) | 7.26\(^{+0.10}_{-0.12}\) | 8.16\({\pm 0.43}\) | 1 |
RUBIES-UDS 59971 | 34.260537 | -5.209120 | 5.365 | 1.12\({\pm 0.18}\) | 1540\({\pm 280}\) | 6.74\(^{+0.18}_{-0.21}\) | 8.61\({\pm 0.48}\) | 1 |
RUBIES-UDS 63166 | 34.312143 | -5.202546 | 6.518 | 3.08\({\pm 0.50}\) | 2200\({\pm 400}\) | 7.36\(^{+0.18}_{-0.22}\) | \(<\) 9.92 | 0 |
Taylor et al. (2025) | ||||||||
RUBIES-EGS 920 | 214.052344 | 52.884268 | 3.616 | 1.13\({\pm 0.076}\) | 831.7\({\pm 196}\) | 6.21\(^{+0.23}_{-0.29}\) | 8.86\({\pm 0.19}\) | 0 |
RUBIES-EGS 6411 | 215.109185 | 52.939770 | 4.880 | 1.15\({\pm 0.14}\) | 1118\({\pm 132}\) | 6.41\(^{+0.12}_{-0.14}\) | 8.45\({\pm 0.26}\) | 0 |
RUBIES-EGS 13872 | 215.132933 | 52.970705 | 5.261 | 1.18\({\pm 0.32}\) | 1510\({\pm 165}\) | 6.72\(^{+0.14}_{-0.17}\) | 8.14\({\pm 0.49}\) | 0 |
RUBIES-EGS 15825 | 215.079264 | 52.934252 | 3.666 | 64.8\({\pm 2.4}\) | 3343\({\pm 179}\) | 8.08\(^{+0.05}_{-0.06}\) | \(<\) 10.4 | 0 |
RUBIES-EGS 19174 | 214.860840 | 52.784773 | 3.774 | 8.15\({\pm 2.0}\) | 672.588\({\pm 79.4}\) | 6.23\(^{+0.14}_{-0.17}\) | 10.8\({\pm 0.73}\) | 0 |
RUBIES-EGS 28812 | 214.924149 | 52.849050 | 4.223 | 9.36\({\pm 0.20}\) | 2191\({\pm 63.0}\) | 7.37\(^{+0.03}_{-0.03}\) | 9.01\({\pm 0.43}\) | 1 |
RUBIES-EGS 29489 | 215.022071 | 52.920786 | 4.543 | 24.4\({\pm 1.0}\) | 2091\({\pm 161}\) | 7.56\(^{+0.08}_{-0.08}\) | 8.65\({\pm 0.39}\) | 1 |
RUBIES-EGS 34978 | 214.861690 | 52.818438 | 3.772 | 19.8\({\pm 1.2}\) | 1357\({\pm 77.6}\) | 7.04\(^{+0.06}_{-0.06}\) | 9.81\({\pm 0.19}\) | 0 |
RUBIES-EGS 37032 | 214.849388 | 52.811824 | 3.850 | 11.9\({\pm 0.58}\) | 1969\({\pm 167}\) | 7.28\(^{+0.08}_{-0.09}\) | 8.67\({\pm 0.33}\) | 1 |
RUBIES-EGS 46985 | 214.805654 | 52.809497 | 4.963 | 6.81\({\pm 0.33}\) | 1389\({\pm 151}\) | 6.98\(^{+0.10}_{-0.11}\) | 9.04\({\pm 0.70}\) | 0 |
RUBIES-EGS 50052 | 214.823454 | 52.830277 | 5.240 | 14.4\({\pm 0.27}\) | 2092\({\pm 60.3}\) | 7.52\(^{+0.03}_{-0.03}\) | 8.28\({\pm 0.36}\) | 0 |
RUBIES-EGS 50522 | 214.855980 | 52.854661 | 3.614 | 1.09\({\pm 0.03}\) | 1515\({\pm 424}\) | 6.53\(^{+0.27}_{-0.35}\) | 10.3\({\pm 0.12}\) | 0 |
RUBIES-EGS 50812 | 214.845487 | 52.848281 | 3.519 | 1.13\({\pm 0.076}\) | 1850\({\pm 266}\) | 7.17\(^{+0.13}_{-0.15}\) | 8.22\({\pm 0.38}\) | 1 |
RUBIES-EGS 58237 | 214.850571 | 52.866030 | 3.651 | 7.66\({\pm 0.27}\) | 2002\({\pm 46.9}\) | 7.65\(^{+0.03}_{-0.03}\) | \(<\) 10.9 | 0 |
ID | RA | Dec | \(z_{\rm spec}\) | F\(_{{\rm H}\alpha, {\rm broad}}\) | FWHM\(_{{\rm H}\alpha, {\rm broad}}\) | \(\log\,(M_{\rm BH}/{\rm M_\odot})\) | \(\log\,(M_{\rm *}/{\rm M_\odot})\) | LRD |
(J2000) | (J2000) | (10\(^{-18}\) erg s\(^{-1}\) cm\(^{-2}\)) | (km s\(^{-1}\)) | |||||
Taylor et al. 2025 | ||||||||
RUBIES-UDS 5496 | 34.405872 | -5.312951 | 3.655 | 7.14\({\pm 0.98}\) | 1810\({\pm 407}\) | 7.08\(^{+0.21}_{-0.26}\) | 9.94\({\pm 0.31}\) | 0 |
RUBIES-UDS 8895 | 34.363041 | -5.306108 | 3.982 | 19.2\({\pm 1.3}\) | 1517\({\pm 57.9}\) | 7.16\(^{+0.05}_{-0.05}\) | 11.0\({\pm 0.08}\) | 0 |
RUBIES-UDS 10036 | 34.381671 | -5.303742 | 3.806 | 10.9\({\pm 0.54}\) | 1205\({\pm 64.5}\) | 6.82\(^{+0.06}_{-0.06}\) | \(<\) 9.65 | 0 |
RUBIES-UDS 11721 | 34.411039 | -5.300780 | 3.978 | 1.65\(^{\pm 0.23}\) | 1689\({\pm 110}\) | 6.76\(^{+0.08}_{-0.09}\) | 9.82\({\pm 0.31}\) | 0 |
RUBIES-UDS 18302 | 34.233628 | -5.283850 | 3.698 | 2.73\({\pm 0.31}\) | 1482\({\pm 260}\) | 6.71\(^{+0.17}_{-0.20}\) | 10.4\({\pm 0.76}\) | 0 |
RUBIES-UDS 19484 | 34.232426 | -5.280654 | 4.656 | 1.21\({\pm 0.16}\) | 1288\({\pm 229}\) | 6.53\(^{+0.17}_{-0.20}\) | \(<\) 9.80 | 0 |
RUBIES-UDS 19521 | 34.383672 | -5.287732 | 5.669 | 2.82\({\pm 0.30}\) | 1682\({\pm 211}\) | 7.03\(^{+0.13}_{-0.14}\) | 8.14\({\pm 0.92}\) | 1 |
RUBIES-UDS 21944 | 34.469218 | -5.283563 | 3.526 | 7.49\({\pm 0.79}\) | 1739\({\pm 161}\) | 7.03\(^{+0.01}_{-0.11}\) | 10.3\({\pm 0.23}\) | 0 |
RUBIES-UDS 22304 | 34.399170 | -5.283007 | 3.906 | 2.73\({\pm 7.4}\) | 747.4\({\pm 89.2}\) | 6.122\(^{+0.15}_{-0.18}\) | 8.48\({\pm 0.20}\) | 0 |
RUBIES-UDS 29813 | 34.453355 | -5.270717 | 5.440 | 8.15\({\pm 0.36}\) | 2329\({\pm 168}\) | 7.52\(^{+0.07}_{-0.08}\) | 8.29\({\pm 0.35}\) | 1 |
RUBIES-UDS 30969 | 34.296356 | -5.268672 | 4.000 | 11.8\({\pm 1.2}\) | 1018\({\pm 44.5}\) | 6.71\(^{+0.06}_{-0.06}\) | 9.73\({\pm 0.23}\) | 0 |
RUBIES-UDS 35974 | 34.331644 | -5.260593 | 4.367 | 5.76\({\pm 0.68}\) | 1101\({\pm 141}\) | 6.67\(^{+0.13}_{-0.15}\) | 10.1\({\pm 0.16}\) | 0 |
RUBIES-UDS 46885 | 34.291666 | -5.233788 | 3.730 | 7.83\({\pm 0.51}\) | 781.0\({\pm 26.9}\) | 6.35\(^{+0.21}_{-0.26}\) | 7.97\({\pm 0.25}\) | 0 |
RUBIES-UDS 48507 | 34.284578 | -5.230702 | 4.468 | 4.69\({\pm 0.38}\) | 765.5\({\pm 37.3}\) | 6.32\(^{+0.06}_{-0.06}\) | 7.87\({\pm 0.52}\) | 0 |
RUBIES-UDS 61627 | 34.238394 | -5.205775 | 3.654 | 3.06\({\pm 0.57}\) | 1586\({\pm 114}\) | 6.79\(^{+0.10}_{-0.11}\) | 10.5\({\pm 0.41}\) | 0 |
RUBIES-UDS 63139 | 34.230848 | -5.202607 | 4.435 | 2.06\({\pm 0.30}\) | 1240\({\pm 105}\) | 6.58\(^{+0.10}_{-0.11}\) | 8.00\({\pm 0.62}\) | 0 |
RUBIES-UDS 119957 | 34.268908 | -5.176722 | 4.149 | 8.61\({\pm 0.28}\) | 1908\({\pm 94.0}\) | 7.22\(^{+0.05}_{-0.05}\) | 8.07\({\pm 0.45}\) | 1 |
RUBIES-UDS 139709 | 34.296002 | -5.149895 | 5.685 | 36.7\({\pm 1.5}\) | 2366\({\pm 138}\) | 7.86\(^{+0.06}_{-0.06}\) | 8.65\({\pm 0.41}\) | 1 |
RUBIES-UDS 143683 | 34.316389 | -5.144678 | 4.226 | 3.90\({\pm 0.28}\) | 1351\({\pm 203}\) | 6.76\(^{+0.14}_{-0.16}\) | 8.74\({\pm 0.85}\) | 0 |
RUBIES-UDS 146995 | 34.331043 | -5.139963 | 3.732 | 25.2\({\pm 1.5}\) | 1403\({\pm 87.0}\) | 7.12\(^{+0.07}_{-0.07}\) | \(<\) 10.8 | 0 |
RUBIES-UDS 147411 | 34.360718 | -5.139081 | 3.966 | 4.16\({\pm 0.31}\) | 1667\({\pm 298}\) | 6.93\(^{+0.16}_{-0.19}\) | \(<\) 8.78 | 0 |
RUBIES-UDS 150323 | 34.417822 | -5.5287732 | 3.618 | 9.20\({\pm 0.32}\) | 1804\({\pm 109}\) | 7.12\(^{+0.06}_{-0.06}\) | 8.94\({\pm 0.54}\) | 1 |
RUBIES-UDS 153207 | 34.493112 | -5.130999 | 3.597 | 177\({\pm 3.6}\) | 1287\({\pm 27.8}\) | 7.42\(^{+0.02}_{-0.02}\) | 10.2\({\pm 0.25}\) | 0 |
RUBIES-UDS 155916 | 34.317031 | -5.127611 | 4.098 | 2.46\({\pm 0.20}\) | 1595\({\pm 154}\) | 6.80\(^{+0.10}_{-0.11}\) | 10.7\({\pm 0.58}\) | 0 |
RUBIES-UDS 172350 | 34.368951 | -5.103941 | 5.580 | 32.0\({\pm 0.71}\) | 1884\({\pm 71.7}\) | 7.63\(^{+0.04}_{-0.04}\) | 8.23\({\pm 0.47}\) | 1 |
RUBIES-UDS 174752 | 34.205808 | -5.100500 | 6.039 | 4.39\({\pm 0.25}\) | 1496\({\pm 110}\) | 7.18\(^{+0.06}_{-0.06}\) | 8.94\({\pm 0.64}\) | 0 |
RUBIES-UDS 182791 | 34.213813 | -5.087050 | 4.718 | 31.4\({\pm 0.41}\) | 2936\({\pm 203}\) | 7.93\(^{+0.02}_{-0.02}\) | 9.00\({\pm 0.58}\) | 1 |
RUBIES-UDS 807469 | 34.376139 | -5.310366 | 6.778 | 4.14\({\pm 0.49}\) | 1667\({\pm 197}\) | 7.19\(^{+0.12}_{-0.14}\) | 8.34\({\pm 0.53}\) | 1 |
RUBIES-UDS 970351 | 34.261900 | -5.105205 | 5.282 | 5.57\({\pm 0.36}\) | 1731\({\pm 251}\) | 7.16\(^{+0.13}_{-0.15}\) | \(<\) 9.95 | 0 |
For this study, we make use of 70 broad-line AGN at \(3<z<7\) that were previously identified with NIRSpec by five studies: [18], [19], [21], [36], and [81]. The coordinates and redshifts of these AGN are listed in Table 1. Five sources reported in [18] and [19] were identified using the CEERS dataset, eight sources from [21] were found in the JADES dataset, and the remaining 57 sources were found using the RUBIES dataset.
The redshift distribution of our broad-line AGN sample is shown in Figure 1. Roughly 43% (30 sources) of our sample appear heavily reddened and are classified as LRDs, based on the criteria developed in [36]; i.e., they have a compact morphology and a “v-shaped" SED such that they are red in the rest-optical and blue in the rest-UV.
While both LRD and non-LRD sources are found over the entire redshift range probed by our sample, their redshift distributions skew in different directions. The LRDs are preferentially found at higher redshifts: 4 are located at \(z<4\) compared to 27 at \(z>4\). On the other hand, the redshift distribution of the non-LRDs peaks at lower redshifts: 19 are located at \(z<4\) and 20 at \(z>4\). We discuss how this redshift distribution might impact our findings in §6.
Finally, the bolometric luminosities, \(L_{\rm Bol}\), of the sample are shown in Figure 2. Here we derive \(L_{\rm Bol}\) from the broad H\(\alpha\) line luminosity (\(L_{\rm H\alpha}\)) as measured in Section 4.1 using the \(L_{\rm H\alpha}\)-\(L_{\rm Bol}\) relation from [83]. The bolometric luminosities range from 10\(^{43}\) to 10\(^{46}\) erg s\(^{-1}\), with a median luminosity of \(1.7\times10^{44}\) erg s\(^{-1}\). The median luminosity of the LRDs in our sample is \(2.1\times10^{44}\) erg s\(^{-1}\), while that of the non-LRDs is \(1.2\times10^{44}\) erg s\(^{-1}\).
In this Section, we detail the methodologies used for the measurement of the BH mass (Sec. 4.1), the morphology of the hosts (Sec. 4.2), and the measurement of the stellar mass (Sec. 4.3). We conclude with a summary description of the methodology used to infer the relation between \(M_{\rm BH}\) and \(M_\star\) as a function of redshift (Sec. 4.4).
We estimate the virial BH masses of our AGN sample assuming that their broadened emission lines trace the kinematics of gas in the broad-line region (see, e.g., [84]). We measure line fluxes and widths in the available NIRSpec G395M/F290LP spectra using a Levenberg-Marquardt least-squares method implemented by the mpfit
IDL code4 as described in [18]. We simultaneously fit multiple Gaussians for features in the Balmer and Paschen
line regions, depending on the redshift of the source. To account for broad components, we fit lines with two Gaussians: one narrow with width \(<350\) km s\(^{-1}\) and one broad with
width \(>350\) km s\(^{-1}\). The line centers, widths, and fluxes are all free parameters and the broad and narrow components can be kinematically offset from each other. When fitting
H\(\alpha\) we also include additional Gaussian components for the [N ii] \(\lambda\lambda\)6550, 6585 doublet. While the H\(\alpha\)and [N ii] \(\lambda\)6585 lines are separated by roughly three times the resolution limit, the lines will blend together in the presence of a sufficiently broad H\(\alpha\)component. We account for this by
constraining the line widths and relative line centers of the [N ii]doublet to that of narrow H\(\alpha\). Four sources in our sample also show blue-shifted absorption in their Balmer and Paschen lines similar to the ones
reported in [20] and [36]. To account for
this, we add an additional absorption component to our fits of these sources. We find the absorption features have a full-width at half maximum (FWHM) in the range 250-500 km s\(^{-1}\).
Examples of our line fits for sources over the full range of redshifts probed by our sample are shown in Figure 3. The broad-line widths and line fluxes that we measure are listed in Table 1. Uncertainties on our line measurements are derived using a Monte Carlo approach. For each spectral element, we perform 1000 random draws from a Gaussian distribution whose mean is set to the measured flux at that wavelength and whose standard deviation is set to the error in that measurement. Our line fitting is then repeated on all 1000 mock spectra, and standard deviations are calculated from the resulting distributions.
Parameter | Prior |
---|---|
Galaxy components: | |
stellar_mass | log-uniform between \(10^5\) and \(10^{15} M_\odot\) |
[sfhdelayed] | |
tau_main | 50, 100, 200, 500, 1000, 3000, |
5000, 7000, 10000 Myr | |
age | 50 to 10000 Myr in 20 log-uniform steps |
sfr_A | 1 |
[bc03] | |
imf | 1 (Chabrier) |
metallicity | 0.02 (solar) |
separation_age | 10 Myr |
[nebular] | |
logU | -2.0 |
f_esc | 0.0 |
f_dust | 0.0 |
lines_width | 300 km/s |
[galdale] | |
alpha | 0.75 to 2.75 in 32 uniform steps |
AGN components | |
[activate] | |
L_AGN | log-uniform between \(10^{38}\) and \(10^{50}\) erg/s |
[activatelines] | |
AFeII | 0.6 to 32 in 10 log-uniform steps |
Alines | 0.3, 0.5, 0.7, 1, 1.5, 2, 4, 10, 20 |
linewidth | 10000 km/s |
[activategtorus] | |
fcov | 0.05 to 0.95 in 18 uniform steps |
Si | -4 to +4 in 40 uniform steps |
COOLlam | 10 to 30 in 20 uniform steps |
COOLwidth | 0.2 to 0.65 in 10 uniform steps |
HOTfcov | 0.04, 0.1, 0.2, 0.4, 0.6, 0.8, 1.0 |
1.2, 1.4, 1.6, 2.0, 2.5, 3, 5, 10 | |
HOTlam | 1 to 5.5 in 450 uniform steps of 0.01 |
HOTwidth | 0.2 to 0.65 in 10 uniform steps |
[activatepl] | |
uvslope | 0 |
plslope | -2.7 to -1 in 170 uniform steps |
plbendloc | 50. 80. 90. 100. 120. 150. nm |
plbendwidth | 0.1 to 10 in 10 log-uniform steps |
[biattenuation] | |
E(B-V) | 0.01 to 10 in 80 log-uniform steps |
E(B-V)_AGN | 0.01 to 10 in 80 log-uniform steps |
To calculate BH masses, we follow the methodology of [81] and make use of the single-epoch scaling relationship presented in [85]: \[\label{eq:32GH05} M_{\rm BH} = 3.7 \times 10^6 \left( \frac{L_{\rm H\alpha}}{10^{42}\;{\rm erg\;s^{-1}}}\right)^{0.47} \left(\frac{{\rm FWHM_{\rm H\alpha}}}{10^3\;{\rm km\;s^{-1}}} \right)^{2.06} M_\odot.\tag{1}\] Here \(L_{\rm H\alpha}\) and \({\rm FWHM_{\rm H\alpha}}\) are the luminosity and FWHM of the broad H\(\alpha\) line. For RUBIES-EGS 55604 and RUBIES-UDS 40579, where \({\rm H\alpha}\) is not visible, we make use of the H\(\beta\) and P\(\delta\) line widths and luminosities, respectively, assuming a H\(\beta\)-H\(\alpha\) line ratio of 2.86 and a P\(\delta\)-H\(\alpha\) line ratio of 46.7, consistent with Case B recombination. The [85] relationship has an uncertainty of roughly 0.5 dex [44] which will dominate over our measurement uncertainties. However, since the [85] relationship was derived for AGN at \(z<0.06\), these uncertainties may increase further by applying it to higher redshift sources.
We note that, in calculating our BH masses, we do not correct our line luminosities for dust attenuation. There has been considerable debate in the literature as to whether LRDs are as dust-obscured as their red colors suggest. Instead, it has been proposed that they are enshrouded in dense gas clouds [16], [34], [35], which would help explain their general lack of hot dust emission in the near and mid infrared [86]–[89]. Given this uncertainty and the fact that LRDs comprise a significant portion of our sample, we have opted not to correct their line luminosities for dust extinction. If our broad-line AGN have significant dust attenuation (\(A_{\rm V} > 2\)), our reported BH masses may be underestimated, such that the BHs would be even more overmassive. However, [81] note that the effect of reddening on BH masses computed this way is likely small (\(\sim\)0.19 dex) compared to the \(\sim\)0.5 dex uncertainty from using the locally-calibrated prescription from [85].
We assess the underlying morphology of the AGN host galaxies using two-dimensional surface brightness profile fitting with the GALFIT
software [90]. We carry out these fits in the NIRCam F115W, F150W, F200W, F277W, F356W, and F444W imaging (at their native resolution) to determine if an extended host component is visible and use this knowledge to help inform our
SED fitting. For these fits, we provide GALFIT
with empirical PSFs constructed from the NIRCam imaging in each field and noise images that account for both the intrinsic image noise (e.g., background noise and readout noise) as well as added
Poisson noise due to the objects themselves. We fit each source with a single point source component in each band and examine the residual image for underlying light from the host galaxy. When extended emission is visible, additional Sérsic components are
added to help minimize the flux in our residual images. All neighboring objects down to three magnitudes fainter than the primary source in the fitting band are fit simultaneously using single Sérsic profiles.
Examples of our GALFIT
analysis in the F200W filter for two point-like and two extended sources are shown in Figure 4. The second column from the left shows the residual image after subtracting off a
point-source only model, which can help reveal if any extended component is present, while the last column on the right shows the residual image after subtracting our full, best-fit model.
Roughly half (52.9%) of our sample appears point-like and is best-fit with a point-source only model. The LRDs in our sample are more likely to have a point-like morphology compared to the non-LRD sources. Roughly 87% of the LRDs show no extended emission when fit with a point-source only model. On the other hand, extended host emission is visible for 74% of the non-LRDs.
Although the compact size of LRDs is one of their key characteristics, we find four LRDs in our sample that show signs of a spatially extended host galaxy. These are JADES-GN-954, RUBIES-UDS-40579, RUBIES-EGS-49140, and RUBIES-EGS-50812. Our Galfit model fit for RUBIES-UDS-40579 at z=3.103 is shown in Figure 4. In all four cases, we find that the unresolved point source component begins to dominate our two-component fits at wavelengths longer than F277W, indicating that nuclear light from the AGN is responsible for the bulk of their emission at rest-optical wavelengths.
To determine the host stellar masses of our broad-line AGN sample, we model their SEDs using GRAHSP
[91]. GRAHSP
combines
galaxy stellar population and nebular emission models with a flexible, empirically validated AGN model consisting of power-law continuum emission from an accretion disk, broad and narrow emission lines, and torus emission in the infrared.
GRAHSP
employs a Bayesian approach to constrain model parameters using a nested sampling inference procedure.
For the host stellar population, GRAHSP
makes use of several modules originally developed for CIGALE
[92], [93]. For this study, we adopt a delayed-\(\tau\) star formation history (the sfhdelayed
module), a [94] initial mass function with a solar metallicity (\(Z = 0.02\)), and [95] for the simple stellar population (SSP) module. We add nebular emission using the nebular
module [96] and
account for dust extinction using the biattenuation
module. The latter allows for independent attenuation of the galaxy and AGN components using an SMC-like dust attenuation model.
For the AGN components, we use the activatepl
module to model the power-law continuum emission from the accretion disk, the activatelines
module to account for both narrow and broad AGN emission lines, and the
activategtorus
module to add infrared emission associated with reprocessing of UV light by a dust torus. Attenuation of the resulting AGN emission is handled by the aforementioned biattenuation
module.
Table 3 summarizes the modules used and their parameters. For the galaxy components, these follow previous CIGALE-based fits by [97] and [98], while the AGN components generally follow the ranges
found in [91]. GRAHSP
uses the Ultranest
package to sample from the posterior distribution. Our reported host masses and
uncertainties are computed using the median and standard deviation of 1000 equally weighted samples of the posterior probability distribution returned by Ultranest
.
All sources are nominally fit with both galaxy and AGN components activated. However, if our Galfit
analysis finds the morphology of a source is best fit with a pure point source model and no extended host emission is detected, then we run
GRAHSP
with only the galaxy modules activated and report the resulting mass as an upper limit. This is similar to the approach used by [18] and [19].
An exception to this approach is made for the LRD population. There has been substantial debate in the literature as to the origin of the rest-frame blue UV and red optical emission from these sources, but the prevailing theory is that the UV excess
originates from the host galaxy, while the optical emission originates from the AGN. As discussed in 4.2, we find evidence to support this scenario in the form of extended host emission in the bluest NIRCam bands for four
LRDs in our sample, two of which are at \(z<4\) (see also [99], which present a spatially extended LRD at \(z \approx 2\), and [100] a sample of LRD descendants at \(z < 4\)). Our SED fit
for one of these sources, RUBIES-UDS-40579 at \(z=3.103\), is shown in Figure 5. At higher redshift, LRDs are routinely point-like in NIRCam imaging, yet their SEDs are virtually
identical to their more extended, lower redshift counterparts. As a result, we fit all LRDS with both galaxy and AGN modules in GRAHSP
and report their host masses and uncertainties even when their morphologies are point-like. Examples of our
SED fits for both LRDs and non-LRDs are shown in Figure 5.
The black hole and stellar mass measurements, along with their associated uncertainties, can be used to execute a statistical inference of the evolution of the \(M_{BH}-M_\star\) relation in the redshift range of our sample.
Local scaling relations (see, e.g., [6], [44]) indicate that, typically, \(M_{BH} \sim 10^{-3} \times M_{\star}\), although with an intrinsic scatter of \(\sim 0.5\) dex [44]. Using early JWST data from [21], [101], [19] and [102], [29] found that galaxies hosting broad-line AGN in the redshift range \(4<z<7\) violate the local \(M_{BH}-M_\star\) at \(>3\sigma\), with black holes being overmassive by factors of \(10-100\).
In this paper, we make use of the same Markov-Chain Monte Carlo (MCMC) method developed in [29] to execute a statistical analysis of the relation between black hole mass and host’s stellar mass. This method allows a careful treatment of the mass bias, i.e., the fact that observing overmassive black holes is intrinsically easier because they tend to be more luminous.
In the remainder of this Section, we describe the MCMC model in broad strokes. The interested reader is strongly encouraged to refer to the original study (i.e., [29]), where an in-depth description is provided. The interested reader is also referred to [103], where a detailed description of the likelihood function used in [29] is provided, together with additional methodologies to fit models to data.
The MCMC algorithm uses three main ingredients to find the best model that describes the given \(M_{\rm BH}-M_\star\) data. First, it uses the empirical local \(M_{\rm BH}-M_\star\) relation from [44] as a prior, selected due to its relevance to the sample at hand, which includes galaxies with similar BH mass. Of course, the redshifts of the galaxies in our sample are much higher, while [44] used galaxies with \(z < 0.055\). The adopted relation is: \[\log \left(\frac{M_{BH}}{M_\odot} \right) = \alpha + \beta \log \left(\frac{M_\star}{10^{11} M_\odot} \right) \, ,\] where \(\alpha = 7.45 \pm 0.08\) and \(\beta = 1.05 \pm 0.11\). Second, it uses the JWST sensitivity in H\(\alpha\) for a resolution element of the spectrum, which is then translated into a minimum black hole mass that is thus detectable. Third, it uses the high-\(z\) stellar mass function [105] to account for the fact that low-mass galaxies are intrinsically more numerous than high-mass galaxies.
These ingredients are then fed into the likelihood function, which is central to modeling the high-\(z\) \(M_{\rm BH}-M_\star\) relation while taking into account (severe) observational biases. A detailed description of the likelihood function is provided in [29], to which the interested reader is referred.
The algorithm, when applied to a data set of values (\(M_{\rm BH},M_\star\)), together with their associated uncertainties, provides the posterior distribution functions for three quantities, characterizing the high-\(z\) \(M_{\rm BH}-M_\star\) relation: (i) the slope \(m\); (ii) the intercept \(b\); (iii) the orthogonal variance \(\nu\), which is related to the standard scatter via \(\sqrt{\nu}\sec\theta\), where \(\theta\) is the angle between the \(M_{\rm BH}-M_\star\) relation and the x-axis. Hence, the intrinsic scatter is derived from the data, and not fixed a priori. However, [106] showed that there is no significant redshift evolution in the intrinsic scatter of the original set of overmassive black holes used in [29].
In Figure 6, we show our derived host galaxy stellar masses plotted against the BH masses of our broad-line AGN sample. Also shown are the local scaling relationships of [6] and [44], which are derived for low-redshift, bulge-dominated quiescent galaxies and broad-line AGN, respectively. We use the RV15 relationship as our local benchmark because it was empirically estimated from a sample of broad-line AGN using broad H\(\alpha\) lines to measure BH masses and because the BH mass distribution of the sample is comparable to our own. Consistent with several previous studies, we find that the majority (50/70) of our sources scatter above the local scaling relationship of RV15 and have BH to stellar mass ratios that are 1-2 dex higher than observed their local AGN sample.
We find a redshift trend such that host masses decrease with increasing redshift among sources with the same \(M_{\rm BH}\) (and hence \(L_{\rm bol}\), assuming a similar accretion regime). As a result, sources at \(z<4\) preferentially sit near the KH13 relationship, while those at higher redshifts scatter above it. This is reflected in the BH to stellar mass ratio, which can be seen as a function of redshift in Figure 7. We find that sources at \(z<4\) have a median \(M_{\rm BH}/M_{\rm \star}\) ratio of 0.15%, while this increases to 4% for sources at \(z\sim 5.5\). The ratio in our highest redshift bin (\(6<z<7\)) rises to 7%.
Our findings suggest that the higher-redshift, broad-line AGN in our sample are powered by BHs that are increasingly overmassive relative to their local counterparts. While it is expected that more luminous BHs will preferentially be selected at higher redshifts in a flux-limited survey due to observational biases, an evolving \(M_{\rm BH}/M_{\rm \star}\) ratio is supported by our forward modeling that is designed to take such biases into account (see §5.1).
Instead we propose the change in the \(M_{\rm BH}/M_{\rm \star}\) ratio with redshift is likely related to the increasing fraction of LRDs in our sample at \(z>4\). Of the sources at \(z<4\), 83% (19 sources) are galaxy-dominated, non-LRDs with extended host morphologies. Only 17% (4 sources) are LRDs. On the other hand, at \(z>4\), 55% are LRDs (26 sources) with point-like morphologies. LRDs are generally faint in the UV, which is the emission that we attribute to their host galaxies. As a result, their average host mass is \(\sim1\) dex lower than that of the non-LRD AGN in our sample. That said, even the non-LRDs do show signs of an increasing BH to stellar mass ratio with redshift, as also found by [47]. When we exclude the LRDs, we find that the mass ratio of the non-LRDs at \(z>4\) increases by \(\sim1\) dex compared to their counterparts at \(3<z<4\). We discuss further how the varying composition of our broad-line AGN sample as a function of redshift impacts our findings in §6.
In order to account for selection biases and quantify the redshift evolution of the relationship between \(M_{\rm BH}\) and \(M_{\star}\), we applied the same MCMC model described in [29] to our broad-line AGN sample. Because we observe a sharp increase in the \(M_{\rm BH}/M_{\rm \star}\) ratio at \(z>4\) (see Fig. 7), we have chosen to limit our analysis to sources in the redshift range \(4<z<7\). When run on the AGN in this redshift range, we recover a best-fit relation of \[\log\left(\frac{M_{\rm BH}}{M_\odot}\right) \;=\; b + m\,\log\left(\frac{M_\star}{10^{11}M_\odot}\right)\] with intercept \(b = -2.50 \pm 0.75\) and slope \(m = 1.10 \pm 0.09\). Our best-fit relationship is shown in Figure 8. This inferred relation lies more than \(3\sigma\) above the local RV15 relationship, and is in excellent agreement with the original [29] high-\(z\) result (also shown in Figure 8).
To investigate any redshift trends, we subdivide the sample into four bins (\(3<z<4\), \(4<z<5\), \(5<z<6\), \(6<z<7\)) and rerun the MCMC inference on each bin independently. In this case, we include the data points in the redshift range \(3<z<4\) because they offer an average value of the ratio \(M_{\rm BH}/M_{\star}\) that is closer to the local one. We extract the median and \(1\sigma\) uncertainties of \(b\) and \(m\) in each bin, then plot these eight values as a function of bin center, as displayed in Figure 9.
To quantify the significance of any redshift evolution, we compare the posterior distributions of \(m\) and \(b\) in the lowest (\(3<z<4\)) and highest (\(6<z<7\)) bins. Specifically, for each of 15,000 random draws from the two chains, we record whether the high-\(z\) value exceeds the low-\(z\) value, yielding: \[P(m_{6-7} > m_{3-4}) \;=\; 0.896\,, \qquad P(b_{6-7} > b_{3-4}) \;=\; 0.780\,.\] These probabilities correspond to roughly \(1-2\sigma\) in a Gaussian sense, therefore we interpret this as only marginal evidence for evolution in both slope and intercept across \(3<z<7\).
We can also use this same methodology to examine the evolution of the intrinsic \(M_{\rm BH}/M_{\star}\) ratio over the redshift range of our AGN sample. Calculated for the median \(M_{\star}\) in each redshift bin in Figure 9, we find that \(M_{\rm BH}/M_{\star}\) increases on average by a factor of \(\sim197\) (i.e., \(2.294\) dex) between \(z = 3.5\) to \(z = 6.5\). The uncertainty on this change is \(0.751\) dex (calculated assuming redshift independence), which means this increase is measured with a confidence level of \(> 3\sigma\).
Finally, we can estimate the intrinsic scatter of our inferred \(M_{\rm BH}\)–\(M_{\star}\) relationship based on the best-fit orthoginal variance. We find a typical scatter of \(0.9\) dex in all four redshift bins. This scatter is larger than what was originally inferred in [29] (i.e., \(0.69\)) and does not appear to significantly evolve with redshift, as already suggested in [106].
To summarize, the key findings of this study are the following:
We find that the \(4<z<7\) \(M_{\rm BH}\)–\(M_{\star}\) relationship as probed by our sample of broad-line AGN is \(>3\sigma\) above the local relationship of RV15.
We derive an intrinsic scatter in this relationship of \(0.9\) dex, which does not vary over the redshift range of our sample.
We find that the \(M_{\rm BH}/M_{\star}\) ratio increases on average by \(2.294\) dex from \(z = 3.5\) and \(z = 6.5\) with a confidence level of \(> 3\sigma\).
We find only marginal evidence (\(\sim\)1–2\(\sigma\)) that both the slope and intercept of the \(M_{\rm BH}\)–\(M_{\star}\) increases over the redshift range of our sample.
These data suggest that the \(M_{\rm BH}/M_{\star}\) ratio increases significantly with increasing redshift. At the same time, the intrinsic scatter remains constant, despite being larger than its typical value in the local Universe.
Using a forward modeling approach that accounts for selection biases, we find that the faint, broad-line AGN identified by JWST at \(z>4\) are overmassive by 1-2 dex relative to low-redshift AGN and the resulting \(M_{\rm BH}\)–\(M_{\star}\) relationship is incompatible with the local relationship of RV15 with a confidence level of \(> 3\sigma\).
A contributing factor to the increasing \(M_{\rm BH}/M_{\star}\) ratio that we observe at higher redshifts is the increasing fraction of LRDs in our sample with increasing redshift. While LRDs make up 17% of our sample at \(z<4\), they are the majority of the sample at \(z>4\), consistent with the redshift distribution of photometrically-selected LRDs [36], and with theoretically-derived redshift distributions [55]. Due to their faint rest-frame UV emission, the average host mass of the LRDs in our sample is \(\sim1\) dex lower than that of the non-LRD AGN.
Without the LRDs included, the bulk of the remaining BLAGN in our sample have \(M_{\rm BH}/M_{\star}\) ratios that are largely consistent with the RV15 relationship. The bimodal distribution that we observe in the mass ratios of LRD and non-LRD BLAGN may indicate that LRDs are a specific evolutionary phase of SMBH evolution, where early BH growth has outpaced that of their host galaxies.
That said, there is still significant debate as to the nature of LRDs and which part, if any, of their “v-shaped" SEDs is due to stellar emission. As a result, their host masses may be underestimated. However, if we attribute their rest-optical light to stellar emission, which would make their host masses more consistent with the RV15 relationship, it would result in extreme stellar densities given their measured sizes [88], [107]. Their stellar densities would be higher than the densities observed in any system at any redshift and, in the most extreme cases, would greatly exceed the density necessary for runaway stellar collisions [108]–[110].
Another important caveat is that the BH masses of LRDs may be overestimated for a number of reasons, including if the locally-calibrated virial mass relationship of [85] is no longer applicable at high redshifts, if electron scattering contributes to the broadening of the Balmer lines [111], or if the LRDs have lower bolometric luminosities than commonly assumed [112]. That said, [45] recently reported that their direct measurement of the BH mass in an LRD at \(z=7.01\) is in excellent agreement with the mass inferred from single-epoch virial estimates. This suggests that these relationships may in fact be reliable for LRDs out to the Epoch of Reionization.
Our finding of a statistically significant deviation from the \(M_{\rm BH}\)–\(M_{\star}\) relation of RV15 differs from the recent results from [113], who examined a subset of the sources used in this study. In addition to the larger number of LRDs in our sample, a key difference between the approach of [113] and that of [29], which we use in this study, is the adoption of an Eddington ratio (\(\lambda_{\rm Edd}\)) distribution function derived for bright quasars at \(z\sim6\) that has a mean of \(\lambda_{\rm Edd}\sim0.1\). Instead, our forward modeling approach translates \(M_{\rm BH}\) directly into an observed luminosity, assuming an observationally-motivated value of 1000 km s\(^{-1}\) for \({\rm FWHM_{H\alpha}}\).
Lower values of \(\lambda_{\rm Edd}\) translate into lower luminosities for a given \(M_{\rm BH}\), making lower-mass BHs harder to detect. However, there is growing evidence that LRDs may be accreting at higher Eddington ratios than the \(10\%\) assumed in [113]. First, large collections of LRDs at \(z > 4\) [19], [21], [114] suggest a median \(\lambda_{\rm Edd}\) ranging from \(0.4\) to \(1.1\), i.e., slightly super Eddington, and more than one order of magnitude higher than the assumption in [113]. Additionally, from a theoretical perspective, there is growing evidence that LRDs may accrete close to or above the Eddington limit, as this would help explain many of their observed properties, including their X-ray weakness and weak variability [115]–[119]. Early super-Eddington growth that transitions to a typical AGN phase at lower redshifts would naturally explain the high \(M_{\rm BH}\)–\(M_{\star}\) ratios observed in the LRDs compared to lower redshift AGN [120].
Ultimately, a better understanding of the intrinsic Eddington ratio distribution for these broad-line AGN, and LRDs in particular, is necessary to improve our handling of selection biases. If these sources are accreting with the same Eddington ratios as bright quasars, the offset from the local relationship may indeed be minimal. However, if they are accreting at higher rates, then the completeness correction would be less severe and allow for a higher normalization in the scaling relationship at \(z\sim5\).
That said, a potential issue with the bias-only explanation presented in [113] and [50] is that it would require the intrinsic high-\(z\) \(M_{\rm BH}\)-\(M_{\star}\) relation to have a nearly local normalization but a very large scatter, so that we preferentially detect only the bright, overmassive tail (see also [121]). For instance, [50] explicitly infers \(\sigma\simeq 0.8^{+0.23}_{-0.28}\) dex and a normalization consistent with the local relation, attributing the observed offsets to selection and dispersion rather than true evolution. In that picture, the systems we observe at \(z \gtrsim 4\) that sit \(\sim 1.8-1.9\) dex above the local mean (i.e., a factor of \(\sim 70\) in BH mass, as initially reported by [29]) represent the extreme Gaussian tail, not the bulk population.
For a Gaussian with \(\sigma \approx 0.8\) dex, the fraction above \(\sim 1.85\) dex is only \(\sim 1\%\); i.e., the underlying parent population would have to be \(\sim 100\) times more numerous than the already observed counts to “hide” the remainder. However, the number density of LRDs at \(z\sim5-7\) already accounts for a relatively high fraction (\(\sim1-10\%\)) of the total galaxy population at the bright end (\(M_{UV}\sim-22\); [31], [36], [114]). The observed BH mass function at \(z \sim 5-6\) [81] also already agrees in normalization and shape with physically motivated models and independent AGN luminosity functions , not requiring a large, undetected reservoir. Inflating the true number density by a significant factor to accommodate a bias-only scenario would overproduce the BH mass function relative to those same models and observed luminosity functions.
In addition, [122] recently stacked the spectra of \(\sim 600\) JADES galaxies in the redshift range \(3<z<7\), finding that the faint broad-line AGN population that they recover is still significantly above the local \(M_{\rm BH}\)-\(M_{\star}\) relation. Hence, these observations suggest that we are not missing a significant population of AGN that sit on the local relation.
To understand the implications of a large population of overmassive BHs at high redshifts, in this section we compare our results to what current large-volume cosmological simulations and semi-analytical models (SAMs) predict regarding the existence of overmassive BHs at high redshift, and the corresponding evolution of the \(M_{\rm BH}-M_\star\) relation through cosmic time.
Many cosmological simulations have difficulty reproducing the overmassive BH population observed with JWST without explicitly invoking heavier BH seeds. For example, [123], using the cosmological zoom-in simulation code GIZMO, find that stellar-remnant seeds with standard Bondi-Hoyle accretion fail to grow rapidly enough to reproduce the overmassive BHs identified by JWST at \(z \gtrsim 7\), but heavy seeds (\(\sim10^{5-6} \, M_\odot\)) in dense, biased environments can reach \(M_{\rm BH}\sim10^{7-8} \, M_\odot\) if they accrete efficiently at or near the Eddington limit. Under these conditions, the simulated systems naturally appear overmassive with \(M_{\rm BH}/M_\star\) ratios well above the local relation and comparable to those inferred from our sample. The simulations predict that such overmassive states can persist until at least \(z \sim 5\), after which additional stellar growth or mergers are required to converge onto the local relations.
In addition, [124], using the BRAHMA simulations with heavy seed models that emulate direct collapse black hole (DCBH) formation, find that their most physically motivated models result in BHs at \(z \sim 4\) that fall below our inferred high-\(z\) relation by roughly an order of magnitude. Only when some of their most stringent criteria for DCBH formation are dropped and merger delays between central BHs are assumed to be short (i.e., \(\lesssim 750\) Myr) does the simulation reproduce the overmassive relation observed. In these simulations, BH growth at \(z \gtrsim 4\) is primarily merger-driven; hence, producing many overmassive systems requires permissive heavy seeding, higher initial seed masses, and/or more efficient accretion than in the baseline model. These results imply that persistent, widespread overmassive BHs at high redshift require heavy seeding and that BH mergers proceed efficiently.
Other cosmological simulations, such as Astrid and TNG (see, e.g., [125], [126]), offer a complementary perspective. The redshift evolution of the mean \(M_{\rm BH}-M_\star\) relation is governed by how gas is partitioned between star formation and BH fueling. In these simulations, sources that are overmassive at \(z=4\) typically grow more slowly thereafter and drift toward the mean by \(z\simeq2\), whereas undermassive systems catch up. Thus, large positive offsets are transient rather than a long-lived, ubiquitous population in these models.
Finally, comparisons of Astrid and TNG50 [127] demonstrate that simulations can host strongly overmassive BHs, but at low stellar masses these systems are rare and, at low redshift, their origin is primarily environmental (e.g., tidal stripping in overdense regions) rather than a direct imprint of seeding. At \(z > 6\), overmassive BHs are a signature of heavy seeds (see, e.g., [128]). However, at later times, they are more often linked to suppressed star formation and environmental interactions.
The Dark Sage SAM [127] displays substantially larger scatter in \(M_{\rm BH}\) at fixed \(M_\star\). This broader dispersion arises from SAM choices for BH fueling and quenching that the black hole accretion rate from the star formation rate more readily than in Astrid or TNG, allowing a more numerous population of outliers at early times. SAMs of this kind can therefore more easily accommodate the existence of high-\(z\) overmassive BHs. However, the prevalence and persistence of such offsets depend sensitively on the assumed prescriptions for gas supply, triggering, and feedback.
A complementary theoretical perspective is provided by the feedback-free starbursts (FFB) scenario [129]. In that framework, as evaluated in [130], massive BH seeds form at cosmic dawn by rapid core collapse within the young, rotating clusters that constitute the starbursting FFB galaxies. The high \(M_{\rm BH}/M_{\star}\) ratio is maintained as the clusters and BHs inspiral by dynamical friction to the galactic-disk centers and merge there while overcoming gravitational-wave recoil. This pathway yields typical ratios of \(\sim 0.01\), broadly consistent with the average values we infer at \(z \gtrsim 4\). However, reaching the extreme \(\sim 0.1\) ratios seen in some of our sources may be challenging in this framework.
In this study, we examine the relationship between BH mass and host galaxy stellar mass as probed by faint, broad-line AGN at \(3 < z < 7\). Our sample consists of 70 AGN previously identified in the literature based on their broad emission lines in NIRSpec G395M/F290LP observations from the CEERS, JADES, and RUBIES surveys. Our AGN sample includes both reddened sources classified as LRDs, as well as bluer, non-LRDs in roughly equal number, although the LRDs are preferentially found at higher redshifts and become the majority of the sample at \(z>4\).
We perform emission line fitting using a consistent methodology for the entire sample and infer BH masses using the single-epoch scaling relationship presented in Reines et al. (2013). We assess the underlying morphology of the AGN host galaxies using two-dimensional surface brightness profile fitting and measure host stellar masses with SED fitting using a hybrid model that includes stellar continuum and nebular emission from the galaxy and power-law continuum emission and broad and narrow line emission from the AGN.
We find that the BLAGN at \(3<z<4\) have a median \(M_{\rm BH}/M_{\rm \star}\) ratio consistent with the local scaling relationship of RV15, however at \(z>4\) the mass ratios are 1-2 dex higher than that observed in their low-\(z\) broad-line AGN. We attribute this redshift trend with the increasing fraction of LRDs in our sample at \(z>4\) as their host masses are \(\sim1\) dex lower than the non-LRD AGN in our sample.
Using a forward-modeling Bayesian framework that accounts for uncertainties, intrinsic scatter, and selection effects, we infer an \(M_{\rm BH}-M_{\star}\) relationship from our sample at \(4 < z < 7\) that is inconsistent with the local RV15 relationship at a confidence level of \(>3\sigma\). We derive an intrinsic scatter in this relationship of 0.9 dex, which is higher than the 0.5 dex found locally, but does not vary over the redshift range of our sample. Using this same methodology, we find that the intrinsic \(M_{\rm BH}/M_{\star}\) ratio increases by over 2 dex with a confidence level of \(>3\sigma\) from \(z=3.5\) to \(z=6.5\).
Our findings support a picture in which the BHs powering JWST’s broad-line AGN are genuinely overmassive and become increasingly so with redshift. The presence of a large population of overmassive BHs at high redshifts could be due to initial seeding conditions or efficient BH accretion that outpaces star formation in early galaxies, however most cosmological simulations and SAMs have difficulty recreating such a population without employing the use of a heavy BH seeds.
Given that our results are heavily influenced by the increasing fraction of LRDs in our sample with redshift, we caution that much remains to be determined about these sources. We have worked under the assumption that the blue UV emission of LRDs is due to stellar emission and that the locally-calibrated, single-epoch scaling relationship used to determine our BH masses is valid for LRDs at high redshifts. Ultimately, additional study of LRDs will be needed to determine if these assumptions are valid. Several ongoing spectroscopic follow-up programs and dedicated monitoring campaigns of LRDs should provide a better understanding of their BH masses, accretion rates, and intrinsic luminosities in the near future.
We thank the JADES and RUBIES team for their work in designing and preparing the NIRSpec observations used in this study. This work is supported by NASA grants JWST-AR-02446 and JWST-GO-5718 based on observations made with the NASA/ESA/CSA James Webb Space Telescope. The data were obtained from the Mikulski Archive for Space Telescopes at the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-03127 for JWST.
https://archive.stsci.edu/hlsp/jades↩︎
https://jwst-docs.stsci.edu/data-artifacts-and-features/snowballs-and-shower-artifacts↩︎
https://pages.physics.wisc.edu/\(\sim\)craigm/idl/fitting.html↩︎