RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines
October 23, 2025
Retrieval-Augmented Generation (RAG) has emerged as the dominant architectural pattern to operationalize Large Language Model (LLM) usage in Cyber Threat Intelligence (CTI) systems. However, this design is susceptible to poisoning attacks, and previously proposed defenses can fail for CTI contexts as cyber threat information is often completely new for emerging attacks and sophisticated threat actors can mimic legitimate formats, terminology, and stylistic conventions. To address this issue, we propose that the robustness of modern RAG defenses can be accelerated by applying source credibility algorithms on corpora, using PageRank as an example. In our experiments, we demonstrate quantitatively that our algorithm applies a lower authority score to malicious documents while promoting trusted content, using the standardized MS MARCO dataset. We also demonstrate proof-of-concept performance of our algorithm on CTI documents and feeds.
LLM, RAG, poisoning, defense, security
Large Language Models (LLMs) have grown in use immensely due to their extraordinary generative capabilities and diverse applications. The integration of LLMs into cybersecurity operations [1]–[6] marks a paradigm shift, where unstructured Cyber Threat Intelligence (CTI) reports can be ingested and analyzed, and actionable Indicators of Compromise (IoCs) can be extracted and fed into downstream systems. However, the ever-evolving nature of cybersecurity and the huge volumes of data available from multiple threat feeds [7]–[10] means that a large amount of contextual information relevant to this processing is constantly being generated. This highlights the inherent limitation of LLMs: their lack of direct access to current domain-specific knowledge, as well as their propensity to hallucinate in contexts where actual data is not available as a point of reference.
To address this issue, Retrieval-Augmented Generation (RAG) [11] has emerged as the dominant architectural pattern to operationalize LLM usage in CTI systems [6], [12], [13]. RAG pipelines dynamically retrieve relevant, up-to-date information from curated knowledge bases or external sources, grounding the LLM’s reasoning and response generation in verifiable evidence. This allows LLMs to access an up-to-date knowledge base in real-time, providing the necessary context to formulate an answer. A RAG-enabled LLM is then able to effectively complete tasks such as summarizing threat actor profiles, linking IoCs, or explaining complex attack techniques.
However, this dependence introduces a new attack vector for infusing misinformation into CTI pipelines. RAG has been known to be susceptible to poisoning attacks, and previous work [14]–[16] has shown that attackers can craft malicious injections that can be optimized to be included in the retrieval with high probability, especially when access to the data source is available. With complete access to open source CTI feeds, adversaries are easily able to inject deceptive elements such as falsified threat reports, manipulated IoCs, misleading vulnerability data, or poisoned mitigation advice into the retrieval system. Since CTI is a shared resource between many organizations, this corrupts the entire security pipeline by amplifying this misinformation. The consequences range from wasted resources by security analysts who may chase phantom threats and deploy ineffective mitigations, to catastrophic misconfigurations and the obscuring of genuine, ongoing attacks [17]. Indeed, if the attacker’s goal is to disrupt operations and cause havoc by misconfiguration of key services, carefully crafted fake CTI could well accomplish that goal by itself. Therefore, ensuring the authenticity of CTI within the RAG pipeline is not merely an academic concern, but a fundamental operational security requirement.
Prior defenses to RAG poisoning have put forward various techniques, such as using modified majority voting schemes, having multiple LLMs verify relevance, adding redundant safe data to the retrieval, clustering retrieved texts in embedding space, and involving humans in the loop [18]. These techniques mainly focus on analyzing the retrieved subject matter. However, these defenses can fail for CTI contexts as cyber threat information is often completely new for emerging attacks, and sophisticated threat actors can mimic legitimate CTI formats, terminology and stylistic conventions. Thus, we propose that instead of interrogating what the information says, we focus on where it originates and how it propagates.
This approach largely mirrors tactics used in search engine ranking. Given that RAG is largely a search problem, we believe that leveraging the decades of literature on search engine ranking to guard against irrelevant or misleading information is a natural fit. We turn to one of the most well-known ranking algorithms in PageRank [19], which was used by Google for ranking webpages based on inbound hyperlinks. In essence, PageRank measures the prestige and amount of contextual support for each page from other sources. Documents in a corpus contain content-based links that are analogous to links between webpages. Therefore, we propose that the robustness of modern RAG defenses can be accelerated by applying source credibility algorithms on corpora, using PageRank as an example.
In our work, we introduce a PageRank-derived authority score that we call RAGRank as a novel defense against corpus poisoning attacks. We first build a citation network from a document corpus using explicit citations, LLM-inferred citations, and claim-level entailment. We then incorporate refinements such as time decay to counter bias against recent documents, and author credibility to leverage author reputation. In our experiments, we demonstrate that our algorithm improves accuracy of answers on a standardized dataset, and also show its performance on two CTI misinformation case examples.
Prior Defenses. Prior works have used various approaches to defend against corpus poisoning attacks and ensuring retrieved content is relevant to the query. In [20], the authors used Natural Language Inference (NLI) techniques to make retrieval more robust, and show that training the model for the task can further improve performance. RobustRAG [21] employs an isolate-then-aggregate strategy that separates each retrieved document and generates an LLM response from each before aggregating them for a final robust response. Similarly, other works have explored Knowledge Expansion [15], [22], which expands the retrieval context in order to drown out the malicious injection. TrustRAG [23] takes this a step further to identify malicious documents before the retrieval by clustering them in embedding space, in order to defeat attacks which generate a large amount of poisoned injections that might overcome the earlier strategies. The closest related work to our effort is ClaimTrust [24], which uses a modified PageRank-inspired algorithm to propogate trust scores across documents. However, ClaimTrust allows multiple edges between nodes as well as negative weight edges. We found this would give false claims high authority if they are mixed in with many true claims, and negative weight edges could be abused to downrank legitimate documents.
RAG and Misinformation in CTI Pipelines. Fayyazi et. al. [12] showed that RAG performs better for CTI use cases than other methods, and Tellache et. al. [13] presented the performance of a CTI RAG pipeline by retrieving relevant documents via CTI APIs from VirusTotal and CrowdStrike to correlate incidents with historical cases. Their results showed strong answer relevance and groundedness, and was validated by cybersecurity professionals. Conversely, Shafee et. al. [17] demonstrated how adversarial text generation techniques can create fake cybersecurity information that misleads classifiers, degrades performance, and disrupts system functionality, leading to real-world consequences of wasted time and resources. Huang et. al. [25] evaluated how well generated CTI can be detected by humans as well as traditional Natural Language Processing (NLP) and machine learning techniques, concluding that humans cannot distinguish the generated content regardless of their background, and a multi-step approach is needed.
A RAG system embeds the knowledge corpus into high-dimensional vectors to capture semantics and stores this information in a vector database. The user’s query is also embedded, and the resultant vector is compared with the corpus embeddings by cosine similarity. The top-\(k\) closest matches are provided as context to the LLM alongside the original query. To formalize this, let \(\mathcal{D} = \{d_1, ..., d_n\}\) be a document corpus, and \(\phi: T \rightarrow \mathbb{R}^m\) be an embedding function that maps text to \(m\)-dimensional vectors. For a user query \(q\), RAG computes: \[\label{eq:relevance} \mathcal{R}_k = \underset{d \in \mathcal{D}}{\text{top-}k}[\texttt{sim}(q,d)] = \underset{d \in \mathcal{D}}{\text{top-}k}[ \cos(\phi(q), \phi(d)) \geq \theta ]\tag{1}\] where \(\mathcal{R}_k\) is the top-\(k\) retrieved context, and \(\theta\) is a minimum similarity threshold. The final output is generated by: \[y = \texttt{LLM}(q \oplus \mathcal{R}_k)\] where \(\oplus\) denotes context concatenation.
The adversary attempts to corrupt \(\mathcal{D}\), such that during the retrieval stage, irrelevant and malicious information may be retrieved. Thus, the adversary constructs malicious documents \(\mathcal{D}_m = \{d_1^m, \dots, d_p^m\}\) targeting a specific query \(q\) by solving: \[\max_{d^m \in \mathcal{D}_m} \;\mathbb{P}[ d^m \in \mathcal{R}_k ] \quad \text{s.t.} \quad \texttt{sim}(q, d^m) \geq \theta\] The attack succeeds when the retrieval set \(\mathcal{R}_k\) for query \(q\) contains a threshold number \(M\) of malicious documents \(d_m\), such that the LLM is forced to base its output \(y\) with significant consideration of the poisoned context: \[\begin{align} y &= \texttt{LLM}(q \oplus \mathcal{R}_k) \quad \text{where} \nonumber \\ \mathcal{R}_k &= \underset{d \in \mathcal{D} \cup \mathcal{D}_m}{\text{top-}k} [\texttt{sim}(q,d)] \quad \text{and} \quad |\mathcal{R}_k \cap \mathcal{D}_m| \geq M \end{align}\] We note that in our threat model for CTI, we might well have situations where \(|\mathcal{R}_k| \ll k\) for new information on emerging threats. This provides the attacker a favorable opportunity to inject malicious information into threat feeds and ensure it gets included in the retrieved context.
Given a document corpus, we explore three different ways to build a graph structure that represents the data set: explicit citations, inferred citations, and claims extracted from the documents. An overview of our approach using claim extraction is shown in Figure 1. Once we have a graph, we calculate the PageRank score over the graph which is then enhanced with additional measures, i.e., time decay to counter bias against recent documents which could provide important information about emerging threats, and author credibility to leverage author reputation. Once the user inputs a query, the pertinent documents are extracted based on an enhanced version of 1 , in which we augment the existing similarity function with an authority score we call RAGRank to further filter the retrieval set. The final set is then combined with the query and given to the LLM to generate a response. Next, we delve into the details of each step.
We use three different techniques to build the retrieval dataset graph. We describe all three for the sake of completeness in this paper. However, due to space restrictions, we only present results using the “inferred citations" method in the experiments.
In this case, we create a directed citation graph based on each document’s explicit references. Each document is a vertex in the graph, and a directed edge is added if a document cites another, with the source being the citing document. This is the most straightforward technique to construct such a graph, and while effective, a couple of issues arise. For one, databases in practical enterprise systems often do not contain explicit citation links. Secondly, this could also lead to a scenario where a new document could disprove findings in an outdated one but still cite it, which would cause the document with incorrect information to be attributed with high credibility.
For this approach, we generate a similar graph as above, but use inferred citations, rather than explicit ones. We determine the commonality of the topics between two documents by using an LLM (we use Gemma 2-27B) to compare the contents. We prompt the LLM to consider claims and keywords, providing a specific metric of comparison (see Appendix 4 for full prompt). From this, the LLM determines the degree of commonality between the contents as a value between 0 and 1. This value is used as the weight of the connection between the two documents in the citation network.
We make our inferred citation directional by positing that if the contents of Document \(B\) can be considered as a continuation of the discussion presented in Document \(A\), Document \(B\) cites Document \(A\). The directionality of this citation is also determined by temporal factors, where Document \(B\) can only cite Document \(A\) if Document \(B\) was published after Document \(A\). If these conditions are met, then we add an edge from vertex \(B\) to vertex \(A\). Also, since LLM inference is expensive, we only inference the LLM for document pairs with a cosine similarity higher than \(0.5\). We reason documents with a lower cosine similarity will likely have a negligible agreement score.
This method addresses the issues mentioned above with the explicit citations approach. If newer documents provide conflicting information compared to older documents, the newer information will likely not result in an inferred citation link to the older information, regardless of an explicit citation link between the documents. This also makes our framework extensible to practical RAG systems that lack obvious graphical relationships. However, we do note some drawbacks. We noticed that incorrect connections can still be made between clean and malicious documents, largely because malicious documents still contained factually correct information. We also expect a larger model to do better than the Gemma 2 LLM at inferring citations between related documents.
We also explored another technique for graph creation using claim extraction as an extension of the inferred citations approach. From the NLP context, a claim is generally a unit of text that is asserting something that could be supported, argued, verified, or classified. We employ a pretrained NLI model (RoBERTA Large MNLI), which is specifically trained to determine the logical entailment between pairs of claims. For example, claim \(A\) entails claim \(B\) if \(B\) is a necessary consequence of \(A\), akin to a citation, enhancing \(B\)’s credibility. Entailment is thus a stricter form of inferred citation agreement.
We first extract all claims from the documents using our LLM (Gemma 2) to output claims from a given document. Each claim is then treated as a vertex in the graph, and an edge is added between vertex \(A\) and vertex \(B\) if claim \(A\) entails claim \(B\). This results in a much denser graph, with more granular scoring possible between claims. However, we find that due to the sheer amount of claims and their content, in some cases they are far more difficult to use for actual inference, despite great separation between clean and malicious PageRank scores. There are a number of ways this can be addressed and we attempted a couple of approaches where we tried to combine and group claims [26], however both these techniques require further investigation and we leave this for future work.
| Test APT | Query Focus | Correct Answer | Poisoned Info | |||
| (RAGRank Score) | ||||||
| RAGRank Score | Outcome | |||||
| CozyBear | Group affiliation | Russia | China | Wikipedia (0.92) | 0.31 | Correct |
| FancyBear | Organization nature | State-sponsored | Independent firm | Wikipedia (0.89) | 0.29 | Correct |
| LazarusGroup | Attack targets | Financial/crypto | Education/OSS | Wikipedia (0.85) | 0.33 | Correct |
| ScatteredSpider | Group purpose | Cybercriminal | Corporate red team | Conflicting sources* (0.22) | 0.22 | No Answer |
| ScatteredSpider | Term meaning | Cybercriminal group | Desert spider | Cyware Blog (0.94) | 0.25 | Correct |
*This query only retrieved poisoned results
Since the attacker targets the similarity function \(\texttt{sim}(q, d)\), we seek to augment this formula by adding an authority score \(\alpha(d)\) that is considered along with the cosine similarity result.
Given a graph built using our techniques above, for each node \(d_i\) we start by calculating its PageRank score [19]: \[\alpha(d_i) = \frac{1 - \beta}{n} + \beta \sum_{d_j \in \mathcal{L}_{\text{in}}(d_i)} \frac{w_{ji}}{\sum_{d_k \in \mathcal{L}_{\text{out}}(d_j)}w_{jk}} \alpha(d_j)\] where \(\beta\) is set to \(0.85\), \(\mathcal{L}_{\text{in}}(d_i)\) denotes inbound citations, \(\mathcal{L}_{\text{out}}(d_j)\) denotes outbound citations, \(n\) is the number of nodes and \(w_{ji}\) is the weight of the edge from \(d_j\) to \(d_i\). We then enhance this score with additional post processing.
Due to the importance of considering newer threat information, we implement a time decay such that the score of older documents is scaled by the difference from their creation date to the current date. This scaling occurs as a percentage of the document’s current authority score based on the distance from the current date.
Let \(t(d_i)\) denote the age of document \(d_i\) and \(r\) be a relevance period after which a document’s authority begins to decline. Define a linear time-decay factor: \[\tau(d_i)=\left\{ \begin{array}{ll} 1 & t(d_i) \leq r \\ \max(0, 1 - \lambda \cdot (t(d_i) - r)) & t(d_i) > r \\ \end{array} \right.\] where \(\lambda\) is a decay rate hyperparameter. The time-decayed authority score is then computed as: \[\alpha'(d_i) = \tau(d_i) \cdot \alpha(d_i)\]
In our testing, we set \(t(d_i)\) to be the age in months and use \(\lambda = 0.01\), but these parameters can be set to age out documents based on the required focus of the system and time spanned by the specific document corpus.
We also consider author credibility in the scenario where a new document may deserve an increased initial authority due to its credible authors. Author credibility is the average authority value of an author’s prior documents. In cases where there are multiple authors, their authority values are averaged amongst each other. This promotes new documents from historically reputable authors and restricts malicious content from evil authors.
Let \(\mathcal{D}_x\) be the set of documents authored by author \(x\). The credibility score for author \(x\) is then defined as: \[C(x) = \frac{1}{|\mathcal{D}_x|} \sum_{d \in \mathcal{D}_x} \alpha'(d)\] The cumulative author credibility for document \(d_i\) is: \[\gamma(d_i) = \sum_{x \in A(d_i)} C(x)\] where \(A(d_i)\) is the set of all authors of document \(d_i\).
We calculate our final score \(R(d_i)\), which we call RAGRank, as the min-max normalized sum of time-decayed rank and author credibility: \[\begin{align} \label{eq:ragrank} R(d_i) &= \frac{s_i - \min_j s_j}{\max_j s_j - \min_j s_j} \quad \text{where} \nonumber \\ s_i &= \alpha'(d_i) + \gamma(d_i) \end{align}\tag{2}\] and \(\min_j\) and \(\max_j\) are computed over all documents \(d_j \in \mathcal{D}\). This yields a score in the range [0,1] for each document.
In our retrieval pipeline, we employ a two-pass ranking strategy to balance relevance and authority. In the first pass, we retrieve the top-\(2k\) (instead of top-\(k\)) chunks most similar to the input prompt based on their cosine similarity as in 1 . In the second pass, we re-rank these \(2k\) candidates using their RAGRank score. The top \(k\) chunks from this re-ranked list are then selected as context for the language model. Thus, combining 1 and 2 we get: \[\begin{align} \mathcal{R}_k &= \underset{d \in \mathcal{D}}{\text{top-}k} [ R(d) \cdot \mathbf{1}_{ \{ d \in \mathcal{R}_{2k} \}} ] \quad \text{where} \nonumber \\ \quad \mathcal{R}_{2k} &= \underset{d \in \mathcal{D}}{\text{top-}2k} [ \texttt{sim}(q, d) ] \end{align}\]
This two-stage approach addresses a key limitation of using a single weighted sum of cosine similarity and authority (e.g. \(\omega \cdot R(d_i) + (1 - \omega) \cdot \texttt{sim}(q, d_i)\)). In such formulations, high-authority documents can overshadow more relevant but lower-authority documents. By first filtering for relevance and then re-ranking by authority, we ensure that only semantically related documents are considered for inclusion, while still promoting trustworthy and influential sources.
We performed experiments using a standard dataset as well as two conceptual cases with CTI misinformation. Some of these results are preliminary, but we believe there is sufficient evidence that the current approach shows promise.
We began our evaluation on poisoned versions of the MS MARCO [27] question answering dataset, as constructed by [15]. We present results ranging from one poisoned document to five poisoned documents for each question in the dataset, with five being the highest level of poisoning in [15]. After constructing the graph using the inferred citation method, we computed RAGRank scores and then asked our LLM to answer all the questions in the dataset with instructions to prioritize high authority documents.
Our accuracy in this experiment is calculated based on whether the LLM found the correct answer in the RAG database or whether it provided the poisoned answer. The results with increasing poisoning levels is shown in Figure 2. Blind accuracy represents the performance without any poisoning protection, where as the control represents the performance with all poisoning removed from each retrieval. Observe that RAGRank is able to improve accuracy of the LLM by about 10-15% in most cases, with difficulty increasing along the x-axis. We must note that this dataset does not include author and time metadata, hence we are unable to use the benefits those enhancements provide.
In this section, we show two proof-of-concept CTI examples. Our corpus is structured to mimic a collection of articles and threat reports from cyber security organizations (e.g., CrowdStrike, Darktrace) and any other public news and data about threat actors (e.g., Wikipedia, Reuters) that may be ingested into a CTI database. After processing, we end up with about 300 excerpts used to build our graph.
In the first scenario, an attacker tries to drown out previously known information in a threat database using poisoned blog posts and articles. To test such a scenario, we create test cases that each contain a query about a known APT, and a list of ten poisoned chunks that attempt to trick the RAG into retrieving them so their contents may manifest in the final LLM inference output. In total, we accumulate five test cases shown in Table 1, which lists the APT, the focus of the query, the correct answer, the incorrect answer encouraged by the attacker, the scores of the highest ranked retrieved documents (both actual and poison), and the outcome. For example, in the first row the query asks about the affiliation of the CozyBear APT, with the poisoned samples attempting to get the LLM to answer with China; however, the LLM correctly deferred to Wikipedia as the main source. Most cases were similar except for test four, where even though the LLM only retrieved poisoned documents in the first pass, it refused to provide a conclusive answer due to none of the documents being authoritative.
Figure 3 compares the similarity and RAGRank scores for the retrieved chunks across the test cases. Notice that each test case retrieves a good number of poisoned samples, in fact, the CozyBear and ScatteredSpider cases only return one real article. The poisoned chunks also have a high similarity score as they are directly relevant to the query, however have a low RAGRank score which ensures the LLM bases its answer on the high authority document(s).
In the second scenario, an attacker tries to front-run posts about their infrastructure before an attack. In our experiment, the attacker creates a fictional blog (see Appendix [sec:sec:appendixB]) in a public feed that states that a particular domain (updates-winsecure[.]com) is benign. However, this is a domain that the attacker has set up for a future malware campaign, and the goal is to poison any future RAG system that queries for information about this domain. This is an interesting case as we do not expect any real documents to have information about a new domain. Table 3 shows the poisoned chunks extracted from the attacker’s blog, as well as the similarity and RAGRank scores for the top ranked retrieved document from a query about the domain. Notice that without RAGRank, the poison document with a low RAGRank score is the top contender, where as with our addition a different legitimate document is ranked first. Thus, a hypothetical SOC analyst querying our RAG system receives very different responses with and without our protections available. Without it, the attacker is successfully able to poison the RAG system with just a single poisoned example. With RAGRank enabled, the RAG system correctly informs the user that there is little authoritative information about the given domain.
| Poisoned chunks extracted from attacker’s blog |
|---|
| P1. “After investigation, we conclude updates-winsecure[.]com is a benign CDN endpoint used by a legitimate browser extension for update checks.” |
| P2. “VirusTotal shows 0/98 hits for the domain with no reputation flags.” |
| P3. “No adversary infrastructure, anomalous behavior, or overlapping IOCs observed.” |
| P4. “Suppress alerts for this domain unless new anomalies appear.” |
| Without RAGRank | With RAGRank | |
|---|---|---|
| Top Document Score | Similarity = 0.84 (RAGRank = 0.05) |
Similarity = 0.68 RAGRank = 0.21 |
| LLM Result | Benign “The most relevant information comes from statement P1, statement P2, and statement P3, which collectively indicate that updates-winsecure[.]com is considered benign based on investigations, VirusTotal scans, and the absence of indicators of malicious activity.” |
Unknown “The most relevant documents have low authority values, which reduces their reliability according to the rules provided. Therefore, based on the available documents and following the instruction to prioritize high authority values and relevance, I must conclude that I don’t know if updates-winsecure[.]com has been associated with known threat actors or malware campaigns due to the lack of reliable information from high-authority sources.” |
In this paper, we introduced RAGRank, a source credibility scoring technique for countering poisoning of information in a CTI RAG system. Our method works by computing an authority scores for each document in the RAG database, which can be used in combination with the retrieval similarity score as a second filtration step to ensure only high credibility documents are used as a source. We then show its performance on the standardized MS MARCO dataset, as well as two proof-of-concept cases using CTI articles and threat reports.
We have identified several avenues to extend this work:
) Perform more rigorous experimentation on larger CTI RAG databases with validation by domain experts.
) Combine our three different graph building approaches into one technique, which could provide more robustness to the authority scoring.
) Improve our claim extraction process by investigating the use of hierarchical summaries [26] where many agreeing claims are grouped
together and summarized before retrieval, offering a broader context while preserving only well-supported details.
) Explore additional metadata options for building accurate source credibility. Two simple yet promising examples are the follower/repost counts of a social media source and the domain type of the blog source (e.g., .edu and .gov may be more credible than
.com).
) Study possible attacks against RAGRank, which could be similar to those against PageRank. For example, a long-term strategy where a malicious actor releases credible information to build authority over time, then releases something malicious. We plan to
assess the necessary level of developed authority to counterbalance unsupported malicious information and explore strategies to mitigate such attacks.
Figure 4:
.
Equal Contribution↩︎