Mapping Privacy Risk Across the AI Lifecycle

Organizations deploying generative AI often treat it as entirely new privacy risk territory. That framing is incomplete in two ways. First, generative AI shares its lifecycle with traditional machine learning. Second, the established privacy risks of traditional ML have not disappeared just because attention has moved to large language models.

In practical terms, privacy obligations under GDPR and the EU AI Act attach to data flows and processing activities, not to model architectures. The EDPB's Opinion 28/2024 on AI models reinforces this view by assessing privacy risk through the development and deployment phases of AI systems rather than through the model class.¹ Designing safeguards for generative AI solutions starts with understanding which risks emerge at which stage of the lifecycle, and which of those risks are genuinely new.

This piece breaks down the AI lifecycle into four distinct phases: exploration, training, deployment, and inference. For each phase, it identifies the privacy risks at stake, and clarifies what changes with generative AI and what does not.

Whether the system is a fraud detection classifier or a large language model, the lifecycle structure is the same. What changes between them is the shape of the risk that lives in each phase. This view is reflected in NIST's AI Risk Management Framework and its Generative AI Profile, both of which structure risk management around lifecycle stages common to all AI systems.²³ Some frameworks add a fifth phase for monitoring. We treat it separately here to keep the focus on the data and model lifecycle itself.

Each phase carries a distinct risk profile. Treating "AI risk" as a single category obscures more than it reveals. The starting point of any serious privacy assessment is to ask which risks live at which phase.

Figure 1. Privacy risk categories distributed across the four phases of the AI lifecycle.

Phase 1: Exploration

The exploration phase is where teams define a problem, identify candidate datasets, and assess whether an idea is worth pursuing. It is internal, exploratory, and typically does not involve external user interaction.

It is also the phase where governance is weakest. Innovation often runs ahead of formal review. Data gets copied to a sandbox, moved into a notebook, shared with a collaborator, and reused for purposes it was not originally collected for. This happens because validating an idea quickly is the point of the phase. The cost of friction at this stage is felt by data scientists. The cost of weak governance is felt later, usually by privacy and legal teams.

The risks here are not dramatic, but rather foundational. Dataset lineage gets blurred. Lawful basis for processing becomes harder to establish. By the time a project enters formal development, decisions made informally during exploration are difficult to undo. Many compliance failures further down the pipeline can be traced back to this phase.

Phase 2: Training

The training phase uses real-world data to fit and optimize a model. Across model types, the central risk is that the model learns more than it was supposed to.

This takes several concrete forms. Overfitting is the simplest case. The model becomes too specific to its training data and effectively memorizes parts of it. Bias absorption is another. Real-world data carries real-world biases. A hiring model trained on historical resumes will learn the patterns of past hiring decisions, including the biased ones, and will reproduce them unless explicit steps are taken to prevent it. The phenomenon has been studied extensively across the fairness and machine learning literature, with recent work mapping the kinds of bias most relevant to EU AI Act compliance.⁴⁵ Memorization in the strict sense is a third concern. Some models can reproduce parts of their training data verbatim, including personally identifiable information, as a response to specific inputs.⁶

The risk that is most often underestimated is hidden correlations. A common assumption is that removing a sensitive attribute from a dataset removes the information it carries. It does not, if that attribute is correlated with others that remain. Removing "race" from a dataset, for example, does not prevent a model from learning racial patterns when zip code, surname, or income carry the same signal. The model learns the correlated structure and recovers the original signal anyway. Simply removing sensitive attributes does not fully solve the problem because disparate impact arises whenever non-sensitive and sensitive attributes are correlated, as directly framed by Komiyama and Shimao.⁷

The large number of learnable parameters of generative models amplifies these effects. The underlying risks, however, are not new. They have been extensively studied in traditional ML for years.

Phase 3: Deployment

The deployment phase is where AI risk converges with classical IT and cyber risk. The questions that matter here will be familiar to any infrastructure or security team.

Where is user input stored, and for how long? What retention policy applies, and is it enforced in practice? Are model updates triggered by user inputs (for example, through online fine-tuning or memory features that learn from prior conversations), and if so, are those inputs handled lawfully? Which third-party APIs are called during an inference, and what data crosses those boundaries? When these questions go unanswered, the consequences are: undeclared retention of personal data, unlawful reuse of data for purposes it was never collected for, and uncontrolled data transfers across organizational boundaries.

These are not new categories of risk. They are the same categories that apply to any system handling sensitive data. What is new is the specific attack surface that AI-driven patterns create. Adaptive learning from production traffic blurs the line between operational data and training data, since user inputs can end up shaping the next version of the model. Integration with external model providers pushes parts of the inference pipeline outside the organization's direct control, with sensitive data crossing boundaries that an in-house team cannot fully audit. The volume and sensitivity of the inputs and outputs flowing through these pipelines mean that small weaknesses in any link of the chain have outsized consequences.

Phase 4: Inference

The inference phase is where users interact with the deployed model. Risk is two-sided. Data flows in from users (or user systems, such as knowledge bases) in the form of prompts or queries. Data flows out in the form of model responses. The picture here differs meaningfully between non-generative models and large language models.

For non-generative models, the output is bounded. A fraud classifier returns zero or one. A risk score returns a number in a fixed range. The information returned by the model is therefore limited, but a well-developed body of research has shown that bounded outputs still carry information about training data. Membership inference attacks determine whether a specific record was part of the training set.⁸ For a product recommender system, knowing whether a record was in the training data may not be of concern. For a model trained on cancer diagnoses, it is not a trivial matter, since it reveals sensitive health-related information. Attribute inference attacks reconstruct sensitive attributes from model behavior, allowing an adversary with partial knowledge of an individual to recover the rest.⁹

For large language models, the output space is effectively unbounded. The model produces sequences of tokens drawn from a probability distribution over a large vocabulary. As a consequence, the same prompt does not always produce the same answer. This explains several phenomena that are sometimes treated as anomalies. Hallucinations are not bugs in this view: they are probable but factually incorrect sequences.¹⁰ The EDPB's ChatGPT Taskforce report takes the same view from a regulatory angle, noting that the probabilistic nature of LLM output creates compliance obligations under the GDPR data accuracy principle that go beyond ordinary disclaimers.¹¹ Harmful or biased outputs follow the same logic. The output space cannot be tightly constrained.

Agentic systems extend this picture further. When models invoke tools, retrieve data autonomously, and chain actions across services, the data flows at inference time become harder to bound. The lifecycle phases still apply to the underlying model, but inference becomes a moving target. We will go deeper on agentic systems in a separate piece.

Two classes of risk coexist in any organization operating large language models, whether standalone or alongside traditional ML systems: the traditional inference attacks discussed above, and the new risks that follow from unbounded probabilistic output. As the next section makes clear, integrating large language models exacerbates existing risks such as memorization, and at the same time introduces new ones.

What Actually Changes with Generative AI

What changes when an organization moves from traditional ML to generative AI follows from properties of the models themselves, not from the lifecycle structure.

The first is scale. Modern large language models have orders of magnitude more parameters than traditional ML systems: enough to encode information we did not intend them to encode, in internal representations that are not directly observable. While this characteristic is common to all deep learning models, the capacity of language models to absorb and retain detail from training data is qualitatively different from previous generations.¹²

The second is the nature of inference itself. A classifier has a small, discrete output space (a label or a score) that can be enumerated and tested. A language model returns a sequence of tokens drawn probabilistically from a vocabulary that is effectively unbounded. This is the source of many of the behaviors that make generative AI distinctive in production. The unpredictability of outputs. The difficulty of guaranteeing safety properties. The persistence of hallucination as a structural rather than incidental phenomenon.

Figure 2. Privacy risks compared across generative and non-generative AI, by lifecycle phase.

These two properties together change what is possible at inference time. Large language models can reproduce training or fine-tuning data in their outputs, sometimes verbatim, often in contexts where the data was never meant to surface. A model fine-tuned on internal documents can return excerpts of those documents in response to an unrelated user query. A model trained on data containing personal information can emit that information when prompted in ways its developers did not anticipate.¹³ Memorized content can be elicited through natural language and returned as part of a normal response. The boundary between training data and output is more porous than it is in earlier machine learning systems.

These are real new risks that coexist with previous ones. Membership inference, attribute inference, and disclosure through model outputs remain live concerns for any deployed model, generative or not.

Implications for Privacy and Governance

By dividing the AI lifecycle into four phases and characterizing privacy risks specific to each, three implications come into focus.

Privacy risk attaches to phases of the lifecycle, not to "AI" in the abstract. A meaningful safeguard at the inference phase looks different from a meaningful safeguard at the training phase. Treating AI as a single risk category leads to controls that are either too broad to be useful or too narrow to be effective.

The exploration phase is often the least governed and the most consequential for downstream compliance. Data lineage, lawful basis, and the boundaries of permitted use are established or undermined at this stage. Governance frameworks that focus only on production deployment miss the point at which most of the consequential decisions are made.

Generative AI introduces real new risks at training and inference. It does not eliminate the established risks of traditional ML. Organizations deploying both need governance that covers both. A privacy program that has been redesigned around generative AI, and has in the process stopped paying attention to the older categories of risk, is exposed in ways that are easy to miss.

The lifecycle framing is not a complete answer. It is the starting point of any rigorous AI privacy assessment. From there, the work is in the specifics. Which data. Which model. Which deployment context. Which safeguards are appropriate at each stage.

European Data Protection Board, Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models, 17 December 2024. https://www.edpb.europa.eu/system/files/2024-12/edpb_opinion_202428_ai-models_en.pdf ↩
National Institute of Standards and Technology, Artificial Intelligence Risk Management Framework (AI RMF 1.0), January 2023, https://www.nist.gov/itl/ai-risk-management-framework ↩
National Institute of Standards and Technology, Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, NIST AI 600-1, July 2024, https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf ↩
Salimi, B., Howe, B., and Suciu, D., "Data Management for Causal Algorithmic Fairness," 2019, https://arxiv.org/abs/1908.07924 ↩
Ceccon, M., Cornacchia, G., Dalle Pezze, D., Fabris, A., and Susto, G. A., "Underrepresentation, Label Bias, and Proxies: Towards Data Bias Profiles for the EU AI Act and Beyond," Expert Systems with Applications, 2025, https://arxiv.org/abs/2507.08866 ↩
Carlini, N., Tramèr, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., Roberts, A., Brown, T., Song, D., Erlingsson, Ú., Oprea, A., and Raffel, C., "Extracting Training Data from Large Language Models," 30th USENIX Security Symposium, 2021, https://www.usenix.org/conference/usenixsecurity21/presentation/carlini-extracting ↩
Komiyama, J. and Shimao, H., "Two-stage Algorithm for Fairness-aware Machine Learning," 2017, https://arxiv.org/abs/1710.04924 ↩
Shokri, R., Stronati, M., Song, C., and Shmatikov, V., "Membership Inference Attacks Against Machine Learning Models," 2017 IEEE Symposium on Security and Privacy, https://arxiv.org/abs/1610.05820 ↩
Fredrikson, M., Jha, S., and Ristenpart, T., "Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures," Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015, https://dl.acm.org/doi/10.1145/2810103.2813677 ↩
Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., and Liu, T., "A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions," 2023, https://arxiv.org/abs/2311.05232 ↩
European Data Protection Board, Report of the Work Undertaken by the ChatGPT Taskforce, 23 May 2024, https://www.edpb.europa.eu/our-work-tools/our-documents/other/report-work-undertaken-chatgpt-taskforce_en ↩
Carlini, N., Ippolito, D., Jagielski, M., Lee, K., Tramèr, F., and Zhang, C., "Quantifying Memorization Across Neural Language Models," International Conference on Learning Representations (ICLR), 2023, https://arxiv.org/abs/2202.07646 ↩
Chen, X., Tang, J., Li, Y., Pan, X., Yan, M., and Yang, M., "The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks," 2023, https://arxiv.org/abs/2310.15469 ↩