peopleanalyst

AI Human Interaction Guide · Part VI of 7

Governance, Privacy, and Compliance

Regulatory and ethical scaffolding for enterprise AI — the EU AI Act, executive orders, privacy regimes, and the calibration questions underneath.


6.1 The governance landscape — regulatory + organizational + operational

AI governance in 2026 operates at three layers, each with its own institutional history and methodology:

Regulatory layer. External legal and regulatory frameworks that constrain how enterprises deploy AI. The 2024-2026 landscape includes the EU AI Act (risk-tiered regulation of AI systems; entered into force August 2024 with phased compliance deadlines through 2027); various US federal and state-level AI executive orders and regulations; sector-specific regulation in financial services, healthcare, and employment contexts; emerging case law on AI liability — including the Air Canada chatbot ruling (BCCRT, February 2024), which established that companies are liable for fabricated statements by their AI customer-service systems.1 The regulatory landscape is fragmented across jurisdictions and evolving fast enough that any reference treatment of it will be quickly out of date in specifics; the guide covers the general shape.

Organizational governance layer. Internal enterprise frameworks for AI oversight — AI ethics committees; AI review boards; risk-management processes; vendor-evaluation criteria; deployment-approval gates. The organizational layer is where enterprises operationalize regulatory requirements plus their own additional risk-management discipline. The conventional structure (governance committee + policy + review processes) is well-established; the discipline of making it actually work is less so.

Operational layer. Day-to-day discipline by which AI systems get deployed, monitored, and corrected — usage policies; access controls; deployment approval workflows; monitoring instrumentation; incident-response procedures. The operational layer is where governance succeeds or fails. Policies that exist on paper without operational implementation are common; operational discipline without policy backing tends to be inconsistent.

The guide's framing borrows from Part V §5.5's design-constraint-vs-risk-management distinction: governance can operate as downstream risk management (governance committees reviewing rollouts after design decisions; usage policies catching problems after deployment) or as upstream design constraints (design vetoes that prevent problematic features from being built in the first place; deployment gates that prevent problematic features from reaching production). Both are necessary; the upstream variant is what Part V §5.5 advocates and what most enterprise governance frameworks under-emphasize.

The chapter walks privacy (§6.2), ethics (§6.3), auditability (§6.4), and the institutional question (§6.5) in turn. Each surfaces specific aspects of how the methodology gap from Part I §1.3 interacts with governance practice.


6.2 Privacy and the AI substrate

AI's privacy implications are categorically different from prior enterprise software's privacy implications, in three specific ways:

The training-data privacy problem. Foundation models learn from data; the data may include personal information that the model can subsequently surface. Frontier-model labs apply filtering and alignment to reduce direct disclosure (the model resists when asked to recite training-data verbatim), but the underlying data is in the model's parameters in a form that is difficult to fully scrub. Enterprise deployments that use foundation models against their own data — through fine-tuning, retrieval-augmented generation (RAG), or in-context learning — inherit a similar concern: data the enterprise wants to keep confidential can leak into the model's outputs in ways the conventional access-control framework does not catch.

The corrective is methodological: data minimization at the boundary where data enters the AI system; access-control on retrieval rather than only on the model; instrumentation that detects when outputs contain information that wasn't supposed to be in scope. The People Analytics Toolbox's data-anonymizer spoke is one production-grade implementation of this discipline — deterministic HMAC-keyed tokenization, k-anonymity min-N gates, substitution-strategy registry — that any spoke surfacing rollups passes through before responding.2

The inference-from-aggregates problem. AI systems can produce inferences about individuals from data that does not directly identify them. A team-level engagement report can be paired with a system that produces individual-level inferences if the team is small enough or the system has enough auxiliary information. The classical min-N privacy gate (don't surface team-aggregates for teams below a certain size) is necessary but not sufficient against AI-augmented inference attacks; the more robust correction is to enforce min-N at the substrate level (the data the AI system can access; not the data the AI system surfaces).

The cumulative-disclosure problem. Conversational AI systems accumulate user information over long sessions. Information a user discloses incrementally across many turns can be aggregated by the system into a profile that the user did not intend to create. The disclosure is voluntary at each step; the cumulative profile is not what the user thought they were creating. The methodology gap from Part I §1.3 — software-fails-by-stopping vs AI-fails-by-performing-confidently — extends to privacy: the cumulative-disclosure failure is invisible to the user at the time it happens.

The regulatory frame around AI privacy is evolving fast. GDPR's right-to-explanation provisions interact with AI's substrate-opacity (§6.4) in ways that the regulators are still working out — Edwards and Veale's 2017 Duke Law and Technology Review analysis remains the load-bearing examination of why the right-to-explanation framing produces less practical remedy than its drafters anticipated.3 The EU AI Act's risk-tiering creates compliance obligations that operationalize differently for high-risk vs lower-risk AI systems; US state-level privacy regulation continues to fragment. Enterprise deployments that anticipate the trajectory of regulation rather than reacting to each new requirement tend to do better operationally.


6.3 Ethics in deployment — decision-stakes, paternalism, autonomy

The ethical questions around AI deployment cluster around three load-bearing tensions:

Decision-stakes and accountability. When an AI-augmented decision affects a person's career, financial circumstances, healthcare, or legal status, the methodology gap from Part I §1.3 has ethical weight beyond its operational weight. An AI system that produces confident-looking outputs that are wrong in ways the user cannot easily detect is one kind of problem in customer service (cumulative trust cost; §3.3); it is a different kind of problem in hiring (a candidate is rejected based on AI analysis the hiring manager did not verify) or medicine (a treatment is selected based on AI recommendation the physician did not interrogate).

The structural correction at the ethics layer is the same one Part V §5.5 advocates at the design layer: instrumentation that surfaces calibration mismatches between AI confidence and ground-truth accuracy. The ethics dimension adds the additional requirement that the human-in-the-loop reviewer must have the capability and incentive to actually use the instrumentation — Part IV §4.5's rubber-stamping failure mode is an operational failure with ethical consequences when the decisions are consequential.

Paternalism vs autonomy. The calibration-of-personalization literature treats the question explicitly. When does AI's adaptation to a user's preferences support the user's autonomy (the user can do what they want more easily) vs undermine it (the user is shaped by the AI's adaptation in ways they didn't consent to)? The literature draws on Sunstein's #Republic on group polarization and on Conly's Against Autonomy on paternalism, with the AHI program's review applying the framing specifically to conversational AI.4

The guide's position (extending Part V §5.6 Position 2): in domains where the AI's outputs affect the user's reasoning or judgment, the system should default to surfacing alternatives, naming the AI's confidence, and resisting reasoning-personalization. The default toward agentic alignment-to-user-preference is the failure mode the sycophancy literature documents; the corrective is structural-input patterns + calibration instrumentation + explicit resistance to the AI taking on the user's framing.

Bias and fairness. The bias-amplification literature (Part V §5.2.5) documents that AI systems amplify biases present in their training data and in their human-feedback loops. Glickman and Sharot's 2024 Nature Human Behaviour finding that the amplification operates bidirectionally — and that the human side of the loop sometimes carries the larger amplification — sharpens the policy implication: bias-mitigation methodology that operates only on the AI's outputs misses the dynamics inside the human-AI loop that drive the longitudinal trajectory of bias.5 The fairness implications cluster around two questions:

  • Does the AI system produce outcomes that are equitable across protected demographic categories?
  • Does the AI system contribute to longitudinal trajectories of inequity through its outputs and the human responses they shape?

The first question has substantial methodology behind it (fairness metrics; disparate-impact analysis; counterfactual fairness frameworks). The second is much less methodologically developed; it depends on instrumentation that tracks longitudinal outcomes for populations affected by AI-augmented decisions, which is the same instrumentation Part V §5.2.5 names as needed for the bias-amplification loop.

The AHI program's children with AI commissioning decision is also a useful reference for ethics-in-deployment thinking: the program took a deliberate posture (Option A — narrow scoping note + bibliography; no public position on contested policy-adjacent domains) rather than rushing to public position-taking on a topic where the empirical record was insufficient and the political stakes were high.6 The discipline — take positions where the empirical record supports them; defer where it doesn't — is the disposition the guide advocates for ethics-in-deployment generally.


6.4 Auditability and the substrate-opacity problem

A specific tension governance frameworks have struggled with: AI systems' substrate (the training data; the alignment process; the fine-tuning data; the architecture decisions) is typically proprietary to the vendor and opaque to the enterprise deploying the system. Auditability — the ability to verify that a system did what it said it did, with the data it said it used, in the way it said it operated — is structurally limited.

The classical software-auditability framework relies on source-code inspection, deterministic test cases, and reproducible builds. None of these tools work well on foundation-model AI:

Source-code inspection is partial. Frontier-model labs typically publish the architecture (often) but not the trained weights (rarely), the training data (almost never in auditable detail), the alignment process (sometimes summarized), or the fine-tuning data (rarely). The architecture alone does not let an auditor verify the system's behavior.

Deterministic test cases are impossible. AI systems are probabilistic by construction (Part I §1.3). The same test input produces different outputs across runs. Test-suite-style verification doesn't apply directly.

Reproducible builds are infeasible. Even with access to architecture and weights, reproducing a model's behavior in detail is computationally expensive (training a frontier model costs millions of dollars). Verification at training time is not a tractable audit path.

The corrective is methodology that works on AI's specific characteristics:

Behavioral audits. Specify the behaviors the system should and shouldn't exhibit; test against representative input distributions; characterize the system's behavior surface even though it can't be exhaustively tested. This is the methodology underlying AI red-teaming work and behavioral-evaluation suites.

Provenance tracking. Track where data entered the system (training time, fine-tuning, runtime context), what processing it underwent, and what outputs followed. The classical provenance discipline from data engineering applies; the AI layer adds requirements (which model version; which alignment update; which prompt formulation).

Calibration auditing. Test the AI's confidence outputs against ground-truth on representative tasks. A model that says it's 90% confident should be right 90% of the time on a sample where the ground truth is known. The methodology is mature in classical machine learning (Guo et al.'s 2017 On Calibration of Modern Neural Networks is the canonical reference for the modern networks are systematically miscalibrated finding that drives the field's calibration-correction methods).7 Foundation-model calibration auditing is an active research area; the substrate-opacity problem makes the verification harder than the classical case.

Deployment-context auditing. Audit the operational context of the AI's use rather than only the model itself. Who's using it for what? What are the human-in-the-loop patterns? What's the rubber-stamping rate (Part IV §4.5)? The audit shifts from is the model correct to is the system as deployed producing the outcomes the governance framework specifies.

The EU AI Act's high-risk-AI requirements explicitly call for some of this discipline (technical documentation; risk management systems; transparency to deployers; human oversight; accuracy + robustness + cybersecurity requirements). Operational implementation lags the regulatory ask in most enterprises in 2026; closing that gap is one of the active investment areas for AI governance practice.


6.5 The institutional question — who governs AI inside an enterprise

A recurring failure mode in enterprise AI governance: the institutional placement of AI-governance responsibility produces structural mismatches with where the governance work actually needs to happen.

The conventional placements include:

AI ethics committee + Chief AI Officer. A senior committee setting policy; a Chief AI Officer holding accountability. This is the most common 2025-2026 structure. It works well when the CAO has both the formal authority of the role and the informal-network reach to influence operating businesses' deployment decisions; it works badly when the CAO has the title but not the reach. (Part II §2.5 covers this asymmetry.)

IT-led governance. AI governance run by the CIO / CTO organization, often as an extension of existing data-governance practice. This works for the substrate-side concerns (privacy, security, infrastructure) and works less well for the deployment-side concerns (calibration, ethics, decision-quality), which need stakeholder participation IT typically doesn't have.

Risk-led governance. AI governance run by the Chief Risk Officer or general counsel, often as an extension of operational-risk practice. This works for compliance and regulatory-alignment concerns and works less well for the design-constraint upstream work (§6.1 framing) — risk frameworks tend to evaluate after design rather than constrain during design.

Distributed governance. No single institutional owner; AI-governance responsibility is distributed across functional leads (HR for workforce AI; CMO for marketing AI; CRO for customer-facing AI; CIO for infrastructure). This works when the functional leads have aligned principles and good cross-functional coordination; it produces fragmentation when they don't.

The guide's observation: the four placements are tools, not solutions. The institutional question is downstream of a more fundamental question — what discipline does the enterprise want to enforce on AI deployments — and the institutional placement should be chosen to support that discipline, not to substitute for it.

A specific implication: the four non-negotiable failure modes from Part V §5.4 (output-only optimization; over-automation; weak measurement; ignoring categorical differences) operate as design constraints in the AHI program's institutional structure. If an enterprise commits to similar constraints, the institutional placement is downstream — whoever has the authority to veto features that violate the constraints is where the governance responsibility sits, whether that's a CAO, a Chief Risk Officer, a CTO, or a distributed structure with explicit veto authority at the right points.


6.6 Part-end glossary, bibliography, and cross-references

Glossary

Auditability. The ability to verify that an AI system did what it claimed to do, with the data it claimed to use, in the way it claimed to operate. Structurally limited for foundation-model AI; addressable through behavioral audits, provenance tracking, calibration auditing, and deployment-context auditing.

Behavioral audit. An audit methodology that characterizes an AI system's behavior surface against representative input distributions, rather than exhaustively testing for correctness on specific inputs.

Calibration auditing. A specific behavioral-audit methodology testing whether the AI's stated confidence matches ground-truth accuracy on representative tasks.

Cumulative disclosure. The privacy failure mode in which a user discloses information incrementally across many AI interactions, with the cumulative profile substantially exceeding what the user would have disclosed if asked directly.

Design constraint vs risk management. The structural distinction in governance posture: design constraints prevent problematic features from being built (upstream); risk management evaluates rollouts after design (downstream). Both are necessary; the upstream variant is less common and more load-bearing.

EU AI Act. The European Union's risk-tiered regulation of AI systems (entered into force August 2024; phased compliance deadlines through 2027). High-risk AI deployments face technical-documentation, risk-management-system, transparency, human-oversight, and accuracy / robustness / cybersecurity requirements.

k-anonymity. A privacy property in which any individual in an aggregated dataset is indistinguishable from at least k-1 other individuals. The classical minimum-N privacy gate.

Min-N gate. The privacy-discipline enforcement of a minimum cohort size (typically 5, 7, or 10) below which aggregated outputs are not surfaced. Necessary but not sufficient against AI-augmented inference attacks.

Paternalism vs autonomy. The ethics-in-deployment tension between an AI system's adaptation to user preferences supporting the user's autonomy vs undermining it. The reasoning-personalization failure mode from Part V §5.2.1 sits at this tension.

Protected feedback. A substrate primitive (vs configuration option) that ensures workforce feedback cannot be used to identify individual respondents, score individuals, or support surveillance/disciplinary/retaliation actions. Implemented in Performix as a cross-cutting capability that every other capability passes through.

Provenance tracking. A discipline of tracking where data entered an AI system, what processing it underwent, and what outputs followed. Necessary for any meaningful auditability of AI deployments.

Substrate opacity. The structural condition of foundation-model AI in which the training data, alignment process, and other formative inputs are proprietary to the vendor and largely undisclosed to the enterprise deploying the system.

Bibliography (Part 6)

European Union. Regulation (EU) 2024/1689 of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union, July 2024.

Logg, J. M., Minson, J. A., & Moore, D. A. Algorithm Appreciation: People Prefer Algorithmic to Human Judgment. Organizational Behavior and Human Decision Processes, 2019.

Sharma, Mrinank, et al. Sycophancy in AI Assistants. 2024.

Sunstein, Cass R. #Republic: Divided Democracy in the Age of Social Media. Princeton University Press, 2017.

Thaler, R. H., & Sunstein, C. R. Nudge: Improving Decisions about Health, Wealth, and Happiness. Yale University Press, 2008.

AHI program reviews:

  • calibration-of-personalization.md — the load-bearing review for the paternalism-vs-autonomy framing and the reasoning-personalization corrective
  • conversation-analysis-ethnomethodology.md — the methodological-discipline review at the AHI program's deepest layer
  • children-with-ai-commissioning-decision.md — the no-public-position discipline as governance posture example

People Analytics Toolbox data-anonymizer spoke — the substrate primitive implementation of the protected-feedback / privacy-gate pattern.

Performix protected-feedback capability — the cross-cutting suppression-gate-as-substrate pattern.

Cross-references

Concept introduced hereWhere it gets fuller treatment
Design-constraint upstream governancePart V §5.5 (the AHI program's product-feature-as-design-constraint examples)
The protected-feedback principlePerformix product card; Part V §5.5 (suppression-gate-as-substrate)
Substrate opacity and auditabilityPart I §1.4 (the data substrate underneath modern AI)
The reasoning-personalization corrective in ethics contextsPart V §5.6 Position 2
Rubber-stamping in human-in-the-loop architecturesPart IV §4.5; Part V §5.5
The four non-negotiable failure modes as design constraintsPart V §5.4
The institutional placement questionPart II §2.5 (organizational design); Part VII §7.2 (the network-topology lens on org structure)

Footnotes

  1. Moffatt v. Air Canada. British Columbia Civil Resolution Tribunal, Case 2024 BCCRT 149, February 14, 2024. The Tribunal awarded the passenger $812 CAD after finding that the airline's chatbot fabricated a bereavement-fare policy and that the airline was liable as principal for the agent statement. Treated in fuller detail in Appendix A §A.2 as the precedent-setting case for AI customer-service liability.

  2. People Analytics Toolbox data-anonymizer spoke. Documented at peopleanalyst.com/research/pa-platform/. PII detection, deterministic HMAC-keyed tokenization, k-anonymity min-N gate, substitution-strategy registry. Every spoke that surfaces team-level rollups calls min-N-check before responding.

  3. Edwards, Lilian, and Michael Veale. Slave to the Algorithm? Why a Right to an Explanation Is Probably Not the Remedy You Are Looking For. Duke Law and Technology Review 16, no. 1 (2017): 18-84. The analysis traces why GDPR Article 22's right-to-explanation provisions produce less practical remedy than their drafters intended, with substrate-opacity (the architecture's relevant inputs are not legibly explanatory) as the structural reason.

  4. AHI program review at content/research/ai-human-interaction/sources/topic-reviews/calibration-of-personalization.md. The paternalism / autonomy framing draws on Sunstein (#Republic; Nudge with Thaler) and Conly (Against Autonomy); the review's analytical contribution is applying these frameworks to conversational AI specifically.

  5. Glickman, Moshe, and Tali Sharot. Human-AI Bias Amplification: A Bidirectional Loop. Nature Human Behaviour, 2024. The bidirectional-amplification finding — that the human side of the loop sometimes carries the larger amplification — is treated in fuller depth in Part V §5.2.5.

  6. AHI program decision document at content/research/ai-human-interaction/children-with-ai-commissioning-decision.md. Three commissioning options articulated (A: narrow scoping note + bibliography; B: full review + position paper; C: bounded review with explicit-deferral discipline). Option A chosen as the disciplined posture for a domain with insufficient empirical record and high political stakes.

  7. Guo, Chuan, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. On Calibration of Modern Neural Networks. International Conference on Machine Learning (ICML), 2017. The foundational paper showing that modern deep neural networks are systematically overconfident, and introducing temperature-scaling as a simple post-hoc calibration correction. The methodological starting point for any contemporary calibration-auditing work.

← All guide parts