Model Change Management Protocol
Project: Pickles GmbH — AI Governance Framework Stage: Stage 4 — Monitoring & Operational Controls Status: Draft Version: v1 Date: 2026-02-26 Assumptions: Built on outline assumptions — not verified against real Pickles GmbH data
Purpose
This protocol governs all changes to Pickles GmbH's AI systems — whether initiated internally (prompt redesign, retraining, architectural changes) or externally (third-party model provider updates). It ensures that:
- Changes are tested before production deployment
- Substantial modifications triggering new EU AI Act conformity assessments are identified before deployment
- Clients are notified of changes affecting system behaviour
- A rollback plan exists for every production change
- The Technical Documentation Pack (L2-4.2) stays current
Regulatory basis: - EU AI Act Article 3(23) — definition of substantial modification - EU AI Act Article 43(4) — substantial modification triggers new conformity assessment - EU AI Act Article 17(1)(a) — QMS must include procedures for management of modifications - EU AI Act Article 9(2) — risk management system is an iterative process (changes require risk re-evaluation) - EU AI Act Article 6(4) — self-assessment must be updated if classification changes - ISO/IEC 42001 Clause 10 — continual improvement and corrective action
[ASSUMPTION] The change governance roles and approval thresholds are based on the assumed organisational structure. They must be confirmed against real Pickles GmbH structure before operational use.
1. Change Classification Framework
Every proposed change must first be classified. Classification determines the approval path, testing requirements, and regulatory obligations.
1.1 Change Types
| Type | Description | Examples |
|---|---|---|
| Type A — Prompt redesign | Changes to system prompts, instruction sets, or retrieval query logic — no model weight changes | Revised instruction wording; updated system prompt; modified retrieval parameters |
| Type B — Configuration change | Changes to system configuration, output formatting, safety filters, or operational parameters | Adjusting output length limit; enabling/disabling a feature; changing safety filter threshold |
| Type C — Third-party model update | New version of a third-party model API deployed by the provider | GPT-4o → GPT-4o-mini; Claude 3 → Claude 3.5; model version deprecation |
| Type D — Fine-tuning / retraining | Pickles GmbH fine-tunes, retrains, or trains a model on new data | Domain fine-tuning on new legal corpus; RLHF update |
| Type E — Architecture change | Changes to system architecture, infrastructure, or integration patterns | Switching from RAG to fine-tuning; replacing vector database; new API integration |
| Type F — Provider switch | Replacing the third-party AI model provider entirely | Switching from Provider A to Provider B |
1.2 Substantial Modification Assessment
EU AI Act Article 3(23) defines a substantial modification as a change that is: 1. Not foreseen or planned in the initial conformity assessment (see L2-4.2 Section 2.6 — pre-determined changes); AND 2. Affects compliance with Chapter III Section 2 requirements (Articles 9–15), OR modifies the intended purpose of the AI system
Article 43(4): pre-determined changes documented in Annex IV technical documentation (L2-4.2 Section 2.6) do NOT constitute substantial modification.
Substantial modification classification matrix:
| Change Type | Pre-determined? | Affects Ch. III S.2 or intended purpose? | Substantial Modification? |
|---|---|---|---|
| Minor prompt wording adjustment within existing scope | Yes (if documented) | No | No |
| Prompt change that materially alters output type or scope | No | Yes (intended purpose affected) | Yes |
| Configuration change within pre-defined bounds | Yes (if documented) | No | No |
| Configuration change outside pre-defined bounds | No | Depends — assess | Assess — likely Yes |
| Third-party model version update (same capability class) | Possible — if update policy documented | May affect accuracy/robustness (Article 15) | Assess |
| Major model version change (new capability class) | No | Yes — accuracy, robustness, intended purpose | Yes |
| Fine-tuning on new domain data | No | May affect Article 10 data governance compliance | Assess — likely Yes |
| Architecture change affecting core processing | No | Yes | Yes |
| Provider switch | No | Yes — new system with different characteristics | Yes |
[LEGAL REVIEW REQUIRED] Substantial modification assessments for SYS-04 have conformity assessment implications. A qualified EU AI Act practitioner must confirm the assessment for any Type C, D, E, or F change before deployment.
2. Change Request Process
2.1 Initiation
Any change to a Pickles GmbH AI system must begin with a Change Request (CR). CRs may be raised by: - Head of Engineering (technical changes) - Head of Product (product changes) - AIRO (governance-driven changes) - Third-party model provider notification (Type C, F) — received per L2-5.3
Change Request must document:
CHANGE REQUEST — [CR-ID: CR-YYYY-MM-DD-NN]
System affected:
Change type (A/B/C/D/E/F):
Description of proposed change:
Reason for change:
Pre-determined change? (Y/N — reference L2-4.2 Section 2.6 if yes):
Raised by:
Date raised:
2.2 Initial Triage
The AIRO and Head of Engineering perform initial triage within 2 business days of CR receipt:
| Triage Question | If Yes |
|---|---|
| Is this a pre-determined change (L2-4.2 Section 2.6)? | Expedited path — confirm in writing; proceed to Testing (Step 4) |
| Is this a Type C third-party update with vendor regression data? | Use vendor data to supplement internal testing; proceed to Assessment |
| Does the change affect the intended purpose of the system? | Substantial modification assessment required |
| Does the change affect SYS-04 (high-risk) accuracy, robustness, or data governance? | Substantial modification assessment required; notify Legal |
3. Substantial Modification Assessment
For any change not confirmed as pre-determined, complete the following assessment before testing:
SUBSTANTIAL MODIFICATION ASSESSMENT — [CR-ID]
1. Does this change modify the intended purpose of the system?
(Intended purpose is defined in L2-4.2 Section 1.1 for each system)
Yes / No / Uncertain [LEGAL REVIEW REQUIRED if Uncertain]
2. Does this change affect compliance with any of the following?
Article 9 (risk management system): Yes / No
Article 10 (data governance — if retraining): Yes / No
Article 12 (logging capabilities): Yes / No
Article 13 (transparency to deployers): Yes / No
Article 14 (human oversight mechanisms): Yes / No
Article 15 (accuracy, robustness, cybersecurity): Yes / No
3. Was this change foreseen and documented in L2-4.2 Section 2.6?
Yes (reference: ___) / No
4. CONCLUSION
Substantial modification: Yes / No / Requires legal review
If Yes: New conformity assessment required before deployment (Article 43(4))
Approved by (AIRO + Legal): ___ Date: ___
4. Testing Protocol
All changes — regardless of classification — undergo testing before production deployment. The depth of testing scales with change risk.
4.1 Testing Tiers
| Testing Tier | Applies To | Minimum Requirements |
|---|---|---|
| Tier 1 — Smoke test | Type A minor prompt changes; Type B configuration within pre-defined bounds | Run benchmark query suite (n=20); confirm output format and quality unchanged; no regressions on known failure modes |
| Tier 2 — Standard regression | Type A major prompt changes; Type B outside pre-defined bounds; Type C minor model updates | Full benchmark query suite (n=100); citation accuracy check; latency check; human review of 10 sampled outputs by a qualified legal reviewer [ASSUMPTION] |
| Tier 3 — Full regression | Type C major model updates; Type D (fine-tuning); Type E (architecture); Type F (provider switch) | Full benchmark query suite (n=200+); citation accuracy; bias check across document types; latency; safety filter validation; human review of 25+ outputs by qualified legal reviewer; AIRO sign-off before staging deployment |
| Tier 4 — Full regression + third-party validation | Substantial modifications requiring new conformity assessment | All Tier 3 requirements + independent third-party accuracy assessment + new Annex IV technical documentation + conformity assessment before production deployment [LEGAL REVIEW REQUIRED] |
4.2 Benchmark Query Suite
[ASSUMPTION] The benchmark query suite is a curated set of test queries covering: - Standard legal research queries across Pickles GmbH's primary practice area coverage - Known edge cases and failure modes identified in previous incidents and monitoring - Citation-heavy queries (for accuracy testing) - Multilingual queries if system supports multiple languages - Queries designed to probe bias and differential performance
The benchmark suite is maintained by the Head of Product and updated after every P1/P2 incident and every major model change. [ASSUMPTION]
4.3 Regression Testing Pass Criteria
A change passes regression testing and may proceed to staging if:
| Criterion | Pass Standard |
|---|---|
| Citation accuracy | ≥95% on benchmark suite (no regression from previous baseline) |
| Error rate | ≤2% expert-identified errors in human review sample |
| Latency | P95 response time within 10% of previous baseline |
| No new failure modes | No failure modes not present in previous baseline |
| Safety filter | No outputs violating content safety requirements |
| Bias check (Tier 3/4) | No statistically significant performance degradation in any tested stratum |
4.4 Staging Deployment
Before production deployment, all Tier 2+ changes must be deployed in a staging environment for a minimum period: - Tier 2: 2 business days - Tier 3: 5 business days - Tier 4: As required by conformity assessment timelines
5. Sign-Off Authority
| Change Tier | Sign-Off Required |
|---|---|
| Tier 1 | Head of Engineering |
| Tier 2 | Head of Engineering + Head of Product |
| Tier 3 | Head of Engineering + Head of Product + AIRO |
| Tier 4 (Substantial modification) | AIRO + CEO + Legal (plus conformity assessment body if applicable) |
No change to SYS-04 (high-risk) may be deployed to production without AIRO sign-off, regardless of tier.
6. Rollback Plan
Every change deployed to production must have a documented rollback plan approved before deployment.
6.1 Rollback Requirement
ROLLBACK PLAN — [CR-ID]
Previous version/configuration:
Rollback method (how to revert to previous state):
Rollback owner:
Maximum rollback time (from decision to complete):
Rollback trigger conditions:
- Automatic (monitoring threshold breach): [specify]
- Manual (AIRO or Head of Engineering decision): [specify]
Data implications of rollback (any data created during new version that is affected):
Client notification required on rollback? (Y/N):
6.2 Rollback Decision Authority
| Situation | Rollback Decision Authority |
|---|---|
| Automated monitoring alert (L3-6.1) triggering rollback condition | Head of Engineering — may initiate immediately |
| P1 incident linked to recent change | AIRO — may order immediate rollback |
| P2 incident with probable link to recent change | AIRO + Head of Engineering — joint decision |
| Regulatory authority instruction (Article 79) | CEO + Legal — mandatory compliance |
6.3 Post-Rollback Actions
Following any rollback: 1. Incident log opened (minimum P2) 2. Root cause analysis initiated (L3-6.2 RCA template) 3. Technical documentation updated to record rolled-back change (L2-4.2 Section 6) 4. CR closed as failed; new CR required for revised approach
7. Specific Change Scenarios
7.1 Third-Party Model Provider Update (Type C)
When a provider notifies Pickles GmbH of a model update per L2-5.3 Section 7:
| Step | Action | Owner | Timing |
|---|---|---|---|
| 1 | Log provider notification; open CR | Head of Engineering | Day 0 |
| 2 | Review provider release notes; classify as minor or major update | Head of Engineering + Head of Product | Day 1–2 |
| 3 | Assign testing tier (Tier 2 for minor; Tier 3 for major) | AIRO | Day 2 |
| 4 | Run regression testing in isolated environment | Head of Engineering | Per tier requirements |
| 5 | Substantial modification assessment | AIRO + Legal | During testing |
| 6 | Staging deployment and monitoring | Head of Engineering | Post-testing |
| 7 | Sign-off and production deployment | Per Section 5 | After staging |
| 8 | Update L2-4.2 Section 6 (lifecycle change log) | Head of Engineering | At deployment |
| 9 | Client notification if behaviour materially changes | Head of Product | Before or at deployment |
7.2 Prompt Redesign (Type A)
Minor prompt changes may follow an expedited path if they are documented as pre-determined in L2-4.2 Section 2.6. All others follow the standard CR process with Tier 1 or Tier 2 testing.
7.3 Provider Switch (Type F)
A complete provider switch is always a substantial modification requiring Tier 4 testing and new conformity assessment for SYS-04. Additionally: - New DPA must be executed with the new provider (L2-5.3) - §43e BRAO service agreement required - SCCs assessed if new provider is non-EEA - Client notification required before switch
7.4 Retraining / Fine-Tuning (Type D)
Any retraining or fine-tuning on new data requires: - Data governance review under EU AI Act Article 10 (training data quality, bias examination) - L2-4.2 Section 2.4 (training data) updated - Bias assessment (M-05) run on new model before production - Tier 3 or Tier 4 testing depending on scope
8. Documentation Updates Required After Any Change
| Document | Section to Update | Trigger |
|---|---|---|
| L2-4.2 Technical Documentation Pack | Section 6 — lifecycle changes | Every production change |
| L2-4.2 Technical Documentation Pack | Section 2.1 — development methods; Section 2.4 — training data | Type D, F changes |
| L2-4.2 Technical Documentation Pack | Section 2.6 — pre-determined changes | If new change type should be pre-determined in future |
| L3-6.1 AI Monitoring Framework | Benchmark baselines | After every Tier 3+ change |
| ASSUMPTIONS-LOG.md | Update relevant assumptions if architecture confirmed | If change reveals or confirms assumptions |
9. Change Log
| CR-ID | Date | Type | System | Description | Tier | Substantial? | Outcome |
|---|---|---|---|---|---|---|---|
| — | — | — | — | No changes recorded yet | — | — | — |
Document Control
| Field | Detail |
|---|---|
| Document ID | L3-6.3 |
| Next review | Annual; after any Tier 4 change; after any P1 incident linked to a change |
| Regulatory basis | EU AI Act Articles 3(23), 6(4), 9, 17(1)(a), 43(4); ISO/IEC 42001 Clause 10 |
| Cross-references | L2-4.2 (technical documentation), L2-5.3 (vendor update notifications), L3-6.1 (monitoring drift), L3-6.2 (incident response post-change) |
| Assumptions relied upon | A-001, A-004, A-009 |