Double Anonymization
Two-stage anonymization with provenance tracking for maximum privacy protection and verifiable data transformation.
Overview
MediPact implements double anonymization with provenance tracking to provide maximum privacy protection and verifiable data transformation on the Hedera blockchain. This two-stage approach ensures defense in depth while maintaining data utility for research.
Stage 1: Storage
Optimized for research queries while protecting privacy. Preserves 5-year age ranges and exact dates.
Stage 2: Chain
Maximum privacy for immutable blockchain storage. Further generalizes to 10-year ranges and month/year dates.
Why Double Anonymization?
Defense in Depth
Two layers of protection ensure that if one layer fails, the other still protects patient privacy. Different anonymization strategies at each stage provide comprehensive coverage.
Different Purposes
Storage anonymization is optimized for research queries (preserves detail), while chain anonymization is optimized for privacy on immutable blockchain storage (maximum generalization).
Provenance Tracking
Both hashes stored together on Hedera with a provenance proof allows anyone to verify that the chain hash was derived from the storage hash, providing complete audit trail and transformation verification.
Compliance Ready
Meets strict regulatory requirements including GDPR and HIPAA. Demonstrates layered privacy protection and exceeds Safe Harbor de-identification standards.
Two-Stage Process
Stage 1: Storage Anonymization
Purpose: Research-Optimized Privacy
Stage 1 anonymization is designed to protect privacy while preserving data utility for research queries.
What Gets Removed
- Patient names
- Patient IDs (original)
- Specific addresses (street, city)
- Phone numbers
- Exact dates of birth
- Exact age (replaced with age range)
What Gets Preserved
- Age Range: 5-year ranges (e.g., "35-39")
- Location: Country, region, district
- Dates: Exact dates (YYYY-MM-DD)
- Gender: Male, Female, Other, Unknown
- Occupation: Specific categories (e.g., "Healthcare Worker")
- Medical Data: All clinical information intact
Example
1{
2 "anonymousPatientId": "PID-001",
3 "ageRange": "35-39",
4 "country": "Uganda",
5 "region": "Central",
6 "gender": "Male",
7 "occupationCategory": "Healthcare Worker",
8 "effectiveDate": "2024-03-15",
9 "observationCodeLoinc": "4548-4",
10 "valueQuantity": "8.1"
11}Stage 2: Chain Anonymization
Purpose: Maximum Blockchain Privacy
Stage 2 anonymization applies further generalization specifically for immutable blockchain storage where data cannot be deleted or modified.
Additional Generalizations
- Age Ranges: 5-year → 10-year (e.g., "35-39" → "30-39")
- Dates: Exact → Month/Year (e.g., "2024-03-15" → "2024-03")
- Location: Remove region/district (keep only country)
- Occupation: Further generalize (e.g., "Healthcare Worker" → "Healthcare")
- Rare Values: Suppress values that could identify individuals
Example
1{
2 "anonymousPatientId": "PID-001",
3 "ageRange": "30-39",
4 "country": "Uganda",
5 "gender": "Male",
6 "occupationCategory": "Healthcare",
7 "effectiveDate": "2024-03",
8 "observationCodeLoinc": "4548-4",
9 "valueQuantity": "8.1"
10}Note: Region/district removed, age range expanded, date rounded to month, occupation generalized.
Provenance Records
What is a Provenance Record?
A provenance record contains both hashes (storage + chain) with a cryptographic proof linking them together, stored immutably on Hedera HCS.
Structure
1{
2 "storage": {
3 "hash": "abc123def456...",
4 "anonymizationLevel": "storage",
5 "timestamp": "2024-03-15T10:30:00Z"
6 },
7 "chain": {
8 "hash": "def456ghi789...",
9 "anonymizationLevel": "chain",
10 "derivedFrom": "abc123def456...",
11 "timestamp": "2024-03-15T10:30:00Z"
12 },
13 "anonymousPatientId": "PID-001",
14 "resourceType": "Patient",
15 "hospitalId": "HOSP-XXX",
16 "timestamp": "2024-03-15T10:30:00Z",
17 "provenanceProof": "xyz789abc123..."
18}Storage Hash (H1)
SHA-256 hash of Stage 1 anonymized data. Used for backend storage verification.
Chain Hash (H2)
SHA-256 hash of Stage 2 anonymized data. Used for immutable blockchain storage.
Provenance Proof
Cryptographic proof linking both hashes together. Proves transformation chain.
Verification Process
Anyone can verify the provenance chain on Hedera HashScan:
1. Origin Verification
Verify both hashes exist and match expected values:
1assert(provenanceRecord.storage.hash === expectedStorageHash);
2assert(provenanceRecord.chain.hash === expectedChainHash);2. Transformation Verification
Verify chain hash was derived from storage hash:
1assert(provenanceRecord.chain.derivedFrom === provenanceRecord.storage.hash);3. Provenance Proof Verification
Verify the provenance proof links both hashes:
1const expectedProof = generateProvenanceProof(
2 provenanceRecord.storage.hash,
3 provenanceRecord.chain.hash,
4 provenanceRecord.anonymousPatientId,
5 provenanceRecord.resourceType
6);
7assert(provenanceRecord.provenanceProof === expectedProof);Comparison Table
| Feature | Stage 1 (Storage) | Stage 2 (Chain) |
|---|---|---|
| Age Range | 5-year (e.g., "35-39") | 10-year (e.g., "30-39") |
| Dates | Exact (YYYY-MM-DD) | Month/Year (YYYY-MM) |
| Location | Country + Region + District | Country only |
| Occupation | Specific category | Broad category |
| Purpose | Research queries | Blockchain storage |
| Privacy Level | High | Maximum |
| Data Utility | High (preserves detail) | Medium (generalized) |
Benefits
Double Protection
Two layers of anonymization ensure maximum privacy protection with defense in depth.
Provenance Chain
Verifiable transformation chain on Hedera allows anyone to verify origin and transformation.
Origin Proof
Both hashes prove same source, providing complete audit trail for compliance.
Transformation Proof
Chain hash derived from storage hash is verifiable, proving the transformation chain.
Public Verification
Anyone can verify provenance records on HashScan, ensuring transparency and trust.
Compliance Ready
Meets strict regulatory requirements including GDPR and HIPAA Safe Harbor standards.
HashScan Verification
Each provenance record is stored on Hedera and can be verified on HashScan:
- Visit HashScan link from adapter output
- View provenance record JSON
- Verify both hashes (storage + chain)
- Verify
derivedFromlink - Verify provenance proof