API Integration Guide

Learn how to integrate piisafe.eu's PII detection system with cloak.business and anonym.legal APIs. Complete reference with examples, rate limits, and error handling.

Overview

piisafe.eu integrates with two enterprise-grade PII detection APIs to scan websites for exposed personal information:

  • cloak.business: Enterprise solution with 320+ entity types and advanced language support
  • anonym.legal: Starter-friendly API with 285+ entity types and accessible pricing

Both APIs use hybrid detection combining machine learning models with deterministic regex patterns for highly accurate PII identification across 48+ languages.

Note: All API communication happens client-side in the browser. API keys are stored locally and never transmitted to piisafe.eu servers. This is zero-knowledge architecture.

API Providers Comparison

cloak.business

Enterprise Detection

Entity Types: 320+
Presets: 27 (GDPR, HIPAA, PCI-DSS, etc.)
Languages: 48+
Detection Method: ML + Regex hybrid
Character Limit: 50,000 per call
Rate Limits: Token-based (higher tier)
Coverage: 70+ countries
Token Pricing: Pay-as-you-go

anonym.legal

Starter-Friendly Solution

Entity Types: 285+
Presets: 24 (GDPR, HIPAA, regional)
Languages: 48+
Detection Method: ML + Regex hybrid
Character Limit: 50,000 per call
Rate Limits: €3/month starter tier
MCP Server: Available for integration
Entry Cost: Affordable pricing

Detection Capabilities

Entity Category cloak.business anonym.legal Examples
Identifiers SSN, Tax ID, Passport, Driver License
Financial Credit Card, IBAN, SWIFT, Bank Account
Contact Email, Phone, Address, IP Address
Biometric Fingerprint, Facial Recognition, DNA
Healthcare Medical Records, Insurance ID, Prescription
Regional Specific ✓ (All 48 languages) ✓ (All 48 languages) German Tax, French CNI, Italian CODICE

Getting Started

Step 1: Get an API Key

For cloak.business:

  1. Visit https://cloak.business
  2. Sign up for an account
  3. Navigate to "Account" → "API Keys"
  4. Generate a new API key
  5. Copy the key to your clipboard

For anonym.legal:

  1. Visit https://anonym.legal
  2. Start with the free tier or choose a pricing plan
  3. Access "Settings" → "API Credentials"
  4. Your API key is auto-generated
  5. Copy the key to use in piisafe.eu
Security: Never share your API key. Store it securely. In piisafe.eu, it's encrypted and stored only in your browser's localStorage.

Step 2: Enter API Key in piisafe.eu

  1. Go to https://piisafe.eu/scanner.html
  2. Click "Scanner" in the navigation
  3. Select your API provider (Step 1)
  4. Enter your API key (Step 2)
  5. Click "Validate Key" to confirm

Step 3: Configure Detection Settings

After validation, choose:

  • Compliance Preset: GDPR, HIPAA, PCI-DSS, CCPA, or custom
  • Language: 48+ languages for region-specific patterns
  • Entity Threshold: Confidence score (60-95%)

API Flow Diagram

Here's how piisafe.eu orchestrates the scanning process:

┌─────────────────────────────────┐ │ User Enters Website URL │ │ + API Key + Settings │ └──────────────┬──────────────────┘ │ ▼ ┌──────────────────┐ │ URL Discovery │ │ (Sitemap/Crawl) │ └────────┬─────────┘ │ ┌────────▼──────────┐ │ User Selects │ │ Pages to Scan │ └────────┬──────────┘ │ ┌────────▼──────────────────┐ │ Fetch Page Content │ │ (HTML → Extract Text) │ └────────┬──────────────────┘ │ ┌────────▼───────────────┐ │ Check Text Length │ │ 50,000 chars limit? │ └────────┬───────────────┘ │ ┌─────────┴────────┐ │ │ YES NO │ │ ▼ ▼ CHUNKING SEND TO API (Split Text) (Direct) │ │ ├─────────┬────────┘ │ │ ▼ ▼ ┌──────────────────────┐ │ Call Detection API │ │ (cloak / anonym) │ └────────┬─────────────┘ │ ┌─────▼──────┐ │ More Chunks? └─────┬──────┘ │ ┌───┴───┐ YES NO │ │ ▼ ▼ CONTINUE AGGREGATE (Loop) RESULTS │ ▼ ┌──────────────────────┐ │ Real-Time Results │ │ + Risk Grade (A-F) │ │ + Findings List │ │ + Statistics │ └────────┬─────────────┘ │ ▼ ┌──────────────────┐ │ User Exports: │ │ • HTML Report │ │ • JSON Data │ │ • CSV Spreadsheet│ └──────────────────┘

Supported Entity Types

Both APIs detect and classify the following PII entity categories:

Core Entity Categories (40+)

  • Government IDs: SSN, Tax ID, Passport, Visa, Driver License, National ID
  • Financial: Credit Card, Debit Card, IBAN, SWIFT, Bank Account, Cryptocurrency Wallet
  • Contact: Email Address, Phone Number, Physical Address, Postal Code, IP Address
  • Medical: Medical Record Number, Health Insurance ID, Prescription, Healthcare Provider ID
  • Biometric: Fingerprint, Face Recognition, DNA Profile, Iris Scan
  • Online: Username, Password, API Key, URL, Domain, Social Media Handle
  • Corporate: Employee ID, Business Email, Company Phone, Corporate Account
  • Legal: Case Number, Court Document, Patent Number, Trademark

Regional & Language-Specific Entities

Each of the 48 supported languages includes region-specific identifiers:

  • Germany: Steuernummer, Versicherungsnummer, KfZ-Versicherung
  • France: Numéro de Sécurité Sociale, CNI, Numéro de SIRET
  • Spain: NIE, NIF, DNI, Número de Seguridad Social
  • Italy: CODICE FISCALE, Numero di Patente, Numero di Carta d'Identità
  • USA: SSN, EIN, State ID, Driver License (state-specific variations)
  • Canada: SIN, Provincial Health Card, Province-specific IDs
  • And 42+ more language variants...
Pro Tip: Select your target region's language in the "Configure" step for maximum accuracy on region-specific IDs.

Chunking Strategy (Smart Splitting)

Since both APIs have a 50,000 character limit per request, piisafe.eu uses intelligent chunking to analyze pages of any size:

How Chunking Works

  1. Measure: Check extracted text length
  2. Split: If > 49,500 chars, split at word boundaries
  3. Process: Send each chunk to API sequentially
  4. Offset: Adjust entity positions to original text location
  5. Aggregate: Combine results across all chunks

Chunking Configuration

Parameter Value Rationale
Max Characters 50,000 API hard limit
Safety Margin 49,500 Prevents boundary issues
Split Method Word Boundary Preserves sentence/word integrity
Processing Sequential Respects rate limits
Retry Logic 3 attempts per chunk Handles transient failures

Example: 65KB Page Scanning

Input: 65,000 character HTML page ↓ Exceeds 50,000 limit ↓ Chunk 1: Characters 0-49,500 Chunk 2: Characters 49,501-65,000 ↓ Send Chunk 1 → API → Detect entities Send Chunk 2 → API → Detect entities ↓ Aggregate Results - Adjust entity offsets to original document - Merge duplicate findings (if spans chunks) - Calculate overall statistics ↓ Return: Complete entity list with accurate positions
Benefit: Pages of unlimited size can be scanned. Before chunking, pages >50KB would fail. Now, 100% of text is analyzed.

Cost Trade-off: A 65KB page requires 2 API calls instead of 1, doubling token usage for that page. However, this is better than partial scanning (23% data loss).

Rate Limits & Quotas

piisafe.eu Rate Limiting (Server-Side)

Limit Value Applies To
Requests per IP 100 per 15 minutes All endpoints
Concurrent Scans 10 per IP Running scans only
Max Pages/Scan 200 URL discovery
Response Size 5MB max Large result exports
Session Timeout 30 minutes Abandoned scans

API Provider Limits

cloak.business:

  • Token-based pricing model
  • Higher rate limits for premium tiers
  • No hard API call limits (pay per token)
  • Typical token cost: 100-500 tokens per page

anonym.legal:

  • €3/month entry tier (limited requests)
  • Higher plans available (contact sales)
  • Rate limits vary by tier
  • Free tier available for testing

Handling Rate Limit Errors

If you receive a 429 (Too Many Requests) error:

  1. Wait 15-30 seconds before retrying
  2. Reduce concurrent scan count
  3. For cloak.business: Check token balance
  4. For anonym.legal: Verify subscription is active
  5. Contact provider support if issue persists
Note: Chunking increases API calls. A 100KB page = 2 API calls. Plan token budget accordingly.

Pricing & Costs

cloak.business Pricing

Plan Cost Tokens/Month Features
Starter Free Limited Testing & evaluation
Professional Pay-as-you-go Flexible All features, volume discounts
Enterprise Custom Unlimited Custom limits, support, SLA

Typical Token Costs:

  • Small page (5KB): ~50 tokens
  • Medium page (25KB): ~250 tokens
  • Large page (50KB): ~500 tokens
  • Chunked page (100KB = 2 chunks): ~1,000 tokens

anonym.legal Pricing

Plan Cost API Calls Features
Free Tier €0 10/month Basic testing
Starter €3/month 100/month Individual use
Pro €19/month 1,000/month Teams, higher limits
Enterprise Custom Custom Unlimited, support, SLA

Cost Estimation

Example: Scanning 10 websites (average 20 pages per site = 200 pages total)

With cloak.business (pay-as-you-go):

  • Average 200 tokens per page
  • 200 pages × 200 tokens = 40,000 tokens
  • Estimated cost: €20-50 depending on token pricing tier

With anonym.legal (Starter):

  • €3/month base cost
  • 100 API calls/month included
  • Overflow calls billed separately (typically €0.01-0.05 per call)
  • 200 pages may require additional tier or overflow costs
Recommendation: Both providers offer free/trial tiers. Test with a small scan first to estimate token usage for your use case.

Error Handling & Troubleshooting

Common Error Codes

400 Bad Request
Meaning: Malformed API request or invalid parameters
Solution: Check API key format, ensure text encoding is UTF-8, verify entity list syntax
401 Unauthorized
Meaning: API key is invalid, expired, or missing
Solution: Regenerate API key in provider dashboard, verify key is copied correctly, check for extra spaces
403 Forbidden
Meaning: API key valid but account doesn't have permission for this operation
Solution: Verify account subscription is active, check if specific entities require higher tier
429 Too Many Requests
Meaning: Rate limit exceeded or token quota exhausted
Solution: Wait 15-30 seconds, reduce concurrent scans, upgrade account tier, or add tokens
500 Internal Server Error
Meaning: Provider API encountered an internal error
Solution: Wait 1-2 minutes, retry scan, contact provider support if persists
503 Service Unavailable
Meaning: API is temporarily down or undergoing maintenance
Solution: Wait and retry later, check provider status page, contact support
ContentLengthExceeded
Meaning: Text exceeds 50,000 character limit (shouldn't happen with chunking)
Solution: Verify chunking is enabled, check page wasn't corrupted during fetch, retry
InvalidEntityType
Meaning: Selected entity doesn't exist in API catalog
Solution: Use provided catalog dropdown, verify entity name spelling, refresh entity list

Network & Connectivity Issues

Timeout errors: If scan stalls after 30+ seconds, the API may be slow or unreachable:

  • Check your internet connection
  • Verify browser console for network errors (F12)
  • Try a smaller scan (fewer pages)
  • Switch to alternate API provider

CORS (Cross-Origin) errors: If you see "CORS policy" error in console:

  • This is expected for cross-domain API calls
  • piisafe.eu uses CORS proxying on backend
  • No action needed—should resolve automatically
  • If persists, contact support

Debug Mode

Open browser DevTools (F12) to see detailed error logs:

# In browser console, view API responses: localStorage.getItem('pii_scan_logs') // Recent scan errors localStorage.getItem('pii_api_errors') // API error details

For support, collect these details:

  • Error message (exact text)
  • URL being scanned
  • API provider (cloak vs anonym)
  • Browser console screenshot
  • Timestamp of error

Code Examples

Example 1: Validating API Key (JavaScript/Frontend)

// Test API key for cloak.business async function validateCloakKey(apiKey) { const response = await fetch('/api/cloak/test', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ apiKey }) }); const data = await response.json(); if (data.valid) { console.log('✓ Key valid'); localStorage.setItem('pii_api_key_cloak', apiKey); } else { console.error('✗ Invalid key:', data.error); } }

Example 2: Sending Text for Analysis

// Analyze text with cloak.business API async function analyzeText(text, entities, apiKey) { const response = await fetch('/api/cloak/analyze', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ apiKey, text, entityTypes: entities, language: 'en' }) }); const findings = await response.json(); return findings; // Array of detected entities }

Example 3: Chunking Text (50K Limit)

// Split text at word boundaries function chunkText(text, maxChars = 49500) { const chunks = []; let currentChunk = ''; text.split(/\s+/).forEach(word => { if ((currentChunk + word).length > maxChars) { if (currentChunk) chunks.push(currentChunk); currentChunk = word; } else { currentChunk += (currentChunk ? ' ' : '') + word; } }); if (currentChunk) chunks.push(currentChunk); return chunks; } // Usage const text = fetchPageContent('https://example.com'); const chunks = chunkText(text); for (let i = 0; i < chunks.length; i++) { const results = await analyzeText( chunks[i], entityList, apiKey ); console.log(`Chunk ${i+1}/${chunks.length}: ${results.length} entities found`); }

Example 4: Handling Real-Time Progress (Server-Sent Events)

// Listen to scan progress const eventSource = new EventSource( `/api/scanner/status/${scanSessionId}` ); eventSource.onmessage = (event) => { const progress = JSON.parse(event.data); console.log(`Pages: ${progress.pagesScanned}/${progress.pagesTotal}`); console.log(`PII Found: ${progress.piiFound}`); console.log(`Progress: ${progress.percent}%`); if (progress.status === 'complete') { eventSource.close(); } }; eventSource.onerror = (error) => { console.error('Connection lost:', error); eventSource.close(); };

Example 5: Exporting Results

// Export scan results as JSON async function exportResults(scanId) { const response = await fetch(`/api/scanner/report/${scanId}?format=json`); const data = await response.json(); // Download JSON file const blob = new Blob([JSON.stringify(data, null, 2)]); const url = URL.createObjectURL(blob); const a = document.createElement('a'); a.href = url; a.download = `pii-scan-${scanId}.json`; a.click(); } // Available formats: 'json', 'csv', 'html'

Frequently Asked Questions

Getting API Keys

Q: Can I use the same API key across multiple devices?

A: Yes. API keys are account-based, not device-specific. Store securely and avoid sharing publicly.

Q: What if I lose my API key?

A: Regenerate it in your provider dashboard. Old key becomes invalid immediately. Update piisafe.eu with new key.

Q: Is there a free tier?

A: cloak.business offers limited free tier for testing. anonym.legal includes €0 tier (10 API calls/month) plus €3/month starter plan.

Scanning & Detection

Q: What happens if a page has no PII?

A: Scan completes successfully with an "A" grade. Findings list is empty. API call still counts against quota.

Q: Can I scan password-protected websites?

A: No. piisafe.eu scans public HTML only. For protected content, export HTML manually, then upload as raw text.

Q: How accurate is PII detection?

A: Both APIs use ML + regex hybrid models. Accuracy: 85-95% depending on entity type and regional variations. Some false positives/negatives possible. Manual review recommended.

Q: Does the chunking affect detection accuracy?

A: No. Chunking splits at word boundaries, preserving context. Results are identical to single-chunk processing.

Pricing & Costs

Q: What's the cheapest way to scan many pages?

A: anonym.legal Pro (€19/month) offers 1,000 API calls/month, lowest cost per scan. cloak.business better for occasional high-volume scans with flexible scaling.

Q: Do chunked pages cost more?

A: Yes. A 100KB page = 2 API calls = 2× token cost. However, 100% of text is analyzed vs. partial scanning before.

Q: Can I buy tokens in advance?

A: cloak.business: Yes, token packs available. anonym.legal: Subscription-based, can upgrade tier anytime.

Data Privacy

Q: Does piisafe.eu store my API key?

A: No. Keys stored only in your browser's localStorage. Never transmitted to piisafe.eu servers. Zero-knowledge architecture.

Q: Does piisafe.eu store scan results?

A: No. Results stay in browser memory. Session data cleared when page closes. No server-side persistence.

Q: Can I use piisafe.eu for client/customer websites?

A: Yes! Perfect for consultants, security teams, compliance officers. Audit trail kept locally. No data leaves your device.

Technical

Q: What browsers are supported?

A: Modern browsers (Chrome, Firefox, Safari, Edge 2020+). Requires JavaScript and fetch API support. Mobile browsers supported.

Q: Can I integrate piisafe.eu into my own app?

A: Yes! Clone the repo from GitHub, customize backend routes, integrate with your own infrastructure. Full source code available.

Q: What's the difference between piisafe.eu and the APIs directly?

A: piisafe.eu adds: automatic chunking, visual UI, real-time progress, multiple export formats, easy entity selection, preset compliance profiles.

Q: Can I scan multiple websites simultaneously?

A: Limit is 10 concurrent scans per IP. Start sequential scans or wait for earlier ones to finish.

↑ Back to Top