Data Flow Overview
What this document is:
A high-level description of how information moves through the Clawscan system.
Why this matters:
Organizations evaluating Clawscan need to understand where data is processed, what data remains inside their environment, and what information is transmitted to GOlegal systems.
Who should read this:
Security teams, data protection officers (DPOs), IT architects, and procurement reviewers.
When to use this:
Security reviews, DPIA preparation, vendor risk assessments.
Overview
Clawscan is designed around a tenant-resident processing model.
Communication content is processed inside the client’s Microsoft 365 and Azure environment, and only derived analysis results and operational telemetry are transmitted outside that environment.
This approach allows organizations to benefit from automated compliance monitoring while maintaining strict control over their communication data.
See also:
Data processing stages
Clawscan processes information through four stages:
- Content retrieval
- Local compliance analysis
- Derived result transmission
- Aggregated results visualization
Each stage is designed to limit data exposure.
1. Content retrieval
The Clawscan engine retrieves relevant communication content directly from the organization's Microsoft 365 environment.
This retrieval occurs within the client tenant.
Typical inputs include:
- email message content
- attachments (when applicable)
- contextual metadata required for analysis
This information remains inside the client environment during processing.
2. Local compliance analysis
The Clawscan engine performs compliance analysis locally within the tenant.
The analysis process produces derived results, which may include:
- classification of potential compliance risk
- numerical risk scoring
- summarized reasoning describing why the communication triggered analysis signals
The summarized reasoning is designed to remain high-level and contextual.
While it may include limited references or extracts derived from the analyzed communication, it does not aim to reproduce full communication content.
Local processing may also generate technical diagnostics used for system reliability and support.
In limited cases, technical error or diagnostic metadata may be transmitted to GOlegal systems to support maintenance, troubleshooting, and service continuity.
These diagnostics are designed to:
- be primarily technical in nature
- avoid inclusion of full communication content
- minimize exposure of personal data
See:
3. Derived result transmission
After local analysis is completed, Clawscan transmits derived scan results and operational telemetry to the GOlegal control plane.
This information is used for:
- operational monitoring
- service reliability
- licensing and billing management
- aggregated compliance analytics
No raw communication content is transmitted.
4. Aggregated results visualization
Aggregated scan results and operational telemetry may be accessed by authorized users through the Clawscan Admin Control Center.
This interface allows organizations to:
- view aggregated compliance signals
- manage domain (de)activation
- access billing information
- review other platform configuration (if any)
The Admin Control Center does not provide access to email content analyzed by the Clawscan Engine.
Only derived results and operational data are accessible through this interface.
Data categories
The following table summarizes how different categories of information are handled.
| Data type | Location | Transmitted to GOlegal |
|---|---|---|
| Email content | Client tenant | No |
| Attachments | Client tenant | No |
| Local analysis artifacts | Client tenant | Only in case of errors & for debugging/assistance purposes |
| Derived scan results | GOlegal control plane | Yes |
| Operational telemetry | GOlegal control plane | Yes |
Derived scan results contain abstracted information produced by the analysis process, rather than the original communication content.
Telemetry and operational monitoring
Clawscan uses telemetry to ensure the service operates reliably and securely.
Typical telemetry elements include:
- scan timestamps
- analysis classifications
- operational identifiers
- diagnostic metadata
Telemetry does not include raw communication content.
Operational telemetry is used for:
- service monitoring
- system diagnostics
- usage reporting
- product improvement
See:
Privacy-by-design safeguards
Clawscan incorporates several safeguards to limit data exposure:
- tenant-resident processing
- minimal external data transmission
- derived-result reporting rather than raw data sharing
- outbound telemetry model
Organizations also retain control over which communications are subject to analysis, allowing deployment policies to reflect internal governance frameworks.
This includes the ability to define scope limitations and exclusions, such as:
- restricting analysis to specific users and/or groups
- excluding personal mailboxes or clearly identified private communications
- limiting monitoring to specific compliance domains (e.g. competition law, anti-corruption)
These controls support a proportionate and targeted deployment, aligned with privacy-by-design principles.
See:
Data retention
Derived scan results are retained according to a period-based retention policy.
Raw results generated during a given operational period are retained during that period and are deleted a fixed number of months after the period ends.
Aggregated statistics may be retained longer, as they do not contain communication content.
See:
Responsibility boundaries
Clawscan operates within a shared responsibility model.
Client organizations control:
- their Microsoft 365 environment
- the scope of communications subject to analysis
- internal governance policies
GOlegal is responsible for:
- the Clawscan software platform
- telemetry processing
- operational service monitoring
See: