Skip to main content
v1.2.2Last Updated: Apr 22, 2026

Data Flow Overview

In a nutshell

What this document is:
A high-level description of how information moves through the Clawscan system.

Why this matters:
Organizations evaluating Clawscan need to understand where data is processed, what data remains inside their environment, and what information is transmitted to GOlegal systems.

Who should read this:
Security teams, data protection officers (DPOs), IT architects, and procurement reviewers.

When to use this:
Security reviews, DPIA preparation, vendor risk assessments.


Overview

Clawscan is designed around a tenant-resident processing model.

Communication content is processed inside the client’s Microsoft 365 and Azure environment, and only derived analysis results and operational telemetry are transmitted outside that environment.

This approach allows organizations to benefit from automated compliance monitoring while maintaining strict control over their communication data.

See also:


Data processing stages

Clawscan processes information through four stages:

  1. Content retrieval
  2. Local compliance analysis
  3. Derived result transmission
  4. Aggregated results visualization

Each stage is designed to limit data exposure.


1. Content retrieval

The Clawscan engine retrieves relevant communication content directly from the organization's Microsoft 365 environment.

This retrieval occurs within the client tenant.

Typical inputs include:

  • email message content
  • attachments (when applicable)
  • contextual metadata required for analysis

This information remains inside the client environment during processing.


2. Local compliance analysis

The Clawscan engine performs compliance analysis locally within the tenant.

The analysis process produces derived results, which may include:

  • classification of potential compliance risk
  • numerical risk scoring
  • summarized reasoning describing why the communication triggered analysis signals

The summarized reasoning is designed to remain high-level and contextual.
While it may include limited references or extracts derived from the analyzed communication, it does not aim to reproduce full communication content.

Local processing may also generate technical diagnostics used for system reliability and support.

In limited cases, technical error or diagnostic metadata may be transmitted to GOlegal systems to support maintenance, troubleshooting, and service continuity.

These diagnostics are designed to:

  • be primarily technical in nature
  • avoid inclusion of full communication content
  • minimize exposure of personal data

See:


3. Derived result transmission

After local analysis is completed, Clawscan transmits derived scan results and operational telemetry to the GOlegal control plane.

This information is used for:

  • operational monitoring
  • service reliability
  • licensing and billing management
  • aggregated compliance analytics

No raw communication content is transmitted.


4. Aggregated results visualization

Aggregated scan results and operational telemetry may be accessed by authorized users through the Clawscan Admin Control Center.

This interface allows organizations to:

  • view aggregated compliance signals
  • manage domain (de)activation
  • access billing information
  • review other platform configuration (if any)

The Admin Control Center does not provide access to email content analyzed by the Clawscan Engine.

Only derived results and operational data are accessible through this interface.


Data categories

The following table summarizes how different categories of information are handled.

Data typeLocationTransmitted to GOlegal
Email contentClient tenantNo
AttachmentsClient tenantNo
Local analysis artifactsClient tenantOnly in case of errors & for debugging/assistance purposes
Derived scan resultsGOlegal control planeYes
Operational telemetryGOlegal control planeYes

Derived scan results contain abstracted information produced by the analysis process, rather than the original communication content.


Telemetry and operational monitoring

Clawscan uses telemetry to ensure the service operates reliably and securely.

Typical telemetry elements include:

  • scan timestamps
  • analysis classifications
  • operational identifiers
  • diagnostic metadata

Telemetry does not include raw communication content.

Operational telemetry is used for:

  • service monitoring
  • system diagnostics
  • usage reporting
  • product improvement

See:


Privacy-by-design safeguards

Clawscan incorporates several safeguards to limit data exposure:

  • tenant-resident processing
  • minimal external data transmission
  • derived-result reporting rather than raw data sharing
  • outbound telemetry model

Organizations also retain control over which communications are subject to analysis, allowing deployment policies to reflect internal governance frameworks.

This includes the ability to define scope limitations and exclusions, such as:

  • restricting analysis to specific users and/or groups
  • excluding personal mailboxes or clearly identified private communications
  • limiting monitoring to specific compliance domains (e.g. competition law, anti-corruption)

These controls support a proportionate and targeted deployment, aligned with privacy-by-design principles.

See:


Data retention

Derived scan results are retained according to a period-based retention policy.

Raw results generated during a given operational period are retained during that period and are deleted a fixed number of months after the period ends.

Aggregated statistics may be retained longer, as they do not contain communication content.

See:


Responsibility boundaries

Clawscan operates within a shared responsibility model.

Client organizations control:

  • their Microsoft 365 environment
  • the scope of communications subject to analysis
  • internal governance policies

GOlegal is responsible for:

  • the Clawscan software platform
  • telemetry processing
  • operational service monitoring

See: