What Tracer collects
Tracer collects execution metadata derived from operating system–level signals.CPU & scheduling
CPU usage and scheduling behavior
Memory
Memory usage and peak memory
I/O activity
Disk and network I/O activity
Process lifecycle
Process start and stop times
Process relationships
Parent–child process relationships
Container context
Container, namespace, and cgroup context
Cloud cost data
Cloud cost and usage identifiers (from supported providers)
What Tracer does not collect
Tracer explicitly does not collect or inspect:Application and scientific data
Application and scientific data
- Input data files
- Output data or results
- Sample, patient, or experimental data
- File contents or payloads
Tracer may observe that a file was accessed, but never reads or captures file contents. This behavior can be verified in the open-source Tracer/collect implementation.
Source code and runtime internals
Source code and runtime internals
- Source code or scripts
- Function calls or call stacks
- Variables, objects, or in-memory data
- Language-level execution traces
Tracer operates at the process and kernel level, not inside language runtimes.
Secrets and sensitive configuration
Secrets and sensitive configuration
- Environment variables
- Credentials or API keys
- Tokens, passwords, or certificates
Tracer does not inspect process memory or application configuration.
Application- or domain-level semantics
Application- or domain-level semantics
- Biological meaning or correctness
- Algorithmic intent
- Business or scientific interpretation of results
While Tracer can observe which binaries or commands were executed, it does not infer what those commands mean within an application or domain.
Command visibility (clarification)
Tracer may observe:- Which binaries were executed
- Command-line arguments passed to those binaries
- Inspect data passed through those commands
- Parse command arguments for domain meaning
- Access application payloads
Data minimization
Tracer follows a data-minimization approach:Minimal collection
Only metadata required for execution analysis is collected
Early filtering
Filtering occurs as early as possible to reduce volume
No payload inspection
No payload inspection or deep packet capture is performed
Resource-focused
Collection focuses on resource behavior, not content
Maintained allowlists and denylists
Tracer maintains a small set of internal allowlists and denylists to focus collection on meaningful execution activity and reduce unnecessary data. These lists are used to:- Include known scientific tools, workflow binaries, and execution patterns relevant for pipeline observability
- Exclude generic system activity that does not contribute to understanding workload execution (for example, background OS services)
What these lists contain
Depending on configuration and environment, the lists may include:- Common scientific and ML tools and runtimes
- Workflow-related binaries and schedulers
- Known helper processes that are part of pipeline execution
What these lists do not contain
The lists do not include:- File contents or data values
- User-defined secrets or identifiers
- Sample, patient, or experiment metadata
- Application payloads or outputs
How the lists are used
- Lists are applied early in the collection process to reduce event volume
- Classification happens at the level of process metadata, not data content
- The lists do not change application behavior or execution outcomes
Why this matters
Maintaining explicit allowlists and denylists helps Tracer:- Minimize data collection to what is operationally relevant
- Reduce overhead in high-throughput environments
- Avoid collecting noisy or unrelated system activity
- Preserve clear privacy and security boundaries
Data handling and storage
- Execution signals are captured locally and aggregated into structured telemetry
- Only derived metadata is transmitted to the Tracer backend
- Payload data is never exported
- Data retention and access are governed by account-level configuration
Product boundaries
Tracer is intentionally scoped. Tracer observes execution within the boundaries enforced by the operating system, container runtime, and cloud provider.Transparency and open source
The core Tracer agent (Tracer/collect) is open source. The repository documents how execution signals are collected, filtered, and structured, and makes it possible to independently review what data is gathered and what is explicitly excluded. This transparency supports security reviews and helps teams verify Tracer’s data-collection boundaries.Tracer/collect on GitHub
Review the open-source implementation
When this matters
This page is especially relevant if you:- Operate in regulated or security-sensitive environments
- Need to complete security or privacy reviews
- Evaluate Tracer’s suitability for production workloads
- Want clarity on data collection boundaries

