Title: Data Model Locale: en URL: https://sensorswave.com/en/docs/data-integration/data-model/ Description: Understanding the Sensors Wave data model and data structure Sensors Wave uses an event-centric data model to help you comprehensively record and analyze user behavior. Understanding the data model is the foundation for effectively using Sensors Wave. This article provides a detailed overview of the core concepts of the data model and the relationships between data tables. ## Why Should You Understand the Data Model? Before using Sensors Wave for data analysis, understanding the data model can help you: - **Design a sound tracking plan**: Clarify which events to track, which properties to record, and the most appropriate method for recording them - **Use analysis features correctly**: Understand the basic principles and best practices of analysis models - **Optimize query performance**: Write efficient SQL queries based on the characteristics of the data model - **Avoid common mistakes**: Prevent issues like data duplication and incorrect associations ## Core Concepts ### Events Events represent specific actions performed by users in your application or activities occurring in the system. Events are the core of the Sensors Wave data model — all analysis is based on event data. **Composition of an Event**: Each event contains the following core information: - **Event Name** (`event`): A name describing the user behavior, such as `PageView`, `Purchase`, `UserLogin` - **Event Time** (`time`): The time when the event occurred (client-side time) - **User Identifiers**: - `distinct_id`: The currently active unique user identifier (Login ID takes priority; otherwise, Anonymous ID) - `anon_id`: Anonymous ID, identifying the device or browser - `login_id`: Login ID, identifying the logged-in user - `ssid`: Server-generated unique user ID used for data analysis - **Event Properties**: Detailed information describing the event, such as product ID, amount, traffic source, etc. - **User Property Snapshot**: A snapshot of the user's properties at the moment the event occurred, which is associated and recorded during event ingestion ### Users The user table stores basic user information and property data. Each user has a unique identifier in the system, and all behavioral events are associated with this user. **Composition of a User**: - **Unique User ID** (`ssid`): Server-generated unique user identifier, used to associate with the events table - **Login ID** (`login_id`): The user's business account ID (if the user is logged in) - **Anonymous ID List** (`anon_id`): The user's Anonymous IDs (if the user has logged in from multiple devices, only the most recently associated device ID is recorded) - **Key Timestamps**: - `$first_seen_time`: The time of the user's first visit - `$last_seen_time`: The time of the user's last activity - **User Properties**: Information describing user characteristics, such as age, gender, membership level, etc. ### Event Properties Event Properties are used to describe the specific context and details when an event occurs. Event Properties are reported together with the event, and different events can have different properties. **Characteristics of Event Properties**: - Associated with a specific event, describing the details of "this particular action" - Recorded only when the event occurs; not persisted to the user dimension - Can be used for event filtering, grouping, aggregation, and other analyses **Common Event Properties**: | Event Type | Common Properties | Example | |---------|---------|------| | **Page View** (`PageView`) | Page title, page URL, referrer page | `page_title: "Product Detail"` | | **Purchase** (`Purchase`) | Order ID, amount, payment method, product SKU | `total_amount: 299.00` | | **User Signup** (`UserSignup`) | Signup method, traffic source, referrer | `signup_method: "email"` | | **Button Click** (`ButtonClick`) | Button name, page location, click position | `button_name: "Add to Cart"` | ### User Properties User Properties are used to describe user characteristic information. User Properties are stored in the user table, persisted, and can be updated at any time. **Characteristics of User Properties**: - Associated with a user, describing the characteristics of "this user" - Persistently stored until updated or deleted - Can be used for Cohort, Funnel, Retention, and other analyses **Common User Properties**: | Property Type | Common Properties | Example | |---------|---------|------| | **Basic Information** | Name, age, gender, city | `age: 28` | | **Business Information** | Membership level, total spend, registration time | `vip_level: "gold"` | | **Behavioral Traits** | Last purchase time, purchase count, active days | `purchase_count: 15` | | **Marketing Information** | Registration channel, referrer, coupon usage | `signup_channel: "weixin"` | ### Event Properties vs User Properties Understanding the difference between Event Properties and User Properties is very important: | Dimension | Event Properties | User Properties | |---------|---------|---------| | **Data Scope** | Describes the details of a single event | Describes the overall characteristics of a user | | **Storage Method** | Stored in the events table, recorded with the event | Stored as a snapshot in the events table; the latest value is stored in the users table and persisted | | **Update Method** | Recorded when the event occurs, cannot be modified | Can be updated, added, or deleted at any time (users table only) | | **Analysis Use** | Event filtering, group statistics, trend analysis | Cohort, Retention, conversion analysis | | **Example** | Order amount, product category, page URL | User age, membership level, registration time | **Recommendations**: - **Use Event Properties**: When information is related to a specific action and may have different values across different events - (Recommended) Product SKU (different purchase events may involve different products) - (Recommended) Page URL (different browsing events visit different pages) - (Recommended) Order amount (each purchase has a different amount) - **Use User Properties**: When information describes an overall user characteristic and is relatively stable - (Recommended) User age (relatively fixed) - (Recommended) Membership level (remains unchanged for a period of time) - (Recommended) Registration time (never changes) > **Note**: The same piece of information can serve as both an Event Property and a User Property. For example, a user's membership level at the time of a purchase can be recorded as an Event Property (describing the status at the time of purchase), while the current membership level can also be stored as a User Property (describing the user's current status). ## Data Table Structure The Sensors Wave data model consists of multiple data tables, all centered around the events table, forming a star schema (or snowflake schema). > **Default Tables**: The system creates the **events table** and **users table** by default. These two tables are the core of the data model and must exist. > **Optional Tables**: The **entities table** and **dim_* dimension tables** are optional features. They are only generated after users enable the relevant features and report the required data accordingly. ### Events Table (events) > **Default Table**: Automatically created by the system; must exist. The events table is the core data table of Sensors Wave. All event data is stored in this table. **Core Fields**: | Field Name | Type | Description | |-------|------|------| | `time` | datetime | Time the event occurred (client-side time) | | `event` | string | Event name | | `event_id` | int | Event ID (server-generated) | | `distinct_id` | string | Currently active unique user identifier | | `anon_id` | string | Anonymous ID (device identifier) | | `login_id` | string | Login ID (user account) | | `ssid` | int64 | Server-generated unique user ID | | `trace_id` | string | Event deduplication ID (ensures data uniqueness) | | `received_at` | datetime | Data reception time (server-side time) | | `e_*` | various types | Event Property fields (dynamic columns) | | `u_*` | various types | User Property snapshot fields (dynamic columns) | **Characteristics**: - The events table is a fact table that records detailed information about user behavior - Different event types are stored in the same table, distinguished by the `event` field - Event data cannot be modified after being written (only the `ssid` field is updated during user identity association) - Associated with the users table through the `ssid` field ### Users Table (users) > **Default Table**: Automatically created by the system; must exist. The users table stores basic user information and property data. **Core Fields**: | Field Name | Type | Description | |-------|------|------| | `ssid` | int64 | Unique user ID (primary key) | | `login_id` | string | Login ID | | `anon_id` | array | Device IDs | | `$first_seen_time` | datetime | First visit time | | `$last_seen_time` | datetime | Last active time | | `created_at` | datetime | Record creation time | | `updated_at` | datetime | Record update time | | `u_*` | various types | User Property fields (dynamic columns) | **Characteristics**: - The users table is a dimension table, associated with the events table through `ssid` - Supports CRUD operations on User Properties - When a Login ID is associated with multiple device IDs, only the most recently associated device ID is recorded in the users table - User Properties can be updated at any time ### Multi-Entity Table (entities) > **Optional Table**: Created only when you need to analyze entities other than users, such as devices, orders, products, etc. The multi-entity table stores property information for non-user entities, supporting multi-dimensional analysis. **Core Fields**: | Field Name | Type | Description | |-------|------|------| | `type` | string | Entity type (e.g., `device`, `order`) | | `id` | string | Entity unique identifier (business ID) | | `$first_seen_time` | datetime | First appearance time | | `$last_seen_time` | datetime | Last appearance time | | `created_at` | datetime | Record creation time | | `updated_at` | datetime | Record update time | | `en_*` | various types | Entity property fields (dynamic columns) | **Use Cases**: - **Device Analysis**: Analyze device-dimension metrics such as device model, OS version, etc. - **Order Analysis**: Analyze order-dimension metrics such as order status, delivery method, etc. - **Product Analysis**: Analyze product-dimension metrics such as product category, inventory status, etc. ### Custom Dimension Tables (dim_*) > **Optional Tables**: Users can create custom dimension tables to associate external data. Custom dimension tables store user-defined dimension data and can be joined with the events table for queries. **Characteristics**: - Must include a primary key for associating with the events table - Supports CRUD operations - Suitable for dimension data at the sub-million level - Can be associated with dimension table fields through Event Properties **Use Cases**: - **Store Dimension Table** (`dim_stores`): Stores store information, associated with the events table via `store_id` for store-dimension analysis - **Membership Level Dimension Table** (`dim_vip_levels`): Stores membership level information, associated with the users table via `level_id` for membership tier analysis - **Device Brand Dimension Table** (`dim_brands`): Stores device brand information, associated with the multi-entity table via `brand_id` for device brand analysis - **Campaign Dimension Table** (`dim_campaigns`): Stores campaign information, associated with the events table via `campaign_id` for campaign effectiveness analysis ## Data Table Relationships The Sensors Wave data model is a star schema (or snowflake schema) centered around the events table. ### Table Classification In the data model, tables are classified into two categories by function: - **Fact Table**: Only the **events table** is a fact table, recording detailed information about user behavior - **Dimension Tables**: - **users table**: User dimension table, a dimension table relative to the events table (exists by default) - **entities table**: Multi-entity dimension table, a dimension table relative to the events table (optional) - **dim_* tables**: Custom dimension tables that can serve as dimension tables for the events table, users table, or multi-entity table (optional) ### Data Model Diagram The following is a complete data model diagram showing the relationships between the events table, users table, multi-entity table, and multiple custom dimension tables: ``` ┌──────────────────┐ ┌──────────────────┐ │ Custom Dimension │ │ Custom Dimension │ │ Table (dim_*) │ │ Table (dim_*) │ │ [Optional] │ │ [Optional] │ └────────▲─────────┘ └────────▲─────────┘ │ │ │ Many-to-One │ Many-to-One │ │ ┌────────┴─────────┐ ┌────────┴─────────┐ │ Users Table │ │ Multi-Entity │ │ (users) │ │ Table (entities)│ │ [Required] │ │ [Optional] │ └────────▲─────────┘ └────────▲─────────┘ │ │ │ Many-to-One │ Many-to-One │ │ ┌───────────────────────────────────────────────────┐ │ Events Table (events) │ │ [Required] │ │ │ │ • FK: ssid → users │ │ • FK: e_* → dim_* (Event Property association) │ │ • FK: e_* → entities (Event Property association)│ └──────────────────┬────────────────────────────────┘ │ │ Many-to-One │ ┌───────▼──────────┐ │ Custom Dimension │ │ Table (dim_*) │ │ [Optional] │ └──────────────────┘ ``` **Diagram Description**: 1. **The Events Table (events) is the core of the data model** - The events table is the only fact table and must exist - Located at the center, all dimension tables are associated through foreign keys in the events table - The events table records user behavior and retrieves additional information by associating with dimension tables through foreign key fields 2. **The Users Table (users) and Multi-Entity Table (entities) are side by side** - **Users Table (users)**: Must exist, created by default, and is the most essential dimension table - **Multi-Entity Table (entities)**: An optional table for analyzing entities other than users - Both have similar functional roles as primary dimension tables for the events table - The events table directly associates with both tables through foreign keys (many-to-one relationship) 3. **Three association methods for Custom Dimension Tables (dim_*)** - **Method 1: Direct association with the events table** (shown at the bottom): Associate through Event Property fields (e.g., `e_store_id`), forming a star schema - **Method 2: Association with the users table** (shown at the upper left): Associate through User Property fields (e.g., `u_vip_level_id`) to the users table, then indirectly to the events table, forming a snowflake schema - **Method 3: Association with the multi-entity table** (shown at the upper right): Associate through entity property fields (e.g., `en_brand_id`) to the multi-entity table, then indirectly to the events table, forming a snowflake schema 4. **Data flow and many-to-one relationships** - Arrow direction indicates the foreign key association direction: from the events table (many) pointing to the dimension table (one) - **Events table → Users table**: Multiple events are associated with the same user through `ssid` - **Events table → Multi-entity table**: Multiple events are associated with the same entity through `e_*` Event Properties - **Events table → Custom dimension table**: Multiple events are directly associated with the same dimension record through `e_*` Event Properties - **Users table/Multi-entity table → Custom dimension table**: Multiple users/entities are associated with the same dimension record through property fields 5. **Star schema vs Snowflake schema** - **Star schema**: Custom dimension tables directly associate with the events table; shorter query paths, better performance (suitable for high-frequency query scenarios) - **Snowflake schema**: Custom dimension tables first associate with the users table/multi-entity table, then indirectly with the events table; higher degree of data normalization, reduced redundancy (suitable for scenarios with complex dimension management) ### Relationship Details #### 1. Events Table → Users Table **Association Method**: - Association field: `events.ssid = users.ssid` - Relationship type: Many-to-one (multiple events belong to the same user) - Data flow: The events table associates with the users table through the `ssid` foreign key **Use Cases**: - Analyze user-dimension metrics such as user activity, user conversion rate, etc. - Filter and group events by User Properties - Example: When analyzing "Purchase" events, associate with the users table to obtain the user's membership level, registration time, and other information #### 2. Events Table → Multi-Entity Table **Association Method**: - Association field: Associate through Event Property fields, e.g., `events.e_device_id = entities.id AND entities.type = 'device'` - Relationship type: Many-to-one (multiple events are associated with the same entity) - Data flow: The events table associates with the multi-entity table through `e_*` Event Properties **Use Cases**: - Analyze non-user entity dimension metrics such as device type, order status, etc. - Support flexible analysis across multiple entity types - Example: When analyzing "page view" events, associate with the device entity to obtain device model, OS, and other information #### 3. Events Table → Custom Dimension Table (Star Schema) **Association Method**: - Association field: Associate through Event Property fields, e.g., `events.e_store_id = dim_stores.store_id` - Relationship type: Many-to-one (multiple events are associated with the same dimension record) - Data flow: The events table directly associates with the custom dimension table through `e_*` Event Properties **Use Cases**: - Directly associate external data to enrich analysis dimensions - Good query performance, suitable for high-frequency query scenarios - Example: When analyzing "Purchase" events, directly associate with the store dimension table through `e_store_id` to obtain store name, region, and other information #### 4. Users Table → Custom Dimension Table (Snowflake Schema) **Association Method**: - Association field: Associate through User Property fields, e.g., `users.u_vip_level_id = dim_vip_levels.level_id` - Relationship type: Many-to-one (multiple users are associated with the same dimension record) - Data flow: Events table → Users table → Custom dimension table (two-level association) **Use Cases**: - Extend user dimension hierarchy, reduce data redundancy in the users table - Facilitate unified management and updating of dimension data - Example: The users table stores `u_vip_level_id`, and the membership level dimension table provides detailed information such as level name, discount rate, etc. #### 5. Multi-Entity Table → Custom Dimension Table (Snowflake Schema) **Association Method**: - Association field: Associate through entity property fields, e.g., `entities.en_brand_id = dim_brands.brand_id` - Relationship type: Many-to-one (multiple entities are associated with the same dimension record) - Data flow: Events table → Multi-entity table → Custom dimension table (two-level association) **Use Cases**: - Extend entity dimension hierarchy, support more complex analysis requirements - Reduce data redundancy in the entity table, improve data management efficiency - Example: The device entity table stores `en_brand_id`, and the brand dimension table provides detailed information such as brand name, country, etc. ## Data Model Best Practices ### 1. Design Events and Properties Properly **Event Design Principles**: - Use PascalCase for event names, e.g., `PageView`, `UserLogin` - Event names should clearly express the user action; avoid being too abstract - Use appropriate granularity — neither too fine-grained nor too coarse - Maintain semantic stability of events; avoid frequent modifications **Property Design Principles**: - Use snake_case for property names, e.g., `order_id`, `total_amount` - Distinguish between Event Properties and User Properties; avoid confusion - Property value types should be explicit; avoid using different types for the same property - Avoid storing sensitive information in properties (e.g., passwords, ID numbers) ### 2. Use User Identifiers Correctly - Call the `identify` method to associate the Anonymous ID with the Login ID when the user logs in - Use a stable business ID as the Login ID; avoid using values that may change - Ensure the uniqueness and persistence of the Anonymous ID - Understand the differences and purposes of `distinct_id`, `anon_id`, `login_id`, and `ssid` See [How to Properly Identify Users](user-identification.mdx) for details. ### 3. Optimize Data Query Performance **Query Optimization Tips**: - Prioritize indexed fields in filter conditions (e.g., `time`, `event_id`, `ssid`) - Limit the time range of queries; avoid querying data over excessively long time spans - When grouping, choose fields with low cardinality (e.g., event name, Cohort) - Use Cohorts instead of complex user filter conditions **Practices to Avoid**: - Do not group by high-cardinality fields (e.g., `trace_id`, `distinct_id`) - Do not use too many Event Property filter conditions in a query - Do not frequently query the full dataset; prefer sampling or aggregated data ### 4. Maintain Data Quality **Data Quality Assurance**: - Thoroughly test tracking code in the development environment - Regularly check data reporting quality and monitor for anomalies - Establish metadata management standards and unify Event and Property Definitions - Use data validation tools to ensure correct data formats **Practices to Avoid**: - Do not test tracking directly in the production environment - Do not frequently change the semantics of existing events - Do not store null or abnormal values in Event Properties ## Important Notes ### Data Storage Limitations - **Events Table**: Event data cannot be modified once written (only the `ssid` field is updated during user identity association) - **Users Table**: Supports updating User Properties, but data will be physically deleted after a certain period following user deletion - **Custom Dimension Tables**: It is recommended to keep the data volume under the million level; excessively large data volumes will affect query performance ### Field Naming Conventions - **Event Property fields**: Use the `e_` prefix uniformly, e.g., `e_product_id` - **User Property fields**: Use the `u_` prefix uniformly, e.g., `u_age` - **Entity property fields**: Use the `en_` prefix uniformly, e.g., `en_device_model` - **Virtual property fields**: Event virtual properties use the `ve_` prefix; user virtual properties use the `vu_` prefix > **Tip**: Virtual properties are computed dynamically at query time and are not physically stored in data tables. ### Data Update Timing - **Event data**: Written immediately when the event occurs; cannot be modified - **User Properties**: Can be updated at any time; it is recommended to update promptly when property values change - **Entity properties**: Can be updated at any time; it is recommended to periodically sync external data - **Dimension table data**: Can be updated at any time; it is recommended to sync when business data changes ## Related Documentation - [How to Properly Identify Users](user-identification.mdx): Learn about core concepts and best practices for user identification - [Events and Properties](events-and-properties.mdx): Learn more about event and property design guidelines - [Tracking Strategy](tracking-strategy.mdx): Choose the right tracking approach --- **Last updated**: January 19, 2026