- Overview
- Platform setup and administration
- Platform setup and administration
- Platform architecture
- Data Bridge onboarding overview
- Connecting a Peak-managed data lake
- Connecting a customer-managed data lake
- Creating an AWS IAM role for Data Bridge
- Connecting a Snowflake data warehouse
- Connecting a Redshift data warehouse (public connectivity)
- Connecting a Redshift data warehouse (private connectivity)
- Reauthorizing a Snowflake OAuth connection
- Using Snowflake with Peak
- SQL Explorer overview
- Roles and permissions
- User management
- Inventory management solution
- Commercial pricing solution
- Merchandising solution

Supply Chain & Retail Solutions user guide
Data ingestion
Peak's Data Sources feature lets you ingest data from external sources into your data warehouse using connectors and feeds. Once configured, feeds run on a schedule, on demand, or via webhook.
To access Data Sources, open Manage and select Data Sources.
Peak supports four data ingestion paths:
- File storage ingestion — ingest data from files via drag and drop, FTP/SFTP upload, or signed URL.
- Application connectors — pull data from online platforms such as Google Ads, Amazon S3, and Braze.
- Database connectors — connect directly to a database such as Redshift, Snowflake, PostgreSQL, MSSQL, MySQL, or Oracle.
- Ingestion API — push data into Peak programmatically.
Data feeds
A data feed connects to a source, copies its data, and ingests the copy into your Peak data warehouse. Feeds can run automatically using a trigger or manually on demand.
Each feed is configured through four stages:
| Stage | Description |
|---|---|
| Connection | Source location and credentials (for example, hostname, username, password). |
| Import configuration | Specific tables or data to ingest, load type, and key configuration. See Load types. |
| Destination | Target storage in the data warehouse or S3. See Destination options. |
| Trigger | When and how the feed runs. See Triggers and watchers. |
Managing feeds
The Feeds screen shows the status, run history, and next scheduled run for each feed. Hover over a feed to access the following actions:
| Action | Description |
|---|---|
| Run | Runs the feed immediately. |
| Pause | Pauses the feed schedule. |
| Tags | Manages tags for the feed. Only alphanumeric characters are allowed. Use Tab or Enter to separate values. |
| Edit | Opens the feed configuration for editing. |
| Resume | Resumes the feed schedule after a pause. |
Filtering feeds
Use the filter function to find feeds when you have a large number. The following filters are available:
- Feed status: Active, Paused
- Trigger type: Schedule, Webhook, Run Once (Manual), Run Once (Schedule)
- Last run status: Running, Failed, Success, No new data
- Tags: Custom tags applied to your feeds
Monitoring feed activity
Select a feed to open its detail view. Two tabs are available:
Logs tab
Shows how many rows were successfully loaded into the data warehouse, and detailed logs for each feed run. Select the Browse file icon on a log entry to open Files and view the files associated with that feed run.
If a feed run fails, error details are shown in the detailed log. To get details of individual failed records, download the STL load error files using the download icon next to the error details.
Info tab
Shows basic configuration information about the feed.
Load types
Load types control how data is written to your destination table when a feed runs. You select a load type during import configuration.
Load type summary
| Load type | Primary key required | Last run key required | Behavior |
|---|---|---|---|
| Truncate and insert | Optional | Not available | Replaces the destination table with each run. |
| Incremental | Optional | Required | Appends new records based on the last run key. |
| Upsert | Required | Optional | Updates existing records and inserts new records. |
Key configuration
Two key columns can be configured when selecting a load type:
- Primary key: The column or set of columns that uniquely identifies each record. Used to determine which records to update.
- Last run key: A column used to detect new or changed data since the previous run. Can be a timestamp (detects new and modified rows) or a strictly incrementing value (provides a unique ID for updates). Last run key configuration is only available for database connectors.
Truncate and insert
Best suited for small data tables.
When this load type runs, the entire destination table is replaced with the data retrieved from the source.
- Primary key: Optional. Because the full table is replaced each run, there is no need to identify specific records.
- Last run key: Not available. The full dataset is always fetched.
Incremental
Best suited for event-type datasets where records are added over time but not modified.
With incremental feeds, records are only inserted — existing records are never updated. At each run, only records added since the last run are fetched, based on the column value specified in the last run key.
- Primary key: Optional. Only new records are inserted into the existing table, so record-level identification is not required.
- Last run key: Required. Determines the cutoff point for fetching new records. For example, if the last run key is a date column, only records with a value greater than the date of the last run are fetched.
Upsert
Best suited for transactional data where records can be created or modified over time.
With upsert feeds, new records are inserted and existing records are updated. In some cases, the full dataset is fetched to ensure all updates are captured.
- Primary key: Required. Used to identify which existing records need to be updated when new data arrives.
- Last run key: Optional. If you need to capture updates to existing records across the full dataset, omit the last run key so that all records are fetched on each run.
Destination options
When creating a data feed, you select a destination where Peak ingests your data. The available destinations depend on the data warehouse configured for your Peak organization.
Checking your data warehouse type
To see which warehouse type your organization uses, open Manage and select Data Bridge. In the Data warehouse section, check the storage type shown.
For more information about data warehouse configuration, see Data Bridge overview.
Available destinations by warehouse
| Data warehouse | Available destinations |
|---|---|
| Snowflake | S3 (Spark processing) or Snowflake — not both |
| Redshift | S3 (Spark processing), Redshift, or both |
S3 (Spark processing)
Stores data in Amazon S3. Peak uses Apache Spark to process large, unstructured CSV datasets.
- For Snowflake organizations: data is processed as an external table during ingestion and is available to query in SQL Explorer.
- For Redshift organizations: data is processed as Redshift Spectrum during ingestion and is available to query in SQL Explorer. This destination requires an active Glue Catalog. If the option is unavailable, configure the Glue Catalog first.
Snowflake
Data is ingested into the Snowflake data warehouse. Snowflake is SQL-based and supports querying and analyzing data using standard SQL.
After ingestion, Peak adds these audit columns to the destination table:
| Audit column | Description |
|---|---|
PEAKAUDITCREATEDAT | Time when the record was created in the table. |
PEAKAUDITFILENAME | Path of the raw file in the data lake that contains the record. |
PEAKAUDITREQUESTID | Feed run identifier. |
PEAKAUDITUPDATECOUNTER | Number of times the record has been updated (upsert load type only). |
PEAKAUDITUPDATEDAT | Date when the record was last updated (upsert load type only). |
Redshift
Data is ingested into Amazon Redshift. Redshift is a relational database that supports SQL querying and is optimized for aggregations on large datasets. Incoming data must map exactly to the destination table schema, column by column. Any failed rows are flagged and written to a separate table.
After ingestion, Peak adds these audit columns to the destination table:
| Audit column | Description |
|---|---|
peakauditcreatedat | Time when the record was created in the table. |
peakauditrequestid | Feed run identifier. |
peakauditupdatecounter | Number of times the record has been updated (upsert load type only). |
peakauditupdatedat | Date when the record was last updated (upsert load type only). |
Failed row threshold
The failed row threshold defines how many rows can fail before Peak marks the feed as failed. Set a threshold that reflects your total row volume and acceptable data quality tolerance.
Behavior varies by load type:
- Incremental and truncate and insert: the threshold is the aggregate error count across all files in a feed run.
- Upsert: the threshold is applied per file. If some files exceed the threshold and others do not, the feed is marked as Partially ingested.
Additional rules:
- If all rows in a file have corrupted data, the feed is marked as failed regardless of the threshold.
- Error messages for failed rows can be downloaded from feed logs.
- Only positive integers are accepted as threshold values.
Threshold defaults and limits by destination:
| Property | Snowflake | Redshift |
|---|---|---|
| Default threshold | 0 | 1,000 |
| Maximum threshold | 100,000 | 100,000 |
| Editable in Database Connector | No (fixed at 0) | Yes |
| Available in REST API, Webhook, Braze Currents connectors | No | No |
Schema evolution
Peak handles schema changes automatically when a feed run detects added or removed columns.
For CSV files, all files within a single feed run must use the same schema. Inconsistent schemas across files in the same feed run cause ingestion failure. For NDJSON files, different files can have different schemas, provided each individual file has a consistent unified schema.
| Change | Behavior |
|---|---|
| New column added | Column is added to the destination table with data type string. Previous records are set to NULL for that column. |
| Existing column removed | Column is retained in the destination table. New records are set to NULL for that column. |
Schema data types
When configuring a destination, you can override the inferred data type for each column. This is available for all connectors except Webhook and Braze Currents.
Supported data types:
STRINGINTEGERNUMERICTIMESTAMPDATEBOOLEANJSON
TIMESTAMPTZ is not supported. Data in this format is ingested as STRING.
Triggers and watchers
Triggers define when a data feed runs. Watchers send notifications when feed events occur.
Trigger types
| Trigger type | Purpose | Notes |
|---|---|---|
| Schedule | Runs feeds on a basic or cron schedule. | Cron expressions use 6 or 7 fields. |
| Webhook | Runs feeds when external systems send events. | Requires a webhook URL and API key. |
| Run once | Runs a feed once on demand or at a set time. | Manual runs are available from the Feeds list. |
Schedule trigger
Basic schedules run a feed at a specified time and day, or at a recurring frequency (for example, every 2 hours or every Monday and Tuesday at 12:00).
Advanced schedules use cron expressions for more precise timing.
Cron format
A cron expression is a string of 6 or 7 space-separated fields:
| Field | Mandatory | Allowed values | Allowed special characters |
|---|---|---|---|
| Seconds | Yes | 0–59 | , - * / |
| Minutes | Yes | 0–59 | , - * / |
| Hours | Yes | 0–23 | , - * / |
| Day of month | Yes | 1–31 | , - * ? / L W |
| Month | Yes | 1–12 or JAN–DEC | , - * / |
| Day of week | Yes | 1–7 or SUN–SAT | , - * ? / L # |
| Year | No | empty, 1970–2099 | , - * / |
Examples:
0 0 12 * * ?runs every day at 12:00.0 15 10 * * ? 2021runs at 10:15 every day during 2021.0 15 10 ? * 6Lruns at 10:15 on the last Friday of each month.
To validate your cron expressions before use, consider using a syntax checker such as crontab.guru.
Webhook trigger
Unlike regular APIs that require constant polling for updates, webhooks only send data when a specific event occurs — in this case, when new data is available for the feed.
For webhook triggers, Peak generates a unique webhook URL for the feed. Copy the URL into the external system that sends events. You can regenerate the URL if needed. If the external system is outside Peak, provide your tenant API key so the webhook can be authenticated. See API keys.
Run once trigger
Run once triggers can be manual or scheduled:
- Manual: run the feed from the Feeds list when needed.
- Date and time: run the feed once at a specified time (at least 30 minutes from now).
Watchers
Configure watchers to send notifications for feed events:
| Watcher type | Purpose | Example use case |
|---|---|---|
| User watcher | Notifies Peak users in the platform. | Alert a data team when a feed fails. |
| Webhook watcher | Sends events to external systems. | Trigger a Slack notification or workflow. |
To add a watcher to a feed:
- In the Trigger stage, select Add watcher.
- Choose User watcher or Webhook watcher.
- Select the feed events to monitor and save the watcher.
User watcher
User watchers notify selected users inside Peak when a feed event occurs. Users can view notifications from the bell icon.
When configuring a user watcher, you can choose to watch all events or select a custom set.
User watchers can be configured for these events:
- Create
- Run fail
- Run success
- No new data
- Feed edit or delete
Webhook watcher
Webhook watchers send a notification or trigger an action in an external system or Peak feature when a feed event occurs. Examples include Slack notifications or Peak Workflows.
The webhook URL is provided by the target application. If the target is a Peak Workflow, copy the URL from the Workflow's Trigger step.
The JSON payload is optional and can include these parameters to pass feed context:
{tenantname}{jobtype}{jobname}{trigger}
Webhook watchers can be configured for these events:
- Run fail
- Run success
- Running longer than a specified time
- No new data
File naming and timestamps
Peak uses file names to determine how files are grouped into feeds. If a file is updated and fetched regularly as part of the same feed, the file must retain the same base name with a new timestamp appended on each update.
File names must include at least one alphanumeric character.
File naming patterns
Use one of these patterns when naming files for feeds:
<file_name>_<s|n|part><number>.<extension>(for example,Abcs_s12345.csvorABC_part123323.csv)<file_name>_<valid_date>.<extension>(for example,Abc_20131101.csvorAbc_20131101123432.csv)
How Peak parses file names
Peak applies these parsing rules to extract the feed name from each file:
| File name pattern | Example | Parsed name |
|---|---|---|
name_timestamp.csv | customer_20181112.csv | customer |
name_part_timestamp.csv | customer_part123_20181211.csv | customer_part123 |
name_timestampanytext.csv | customer_20181112anytext.csv | customer_20181112anytext |
| No timestamp | customer_profile.csv | customer_profile |
| No underscore before timestamp | customer20120312.csv | customer |
| Special symbols | customer-company:20130817.csv | customercompany |
Timestamp formats
Peak recognizes these valid timestamp formats in file names:
YYYYMMDDYYYYMMDDHHYYYYMMDDHHmmYYYYMMDDHHmmssYYYYMMDDHHmmssSYYYYMMDDHHmmssSSYYYYMMDDHHmmssSSSDDMMYYYYDDMMYYYYHHDDMMYYYYHHmmDDMMYYYYHHmmssDDMMYYYYHHmmssSDDMMYYYYHHmmssSSDDMMYYYYHHmmssSSS
Required schemas
If you upload data to a managed Snowflake data warehouse, Peak expects schemas for organizing your data. Schemas are sub-areas of a data warehouse used to group tables by their purpose and processing stage.
Tables do not strictly need to be placed in a particular schema, but using the recommended schemas keeps your data well organized.
Recommended schemas
The following schemas organize your data by processing stage:
| Schema | Purpose |
|---|---|
STAGE | Raw data. Data ingested via Manage > Data Sources lands here by default. |
TRANSFORM | Aggregated data ready for modeling. |
PUBLISH | Processed and cleaned data for use in dashboards and web apps. |
SANDPIT | Experimental or ad hoc data not ready for modeling or use in apps. |
- Data feeds
- Managing feeds
- Filtering feeds
- Monitoring feed activity
- Logs tab
- Info tab
- Load types
- Load type summary
- Key configuration
- Truncate and insert
- Incremental
- Upsert
- Destination options
- Checking your data warehouse type
- Available destinations by warehouse
- S3 (Spark processing)
- Snowflake
- Redshift
- Failed row threshold
- Schema evolution
- Schema data types
- Triggers and watchers
- Trigger types
- Schedule trigger
- Webhook trigger
- Run once trigger
- Watchers
- File naming and timestamps
- File naming patterns
- How Peak parses file names
- Timestamp formats
- Required schemas
- Recommended schemas