Frequently Asked Questions

What is Saddle Data?

Saddle Data is an intelligent, cloud-native data integration platform. It allows you to build, manage, and automate data pipelines that connect your various applications, databases, and cloud services. Our visual-first approach makes it easy to see how your data moves, while our developer-centric tools provide the power and flexibility that technical teams need.

Who is Saddle Data designed for?

Saddle Data is built primarily for data engineers, analytics engineers, and backend developers who need a powerful and efficient way to manage their organization's data infrastructure. It's also a great fit for technical users in operations, marketing, or finance who need to move beyond the limitations of simpler automation tools and require more control over larger datasets.

How does the Remote Agent architecture work?

Unlike traditional ETL tools that require you to open firewall ports, our Remote Agent uses an outbound-only architecture. You run a lightweight Docker container inside your own VPC. The agent reaches out to the Saddle Data Control Plane to poll for jobs, executes the data movement locally, and streams it directly to your destination. Your data never passes through our infrastructure, and your credentials stay behind your firewall.

Can I chain flows together (DAGs)?

Yes! Saddle Data features a visual DAG (Directed Acyclic Graph) orchestrator. You can configure any flow to trigger downstream flows upon success. This allows you to build complex dependencies, such as syncing raw data from multiple sources followed by a consolidated transformation step.

What is Schema Drift Handling?

Schema Drift occurs when an upstream source (like a Postgres table) adds or removes a column. Saddle Data gives you two ways to handle this. You can choose to Auto-Migrate, where we automatically issue an ALTER TABLE to your destination, or Pause for Review, where we halt the pipeline and notify you so you can approve the change before syncing.

How is Saddle Data different from other tools like Fivetran?

While we offer robust data replication like Fivetran, we differentiate ourselves with our Hybrid Data Plane. You don't have to choose between SaaS convenience and on-prem security. Plus, our visual-first approach and developer-centric features like native dbt integration and first-class API access make us a better fit for engineering teams.

What kind of data sources and destinations do you support?

We are constantly expanding our library of connectors. Currently, we support PostgreSQL, MySQL, Cassandra, Snowflake, BigQuery, Databricks, Google Sheets, AWS Cost Explorer, Stripe, Salesforce, HubSpot, and more. Our platform is designed to connect to a wide range of databases, SaaS applications, and data warehouses.

Can I transform my data with Saddle Data?

Yes! Saddle Data has a powerful in-flight transformation engine. You can build a series of steps to filter rows, select or rename columns, and reshape your data before it ever reaches its destination. We also feature a first-class dbt Core integration for warehouse-native transformations.

How does Saddle Data handle sensitive credentials?

Security is our top priority. We never store your sensitive credentials directly in our application database. When you use our Cloud plane, we store them in Google Secret Manager. If you use our Remote Agent, you can inject credentials locally via environment variables or your own Secrets Manager, ensuring they never touch our servers.

Do you have a free plan?

All plans start with the first 30 days free. This allows you to start building and running data pipelines right away. You can sign up and get started in minutes from our sign-up page.

Is there documentation for your API?

Absolutely. We are an API-first platform. Everything you can do in our UI is also available via our REST API. You can find our comprehensive documentation at docs.saddledata.io.