- Full Time
- Global - Remote Anywhere (Remote)

Alpaca
Your Role:
We are seeking a Senior Data Engineer to design and develop the data management layer for our platform. At Alpaca, data engineering encompasses financial transactions, customer data, API logs, system metrics, augmented data, and third-party systems that impact decision-making for both internal and external users. We process hundreds of millions of events daily, with this number growing as we onboard new customers.
We prioritize open-source solutions in our data management approach, leveraging a Google Cloud Platform (GCP) foundation for our data infrastructure. This includes batch/stream ingestion, transformation, and consumption layers for BI, internal use, and external third-party sinks. Additionally, we oversee data experimentation, cataloging, and monitoring and alerting systems.
Our team is 100% distributed and remote.
Responsibilities:
- Design and oversee key forward and reverse ETL patterns to deliver data to relevant stakeholders.
- Develop scalable patterns in the transformation layer to ensure repeatable integrations with BI tools across various business verticals.
- Expand and maintain the constantly evolving elements of the Alpaca Data Lakehouse architecture.
- Collaborate closely with sales, marketing, product, and operations teams to address key data flow needs.
- Operate the system and manage production issues in a timely manner.
Must-Haves:
- Proven experience building data engineering solutions using open-source infrastructure.
- Proficiency in at least one programming language, with strong working knowledge of Python and SQL.
- Experience with cloud-native technologies like Docker, Kubernetes, and Helm.
- Strong hands-on experience with relational database systems.
- Experience in building scalable transformation layers, preferably through formalized SQL models (e.g., dbt).
- Ability to work in a fast-paced environment and adapt solutions to changing business needs.
- Experience with ETL technologies like Airflow and Airbyte.
- Production experience with streaming systems like Kafka.
- Exposure to infrastructure, DevOps, and Infrastructure as Code (IaaC).
- Deep knowledge of distributed systems, storage, transactions, and query processing.
If you’re passionate about data engineering and thrive in a dynamic startup environment, we’d love to hear from you!