Job Detail

Senior Data Engineer

  • Medium Level
  • Type Fixed
  • Duration: More than 06 months

Project Detail

We have 150+ globally distributed (remote) team members who love to work from their favorite places in the world. We have team members based in the USA, Canada, Hungary, Japan, Brazil, Spain, Philippines, Nigeria, UK, and more! We love candidates who have a passion for making a global difference in financial services and technology, by impacting local communities and becoming a part of our hyper-growth company.

Your Role:

We are seeking a Senior Data Engineer to design and develop the data management layer for our platform. At Alpaca, data engineering encompasses financial transactions, customer data, API logs, system metrics, augmented data, and third-party systems that impact decision-making for both internal and external users. We process hundreds of millions of events daily, with this number growing as we onboard new customers.

We prioritize open-source solutions in our data management approach, leveraging a Google Cloud Platform (GCP) foundation for our data infrastructure. This includes batch/stream ingestion, transformation, and consumption layers for BI, internal use, and external third-party sinks. Additionally, we oversee data experimentation, cataloging, and monitoring and alerting systems.

Our team is 100% distributed and remote.

Responsibilities:

  • Design and oversee key forward and reverse ETL patterns to deliver data to relevant stakeholders.
  • Develop scalable patterns in the transformation layer to ensure repeatable integrations with BI tools across various business verticals.
  • Expand and maintain the constantly evolving elements of the Alpaca Data Lakehouse architecture.
  • Collaborate closely with sales, marketing, product, and operations teams to address key data flow needs.
  • Operate the system and manage production issues in a timely manner.

Must-Haves:

  • Proven experience building data engineering solutions using open-source infrastructure.
  • Proficiency in at least one programming language, with strong working knowledge of Python and SQL.
  • Experience with cloud-native technologies like Docker, Kubernetes, and Helm.
  • Strong hands-on experience with relational database systems.
  • Experience in building scalable transformation layers, preferably through formalized SQL models (e.g., dbt).
  • Ability to work in a fast-paced environment and adapt solutions to changing business needs.
  • Experience with ETL technologies like Airflow and Airbyte.
  • Production experience with streaming systems like Kafka.
  • Exposure to infrastructure, DevOps, and Infrastructure as Code (IaaC).
  • Deep knowledge of distributed systems, storage, transactions, and query processing.

Interested?? Click me to apply