Back to offers
Google

Data Engineer

Google

Warsaw, Poland

80 000 - 110 000 EUR

Gross

Annual

Employment

Full time

Experience

Senior

Contract

B2B

Job type

Office

PythonAWSDataSQL

Original Offer

View detailed description on company site

Job description

Google's Warsaw data platform team builds the pipelines, tooling, and infrastructure that power analytics and ML workloads across multiple product lines. We process hundreds of billions of events per month, and engineering reliability and cost-efficiency into the data layer is as important as feature velocity.

As a Senior Data Engineer you will design, build, and own end-to-end data pipelines - from Kafka ingestion through Spark transformations to Redshift and Delta Lake. You will work closely with data scientists who consume your outputs and product engineers who instrument the raw events. You are expected to introduce and maintain data quality checks with Great Expectations, write dbt models that non-engineers can trust, and proactively identify schema drift before it breaks downstream dashboards.

You will also mentor mid-level engineers on performance profiling of Spark jobs, and contribute to shared Terraform modules that provision EMR clusters and MSK topics. A strong SQL foundation is essential - you will write non-trivial window function queries regularly.

Technical stack

  • Python 3.12
  • Apache Spark 3.5 (PySpark)
  • AWS EMR
  • Amazon Redshift
  • Amazon S3
  • Apache Airflow 2.9
  • dbt
  • Apache Kafka (MSK)
  • Delta Lake
  • Great Expectations
  • pytest
  • Terraform
  • Docker

Interview process

Step 1 - Phone screen (45 min, recruiter + hiring manager): background, why data engineering, what you have built at scale.

Step 2 - Technical deep-dive I (60 min, live coding): Python problem with a data-transformation focus - likely involving PySpark, file formats, and edge cases. We care about correctness and clarity more than raw speed.

Step 3 - Technical deep-dive II (60 min): SQL + data modelling. You will design a data warehouse schema for a given domain, write non-trivial window queries, and explain trade-offs (star vs snowflake, partitioning strategy, late-arriving data).

Step 4 - System design (60 min): design a streaming pipeline from event ingestion to analytics dashboard - topics include exactly-once semantics, backpressure, schema registry, SLA monitoring.

Step 5 - Hiring committee and offer: typical timeline 10–14 business days from final round.

Read the full description and apply if you think you are a good match.

Job views

3 538

Posted

a day ago

Publisher

Brian Kelly

Similar Job Offers

Join our newsletter

Get the latest job offers directly to your inbox.