Service
Data & Reporting
Stream and batch pipelines moving billions of impressions, sessions, and signals. Real-time analytics and attribution pipelines that survive production.
What we do
Adtech data systems fail at scale in predictable ways: event deduplication breaks under backpressure, attribution windows drift when clocks are skewed, and Redshift queries slow down as tables grow past the range where auto-vacuuming keeps up. We have worked through these in production.
We build event pipelines on Kafka and Kinesis, with Flink or Spark for stream processing depending on the statefulness requirements. For batch we use Spark on EMR or Databricks. For analytical storage we have worked extensively with Redshift, Snowflake, and ClickHouse, choosing based on query patterns rather than vendor preference. ClickHouse handles our highest-throughput impression analytics work; Snowflake fits better for the business intelligence layers where time-to-answer matters more than raw throughput.
Attribution pipelines are where data engineering and adtech domain knowledge intersect. We understand last-touch, multi-touch, and data-driven attribution models and have implemented them at the scale where you cannot afford a full table scan for each attribution event. The output is a system that gives attribution answers in seconds, not hours, and that does not drift under the load of a live campaign.
Tech we use
Selected work
Related articles
Talk to the adtech team.
Bidder, exchange, agentic, attribution. Tell us what you are working on.