this is a german Web-Mirror of PYTHON.ORG powered by Domainunion AG

Notice: While JavaScript is not essential for this website, your interaction with the content will be limited. Please turn JavaScript on for the full experience.

Composable Data Stack Python Engineer
dltHub / ScaleVector GmbH
Berlin, BE, Germany

Job Title

Composable Data Stack Python Engineer

Job Description

Your Tasks and Responsibilities:

dlt is a missing part between the traditional Modern Data Stack and the emerging Pythonic Composable Data Stack: a gateway that creates datasets that the other components can then process. Our mission is to integrate dlt fully with this new, emerging ecosystem in a way that our users love. This means we respect their time, effort and previous investments in modern data stack when designing features. To support this mission, your tasks and responsibilities include:

  • You design and implement OSS features that make dlt a gateway to composable data stack: integrating query engines, transformation frameworks, and table formats with our library
  • You listen to our users, always paying attention to what they need to go to production with dlt
  • You work with our customers in commercial projects where dlt is combined with existing “modern data stack” infrastructure
  • You maintain the open source project with the team (e.g., review PRs, resolve issues, talk with community contributors, etc.)

Restrictions

  • No telecommuting
  • No Agencies Please

Requirements

Who You Are

If you are fascinated by the emerging ecosystem of data libraries in Python, which allows you to do on a single machine what until recently was possibly only in the cloud - then you’ll enjoy working with us.

  • You know what duckdb, arrow, datafusion, lancedb, delta-rs, ibis, pyiceberg, sqlglot, kedro, hamilton and similar Python libraries / pip installable components do and know when to apply them.
  • You have experience in building data apps or product based on composable data stack
  • … or you were contributing code to any (or similar) of projects above
  • You know what so called “Modern Data Stack” is and appreciate certain aspects of it (ie. maturity, fitting into enterprise workflows etc.)
  • … and in fact you are interested in combining both worlds.
  • You really like Python and are fluent in writing Python code (e.g., Python typing, unit testing, writing docstrings, etc.)
  • You have a degree in computer science, data science, or other equivalent experience
  • You are familiar with GitHub workflows (e.g., pull requests, code reviews, CI/CD services, etc.)

Nice to Have:

  • You are based in Berlin and willing to work in our office regularly
  • You have a hacker nature and you love to make things optimized
  • Experience with DevOps (e.g., CI systems like GitHub Actions, Docker, Kubernetes, AWS/GCP/Digital Ocean, etc.)
  • Experience with machine learning (e.g., the toolset, the workflows, practical applications, etc.)

About the Company

Who We Are

We are looking for a Software or Data Engineer who is experienced in high-performance Python data processing libraries (often referred to as the Composable Data Stack). You will collaborate directly with our CTO and be part of the core product team.

dlt is an open-source library that automatically creates datasets from messy, unstructured data sources. You can use the library to move data from about anywhere into the most well-known SQL and vector stores, data lakes, storage buckets, or local engines such as DuckDB, Arrow or delta-rs. The library automates many cumbersome data engineering tasks and can be handled by anyone who knows Python. You can see more details in this Hacker News article.

dltHub is based in Berlin and New York City. It was founded by data and machine learning veterans. We are backed by Foundation Capital, Dig Ventures, and many technical founders from companies such as Datadog, Instana, Hugging Face, MotherDuck, Mesosphere, Matillion, Miro, and Rasa.

Contact Info

Previous Senior Software Engineer, b-rayZ in Schlieren, Switzerland Next Software Engineer - Backend, Reflex in San Francisco, California, United Stats