Azure Senior Data Engineer
Our customer provides innovative solutions and insights that enable our clients to manage risk and hire the best talent. Their advanced global technology platform supports fully scalable, configurable screening programs that meet the unique needs of over 33,000 clients worldwide. Headquartered in Atlanta, GA, they have an internationally distributed workforce spanning 19 countries with about 5,500 employees. Our partner perform over 93 million screens annually in over 200 countries and territories.
We are seeking a Senior Data Engineer with solid Python/PySpark programming skills to join the Data Engineering Team and help us build the Data Analytics Platform in Azure cloud.
- Develop reusable, metadata-driven data pipelines
- Automate and optimize any data platform related processes
- Build integrations with data sources and data consumers
- Add data transformation methods to shared ETL libraries
- Write unit tests
- Develop solutions for the Databricks data platform monitoring
- Proactively resolve any performance or quality issues in ETL processes
- Cooperate with infrastructure engineering team to set up cloud resources
- Contribute to data platform wiki / documentation
- Perform code reviews and ensures code quality
- Initiate and implements improvements to the data platform architecture
- Programming: Python/PySpark, SQL
- Proficient in building robust data pipelines using Databricks Spark
- Experienced in dealing with large and complex datasets
- Knowledgeable about building data transformations modules organized as libraries (Python packages)
- Familiar with Databricks Delta optimization techniques (partitioning, z-ordering, compaction, etc.)
- Experienced in developing CI/CD pipelines
- Experienced in leveraging event brokers (Kafka /Event Hubs / Kinesis) to integrate with data sources and data consumers
- Understanding of basic networking concepts
- Familiar with Agile Software Development methodologies (Scrum)
- Understanding of stream processing challenges and familiarity with Spark Structured Streaming
- Experience with IaC (Terraform, Bicep or other)
- Experience running containerized applications (Azure Container Apps, Kubernetes)
- Experience building event sourcing solutions
- Familiarity with platforms for change data capture (e.g. Debezium)
- Knowledge of Azure cloud native solutions (e.g. Azure Data Factory, Azure Function App, Azure Container Instances)