Description

The ideal candidate is an experienced data pipeline builder who enjoys optimizing data systems and building them from the ground up. The Data Engineer will support software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.

Client Details

Our client is building an IT competency centre in Lisbon and requires technically qualified, adaptable and ambitious IT professionals.

Description

We are looking for a Senior Data Engineer to join our client's growing team. The hire will be responsible for expanding and optimizing the data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced PySpark data pipeline developer who enjoys optimizing data systems and building them from the ground up. The Data Engineer will support software developers, database architects, data analysts and data scientists on data initiatives and will ensure the DataOps architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of helping build the 'best in breed' approach to DataOps and data pipelining to support our next generation of products and data initiatives:

Create and maintain optimal data pipeline architecture
Assemble large, complex data sets that meet functional / non-functional business requirements
Build the infrastructure required for optimal extraction, transformation and loading of data from a wide variety of data sources using SQL and AWS 'big data' technologies
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs
Create data tools for analytics and data scientist team members that assist them in building and optimizing the product into an innovative industry leader

Profile

Essential Skills

3+ years extensive experience of building big data pipelines with Apache Spark (preferably with PySpark) - real-time and batch processing
3+ years of running and optimizing Spark data pipelines on either Amazon Web Services (AWS), Microsoft Azure or Google Cloud Platform (GCP)
3+ years' experience of interacting with big data and cloud data stores, both SQL and NoSQL such as Hive, Presto, Big Query, AWS Athena, AWS Redshift or related big data store systems
3+ years' experience of python
Experience building large scale distributed systems or applications
Strong data modeling skills (NoSQL & RDBMS)
Strong experience with Unix environments

Desirable Skills

Experience with streaming event systems like Kafka and Kinesis
Good knowledge of Design Patterns
Experience with Unix Shell scripting
Experience with Docker and Kubernetes
Experience with Continuous Integration, QA and Stress Testing
Experience with configuration management tools like Puppet, Chef, Ansible and Saltstack
Good knowledge of Agile development methodology

Job Offer

Join a leading company.

Candidate-se através do website

Short Info

Posted 1594 days ago
Keywords: data-engineer-m-f

Data Engineer

Description

Short Info

Company Info