Data Engineer

Data Engineer Role Profile

The Data Engineer’s role entails building and supporting data pipelines of which must be scalable, repeatable and secure. This role functions as a core member of an agile team whereby these professionals are responsible for the infrastructure that provides insights from raw data, handling and integrating diverse sources of data seamlessly. They enable solutions, by handling large volumes of data in batch and real-time by leveraging emerging technologies from both the big data and cloud spaces. Additional responsibilities include developing proof of concepts and implements complex big data solutions with a focus on collecting, parsing, managing, analysing and visualising large datasets. They know how to apply technologies to solve the problems of working with large volumes of data in diverse formats to deliver innovative solutions. Data Engineering is a technical job that requires substantial expertise in a broad range of software development and programming fields. These professionals have a knowledge of data analysis, end user requirements and business requirements analysis to develop a clear understanding of the business need and to incorporate these needs into a technical solution. They have a solid understanding of physical database design and the systems development lifecycle.

Responsibilities

Architects Data analytics framework
Translates complex functional and technical requirements into detailed architecture, design, and high performing software
Leads Data and batch/real-time analytical solutions leveraging transformational technologies
Works on multiple projects as a technical lead driving user story analysis and elaboration, design and development of software applications, testing, and builds automation tools
Development and Operations
Database Development and Operations
Policies, Standards and Procedures
Business Continuity & Disaster Recovery
Research and Evaluation
Creating data feeds from on-premise to AWS Cloud
Support data feeds in production on break fix basis
Creating data marts using Talend or similar ETL development tool
Manipulating data using python
Processing data using the Hadoop paradigm particularly using EMR, AWS’s distribution of Hadoop
Develop for Big Data and Business Intelligence including automated testing and deployment

Requisite Experience, Education, Knowledge and/ or Skills

Bachelor’s Degree in Computer Science, Computer Engineering, or equivalent
AWS Certification
Extensive knowledge in different programming or scripting languages
Expert knowledge of data modelling and understanding of different data structures and their benefits and limitations under particular use cases
Capability to architect highly scalable distributed systems, using different open source tools
5+ years Data engineering or software engineering experience
2+ years Big Data experience
2+ years’ experience with Extract Transform and Load (ETL) processes
2+ years AWS experience
5 years demonstrated experience with object-oriented design, coding and testing patterns as well as experience in engineering (commercial or open source) software platforms and large-scale data infrastructures
Big Data batch and streaming tools
Talend
AWS: EMR, EC2, S3
Python
PySpark or Spark

Staff Portal

About

Quick Links