Develop and implement data pipelines that extracts, transforms and loads data into an information product that helps to inform the organization in reaching strategic goals
Work on ingesting, storing, processing and analyzing large data sets
Create scalable and high-performance web services for tracking data
Translate complex technical and functional requirements into detailed designs
Investigate and analyze alternative solutions to data storing, processing etc. to ensure most streamlined approaches are implemented
Serve as a mentor to junior staff by conducting technical training sessions and reviewing project outputs
Daily and Monthly Responsibilities:
Develop and maintain data pipelines implementing ETL processes
Take responsibility for Hadoop development and implementation
Work closely with a data science team implementing data analytic pipelines
Help define data governance policies and support data versioning processes
Maintain security and data privacy working closely with Data Protection Officer internally
Analyze a vast number of data stores and uncover insights
Skills and Qualifications:
Degree in computer sciences, maths or engineering
Expertise in ETL methodology of Data Extraction, Transformation and Load processing in corporate wide ETL solution design using Data Stage
Experience in Python, Spark and Hive
Understanding of data warehousing and data modeling techniques
Knowledge of industry-wide analytical and visualization tools (Tableau and R)
Strong data engineering skills on the Azure Cloud Platform is essential
Streaming frameworks like Kafka
Knowledge of core Java, Linux, SQL, and any scripting language