Data Engineer (Software Developer) is responsible for Technical Design, Development and Integration Testing of Data Engineering Components using AWS Data Engineering stack as well as involving Python.
Experience, Duties and responsibilities
Overall 6+ years of software development experience as a Data Engineer/ETL with more than 4+ yrs specifically on AWS Cloud Data Engineering stack
Experience in Big Data Technologies Pyspark (with or without AWS Glue)
Experience of ETL/Datawarehousing/Data integration/Data lake.
Experience of MPP (Massive Parallel Processing) databases –Redshift(preferred), Snowflake, Teradata
Challenges involved in Big Data – large table sizes (e.g. depth/width), even distribution of data
Strong SQL and data manipulation/transformation sklills
Strong hands on programming experience with pySpark / Python / Boto3 including Python Frameworks, libraries according to python best practices.
Should have good knowledge on building and deploying python applications, especially on AWS
Data engineering end to end experience – e.g. Data ingestion, cleansing, harmonization/integration, quality, security, etc
Experience of working in a cloud environment (AWS preferred)
Experience working in a geographically disparate team.
Development experience in AWS services mainly on Redshift, Glue, Data lake on S3, Cloudwatch, Lambda, Step function - Understanding of Code versioning, Git repository, CICD.
Effectively communicate orally and in writing to a technical and non-technical audience.
Should be an avid learner, initiative-taker, and team player
Strong interpersonal communication skills and customer interaction skills
Excellent oral and written communication, presentation, and analytical skills
Skills: AWS, Glue, Data lake on S3, Cloundwatch, Python, Lamda, Data Engineering, Redshift, Step function, CICD, Git Repository, Agile Projects Experience