Posted On 06 September
As a Software Engineer for the Cord Data & Ingestion (CDI) team, you will be responsible for the design, development, and maintenance of the data lake, data ingestion, data infrastructure, and data pipelines that enable data-driven decision-making for Coupang. In this role, you will build back-end services to ingest data from a vast variety of source systems, optimize the storage, support data pipelines that process terabytes of data each day and make it available for customers to make fast business decisions. At this scale, you will need to ensure fault tolerance and robustness to deliver accurate results on time
Job Description
What You Will Do
Design and build Coupang's next-generation Data Warehouse platform.
Design and build self-serve data ingestion framework for company-wide usage.
Design and build source and sync connectors for data ingestion.
Explore and implement modern data ingestion and data lake technologies for the team.
Design and build reliable data pipelines using modern distributed processing technologies such as Spark, and Hive with the ability to scale with very large data volumes.
Review and engage in system architecture decisions.
Engage in designs, and code review with the team to ensure the highest quality and enforce industry best practices.
Develop best practices and frameworks for the unit, functional, and integration tests for our team's test coverage and automation.
Foster, mentor, and enforce industry best practices in data architecture design, design patterns, and coding standards.
Basic Qualifications
BS or advanced degree in Computer Science, or related technical field.
8+ years of experience in building software and solutions.
Experience in one or more programming languages such as Java, Scala, Python, Go or Kotlin.
Experience in distributed processing using EMR, Spark, Hive, or other big data frameworks.
Solid fundamentals in OO design, data structures, and algorithms.
Preferred Qualifications
Experience with Cloud Computing platforms and understanding of scaling and reliability issues.
Experience in designing, building, and maintaining highly scalable backend applications and platforms.
Experience in Stream-processing systems (Kafka, Storm, Spark-Streaming, or equivalent) is a plus.
Familiarity with workflow or orchestration frameworks, open-source tools like Airflow and Luigi or commercial enterprise tools.
Knowledge of container services (Docker/Kubernetes) is a plus.