Lead Big Data Engineer

Employment Type

: Full-Time


: Miscellaneous

Job Description:
  • The client is looking for Data Technical Lead experienced to design, build and maintain Big Data workflows/pipelines for collecting, storing, processing, and analyzing huge sets of data into and out of AWS data lake.
  • Engage in application design and data modeling discussions.
  • Guide team with design specifications and development.
  • Build and test data workflows/pipelines.
  • Troubleshoot and resolve application and data processing issues.
  • Code optimization and fine-tune application performance.
  • BS/BA degree in Computer Science, Information Systems, or related field.
  • At least 8 years' experience in the Big Data Hadoop technology area, specifically in Cloudera and AWS.
  • Must have led a team of data engineers to deliver data integration projects in these technology areas.
  • Highly skilled in Spark and Scala.
  • Experienced in the Databricks platform.
  • Worked in the AWS environment in storing, processing data on S3, and transforming data with complex computing into other data models.
  • Strong knowledge in SQL and Unix/Linux scripts.
  • Exposure to other Hadoop Ecosystem technologies like YARN, Zookeeper, HDFS, Avro, Parquet, etc.
  • Connecting to various data sources such as RDBMS, NoSQL, FTP/SFTP, etc., and troubleshooting connection issues.
  • Experience with designing and developing pipelines in cleansing, preparing large, complex data sets for reporting and analytics.
  • Must have used Data Integration Tools such as Talend to develop data pipelines and workflows.
  • Knowledge of storage design concepts including partitioning.
  • Maintain, modify, and improve a large set of structured and unstructured data.
  • Has created or worked with frameworks that apply generic behavior for large data pipelines and implements audit, balance, and control mechanisms effectively.
  • Monitoring and troubleshooting data integration jobs.
  • Exposure to using APIs in retrieving and distributing data.
  • Experience in CI/CD tools for complete automation of development and deployment.
Added Valuable Skills:
  • Exposure to Databricks is highly preferred.
  • NoSQL such as HBase.
  • Distributed Messaging such as Kafka.
  • Data architecture.

Launch your career - Create your profile now!

Create your Profile

Loading some great jobs for you...