Role: Data Engineer

Job Description

We need someone with 5-8 years of extensive experience in Data Warehousing, ETL and Big data technologies(Hadoop, Hive, Sqoop..etc) and 3+ years of mandatory experience in Spark with Python/Scala with more than one end-to-end implementation experience.

Roles and Responsibilities

  • To develop Scala or Python scripts, UDFs using both Data frames/SQL/Data sets and RDD in Spark 2.3+ for Data Aggregation, queries and writing data back into the OLTP system through Sqoop.
  • Should have a very good understanding of Partitions, Bucketing concepts and designed both Managed and external tables, ORC files in Hive to optimize performance.
  • Wrote and Implemented Spark and Scala scripts to load data from and to store data into Cassandra/Hbase/ any NoSQL
  • Implementing SCD Type 1 and Type 2 model using Spark
  • Developed Oozie workflow for scheduling and orchestrating the ETL process
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, the correct level of Parallelism and memory tuning
  • Streaming data into Elastic search for visualization using Kibana
  • Should have implemented the mapping parameters/variables in the mapping and the session level to increase the reusability of the code and parameterize the hardcoded values.

Additional skills:

  • Knowledge in AWS stacks AWS Glue, S3, SQS
  • Exposure to Elastic Search, Solr is a plus
  • Exposure to NoSQL Databases Cassandra, MongoDB
  • Exposure to Serverless computing

Apply for this job