Designation: Senior Software engineer
Level: L3/L4
Years of Experience: 4 – 10 Years
Job Role:
- The developer must have sound knowledge in Apache Spark and Python programming.
- Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
- Experience in deployment and operationalizing the code is added advantage
- Design and build high performing and scalable data processing systems to support multiple internal and 3rd party data pipelines
- Write Python/Spark jobs for data transformation, aggregation, ETL and Machine Learning.
- Tuning pyspark jobs and performance optimization
- Responsible for Design, Coding, Unit Testing, and other SDLC activities in a big data environment
- Requirement gathering and understanding, Analyze and convert functional requirements into concrete technical tasks and able to provide reasonable effort estimates
- Work proactively, independently and with global teams to address project requirements, and articulate issues/challenges with enough lead time to address project delivery risks
- Exposure to Elastic Search, Solr is a plus
- Exposure to NoSql Databases Cassandra, MongoDB
- Exposure to Serverless computing
Must have
- Minimum 3 years of hands-on experience in Spark/python with the overall development experience of 4-8 years in RDBMS systems.
- In-Depth knowledge of python and Spark components, ecosystem is a must
- Strong knowledge in distributed systems and solid understanding of Big Data Systems in the Hadoop Ecosystem.
- Experience with integration of data from multiple data sources (RDBMS, API)
- Experience with AWS or Azure cloud vendors
- Experience in developing and deploying large scale distributed applications.al skills