Ir para o conteúdo

Developer

  • On-site
    • Hyderabad, Andhra Pradesh, India
  • IT Services

Job description

Must-Have**

  • Strong proficiency in Python programming.

  • Hands-on experience with PySpark and Apache Spark.

  • Knowledge of Big Data technologies (Hadoop, Hive, Kafka, etc.).

  • Experience with SQL and relational/non-relational databases.

  • Familiarity with distributed computing and parallel processing.

  • Understanding data engineering best practices.

  • Experience with REST APIs, JSON/XML, and data serialization.

  • Exposure to cloud computing environments.

  • 5+ years of experience in Python and PySpark development.

  • Experience with data warehousing and data lakes.

  • Knowledge of machine learning libraries (e.g., MLlib) is a plus.

  • Strong problem-solving and debugging skills.

  • Excellent communication and collaboration abilities.

Job requirements

  • Develop and maintain scalable data pipelines using Python and PySpark.

  • Design and implement ETL (Extract, Transform, Load) processes.

  • Optimize and troubleshoot existing PySpark applications for performance.

  • Collaborate with cross-functional teams to understand data requirements.

  • Write clean, efficient, and well-documented code.

  • Conduct code reviews and participate in design discussions.

  • Ensure data integrity and quality across the data lifecycle.

  • Integrate with cloud platforms like AWS, Azure, or GCP.

Implement data storage solutions and manage large-scale datasets.

or

Apply with Indeed unavailable