Sigma Aldrich | is Hiring | Data Engineer – ACE | BigDataKB.com | 2022-04-05

Before u proceed below to check the jobs/CVs, please select your favorite job categories, whose top job alerts you want in your email & Subscribe to our Email Job Alert Service For FREE

 

Job Location: Bangalore/Bengaluru

The Life Science Data Engineering Team is responsible for designing, developing, testing, and supporting automated end-to-end data pipelines and applications on Life Science’s data management and analytics platform (Palantir Foundry, Hadoop, and other components). The Foundry platform comprises multiple different technology stacks, which are hosted on Amazon Web Services (AWS) infrastructure or on-premises Merck’s own data centers.

Developing pipelines and applications on Foundry requires: Proficiency in SQL / Java / Python (Python required; all 3 not necessary)Proficiency in PySpark for distributed computationFamiliarity with Postgres and ElasticSearchFamiliarity with HTML, CSS, and JavaScript and basic design/visual competencyFamiliarity with common databases (e.g., JDBC, mySQL, Microsoft SQL).

Not all types required This position will be project based and may work across multiple smaller projects or a single large project utilizing an agile project methodology.

Roles Responsibilities:

BigDataKB.com Jyotish
BigDataKB.com Jyotish - Career & Life Prediction

Develop data pipelines by ingesting various data sources – structured and un-structured – into Palantir FoundryParticipate in end-to-end project lifecycle, from requirements analysis to go-live and operations of an applicationActs as business analyst for developing requirements for Foundry pipelinesReview code developed by other data engineers and check against platform-specific standards, cross-cutting concerns, coding and configuration standards and functional specification of the pipeline

Document technical work in a professional and transparent way. Create high quality technical documentationWork out the best possible balance between technical feasibility and business requirements (the latter can be quite strict)Deploy applications on Foundry platform infrastructure with clearly defined checksImplementation of changes and bug fixes via Mercks change management framework and according to system engineering practices (additional training will be provided)

DevOps project setup following Agile principles (e.g., Scrum)Besides working on projects, act as third level support for critical applications; analyze and resolve complex incidents/problems. Debug problems across a full stack of Foundry and code based on Python, Pyspark, and Java

Work closely with business users, data scientists/analysts to design physical data models Education Bachelor (or higher) degree in Computer Science, Engineering, Mathematics, Physical Sciences, or related fields Professional Experience 5 years of experience in system engineering or software development3 years of experience in engineering with experience in ETL type work with databases and Hadoop platforms.

Skills

Hadoop GeneralDeep knowledge of distributed file system concepts, map-reduce principles, and distributed computing. Knowledge of Spark and differences between Spark and Map-Reduce.

Familiarity of encryption and security in a Hadoop cluster.Data management / data structuresMust be proficient in technical data management tasks, i.e., writing code to read, transform and store dataXML/JSON knowledgeExperience working with REST APIsSparkExperience in launching spark jobs in client mode and cluster mode. Familiarity with the property settings of spark jobs and their implications to performance.Application Development

Familiarity with HTML, CSS, and JavaScript and basic design/visual competencySCC/GitMust be experienced in the use of source code control systems such as GitETL Experience with developing ELT/ETL processes with experience in loading data from enterprise sized RDBMS systems such as Oracle, DB2, MySQL, etc.Authorization

Basic understanding of user authorization (Apache Ranger preferred)Programming Must be at able to code in Python or expert in at least one high level language such as Java, C, Scala.Must have experience in using REST APIsSQL Must be an expert in manipulating database data using SQL. Familiarity with views, functions, stored procedures and exception handling.AWS General knowledge of AWS Stack (EC2, S3, EBS, )IT Process ComplianceSDLC experience and formalized change controls

Apply Here

Submit CV To All Data Science Job Consultants Across India For Free

LEAVE A REPLY

Please enter your comment!
Please enter your name here