Job Location: Bangalore/Bengaluru
In this role, you will be part of a growing, global team of data engineers, who collaborate in DevOps mode, to enable Life Science business with state-of-the-art technology to leverage data as an asset and to take better informed decisions.
The Life Science Data Engineering Team is responsible for designing, developing, testing, and supporting automated end-to-end data pipelines and applications on Life Science s data management and analytics platform (Palantir Foundry, Hadoop, and other components).
The Foundry platform comprises multiple different technology stacks, which are hosted on Amazon Web Services (AWS) infrastructure or on-premises organisation s own data centers. Developing pipelines and applications on Foundry requires:
- Proficiency in SQL / Java / Python (Python required; all 3 not necessary)
- Proficiency in PySpark for distributed computation
- Familiarity with Postgres and ElasticSearch
- Familiarity with HTML, CSS, and JavaScript and basic design/visual competency
- Familiarity with common databases (e.g., JDBC, mySQL, Microsoft SQL). Not all types required
This position will be project based and may work across multiple smaller projects or a single large project utilizing an agile project methodology.
Roles Responsibilities:
- Develop data pipelines by ingesting various data sources structured and un-structured into Palantir Foundry
- Participate in end-to-end project lifecycle, from requirements analysis to go-live and operations of an application
- Acts as business analyst for developing requirements for Foundry pipelines
- Review code developed by other data engineers and check against platform-specific standards, cross-cutting concerns, coding and configuration standards and functional specification of the pipeline
- Document technical work in a professional and transparent way. Create high quality technical documentation
- Work out the best possible balance between technical feasibility and business requirements (the latter can be quite strict)
- Deploy applications on Foundry platform infrastructure with clearly defined checks
- Implementation of changes and bug fixes via organisations change management framework and according to system engineering practices (additional training will be provided)
- DevOps project setup following Agile principles (e.g., Scrum)
- Besides working on projects, act as third level support for critical applications; analyze and resolve complex incidents/problems. Debug problems across a full stack of Foundry and code based on Python, Pyspark, and Java
- Work closely with business users, data scientists/analysts to design physical data models
Who you are:
Education
- Bachelor (or higher) degree in Computer Science, Engineering, Mathematics, Physical Sciences, or related fields
Professional Experience
- 5+ years of experience in system engineering or software development
- 3+ years of experience in engineering with experience in ETL type work with databases and Hadoop platforms.
Skills
Hadoop General
- Deep knowledge of distributed file system concepts, map-reduce principles, and distributed computing.
- Knowledge of Spark and differences between Spark and Map-Reduce. Familiarity of encryption and security in a Hadoop cluster.
Data management / data structures
- Must be proficient in technical data management tasks, i.e., writing code to read, transform and store data
- XML/JSON knowledge
- Experience working with REST APIs
Spark
- Experience in launching spark jobs in client mode and cluster mode. Familiarity with the property settings of spark jobs and their implications to performance.
Application Development
- Familiarity with HTML, CSS, and JavaScript and basic design/visual competency
SCC/Git
- Must be experienced in the use of source code control systems such as Git
ETL
- Experience with developing ELT/ETL processes with experience in loading data from enterprise sized RDBMS systems such as Oracle, DB2, MySQL, etc.
Authorization
- Basic understanding of user authorization (Apache Ranger preferred)
Programming
- Must be at able to code in Python or expert in at least one high level language such as Java, C, Scala.
- Must have experience in using REST APIs
SQL
- Must be an expert in manipulating database data using SQL. Familiarity with views, functions, stored procedures and exception handling.
AWS
- General knowledge of AWS Stack (EC2, S3, EBS, )
IT Process Compliance
- SDLC experience and formalized change controls
- Working in DevOps teams, based on Agile principles (e.g., Scrum)
- ITIL knowledge (especially incident, problem and change management)
Languages
Fluent English skills
Specific information related to the position:
- Physical presence in primary work location (Bangalore)
- Flexible to work CEST and US EST time zones (according to team rotation plan)
- Willingness to travel to Germany, US, and potentially other locations (as per project demand)
Submit CV To All Data Science Job Consultants Across India For Free

