Dana-Farber Cancer Institute | Data Engineer (To Support Cancer Researchers) | Boston, MA | United States | BigDataKB.com | 17 Oct 2022

Before u proceed below to check the jobs/CVs, please select your favorite job categories, whose top job alerts you want in your email & Subscribe to our Email Job Alert Service For FREE


Job Location: Boston, MA

Job ID:

450 Brookline Ave, Boston, MA 02215


Employment Type:
Full time

BigDataKB.com Jyotish
BigDataKB.com Jyotish - Career & Life Prediction
Work Location:
Full Remote: 4-5 days remote/wk


REMOTE is an available option for interested US-based candidates.

Do you think artificial intelligence should be applied to solving cancer instead of only advertisement and phone apps? We are looking for a Data Engineer who will prepare the data for the next generation of AI that will run on cancer data.

We are seeking intelligent, hard-working, and dynamic individuals to serve as Data Engineer within the AI Operations and Data Science Services group – a group serving some of the most prominent research and clinical programs at the Institute, from basic to translational research, to clinical deployment, and operationalization. The group encompasses expertise in AI, data science, machine learning, NLP, computer vision, production deployment, cloud infrastructure, data engineering, project management standards, and data labeling. The group seeks to develop a highly interdisciplinary environment supporting our research, clinical and operational staff advance the overall mission of DFCI which is to provide expert, compassionate care to children and adults with cancer while advancing the understanding, diagnosis, treatment, cure, and prevention of cancer and related diseases.

As we widen our support of several crucial centers and programs at DFCI, we seek an energetic and motivated Data Engineer to help us scale up our data infrastructure to support the research objectives of our Investigators. This will involve being responsible for data management, building data pipelines, contributing to software tools developed internally and used by collaborators, and establishing data engineering and data management best practices. The primary focus will be in our Breast Oncology Division, but the candidate will be expected to contribute as needed to other cancer areas we serve. The successful candidate will have proven experience in working on large complex projects, meeting deadlines, and will have excellent communication skills coupled with being very personable.


The key responsibilities will be:

  • Responsible for data management and building data pipelines for breast cancer research data
  • Managing one research client that has numerous ongoing projects in AI applied to radiology / molecular / pathology / clinical data / clinical trial data / text data. Data engineering will be the key enabler to the next level of these projects
  • Strong organizational skills with demonstrated capacity to track and manage data flows across projects and computational platforms
  • Meeting and consulting scientists and designing plans and solutions to support their data tooling needs
  • Delivery of results for projects on-time and on-budget
  • Working as part of the broader team to identify long-term solutions that will improve the quality, speed and efficacy of our current projects and programs
  • Evaluating and benchmarking new software libraries
  • Prototype and deploy data engineering pipelines
  • Design and implement data pipelines that focus on data life cycle
  • Excellent communication and effective problem-solving skills, possibly with a track record of serving a variety of diverse customers and projects
  • Ability to quickly learn new software tools and provide feedback/recommendations
  • Ability to work independently, prioritize, and manage people if needed, within an environment with ever changing priorities
  • Demonstrate excellent soft-skills, such as excellent oral and written communication skills



  • Excellent data engineering and data management skills
  • Required strong proficiency in Python and SQL
  • Preferred a degree in quantitative field such as informatics, computer science, applied mathematics, software engineering, or equivalent experience with evidence of impact in data engineering applied to real life problems (e.g., some quant master courses AND experience in relevant internship)
  • Nice to have experience with research setting ideally within a clinical or basic research environment
  • Familiarity with Jupyter Lab, Linux, and Git
  • Cloud computing experience (e.g., GCP, AWS, Azure)
  • Preferred experience with RedCap API or OMOP Common Data Model


  • Preferred 1 to 5 years of experience post MS or PhD
  • Experience with multiple large, heterogeneous, and sparse datasets is strongly preferred
  • Preferred prior experience in client management
  • Domain knowledge of oncology and cancer biology would be preferred but it is not required

Dana-Farber Cancer Institute is an equal opportunity employer and affirms the right of every qualified applicant to receive consideration for employment without regard to race, color, religion, sex, gender identity or expression, national origin, sexual orientation, genetic information, disability, age, ancestry, military service, protected veteran status, or other groups as protected by law.

Apply Here

Submit CV To All Data Science Job Consultants Across United States For Free


Please enter your comment!
Please enter your name here