Biomedical Data Science

The call for “data scientists” in government and industry has only increased in volume since the publication of the McKinsey Report which states that "by 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions”. Though there are many applications of data science, we are focusing on building an undergraduate biomedical data science program to prepare our students for careers in genetic medicine, pharmaceuticals, bioinformatics, data engineering, biostatistics, etc.

Learn more about the Summer Research Project BD2K

What is Biomedical Data Science?

There are many ways we can define biomedical data science, but in keeping with the definition provided by the NIH Big Data To Knowledge (BD2K), we define "the term "Biomedical Big Data" is inclusive of the diverse digital objects which may have impact in basic, translational, clinical, social, behavioral, environmental, or informatics research questions. Such data types may include imaging, phenotypic, genotypic, molecular, clinical, behavioral, environmental, and many other types of biological and biomedical data. They may also include biologically-relevant data generated for other purposes (e.g., social media, search histories, economic, geographical, or smart phone data). Finally, they also encompass the metadata, data standards, and software tools involved in data processing and analysis." The intersection of math/statistics, computer science, and biomedicine is truly far reaching. Therefore, we are here to help you find your place within those intersections.

Enhancing Diversity in Biomedical Data Science

The goals of the Biomedical Data Science Program at CSUMB, through funding from the NIH, are

  • To establish a summer research program for CSUMB students at the Center for Big Data in Translational Genomics at UC Santa Cruz. The center works to help the biomedical community use genomic information to better understand human health and disease. Visiting CSUMB students will spend the summer working side-by-side with UCSC scientists and data specialists, learning research skills to manage and interpret genomic data.
  • To develop new programs, such as an interdisciplinary statistics major, that will include math, statistics, biology, behavioral and computer sciences to prepare students for graduate school and careers in research or industry.
  • To create opportunities to extend CSUMB faculty training and research in biomedical data science in collaboration with UCSC faculty members and researchers.
Graphic looking at relationship between stats, CS, and domain knowledge
Copyright © 2016 Emmert-Streib, Moutari and Dehmer. doi: 10.3389/fgene.2016.00012
The ability to harvest the wealth of information contained in biomedical Big Data will advance our understanding of human health and disease; however, lack of appropriate tools, poor data accessibility, and insufficient training, are major impediments to rapid translational impact. To meet this challenge, the National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative in 2012.
Venn Diagram for Biomedical Data Science
Biomedical Data Science is the intersection of Math/Statistics, Computer Science, and Biology.

Funding provided by

NIH Logo

Funding for the BD2K Biomedical Data Science program at CSUMB is supported by the Office Of The Director (OD) of the National Institutes of Health under grant number R25MD010391. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.