Are your undergrads getting enough data science?


Science moves faster than curriculum. This means we are leaving some of our students behind.

Most kinds of science nowadays involve collecting a lot of information, and then manipulating this information to make sense of it. I’ve spent some time looking at undergraduate biology curricula from various kinds of institutions over the past year, and it doesn’t look like most bio majors aren’t getting equipped with what I think they need to be able to do science. There are some students who pick up these skills during individualized research mentorship, but it’s not often built into the curriculum.

Data science isn’t just a handy research tool, it’s an integral part of being a scientist. If we’re training people as scientists, then we’ve got to teach fundamental data science skills. With the preponderance and complexity of data that we use on a day to day basis, from sequencing, remote sensing, datalogging, long-term datasets, publicly archived data, surveys, government databases, automated instrumentation, and so on, managing large and complex datasets is now the norm. We may or may not be teaching our classes in this way, but this is the norm for how science is being done, and we need to teach the requisite skills to students as a foundation for being a contemporary scientist.

Do you think your department is doing what it needs to? If so, how did you make this happen, and what was required? If not, what’s the impediment? Bureaucracy? Faculty training? Consistency in teaching assignments? Class size? Too many high priority needs? Faculty resistance? No space in the curriculum and don’t know what to drop? Conflicting priorities of different subfields?

Do our students really need calculus more than they need data science? Do they need a year of organic chemistry more than they need data science?

I think if a student has an undergraduate degree in biology (and perhaps most other STEM fields?), shouldn’t they emerge with at least a rudimentary ability to use either R or Python?

(This is a shortcoming in the curriculum of my own department, and we have identified this as a priority to address, pronto.)

If you’ve been professoring for a little while, then you must be really familiar with the bureaucracy and slow pace of anything involving the curriculum. You want a new course? To change a course? To change a prerequisite? To change what courses belong in a major? How many committees does it have to go through? And if it impacts general education requirements, then that’s a whole lot of politics, on top of all. It appears that it’s easier to grease the wheels of curriculum in smaller private institutions, but, still, it’s one of the bigger administrative stumbling blocks, right?

I don’t subscribe to the view that our majors should serve primarily as job training programs, and I think a university education should be about building fundamental knowledge and practices that are useful as the world changes. So I’m not saying we should be putting data science in our majors for the primary reason because graduate programs and employers want and need it (though that’s a good secondary reason). It’s because data science is needed to be able to do science nowadays.

I think the campuses that have done best in this regard are the high-endowment small liberal art colleges. These campuses have smaller class sizes, lower teaching loads, funds for faculty professional development and student support, and all this adds up to the capacity to create a curriculum that can more promptly respond to contemporary student needs. Also, the proportion of courses taught by contingent faculty are lower, so with greater consistency in teaching assignments, it’s easier to articulate skills from one course to another. Also, these campuses have few transfer students, so they can control what happens in the lower division so that all students in the upper division will have had common experiences. With transfer students, you can’t count on your upper division students having had your own lower division courses, so you can’t have a carefully scaffolded sequence from the 100-level to the 400-level. So I can see why SLACs are taking leadership in this realm, if not in others.

If you’re a student or a recent graduate, do you think you got what you need? If you’re a faculty member, are you providing what students need?


5 thoughts on “Are your undergrads getting enough data science?

  1. During my BSc we were taught stats using the software Minitab. Which at a time seemed like a blessing, in that we did not have to learn how to code in order to do our statistics. And this then meant that teaching could focus more on the stats side of things rather than focusing all their effort on going round the class making sure everyone’s code was correct.

    During my MSc, we had a lecture from one of the epidemiologists at the University who just demonstrated 4-5 different softwares that were available for a range of different data/visual analysis scenarios that could occur. His opinion was that there clearly wasn’t time in the one hour lecture (of about 20 students) for us to understand all the intricacies of how to work these programs (even if we focussed on just one for the hour) but that it was important we knew these types of tools existed; and could go and try them out ourselves if we thought they would be useful.

  2. Thank you for posting. I am involved in curriculum development for a pre-med program at a SLAC. I wonder if there is an argument to be made for data science in medicine?

Leave a Reply