Neuroimaging in Data Science

Chris Grannan
5 min readApr 10, 2021


I have always been fascinated by the brain. So much so that I chose cognitive psychology for my major in college. Understanding how we process the stimuli from the physical world was just so fascinating. Now that I have shifted my focus to data science and analytics, I have been looking for a way to combine my dual passions for analytics and neuroscience. This motivation has led me to the field of neuroimage processing.

Overview of Neuroimaging:

Put simply, neuroscience is the study of the brain. This is commonly confused with the study of psychology, and especially with the field of cognitive psychology. Cognitive psychology is the study of how information is processed by people and how this processing affects behavior. This differs from neuroscience in that neuroscience focuses more on biological structures and processes than behavior. Neuroimaging, one of the more common tools used for both neuroscience and cognitive psychology research, is the process of creating an image of the structure and function of the human brain and spinal cord. There are many different processes for creating neuroimages, including computed axial tomography (CT) scans, cranial ultrasounds, and functional magnetic resonance imaging (fMRI) scans. Of these, fMRI scans are the most common type of imaging used for neuroscience and cognitive research.

A fMRI scan uses blood flow to map brain structures and functions. Faint magnetic traces in blood are picked up through RI coils in the MRI machine, and these magnetic traces are mapped onto the brain. As brain activity increases, blood flow increases. Because of this, we can determine which areas in the brain are responsible for specific tasks as those areas show increased activity while subjects perform these tasks.

Sample fMRI

Pictured here is an example image from an fMRI. The full fMRI consists of many of these images as each image just shows one cross-section of the vertical and horizontal components of the scan. The red sections in this image show increased activation during the scan.

Using Neuroimaging for Data Science:

Nueroimaging data is frequently used with computer vision and classification tasks. Typically, neuroimaging tasks are centered towards identifying noticeable biological differences within subjects showing differences in behavior, health and disorder. Some typical tasks include classifying brain tumors, identifying patients with Alzheimer’s disease, and finding signs of pathology.

Just like with any other dataset, neuroimages often need preprocessing before any machine learning is used. Images must be inspected for quality first which often includes removing the data points before the machine has reached a steady state and removing outliers generated from extreme amounts of head-movement. Further processing will be used to standardize the images. Some of these processing methods include correcting the dataset for normal amounts of head movement, removing the skull from images, segmenting activity centers, normalizing image size, and controlling for metabolic differences between subjects. For a more detailed list of preprocessing techniques, check out this article on neuroimaging preprocessing from wikibooks. The article goes into much more technical detail about the types of problems that arise in neuroimaging and how to account for them.

Now that we have a general idea of what neuroimaging entails and how we can use it, we will go over a couple of interesting tools to assist in neuroimaging analysis, FSL and FreeSurfer.


FMRIB Software Library, abbreviated FSL and pronounced “fossil”, is a library created at Oxford University for use in processing and 3-D imaging of fMRI scans. The viewer function, FSLeyes pronounced “fossilize”, will recreate the fMRI in a 3-D format. Below is an example of what a FSLeyes view looks like.

Example FSLeyes Window

The program reconstructs the brain scans into a 3-D model where you can move a green reticle to focus on one section of the brain in one image. This will adjust the other two images to show that section on the other axes. The same effect can be achieved by changing the coordinates under “Location”. We can also advance the time series of the scan by changing the last box under “Voxel location.” By increasing this value, we load the next images of the time series. You should see slight variations in the scan as time changes, even when the subject is still. These variations indicate the changes in blood flow over the course of the scan.

We can also use FSL to perform preprocessing tasks such as removing the skull from the images and segmenting brain structures.

FSL Main Menu

Here you can see the main menu when FSL is opened. By clicking one of the processing buttons, then loading an image from an fMRI scan, FSL will create a new image with the process automatically removed. For a more thorough explanation of how to work with FSL, there is an excellent walkthrough on youtube by Andrew Jahn, found here. If you are interested in getting started with neuroimaging, I highly recommend you download a practice dataset and work through this tutorial.


FreeSurfer is another software library that is useful when working with Neuroimaging. FreeSurfer can also be used to automate fMRI preprocessing, but it can also be used to recreate the cortical structure of the brain recreating a 3-D image of the brain’s surface.

Example FreeSurfer Cortex Reconstruction

This reconstruction allows the user to map the major gyri (ridges) and sulci (trenches) of the brains surface and further segment the areas of brain activation obtained in the fMRI. While this cortex reconstruction is impressive and undeniably useful, the runtimes to load an fMRI into FreeSurfer can be extreme. With a fairly new and fast laptop, it took nearly 6 hours for me to load one subject’s fMRI into FreeSurfer, and this issue only compounds as you load an entire dataset. For a gentle introduction into using FreeSurfer, once again check out Andrew Jahn’s youtube for an excellent tutorial found here.


For a gentle introduction into neuroimaging, read this blog post by Mark Humphries, author of The Spike.

For a list of trending data science projects using neuroimaging, read this blog post from Towards Data Science.

For a closer look at the FSL library, make sure you check out the documentation, found here.

Check out the FreeSurfer documentation here for a more thorough understanding of what the software entails.

For a tutorial on working with these libraries, check out Andrew Jahn’s youtube playlists for FSL and FreeSurfer.