"IMAGENET" of The Brain

David Vivancos Imagenet of The Brain MindBigData 2018

The version 0.04 of MindBigData "IMAGENET" of The Brain, open Data Base contains 19,710 brain signals of 3 seconds each, captured with the stimulus of seeing  a random image (3,942 so far)  from the Imagenet ILSVRC2013 train dataset and thinking about it, over the course of 2018, from a single Test Subject David Vivancos. (  Note that we realase earlier another Open Data Base for digits 0-9 instead of images The "MNIST" of Brain Digits.)

All the signals have been captured using commercial EEGs (not medical grade), with the Emotiv Insight headset, covering a total of 5 Brain (10/20) locations.

Files available for download (only current version is shown since it is incremental) :

DataBase File Zip size Uncompressed size Date Images
Insight v0.04 MindBigData-Imagenet-IN.zip   16,0 MB (16.788.628 bytes) 120 MB 03/12/2018 3,942

We built our own tools to capture them, but there is no post-processing on our side, so they come raw as they are read from the EEG device, in total 7,632,100 Data Points.

Feel free to test any machine learning, deep learning or whatever algorithm you think it could fit, or add them to your ImageNet pipeline to try to improve your performance, we only ask for acknowledging the source and please let us know of your performance to post it here! 

We choose not to differentiate the signals into training/test  sets at this point so pick the distribution you prefer.

Periodically the Data Base will be increased with more EEG signals , last update 03/12/2018, please feel free to forward any thoughts you may have for improving the dataset.


The data is stored in a very simple text format including 1 CSV file for each EEG data recorded related to a single image 3,942 so far, the goal is to reach 100,000 Images through 2018.

The  naming convention is as follows,  for example lets use the file "MindBigData_Imagenet_Insight_n09835506_15262_1_20.csv"

MindBigData_Imagenet_Insight_ : ralates to the EEG headset used Insight atm only

n09835506 : ralates to category of the image from the synsent of ILSVRC2013 in this example n09835506 is "ballplayer, baseball player", I added a "WordReport-v0.04.txt" file too in the zip file with 3 files per row TAB separated with: the category names, the eeg image recorded count and the synsent ID

15262 : ralates to the exact image from the above category , all the images are from the ILSVRC2013_train dataset  you could download them from the Kaggle Website (imagenet_object_detection_train.tar.gz  56.68 GB)  , this image for example is n09835506_15262.JPEG from the ILSVRC2013_train\n09835506\ folder

_1_ : ralates to the number of EEG sessions recorded for this image, usually there will be only 1 but it is possible to have several brain recordings for the same image, second will be 2 and so on.

_20 : ralates to a global session number where the EEG signal for this image was recorded, to avoid long recording times only 5 images are shown in each session with 3 seconds of visualization and 3 seconds of black screen between them.

Inside the CSV file there are 5 lines of plain text one for each EEG channel recorded, ending with a new line escape character, in this example


The first field of each line is a text string, to indentify the 10/20 brain location of the signal, with possible values: "AF3,"AF4","T7","T8","Pz" for the Insight Headset (look bellow for the brain locations)

After that, you have separated by a coma, all the raw EEG values capttured, for this headset it is done at 128Hz so there should be arround 384 (128 x 3 secs) decimal values like "4304.61538461538"   for each channel note the dot is used for the decimal point 

If you plot all the raw values for the AF3 channel (first line of the file) you have this signal:

Sample Raw EEG signal

Note that this is the temporal series for the raw EEG electrical signal cuputured from my brain stearing at the image relatead above for 3 secs, without blinking and as still as possible to avoid EMG noise.

The other EEG channels follow the same pattern and the "time" coordinate of the time serie is shared between the 5 channels so the first "column" of numbers is the first time step and so on.


Each EEG device capture the signals via different sensors, located in these areas of my brain, the color represents the device. Note that for the "IMAGENET" dataset only Insight is used atm

David Vivancos Brain 10/20 Locations

Feel free to Contact us if you need any more info, and glad to hear from your feedback.


This is a list of realted work in the past, using other high density EEG devices:

- Deep Learning Human Mind for Automated Visual Classification | September 2, 2016 | Concetto Spampinato, Simone Palazzo, Isaak Kavasidis, Daniela Giordano, Mubarak Shah, Nasim Souly

- Personalized Image Classification from EEG Signals using Deep Learning | June 6, 2016 | Alberto Bozal

Let's decode My Brain!
March 12th 2018
David Vivancos

This MindBigData The "IMAGENET" of The Brain is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/