The "MNIST" of Brain Digits
The version 1.03 of the open database contains 1,207,293 brain signals of
seconds each, captured with the stimulus of seeing a digit (from 0 to 9)
and thinking about it, over the course of almost 2 years between 2014 & 2015, from a single Test Subject
In 2018 we started sharing also a new open dataset
"IMAGENET" of The Brain, and in 2021 we started
The Visual "MNIST" of Brain Digits. with real individual MNIST digits shown.
The version 1.03 of the open database contains 1,207,293 brain signals of 2 seconds each, captured with the stimulus of seeing a digit (from 0 to 9) and thinking about it, over the course of almost 2 years between 2014 & 2015, from a single Test Subject David Vivancos. In 2018 we started sharing also a new open dataset "IMAGENET" of The Brain, and in 2021 we started The Visual "MNIST" of Brain Digits. with real individual MNIST digits shown.
All the signals have been captured using commercial
EEGs (not medical
grade), NeuroSky MindWave,
Interaxon Muse &
covering a total of 19 Brain (10/20) locations.
All the signals have been captured using commercial EEGs (not medical grade), NeuroSky MindWave, Emotiv EPOC, Interaxon Muse & Emotiv Insight, covering a total of 19 Brain (10/20) locations.
Four files are available for download:
|DataBase||File||Zip size||File size||Date||Mirror|
|MindWave||MindBigData-MW-v1.0.zip||62,6 MB (65,663,303 bytes)||297 MB (311,994,495 bytes)||09/11/2015|
|EPOC**||MindBigData-EP-v1.0.zip||408 MB (427.958.689 bytes)||2,66 GB (2.859.712.035 bytes)||06/16/2018||US DataHub Mirror|
|Muse||MindBigData-MU-v1.0.zip||62,6 MB (65,663,303 bytes)||297 MB (311,994,495 bytes)||09/11/2015|
|Insight*||MindBigData-IN-v1.06.zip||25,3 MB (26,610,979 bytes)||184 MB (193,010,330 bytes)||12/10/2019|
We built our own tools to capture them, but there is no post-processing on
our side, so they come raw as they are read from each EEG device, in total
395,072,896 Data Points.
We built our own tools to capture them, but there is no post-processing on our side, so they come raw as they are read from each EEG device, in total 395,072,896 Data Points.
Feel free to test any machine learning, deep learning or whatever algorithm you think it could fit, we only ask for acknowledging the source and please let us know of your performance!
We choose not to differentiate the signals into
training/test/validation sets at this
point so pick the distribution you prefer.
We choose not to differentiate the signals into training/test/validation sets at this point so pick the distribution you prefer.
A small portion of the signals were captured without the stimulus of seeing
the digits for contrast, all are random actions not related to thinking or
seeing digits, you can decide to use them or not in your tests, they use the
A small portion of the signals were captured without the stimulus of seeing the digits for contrast, all are random actions not related to thinking or seeing digits, you can decide to use them or not in your tests, they use the code -1.
This is the distribution of the signals per device and digit:
* Insight captures started in September 2015, dataset updated to fix the channel sepparation by comma and use dot for the decimals, instead of commas only , last update 10/12/2019 v1.06
** EPOC dataset updated to fix the channel sepparation by comma and use dot for the decimals, instead of commas only , last update 06/16/2018 v1.01
The data is stored in a very simple text format including:
a numeric, only for
[event] id, a integer, used to distinguish the same event captured at different brain locations, used only by multichannel devices (all except MW).
[device]: a 2 character string, to identify the device used to capture the signals, "MW" for MindWave, "EP" for Emotive Epoc, "MU" for Interaxon Muse & "IN" for Emotiv Insight.
string, to indentify the 10/20 brain location of the signal, with possible
"AF3, "F7", "F3", "FC5", "T7", "P7", "O1",
"O2", "P8", "T8", "FC6", "F4", "F8", "AF4"
[code]: a integer, to indentify the digit been thought/seen, with possible values 0,1,2,3,4,5,6,7,8,9 or -1 for random captured signals not related to any of the digits.
[size]: a integer, to identify the size in number of values captured in the 2 seconds of this signal, since the Hz of each device varies, in "theory" the value is close to 512Hz for MW, 128Hz for EP, 220Hz for MU & 128Hz for IN, for each of the 2 seconds.
[data]: a coma separated set of numbers, with the time-series amplitude of the signal, each device uses a different precision to identify the electrical potential captured from the brain: integers in the case of MW & MU or real numbers in the case of EP & IN.
There is no headers in the files, every line is a signal, and the fields are separated by a tab
For example one line of each device could be (without the headers)
Each EEG device capture the signals via different sensors, located in these areas of my brain, the color represents the device: MindWave, EPOC, Muse, Insight
RELATED RESEARCH, CITATIONS & RESULTS by 3rd parties:
- Giving sense to EEG records ( course IFT6390 "machine learning" by Pascal Vincent from MILA) by Amin Shahab, Marc Sayn-Urpar, René Doumbouya, Thomas George & Vincent Antaki.
- Contribution aux décompositions rapides des matrices et tenseurs , Viet-Dung NGUYEN THÈSE UNIVERSITÉ D’ORLÉANS Nov-16th-2016
- Fast learning of scale-free networks based on Cholesky factorization, Vladislav Jelisavčić, Ivan Stojkovic, Veljko Milutinovic, Zoran Obradovic May-2018
- STRUCTURED LEARNING FROM BIG DATA BASED ON PROBABILISTIC GRAPHICAL MODELS, Vladislav Jelisavčić UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING May-2018
- Combination of Wavelet and MLP Neural Network for Emotion Recognition System, Phuong Huy Nguyen,Thai Nguyen University of Technology (TNUT) & Thu May Duong ,Thi Mai Thuong Duong ,Thu Huong Nguyen University of Information and Communication Technology Vietnam Nov-2018
- A Deep Evolutionary Approach to Bioinspired Classifier Optimisation for Brain-Machine Interaction, Jordan J. Bird , Diego R. Faria, Luis J. Manso, Anikó Ekárt, and Christopher D. Buckingham, School of Engineering and Applied Science, Aston University, Birmingham, UK Mar-2019
- Novel joint algorithm based on EEG in complex scenarios, Dongwei Chen, Weiqi Yang, Rui Miao, Lan Huang, Liu Zhang, Chunjian Deng & Na Han School of Business, Beijing Institute of Technology, Zhuhai, China Aug-2019
- HHHFL: Hierarchical Heterogeneous Horizontal Federated Learning for Electroencephalography,Dashan Gao,Ce Ju,Xiguang Wei, Yang Liu,Tianjian Chen and Qiang Yan, Hong Kong University of Science and Technology, 2AI Lab, WeBank Co. Ltd. Sep-2019
- Universal EEG Encoder for Learning Diverse Intelligent Tasks,Baani Leen Kaur Jolly, Palash Aggrawal, Surabhi S Nath, Viresh Gupta, Manraj Singh Grover, Rajiv Ratn Shah, MIDAS Lab, IIIT-Delhi Nov-2019
- Stanford CS230 - Group Project Final Report,Roman Pinchuk and Will Ross 2020
- Mental State Recognition and Recommendation of Aids to Stabilize the Mind Using Wearable EEG,M.W.A. Aruni Wijesuriya, University of Colombo School of Computing 2020
- Generating the image viewed from EEG signals,Gaffari ÇELİK, Muhammed Fatih 2020
- EEG-Based Emotion Classification for Alzheimer’s Disease Patients Using Conventional Machine Learning and Recurrent Neural Network Models,Mahima Chaudhary, Sumona Mukhopadhyay, Marin Litoiu, Lauren E Sergio, Meaghan S Adams Aug-2020
- Understanding Brain Dynamics for Color Perception Using Wearable EEG Headband,Jungryul Seo, Teemu H. Laine, Gyuhwan Oh, Kyung-Ah Sohn Dec-2020
- Frequency Band and PCA Feature Comparison for EEG Signal Classification, Wayan Pio Pratama, Made Windu Antara Kesiman, Gede Aris Gunadi, Apr-2021
- Toward lightweight fusion of AI logic and EEG sensors to enable ultra edge-based EEG analytics on IoT devices, Tazrin Tahrat, May-2021
- Deep Learning in EEG: Advance of the Last Ten-Year Critical Period,Shu Gong, Kaibo Xing, Andrzej Cichocki, Junhua Li May-2021
- Convolutional Neural Network-Based Visually Evoked EEG Classification Model on MindBigData,Nandini Kumari, Shamama Anwar,Vandana Bhattacharjee Jun-2021
- Visual Brain Decoding for Short Duration EEG Signals,Rahul Mishra, Krishan Sharma, Arnav Bhavsar Aug-2021
- Quality analysis for reliable complex multiclass neuroscience signal classification via electroencephalography, Ashutosh Shankhdhar, Pawan Kumar Verma, Prateek Agrawal, Vishu Madaan, Charu Gupta Jan-2022
Contact us if you need any more info.
decode My Brain!
February 2nd 2022
decode My Brain!
This MindBigData The "MNIST" of Brain Digits is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/