MindBigData
The
Visual "MNIST" of Brain
Digits (2021)
Files available for download:
DataBase | File | Zip size | File size | Date | MNIST Digits | Version |
Muse2-v0.17 | MindBigDataVisualMnist2021-Muse2v0.17.zip | 709 Mb (743,459,783 bytes) | 2,37 Gb (2,551,653,293 bytes) | 01/12/2022 | 18,000 | Beta 0.17 |
Muse2-v0.14 | MindBigDataVisualMnist2021-Muse2v0.14.zip | 592 Mb (621,214,944 bytes) | 1,97 Gb (2,126,055,435 bytes) | 12/16/2021 | 15,000 | Beta 0.14 |
Muse2-v0.09 | MindBigDataVisualMnist2021-Muse2v0.09.zip | 394 Mb (413,159,565 bytes) | 1,31 Gb (1,417,356,709 bytes) | 11/05/2021 | 10,000 | Beta 0.09 |
Muse2-v0.04 | MindBigDataVisualMnist2021-Muse2v0.04.zip | 195Mb (205,309,158 bytes) | 677Mb (710,554,967 bytes) | 09/17/2021 | 5,000 | Beta 0.04 |
Muse2-v0.01 | MindBigDataVisualMnist2021-Muse2v0.01.zip | 77,4Mb (81,200,465 bytes) | 271Mb (284,676,053 bytes) | 08/27/2021 | 2,000 | Beta 0.01 |
Feel free to test any machine learning, deep learning or whatever algorithm you think it could fit, we only ask for acknowledging the source and please let us know of your performance!
In this new dataset other bio-signals have been included beyond EEG, to foster the use of multimodal data in training algorithms, since it could help different lines of research.
January 2022 Update: Due to the EEG signal noise detected in some channels of Muse2 recordings, subsets have been created leaving only the best signals still in raw format, one with 2 channels "Cut2" TP9 & TP10 and other with 3 channels "Cut3" TP9 AF7 & TP10
DataBase | File | Zip size | File size | Date | MNIST Digits | Version |
Muse2-v0.16Cut2 | MindBigDataVisualMnist2021-Muse2v0.16Cut2.zip | 189 Mb (198,358,166 bytes) | 659 Mb (691,242,736 bytes) | 01/09/2022 | 11,387 | Beta 0.16Cut2 |
Muse2-v0.16Cut3 | MindBigDataVisualMnist2021-Muse2v0.16Cut3.zip | 20,9 Mb (21,917,547 bytes) | 74,8 Mb (78,476,950 bytes) | 01/09/2022 | 1,184 | Beta 0.16Cut3 |
The original Muse2 datasets with the 4 EEG channels
above can still be used in many cases with further preprocessing.
New EEG
Headsets will be added through 2022.
FILE FORMAT:
The
data is stored in a very simple text format (csv like, all comma separated) including:
[dataset]:
a text pointing to the
original Yann LeCun MNIST source type, can be "TRAIN" or "TEST", related
to the 60,000 train digits and 10,000 test digits.
[origin]
1
integer,
used to reference the Yann LeCun MNIST location of the original digits in the
source data files from 0-59,999 for train 0-9,999 for test or -1 to indicate
black signal (meaning not from the original MNIST datasets)
[digit_event]: 1 integer with the original MNIST label of the image from 0 to 9 or -1 to indicate black signal (no digit shown)
[original_png]: 784 integers (comma separated), with the original pixel intensities from the Yann LeCun MNIST from the source png files shown, each pixel can have a value from 0 to 255, (for black signal all will be 0s) 784 comes from from (28x28) since it is single channel square image, flattened
[timestamp]: 1 Unix Like timestamp for initial time of catpture of the signals for this digit capture
[EEGdata]:
For Muse2
512
floating point
(comma separated) EEG - TP9 channel raw signal (2secs
at 256hz), followed by
512
floating point
(comma separated) EEG - AF7 channel raw signal (2secs
at 256hz), followed by
512
floating point
(comma separated) EEG - AF8 channel raw signal (2secs
at 256hz), followed by
512
floating point
(comma separated) EEG - TP10 channel raw signal (2secs at 256hz)
For Muse2 Cut2 ( only TP9 &
TP10)
For Muse2 Cut3 ( only TP9, AF7 & TP10)
For Muse2 (only) | ||
512 floating point (comma separated) PPG1 ambient channel raw signal (2secs at 256hz), followed by | ||
512 floating point (comma separated) PPG2 infrared channel raw signal (2secs at 256hz), followed by | ||
512 floating point (comma separated) PPG3 red channel raw signal (2secs at 256hz) |
For Muse2 | ||
512 floating point (comma separated) Accelerometer X channel raw signal (2secs at 256hz), followed by | ||
512 floating point (comma separated) Accelerometer Y channel raw signal (2secs at 256hz), followed by | ||
512 floating point (comma separated) Accelerometer Z channel raw signal (2secs at 256hz) |
For Muse2 | ||
512 floating point (comma separated) Gyroscope X channel raw signal (2secs at 256hz),followed by | ||
512 floating point (comma separated) Gyroscope Y channel raw signal (2secs at 256hz),followed by | ||
512 floating point (comma separated) Gyroscope Z channel raw signal (2secs at 256hz) |
For Muse2 data, in total there are 7,444 values coma separated per row
(6,932 for Muse2 Cut3 & 6,420 for Muse2 Cut2)
There
are no headers in the files
RELATED RESEARCH, CITATIONS & RESULTS by 3rd parties (Using the previous "MNIST" of brain digits dataset):
- Giving sense to EEG records ( course IFT6390 "machine learning" by Pascal Vincent from MILA) by Amin Shahab, Marc Sayn-Urpar, René Doumbouya, Thomas George & Vincent Antaki.
- Contribution aux décompositions rapides des matrices et tenseurs , Viet-Dung NGUYEN THÈSE UNIVERSITÉ D’ORLÉANS Nov-16th-2016
- Fast learning of scale-free networks based on Cholesky factorization, Vladislav Jelisavčić, Ivan Stojkovic, Veljko Milutinovic, Zoran Obradovic May-2018
- STRUCTURED LEARNING FROM BIG DATA BASED ON PROBABILISTIC GRAPHICAL MODELS, Vladislav Jelisavčić UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING May-2018
- Combination of Wavelet and MLP Neural Network for Emotion Recognition System, Phuong Huy Nguyen,Thai Nguyen University of Technology (TNUT) & Thu May Duong ,Thi Mai Thuong Duong ,Thu Huong Nguyen University of Information and Communication Technology Vietnam Nov-2018
- A Deep Evolutionary Approach to Bioinspired Classifier Optimisation for Brain-Machine Interaction, Jordan J. Bird , Diego R. Faria, Luis J. Manso, Anikó Ekárt, and Christopher D. Buckingham, School of Engineering and Applied Science, Aston University, Birmingham, UK Mar-2019
- Novel joint algorithm based on EEG in complex scenarios, Dongwei Chen, Weiqi Yang, Rui Miao, Lan Huang, Liu Zhang, Chunjian Deng & Na Han School of Business, Beijing Institute of Technology, Zhuhai, China Aug-2019
- HHHFL: Hierarchical Heterogeneous Horizontal Federated Learning for Electroencephalography,Dashan Gao,Ce Ju,Xiguang Wei, Yang Liu,Tianjian Chen and Qiang Yan, Hong Kong University of Science and Technology, 2AI Lab, WeBank Co. Ltd. Sep-2019
- Universal EEG Encoder for Learning Diverse Intelligent Tasks,Baani Leen Kaur Jolly, Palash Aggrawal, Surabhi S Nath, Viresh Gupta, Manraj Singh Grover, Rajiv Ratn Shah, MIDAS Lab, IIIT-Delhi Nov-2019
- Stanford CS230 - Group Project Final Report,Roman Pinchuk and Will Ross 2020
- Mental State Recognition and Recommendation of Aids to Stabilize the Mind Using Wearable EEG,M.W.A. Aruni Wijesuriya, University of Colombo School of Computing 2020
- Generating the image viewed from EEG signals,Gaffari ÇELİK, Muhammed Fatih 2020
- EEG-Based Emotion Classification for Alzheimer’s Disease Patients Using Conventional Machine Learning and Recurrent Neural Network Models,Mahima Chaudhary, Sumona Mukhopadhyay, Marin Litoiu, Lauren E Sergio, Meaghan S Adams Aug-2020
- Understanding Brain Dynamics for Color Perception Using Wearable EEG Headband,Jungryul Seo, Teemu H. Laine, Gyuhwan Oh, Kyung-Ah Sohn Dec-2020
- Frequency Band and PCA Feature Comparison for EEG Signal Classification, Wayan Pio Pratama, Made Windu Antara Kesiman, Gede Aris Gunadi, Apr-2021
- Toward lightweight fusion of AI logic and EEG sensors to enable ultra edge-based EEG analytics on IoT devices, Tazrin Tahrat, May-2021
- Deep Learning in EEG: Advance of the Last Ten-Year Critical Period,Shu Gong, Kaibo Xing, Andrzej Cichocki, Junhua Li May-2021
- Convolutional Neural Network-Based Visually Evoked EEG Classification Model on MindBigData,Nandini Kumari, Shamama Anwar,Vandana Bhattacharjee Jun-2021
- Visual Brain Decoding for Short Duration EEG Signals,Rahul Mishra, Krishan Sharma, Arnav Bhavsar Aug-2021
- Quality analysis for reliable complex multiclass neuroscience signal classification via electroencephalography, Ashutosh Shankhdhar, Pawan Kumar Verma, Prateek Agrawal, Vishu Madaan, Charu Gupta Jan-2022
RELATED RESEARCH, CITATIONS & RESULTS by 3rd parties (Using the previous "IMAGENET" of the brain dataset):
- Inferencia de la Topologia de Grafs,Tura Gimeno Sabater, UPC 2020
- Understanding Brain Dynamics for Color Perception using Wearable EEG headband Mahima Chaudhary, Sumona Mukhopadhyay, Marin Litoiu, Lauren E Sergio, Meaghan S Adams York University, Toronto, Canada 2020
- Developing a Data Visualization Tool for the Evaluation Process of a Graphical User Authentication System Loizos Siakallis , UNIVERSITY OF CYPRUS USA 2020
- Object classification from randomized EEG trials Hamad Ahmed, Ronnie B Wilbur,Hari M Bharadwaj and Jeffrey Mark, Purdue University USA 2020
Contact us if you need any more info.
February 2nd 2022
David
Vivancos
vivancos@vivancos.com
This MindBigData The Visual "MNIST" of Brain Digits is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/