SAND

Speech Analysis for Neurodegenerative Diseases challenge

An IEEE ICASSP 2026 SP grand challenge


4-8 May, 2026
Barcelona, Spain

About
...

Why we need SAND?

This challenge stems from the need to analyse noninvasive, objective, and scalable biomarkers, such as speech signals, for early diagnosis and longitudinal monitoring of patients suffering from neurodegenerative diseases. This is because diseases such as Amyotrophic Lateral Sclerosis (ALS), present complex diagnostic challenges due to heterogeneous symptom profiles and overlapping clinical features.

...

Current limitations

Current diagnostic tools are largely based on subjective clinical scales and often fail to detect early changes, resulting in delayed intervention and suboptimal care for patients. This underscores the urgent need to use noninvasive biomarkers.

...

Be part of the change!

With this challenge, we would like to redefine neurodegenerative disease assessment by positioning speech as a central, AI-powered biomarker for diagnosis and monitoring. We invite you to participate in the SAND challenge with your contribution. The five highest-ranked teams will be invited to showcase their work at IEEE ICASSP 2026. Accepted contributions will be published in the official IEEE ICASSP proceedings (IEEE-indexed). The challenge-dedicated session will highlight presentations from the top-performing participants and conclude with a panel discussion. Don’t miss this engaging event!

Important Dates

All specified dates and times are based on the Italian time zone, which is UTC+1 (Central European Time) during Standard Time and UTC+2 (Central European Summer Time) during Daylight Saving Time.

  • September 01, 2025: Challenge Registration opens and Release of the datasets (training)
  • October 01, 2025: Release of the datasets (testing) - first day for submitting results and code
  • November 20, 2025: Last day to submit; challenge closes at 11:59 p.m. UTC+1 on Thursday, November 20, 2025
  • December 05, 2025: Results announcement
  • January 07, 2026: 2-page Papers Due (by invitation only)
  • January 21, 2026: 2-page Paper Acceptance Notification
  • January 28, 2026: Camera-ready 2-page Papers Due

Ranking 1

Task 1 - Rankings
Rank Team Name F1 Score
TUKE 🥇 0.6079
UTL 🥈 0.6005
PRIME 🥉 0.5945
RGTRGT 0.5849
CCNYNEURO 0.5813
OHTSUKI 0.5796
UTAUSTIN 0.5768
SARWANALI 0.5613
PASSIONAI 0.5558
10° AURA 0.5437
11° TKB 0.5430
12° SLEEPERS 0.5320
13° GTIUNISS 0.5301
14° ISDS 0.5116
15° PATHOLOGICALSPEECH 0.4820
16° MOCHA 0.4815
17° QLN 0.4804
18° CAB 0.4794
19° SSS 0.4790
20° MBS 0.4751
21° UOS 0.4413
22° SMARTVOICE 0.4231
23° SAGI 0.4175
24° SPAGHETTIINHALERS 0.4159
25° CASALAB 0.4137
26° THR 0.4073
27° CLT 0.3937
28° CAU 0.3937
29° SMTIH 0.3917
30° BPGC 0.3813
31° PHOFI 0.3793
32° AICV 0.3757
33° UHL 0.3725
34° DSPLABMARIBOR 0.3645
35° IITPATNA 0.3634
36° TEAMTAG 0.3629
37° CSCU 0.3623
38° GTMN 0.3596
39° JLEE 0.3398
40° GTMUVIGO 0.3353
41° TSY 0.3078
42° STAR 0.2969
43° ECHOPATH 0.2870
44° IMATI 0.2629
45° GISPHEU 0.2404
46° FBK 0.2025
47° WTB 0.1996
48° VAPMR 0.1888
49° TCSSPEECH 0.1362
50° MACEWANVOICES 0.0992
51° ALWAYSMAKEIMPACT 0.0656
52° NETSENSE 0.0564

Ranking 2

Task 2 - Rankings
Rank Team Name F1 Score
ISDS 🥇 0.5794
OHTSUKI 🥈 0.5637
JLEE 0.5612
CAU 0.5437
SPAGHETTIINHALERS 0.5401
PATHOLOGICALSPEECH 0.5278
ECHOPATH 0.4994
CAB 0.4870
CLT 0.4791
10° AICV 0.4552
11° CCNYNEURO 0.4408
12° SMARTVOICE 0.4294
13° AURA 0.4185
14° KIE 0.4097
15° HNDX 0.3879
16° PASSIONAI 0.3815
17° SARWANALI 0.3795
18° ARCOLAB 0.3728
19° TKB 0.3673
20° MOCHA 0.3667
21° MBS 0.3207
22° TEAMTAG 0.3069

Guidelines for participants

  • Participants may submit up to three predictions per task, with only the final submission being evaluated.
  • Submissions may be made to either one task or both. Each submission must include a short (2-page) description of the methodology used. At this stage, although not a mandatory requirement, it is strongly recommended that you draft the 2-page paper using the Word or LaTeX template provided in the IEEE ICASSP PaperKit 2026.
  • Each participant may only be part of one team.
  • Each team must register to participate in this challenge.
  • During the registration, please, check carefully the entire list of Team Members, because no modifications will be allowed in the future. You will not be able to add additional members to the 2-page description paper, which will be published in the IEEE ICASSP proceedings if it is accepted.
  • Each team can download the training dataset from the dashboard.
  • At the appropriate time (see guidelines), each team can download the testing dataset from the dashboard. Use it to obtain the results to be submitted.
  • Test data must not be used during training. All parameter tuning should be performed using the training set, from which validation sets may be created.
  • Each team can submit its results from the dashboard.
  • Submissions that are incomplete or contain missing entries will be excluded from evaluation.
  • The final rankings will be published on this platform, and the winners will be notified.
  • Rankings will be based on the best achieved performance for each individual task.
  • To determine the winning method(s), the top 3 highest-performing submitted models (for each task) must provide an executable file (or a notebook) in order to reproduce the classification/prediction results. We strongly encourage all teams to release their code publicly on their profiles.
  • The five top teams will be selected, taking into account the distribution of participants across tasks: the top 2 teams from Task 1 and the top 2 from Task 2 will be chosen, with the fifth slot going to the third-ranked team in the task with the highest number of submissions.
  • The five highest-ranked teams will be invited to submit a 2-page paper to be presented at IEEE ICASSP 2026.
    The 2-page paper must be formatted using the Word or LaTeX template provided in the IEEE ICASSP PaperKit 2026 and submitted by the camera-ready deadline.
  • Our baseline implementation includes predefined validation sets (20% of the training set), which may be used during training. Details of the composition of our baseline training and validation sets can be found in Dataset section
  • The use of external resources (training data and/or pretrained models) is allowed if the data or pre-trained models are publicly and freely accessible, all resources are available before the challenge begins, and any dataset or model used is properly cited in the 2-page description of the methodology.
  • Models developed using private data are not allowed.
Evaluation Criteria
The performance of submitted models will be assessed based on the Averaged F1-score, calculated using the held-out-trials test set. For each task, this metric is calculated as:

$$ \text{Avg. F1-Score} = \frac{1}{|C|} \sum_{c \in C} \frac{TP_c}{TP_c + \frac{1}{2}(FP_c + FN_c) } $$

where \(TP_c\) is the number of true positives, \(FP_c\) of False Positives, \(FN_c\) of False Negatives, all reported for class \(c\), while \(|C|\) is \(5\) for task 1, and \(4\) for task 2.

Final rankings will be based on the Avg. F1-score, with higher scores indicating better classification/prediction performance. We chose this metric because it is useful when using unbalanced datasets. In case of a tie, the originality of the proposed approach will serve as an additional evaluation criterion. The final decision will rest with the organizers. Participants may submit results for either one or both tasks. The top five teams will be selected based on performance and the overall distribution of submissions across the tasks (see below for clarification).
Note: The held-out-trials test sets for Task 1 and Task 2 differ slightly. Please ensure you use the correct xls file associated with the corresponding task you are participating in.

Dataset

The dataset is now available to download from the dashboard. Log in or sign up to access the dashboard and download the dataset.
The dataset contains voice signals acquired from adult subjects in the age range [18 - 90], collected from January 1, 2022, until June 15, 2025. The study was approved by the Ethics Committee of the University Hospital ‘Federico II’ of Naples, Italy (Protocol ID: 100/17/ES01 and 93/2023).
Details of the methodology and methods used to conduct the study are reported in https://doi.org/10.1038/s41597-024-03597-2.
Instead, information regarding the mobile app, Vox4Health, employed to acquire the voice signals is reported in https://doi.org/10.1007/978-3-319-40114-0_15.
The medical team from the ALS Center of the University Hospital ‘Federico II’ of Naples carried out the clinical evaluation of the subjects and assigned each an ALSFRS-R score, an integer value from 0 to 4 for ALS patients, or a value of 5 for the healthy subjects. We specify that no subjects with ALSFRS-R equal to 0 (zero) are present in the dataset.
The collection consists of 2712 voice signals (related to different speech tasks) recorded from 339 Italian speakers:
  • 205 ALS patients (121 males and 84 females) with different severity of dysarthria.
  • 134 healthy subjects (72 males and 62 females).
The gender distribution of patients in the two tasks is the following. For task 1, there are 119 women and 154 men in the training set, while there are 27 women and 40 men in the test set. For task 2, there are 51 women and 81 men in the training set, while there are 14 women and 19 men in the test set.
All signals were recorded with Vox4Health, an m-health app on a smartphone kept around 20 centimeters away from the patients' mouths; the angle between the mobile and the mouth was around 45 degrees.
For the SAND challenge, the dataset was divided maintaining a balance between age, gender, and severity of dysarthria (ALSFRS-R scale value).
The complete dataset was partitioned into:
  • 80% training set.
  • 20% testing set.
For our experiments with baseline models, we used the training set, which in turn was divided into:
  • 80% for baseline training.
  • 20% for baseline validation.
The baseline scores were calculated with respect to the validation subsets. The subjects used in the training and validation sets during our experiments for the baselines are reported in the data files, i.e., the sand_task_1.xlsx and sand_task_2.xlsx.

Audio files

For each subject, this database contains:
  • Recording of vocalisations of each vowel /a/, /e/, /i/, /o/, and /u/ for a minimum of 5 seconds each, ensuring a continuous loudness; this gives origin to five wav audio files.
  • Recordings of the subject’s voice during reiteration in a single breath of each of the three syllables /pa/, /ta/, and /ka/ in the fastest possible way; this results in three wav audio files.
All the recordings were gathered at a sampling frequency of 8000 Hz and a 16-bit resolution. They took place under conditions of quietness (< 30 dB of background noise), dryness (humidity rate of about 35-40%), and absence of emotional and physiological stress.

Training data files

The training data files are sand_task_1.xlsx and sand_task_2.xlsx, respectively for each sample of task1 and task2.
In details, these two xlsx files contain the following sheets:
  • SAND - TRAINING set - Task X contains the list of subjects included in the challenge training set.
  • Training Baseline - Task X contains the list of subjects included in the training set used by us to perform the baseline.
  • Validation Baseline - Task X contains the list of subjects included in the validation set used by us to evaluate our baseline
Where X is 1 or 2 depending on the task of interest.

The sheets of the xslx files contain several columns, both features that can be used as input values and target labels to be used as Ground Truth during the training phase.
The columns contained in each sheet vary depending on the task. For task 1, each sheet of sand_task_1.xlsx contains the following columns:
  • ID: The identifier of each subject, which corresponds to the prefix of each audio file related to that specific subject.
  • Age: The age of the subject.
  • Sex: The gender of the subject.
  • Class: Ground Truth label, which represents the class of the subject. In the case of an ALS subject, it corresponds to the severity of the patient's voice disorder (dysarthria), i.e., the ALSFRS-R scale score, which can have a value from 1 to 4; while, in the case of a healthy subject, it is equal to 5.
While for task2, each sheet of sand_task_2.xlsx contains the following columns:
  • ID: The identifier of each subject, which corresponds to the prefix of each audio file related to that specific subject.
  • Age: The age of the subject.
  • Sex: The gender of the subject.
  • Months: The Number of months between the first and the last assessment.
  • ALSFRS--R_start: Severity of the patient's voice disorder (dysarthria) at the first assessment, i.e., the ALSFRS-R scale score, which can have a value from 1 to 4
  • ALSFRS--R_end: Ground Truth label, the severity of the patient's voice disorder (dysarthria) at the last assessment, i.e., the ALSFRS-R scale score at the last assessment, which can have a value from 1 to 4.

Tree of files and directories (Train Dataset)

Our Train Dataset is structured as follows:
Click on to expand and on to collapse a directory
  • taskX
    • training
      • phonationA
        • ID000_phonationA.wav
        • ID001_phonationA.wav
        • ...
        • ID338_phonationA.wav
      • phonationE
        • ID000_phonationE.wav
        • ID001_phonationE.wav
        • ...
        • ID338_phonationE.wav
      • phonationI
        • ID000_phonationI.wav
        • ID001_phonationI.wav
        • ...
        • ID338_phonationI.wav
      • phonationO
        • ID000_phonationO.wav
        • ID001_phonationO.wav
        • ...
        • ID338_phonationO.wav
      • phonationU
        • ID000_phonationU.wav
        • ID001_phonationU.wav
        • ...
        • ID338_phonationU.wav
      • rhythmKA
        • ID000_rhythmKA.wav
        • ID001_rhythmKA.wav
        • ...
        • ID338_rhythmKA.wav
      • rhythmPA
        • ID000_rhythmPA.wav
        • ID001_rhythmPA.wav
        • ...
        • ID338_rhythmPA.wav
      • rhythmTA
        • ID000_rhythmTA.wav
        • ID001_rhythmTA.wav
        • ...
        • ID338_rhythmTA.wav
    • sand_task_X.xlsx
In this case too, X is 1 or 2 depending on the task of interest.

NEW! New Test Dataset Testing data files

The testing data files are sand_task_1_test.xlsx and sand_task_2_test.xlsx, respectively for each sample of task1 and task2.
The file structure is similar to the training dataset, except that in these xslx test files, the class column of the ground truth labels is empty because each team must estimate the class using their own method.

NEW! New Test Dataset Tree of files and directories (Test Dataset)

Our Test Dataset is structured as follows:
Click on to expand and on to collapse a directory
  • taskX
    • test
      • phonationA
        • ID011_phonationA.wav
        • ID014_phonationA.wav
        • ...
        • ID334_phonationA.wav
      • phonationE
        • ID011_phonationE.wav
        • ID014_phonationE.wav
        • ...
        • ID334_phonationE.wav
      • phonationI
        • ID011_phonationI.wav
        • ID014_phonationI.wav
        • ...
        • ID334_phonationI.wav
      • phonationO
        • ID011_phonationO.wav
        • ID014_phonationO.wav
        • ...
        • ID334_phonationO.wav
      • phonationU
        • ID011_phonationU.wav
        • ID014_phonationU.wav
        • ...
        • ID334_phonationU.wav
      • rhythmKA
        • ID011_rhythmKA.wav
        • ID014_rhythmKA.wav
        • ...
        • ID334_rhythmKA.wav
      • rhythmPA
        • ID011_rhythmPA.wav
        • ID014_rhythmPA.wav
        • ...
        • ID334_rhythmPA.wav
      • rhythmTA
        • ID011_rhythmTA.wav
        • ID014_rhythmTA.wav
        • ...
        • ID334_rhythmTA.wav
    • sand_task_X_test.xlsx
In this case too, X is 1 or 2 depending on the task of interest.

Task 1 - Multi-Class Classification

Multi-Class Classification at time 0 is proposed for to identify the most reliable approach for correctly detecting and classifying the severity of voice disorders (dysarthria), by analysing the audio signals, among the five classes:
  • ALS with Severe dysarthria (Class 1)
  • ALS with Moderate dysarthria (Class 2)
  • ALS with Mild dysarthria (Class 3)
  • ALS with no dysarthria (Class 4)
  • Healthy (Class 5)
The distribution of classes for the training set is the following:
  • Class 1: 2.2%
  • Class 2: 9.55%
  • Class 3: 20.95%
  • Class 4: 27.94%
  • Class 5: 39.33%

Baseline

We use a Visual Transformer (ViT) model as a reference model and obtain an Averaged F1 score of 0.606 on the validation dataset obtained from the training dataset.

Task 2 - Prediction

Task 2 is proposed to predict the ALSFRS-R scale of the patient assessed during the last follow-up visit. For this task, four classes are considered:
  • ALS with Severe dysarthria (Class 1)
  • ALS with Moderate dysarthria (Class 2)
  • ALS with Mild dysarthria (Class 3)
  • ALS with no dysarthria (Class 4)
The distribution of classes for the training set is the following:
  • Class 1: 13.64%
  • Class 2: 21.97%
  • Class 3: 28.79%
  • Class 4: 35.60%
The aim is to accurately predict how the condition will progress, i.e., predict the ALSFRS-R scale value of the patient assessed during the last follow-up visit. This will enable early intervention and better patient care.

Baseline

We use a Partial Decision Tree (PART) algorithm as our baseline model and obtain an Averaged F1-Score of 0.583 on the validation dataset obtained from the training dataset.

Organisers

Computer Science Team

Giovanna
Giovanna Sannino

ICAR-CNR

Ivanoe
Ivanoe De Falco

ICAR-CNR

nadiabrancati
Nadia Brancati

ICAR-CNR

lauraverde
Laura Verde

University of Campania “Luigi Vanvitelli”

mariafrucci
Maria Frucci

ICAR-CNR

danielriccio
Daniel Riccio

University of Naples “Federico II”

Vincenzo
Vincenzo Bevilacqua

University of Naples “Federico II” and ICAR-CNR

Antonio
Antonio Di Marino

University of Naples “Federico II” and ICAR-CNR

Clinical Team

Raffaele
Raffaele Dubbioso

University of Naples “Federico II”

Lucia
Lucia Aruta

University of Naples “Federico II”

Valentina
Valentina Virginia Iuzzolino

University of Naples “Federico II”

Gianmaria
Gianmaria Senerchia

University of Naples “Federico II”

Myriam
Myriam Spisto

University of Campania “Luigi Vanvitelli”



Contact

If you have any questions or concerns, please contact us sand@icar.cnr.it