Skip to content

Post-session retrieval and BIDS

Within 48h after the FIRST session

Anatomical images must be screened for incidental findings within 48h after the first session

  • Send the T1-weighted and T2-weighted scan to ███ for screening and incidental findings.
  • Indicate on our recruits spreadsheet that the participant's first session has been submitted for screening.
  • Wait for response from ███ and note down the result of the screening in our our recruits spreadsheet.

To do so, you'll need to first download the data from PACS and then convert the data into BIDS.

What to do when there are incidental findings

  • Discuss with ███ how to proceed with the participant.
  • Exclude the participant from the study if ███ evaluates they don't meet the participation (inclusion and exclusion) criteria.

Within one week after the completed session

Download the data from the PACS with PACSMAN (only authorized users)

  • Log-in into the PACSMAN computer (███)
  • Mount a remote filesystem through sshfs:

    sshfs <hostname>:/data/datasets/hcph-pilot-sourcedata \
                   $HOME/data/hcph-pilot \
          <args>
    
  • Edit the query file vim $HOME/queries/last-session.csv (most likely, just update with the session's date)

    mydata-onesession.csv
    PatientID,StudyDate
    2022_11_07*,20230503
    
  • Prepare and run PACSMAN, pointing the output to the mounted directory.

    pacsman --save -q $HOME/queries/last-session.csv \
           --out_directory $HOME/data/hcph-pilot/ \
           --config /opt/PACSMAN/files/config.json
    
  • Unmount the remote filesystem:

    sudo umount $HOME/data/hcph-pilot
    

Retrieve physiological recordings

  • Check that the AcqKnowledge file(s) corresponding to the session were added to the Dropbox shared folder and completely uploaded from ████.
  • Check that Psychopy's logs and ET's .EDF files corresponding to the session were added to the Dropbox shared folder and completely uploaded from ████.

Within two weeks after the completed session

Convert imaging data to BIDS with HeuDiConv

We use HeuDiConv to convert from the DICOM format generated by the scanner. In addition, starting from the piloting session five, we abide by ReproIn conventions. To support backward compatibility (and some extra, currently unsupported features by the original heuristic file), we have our own heuristic file.

Our custom heuristic file

Our heuristic file largely derives from ReproIn's at the time of writing. The heuristic has a a protocols2fix: dict[str | re.Pattern[str], list[tuple[str, str]]] (lines 113-148), where replacement patterns to permit backward compatibility are written.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
# emacs: -*- mode: python; py-indent-offset: 4; indent-tabs-mode: nil -*-
# vi: set ft=python sts=4 ts=4 sw=4 et:
#
# Copyright 2023 The Axon Lab <theaxonlab@gmail.com>
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# We support and encourage derived works from this project, please read
# about our expectations at
#
#     https://www.nipreps.org/community/licensing/
#
# STATEMENT OF CHANGES: This file is derived from sources licensed under the Apache-2.0 terms,
# and this file has been changed.
# The original file this work derives from is found at:
# https://github.com/nipy/heudiconv/blob/55524168b02519bbf0a3a1c94cafb29a419728a0/heudiconv/heuristics/reproin.py
#
# ORIGINAL WORK'S ATTRIBUTION NOTICE:
#
#     Copyright [2014-2019] [Heudiconv developers]
#
#     Licensed under the Apache License, Version 2.0 (the "License");
#     you may not use this file except in compliance with the License.
#     You may obtain a copy of the License at
#
#         http://www.apache.org/licenses/LICENSE-2.0
#
#     Unless required by applicable law or agreed to in writing, software
#     distributed under the License is distributed on an "AS IS" BASIS,
#     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#     See the License for the specific language governing permissions and
#     limitations under the License.
"""Reproin heuristic."""

from __future__ import annotations

from warnings import warn
from collections import Counter
import logging
import re

import pydicom as dcm

from heudiconv.utils import SeqInfo
from heudiconv.heuristics.reproin import (
    _apply_substitutions,
    get_study_hash,
    get_study_description,
)

lgr = logging.getLogger("heudiconv")


DWI_RES = {
    "1.6mm-iso": "highres",
    "2mm-iso": "lowres",
}


IGNORE_PROTOCOLS = (
    "DEV",
    "LABEL",
    "REPORT",
    "ADC",
    "TRACEW",
    "FA",
    "ColFA",
    "B0",
    "TENSOR",
    "10meas",  # dismiss a trial of fmap acquisition
    "testJB",  # dismiss a test trial of the cmrr sequence
)

bids_regex = re.compile(r"_(?=(dir|acq|task|run)-([A-Za-z0-9]+))")


# Terminology to harmonise and use to name variables etc
# experiment
#  subject
#   [session]
#    exam (AKA scanning session) - currently seqinfo, unless brought together from multiple
#     series  (AKA protocol?)
#      - series_spec - deduced from fields the spec (literal value)
#      - series_info - the dictionary with fields parsed from series_spec

# Which fields in seqinfo (in this order) to check for the ReproIn spec
series_spec_fields = ("protocol_name", "series_description")

# dictionary from accession-number to runs that need to be marked as bad
# NOTE: even if filename has number that is 0-padded, internally no padding
# is done
fix_accession2run: dict[str, list[str]] = {
    # e.g.:
    # 'A000035': ['^8-', '^9-'],
}

# A dictionary containing fixes/remapping for sequence names per study.
# Keys are md5sum of study_description from DICOMs, in the form of PI-Experimenter^protocolname
# You can use `heudiconv -f reproin --command ls --files  PATH
# to list the "study hash".
# Values are list of tuples in the form (regex_pattern, substitution).
# If the  key is an empty string`''''`, it would apply to any study.
protocols2fix: dict[str | re.Pattern[str], list[tuple[str, str]]] = {
    "": [
        ("t1_mprage_pre_Morpho", "anat-T1w_acq-morphobox__mprage"),
        (
            "micro_struct_137dir_BIPOLAR_b3000_1.6mm-iso",
            "dwi-dwi_acq-highres_dir-unknown__137dir_bipolar",
        ),
        (
            "micro_struct_137dir_BIPOLAR_b3000_1.6mm-iso",
            "dwi-dwi_acq-highres_dir-unknown__137dir_bipolar",
        ),
        (
            "micro_struct_137dir_BIPOLAR_b3000_2mm-iso",
            "dwi-dwi_acq-lowres_dir-unknown__137dir_bipolar",
        ),
        ("gre_field_mapping_1.6mmiso", "fmap-phasediff__gre"),
        (
            "cmrr_mbep2d_bold_me4_sms4_fa75_750meas",
            "func-bold_task-rest__750meas",
        ),
        (
            "cmrr_mbep2d_bold_me4_sms4_fa80",
            "func-bold_task-rest_acq-fa80__cmrr",
        ),
        (
            "cmrr_mbep2d_bold_me4_testJB",
            "func-bold_task-rest_acq-testJB__cmrr",
        ),
        (
            "cmrr_mbep2d_bold_fmap_fa80",
            "fmap-epi_acq-bold_dir-unknown__cmrr_mbepd2d_fa80",
        ),
        ("cmrr_mbep2d_bold_me4_sms4", "func-bold_task-qct__cmrr"),
        ("_task-qc_", "_task-qct_"),
        ("anat-T2w__flair", "anat-FLAIR__spcir"),
        ("AAHead_Scout_.*", "anat-scout"),
        ("_dir_RL", "_dir-RL"),
        ("_dir_LR", "_dir-LR"),
        ("_dir_AP", "_dir-AP"),
        ("_dir_PA", "_dir-PA"),
    ]
    # e.g., QA:
    # '43b67d9139e8c7274578b7451ab21123':
    #     [
    #      ('BOLD_p2_s4_3\.5mm', 'func_task-rest_acq-p2-s4-3.5mm'),
    #      ('BOLD_', 'func_task-rest'),
    #      ('_p2_s4',        '_acq-p2-s4'),
    #      ('_p2', '_acq-p2'),
    #     ],
    # '':  # for any study example with regexes used
    #     [
    #         ('AAHead_Scout_.*', 'anat-scout'),
    #         ('^dti_.*', 'dwi'),
    #         ('^.*_distortion_corr.*_([ap]+)_([12])', r'fmap-epi_dir-\1_run-\2'),
    #         ('^(.+)_ap.*_r(0[0-9])', r'func_task-\1_run-\2'),
    #         ('^t1w_.*', 'anat-T1w'),
    #         # problematic case -- multiple identically named pepolar fieldmap runs
    #         # I guess we will just sacrifice ability to detect canceled runs here.
    #         # And we cannot just use _run+ since it would increment independently
    #         # for ap and then for pa.  We will rely on having ap preceding pa.
    #         # Added  _acq-mb8  so they match the one in funcs
    #         ('func_task-discorr_acq-ap', r'fmap-epi_dir-ap_acq-mb8_run+'),
    #         ('func_task-discorr_acq-pa', r'fmap-epi_dir-pa_acq-mb8_run='),
    # ]
}

# list containing StudyInstanceUID to skip -- hopefully doesn't happen too often
dicoms2skip: list[str] = [
    # e.g.
    # '1.3.12.2.1107.5.2.43.66112.30000016110117002435700000001',
]

DEFAULT_FIELDS = {
    # Let it just be in each json file extracted
    "Acknowledgements": "See README.md.",
}

POPULATE_INTENDED_FOR_OPTS = {
    "matching_parameters": ["ImagingVolume", "Shims"],
    "criterion": "Closest",
}


def filter_dicom(dcmdata: dcm.dataset.Dataset) -> bool:
    """Return True if a DICOM dataset should be filtered out, else False"""
    return True if dcmdata.StudyInstanceUID in dicoms2skip else False


def filter_files(_fn: str) -> bool:
    """Return True if a file should be kept, else False.

    ATM reproin does not do any filtering. Override if you need to add some
    """
    return not _fn.endswith((".csv", ".dvs"))


def fix_canceled_runs(seqinfo: list[SeqInfo]) -> list[SeqInfo]:
    """Function that adds cancelme_ to known bad runs which were forgotten"""
    if not fix_accession2run:
        return seqinfo  # nothing to do
    for i, s in enumerate(seqinfo):
        accession_number = s.accession_number
        if accession_number and accession_number in fix_accession2run:
            lgr.info(
                "Considering some runs possibly marked to be "
                "canceled for accession %s",
                accession_number,
            )
            # This code is reminiscent of prior logic when operating on
            # a single accession, but left as is for now
            badruns = fix_accession2run[accession_number]
            badruns_pattern = "|".join(badruns)
            if re.match(badruns_pattern, s.series_id):
                lgr.info("Fixing bad run {0}".format(s.series_id))
                fixedkwargs = dict()
                for key in series_spec_fields:
                    fixedkwargs[key] = "cancelme_" + getattr(s, key)
                seqinfo[i] = s._replace(**fixedkwargs)
    return seqinfo


def fix_dbic_protocol(seqinfo: list[SeqInfo]) -> list[SeqInfo]:
    """Ad-hoc fixup for existing protocols.

    It will operate in 3 stages on `protocols2fix` records.
    1. consider a record which has md5sum of study_description
    2. apply all substitutions, where key is a regular expression which
       successfully searches (not necessarily matches, so anchor appropriately)
       study_description
    3. apply "catch all" substitutions in the key containing an empty string

    3. is somewhat redundant since `re.compile('.*')` could match any, but is
    kept for simplicity of its specification.
    """

    study_hash = get_study_hash(seqinfo)
    study_description = get_study_description(seqinfo)

    # We will consider first study specific (based on hash)
    if study_hash in protocols2fix:
        _apply_substitutions(
            seqinfo, protocols2fix[study_hash], "study (%s) specific" % study_hash
        )
    # Then go through all regexps returning regex "search" result
    # on study_description
    for sub, substitutions in protocols2fix.items():
        if isinstance(sub, re.Pattern) and sub.search(study_description):
            _apply_substitutions(
                seqinfo, substitutions, "%r regex matching" % sub.pattern
            )
    # and at the end - global
    if "" in protocols2fix:
        _apply_substitutions(seqinfo, protocols2fix[""], "global")

    return seqinfo


def fix_seqinfo(seqinfo: list[SeqInfo]) -> list[SeqInfo]:
    """Just a helper on top of both fixers"""
    # add cancelme to known bad runs
    seqinfo = fix_canceled_runs(seqinfo)
    seqinfo = fix_dbic_protocol(seqinfo)
    return seqinfo


def create_key(template, outtype=("nii.gz",), annotation_classes=None):
    if template is None or not template:
        raise ValueError("Template must be a valid format string")
    return template, outtype, annotation_classes


def infotodict(seqinfo):
    """Heuristic evaluator for determining which runs belong where

    allowed template fields - follow python string module:

    item: index within category
    subject: participant id
    seqitem: run number during scanning
    subindex: sub index within group
    """
    seqinfo = fix_seqinfo(seqinfo)
    lgr.info("Processing %d seqinfo entries", len(seqinfo))

    t1w = create_key(
        "sub-{subject}/{session}/anat/sub-{subject}_{session}_acq-{acquisition}{run_entity}_T1w"
    )
    t2w = create_key(
        "sub-{subject}/{session}/anat/sub-{subject}_{session}_acq-{acquisition}{run_entity}_T2w"
    )
    t2_flair = create_key(
        "sub-{subject}/{session}/anat/sub-{subject}_{session}{run_entity}_FLAIR"
    )
    dwi = create_key(
        "sub-{subject}/{session}/dwi/sub-{subject}_{session}_acq-{acq}_dir-{dir}{run_entity}_dwi"
    )
    mag = create_key(
        "sub-{subject}/{session}/fmap/sub-{subject}_{session}{run_entity}_magnitude"
    )
    phdiff = create_key(
        "sub-{subject}/{session}/fmap/sub-{subject}_{session}{run_entity}_phasediff"
    )
    epi = create_key(
        "sub-{subject}/{session}/fmap/sub-{subject}_{session}"
        "_acq-{acquisition}_dir-{dir}{run_entity}{part_entity}_epi"
    )
    func = create_key(
        "sub-{subject}/{session}/func/sub-{subject}_{session}"
        "_task-{task}{acq_entity}{dir_entity}{run_entity}{part_entity}_bold"
    )
    sbref = create_key(
        "sub-{subject}/{session}/func/sub-{subject}_{session}_task-{task}{run_entity}_sbref"
    )

    info = {
        t1w: [],
        t2w: [],
        t2_flair: [],
        dwi: [],
        mag: [],
        phdiff: [],
        epi: [],
        func: [],
        sbref: [],
    }
    epi_mags = []
    bold_mags = []

    for s in seqinfo:
        """
        The namedtuple `s` contains the following fields:

        * total_files_till_now
        * example_dcm_file
        * series_id
        * dcm_dir_name
        * unspecified2
        * unspecified3
        * dim1
        * dim2
        * dim3
        * dim4
        * TR
        * TE
        * protocol_name
        * is_motion_corrected
        * is_derived
        * patient_id
        * study_description
        * referring_physician_name
        * series_description
        * image_type
        """

        # Ignore derived data and reports
        if (
            s.is_derived == "True"
            or s.is_derived is True
            or s.dcm_dir_name.split("_")[-1] in IGNORE_PROTOCOLS
        ):
            continue

        thisitem = {
            "item": s.series_id,
        }
        thiskey = None
        thisitem.update({k: v for k, v in bids_regex.findall(s.protocol_name)})
        thisitem["run_entity"] = f"{thisitem.pop('run', '')}"

        if s.protocol_name.lower().startswith("anat-t1w"):
            thiskey = t1w
            acquisition_present = thisitem.pop("acq", None)
            thisitem["acquisition"] = (
                ("original" if s.dcm_dir_name.endswith("_ND") else "undistorted")
                if not acquisition_present
                else acquisition_present
            )
        elif s.protocol_name.lower().startswith("anat-t2w"):
            thiskey = t2w
            acquisition_present = thisitem.pop("acq", None)
            thisitem["acquisition"] = (
                ("original" if s.dcm_dir_name.endswith("_ND") else "undistorted")
                if not acquisition_present
                else "unspecified"
            )
        elif s.protocol_name.lower().startswith("anat-flair"):
            thiskey = t2_flair
        elif s.protocol_name.startswith("dwi-dwi"):
            thiskey = dwi
        elif s.protocol_name.startswith("fmap-phasediff"):
            thiskey = phdiff if "P" in s.image_type else mag
        elif s.protocol_name.startswith("fmap-epi"):
            thiskey = epi
            thisitem["part_entity"] = ""
            thisitem["acquisition"] = (
                "b0" if s.sequence_name.endswith("ep_b0") else "bold"
            )

            # Check whether phase was written out:
            # 1. A magnitude needs to exist immediately before in the dicom info
            # 2. Magnitude and phase must have the same number of volumes
            series_id_idx, series_id_name = s.series_id.split("-", 1)
            prev_series_id = f"{int(series_id_idx) - 1}-{series_id_name}-{s.series_files}"
            if prev_series_id in epi_mags:
                thisitem["part_entity"] = "_part-phase"
                info[thiskey][epi_mags.index(prev_series_id)]["part_entity"] = "_part-mag"

            epi_mags.append(f"{s.series_id}-{s.series_files}")

        elif s.protocol_name.startswith("func-bold"):
            # Likely an error
            if s.series_files < 100:
                warn(
                    f"Dropping exceedingly short BOLD file with {s.series_files} time points."
                )
                continue

            thiskey = func

            thisitem["part_entity"] = ""
            # Some functional runs may come with acq
            func_acq = thisitem.pop("acq", None)
            thisitem["acq_entity"] = "" if not func_acq else f"_acq-{func_acq}"

            # Some functional runs may come with dir
            func_dir = thisitem.pop("dir", None)
            thisitem["dir_entity"] = "" if not func_dir else f"_dir-{func_dir}"

            # Check whether phase was written out:
            # 1. A magnitude needs to exist immediately before in the dicom info
            # 2. Magnitude and phase must have the same number of volumes
            series_id_idx, series_id_name = s.series_id.split("-", 1)
            prev_series_id = f"{int(series_id_idx) - 1}-{series_id_name}-{s.series_files}"
            if prev_series_id in bold_mags:
                thisitem["part_entity"] = "_part-phase"
                info[thiskey][bold_mags.index(prev_series_id)]["part_entity"] = "_part-mag"

            bold_mags.append(f"{s.series_id}-{s.series_files}")

        if thiskey is not None:
            info[thiskey].append(thisitem)

    for mod, items in info.items():
        if len(items) < 2:
            continue

        info[mod] = _assign_run_on_repeat(items)

    return info


def _assign_run_on_repeat(modality_items):
    """
    Assign run IDs for repeated inputs for a given modality.

    Examples
    --------
    >>> _assign_run_on_repeat([
    ...     {"item": "discard1", "acq": "bold", "dir": "PA"},
    ...     {"item": "discard2", "acq": "bold", "dir": "AP"},
    ...     {"item": "discard3", "acq": "bold", "dir": "PA"},
    ... ])  # doctest: +NORMALIZE_WHITESPACE
    [{'item': 'discard1', 'acq': 'bold', 'dir': 'PA', 'run_entity': '_run-1'},
     {'item': 'discard2', 'acq': 'bold', 'dir': 'AP'},
     {'item': 'discard3', 'acq': 'bold', 'dir': 'PA', 'run_entity': '_run-2'}]

    >>> _assign_run_on_repeat([
    ...     {"item": "discard1", "acq": "bold", "dir": "PA"},
    ...     {"item": "discard2", "acq": "bold", "dir": "AP"},
    ...     {"item": "discard3", "acq": "bold", "dir": "PA"},
    ...     {"item": "discard4", "acq": "bold", "dir": "AP"},
    ... ])  # doctest: +NORMALIZE_WHITESPACE
    [{'item': 'discard1', 'acq': 'bold', 'dir': 'PA', 'run_entity': '_run-1'},
     {'item': 'discard2', 'acq': 'bold', 'dir': 'AP', 'run_entity': '_run-1'},
     {'item': 'discard3', 'acq': 'bold', 'dir': 'PA', 'run_entity': '_run-2'},
     {'item': 'discard4', 'acq': 'bold', 'dir': 'AP', 'run_entity': '_run-2'}]

    >>> _assign_run_on_repeat([
    ...     {"item": "discard1", "acq": "bold", "dir": "PA", "run": "1"},
    ...     {"item": "discard2", "acq": "bold", "dir": "AP"},
    ...     {"item": "discard3", "acq": "bold", "dir": "PA", "run": "2"},
    ... ])  # doctest: +NORMALIZE_WHITESPACE
    [{'item': 'discard1', 'acq': 'bold', 'dir': 'PA', 'run': '1'},
     {'item': 'discard2', 'acq': 'bold', 'dir': 'AP'},
     {'item': 'discard3', 'acq': 'bold', 'dir': 'PA', 'run': '2'}]

    >>> _assign_run_on_repeat([
    ...     {"item": "discard1", "acq": "bold", "dir": "PA", "run_entity": "_run-1"},
    ...     {"item": "discard2", "acq": "bold", "dir": "AP"},
    ...     {"item": "discard3", "acq": "bold", "dir": "PA", "run_entity": "_run-2"},
    ... ])  # doctest: +NORMALIZE_WHITESPACE
    [{'item': 'discard1', 'acq': 'bold', 'dir': 'PA', 'run_entity': '_run-1'},
     {'item': 'discard2', 'acq': 'bold', 'dir': 'AP'},
     {'item': 'discard3', 'acq': 'bold', 'dir': 'PA', 'run_entity': '_run-2'}]

    >>> _assign_run_on_repeat([
    ...     {"item": "discard1", "acq": "bold", "dir": "PA", "part_entity": "_part-mag"},
    ...     {"item": "discard2", "acq": "bold", "dir": "PA", "part_entity": "_part-phase"},
    ...     {"item": "discard3", "acq": "bold", "dir": "AP", "part_entity": "_part-mag"},
    ...     {"item": "discard4", "acq": "bold", "dir": "AP", "part_entity": "_part-phase"},
    ... ])  # doctest: +NORMALIZE_WHITESPACE
    [{'item': 'discard1', 'acq': 'bold', 'dir': 'PA', 'part_entity': '_part-mag'},
     {'item': 'discard2', 'acq': 'bold', 'dir': 'PA', 'part_entity': '_part-phase'},
     {'item': 'discard3', 'acq': 'bold', 'dir': 'AP', 'part_entity': '_part-mag'},
     {'item': 'discard4', 'acq': 'bold', 'dir': 'AP', 'part_entity': '_part-phase'}]

    """
    modality_items = modality_items.copy()

    str_patterns = [
        "_".join([f"{s[0]}-{s[1]}" for s in item.items() if s[0] != "item"])
        for item in modality_items
    ]
    strcount = Counter(str_patterns)

    for string, count in strcount.items():
        if count < 2:
            continue

        runid = 1

        for index, item_string in enumerate(str_patterns):
            if string == item_string:
                modality_items[index].update(
                    {
                        "run_entity": f"_run-{runid}",
                    }
                )
                runid += 1

    return modality_items

During piloting, we changed a number of settings

For example, the first four sessions did not follow Reproin conventions and filenames varied substantially. Please note the protocols2fix variable in our heuristic file, where the compatibility is implemented.

  • Run HeuDiConv with our heuristic file <path>/code/heudiconv/reproin.py:

    Executing HeuDiConv
    #!/bin/bash
    heudiconv -s "001" -ss "pilot001" -b -l . -o /data/datasets/hcph/ \
              -f <sops_clone_path>/code/heudiconv/reproin.py \
              --files /data/datasets/hcph-pilot-sourcedata/\
                      sub-01/\
                      ses-18950702/
    

    Session number MUST be updated manually

    Example of the dataset organization

    Piloting sessions 15 and 16 look like this:

    ├── ses-pilot015
    │   ├── anat
    │   │   ├── sub-001_ses-pilot015_acq-original_T1w.json
    │   │   ├── sub-001_ses-pilot015_acq-original_T1w.nii.gz
    │   │   ├── sub-001_ses-pilot015_acq-undistorted_T1w.json
    │   │   ├── sub-001_ses-pilot015_acq-undistorted_T1w.nii.gz
    │   │   ├── sub-001_ses-pilot015_T2w.json
    │   │   └── sub-001_ses-pilot015_T2w.nii.gz
    │   ├── dwi
    │   │   ├── sub-001_ses-pilot015_acq-highres_dir-LR_dwi.bval
    │   │   ├── sub-001_ses-pilot015_acq-highres_dir-LR_dwi.bvec
    │   │   ├── sub-001_ses-pilot015_acq-highres_dir-LR_dwi.json
    │   │   └── sub-001_ses-pilot015_acq-highres_dir-LR_dwi.nii.gz
    │   ├── fmap
    │   │   ├── sub-001_ses-pilot015_acq-b0_dir-RL_epi.json
    │   │   ├── sub-001_ses-pilot015_acq-b0_dir-RL_epi.nii.gz
    │   │   ├── sub-001_ses-pilot015_acq-bold_dir-RL_part-mag_epi.json
    │   │   ├── sub-001_ses-pilot015_acq-bold_dir-RL_part-mag_epi.nii.gz
    │   │   ├── sub-001_ses-pilot015_acq-bold_dir-RL_part-phase_epi.json
    │   │   ├── sub-001_ses-pilot015_acq-bold_dir-RL_part-phase_epi.nii.gz
    │   │   ├── sub-001_ses-pilot015_magnitude1.json
    │   │   ├── sub-001_ses-pilot015_magnitude1.nii.gz
    │   │   ├── sub-001_ses-pilot015_magnitude2.json
    │   │   ├── sub-001_ses-pilot015_magnitude2.nii.gz
    │   │   ├── sub-001_ses-pilot015_phasediff.json
    │   │   └── sub-001_ses-pilot015_phasediff.nii.gz
    │   ├── func
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-1_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-1_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-1_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-1_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-2_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-2_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-2_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-2_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-3_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-3_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-3_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-3_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-4_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-4_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-4_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-bht_echo-4_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-bht_part-mag_events.tsv
    │   │   ├── sub-001_ses-pilot015_task-bht_part-phase_events.tsv
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-1_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-1_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-1_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-1_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-2_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-2_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-2_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-2_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-3_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-3_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-3_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-3_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-4_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-4_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-4_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-qct_echo-4_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-qct_part-mag_events.tsv
    │   │   ├── sub-001_ses-pilot015_task-qct_part-phase_events.tsv
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-1_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-1_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-1_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-1_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-2_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-2_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-2_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-2_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-3_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-3_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-3_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-3_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-4_part-mag_bold.json
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-4_part-mag_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-4_part-phase_bold.json
    │   │   ├── sub-001_ses-pilot015_task-rest_echo-4_part-phase_bold.nii.gz
    │   │   ├── sub-001_ses-pilot015_task-rest_part-mag_events.tsv
    │   │   └── sub-001_ses-pilot015_task-rest_part-phase_events.tsv
    │   └── sub-001_ses-pilot015_scans.tsv
    └── ses-pilot016
        ├── anat
        │   ├── sub-001_ses-pilot016_acq-original_T1w.json
        │   ├── sub-001_ses-pilot016_acq-original_T1w.nii.gz
        │   ├── sub-001_ses-pilot016_acq-undistorted_T1w.json
        │   ├── sub-001_ses-pilot016_acq-undistorted_T1w.nii.gz
        │   ├── sub-001_ses-pilot016_T2w.json
        │   └── sub-001_ses-pilot016_T2w.nii.gz
        ├── dwi
        │   ├── sub-001_ses-pilot016_acq-highres_dir-RL_dwi.bval
        │   ├── sub-001_ses-pilot016_acq-highres_dir-RL_dwi.bvec
        │   ├── sub-001_ses-pilot016_acq-highres_dir-RL_dwi.json
        │   └── sub-001_ses-pilot016_acq-highres_dir-RL_dwi.nii.gz
        ├── fmap
        │   ├── sub-001_ses-pilot016_acq-b0_dir-AP_epi.json
        │   ├── sub-001_ses-pilot016_acq-b0_dir-AP_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-b0_dir-LR_epi.json
        │   ├── sub-001_ses-pilot016_acq-b0_dir-LR_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-b0_dir-PA_epi.json
        │   ├── sub-001_ses-pilot016_acq-b0_dir-PA_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-b0_dir-RL_epi.json
        │   ├── sub-001_ses-pilot016_acq-b0_dir-RL_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-bold_dir-AP_part-mag_epi.json
        │   ├── sub-001_ses-pilot016_acq-bold_dir-AP_part-mag_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-bold_dir-AP_part-phase_epi.json
        │   ├── sub-001_ses-pilot016_acq-bold_dir-AP_part-phase_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-bold_dir-LR_part-mag_epi.json
        │   ├── sub-001_ses-pilot016_acq-bold_dir-LR_part-mag_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-bold_dir-LR_part-phase_epi.json
        │   ├── sub-001_ses-pilot016_acq-bold_dir-LR_part-phase_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-bold_dir-PA_part-mag_epi.json
        │   ├── sub-001_ses-pilot016_acq-bold_dir-PA_part-mag_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-bold_dir-PA_part-phase_epi.json
        │   ├── sub-001_ses-pilot016_acq-bold_dir-PA_part-phase_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-bold_dir-RL_part-mag_epi.json
        │   ├── sub-001_ses-pilot016_acq-bold_dir-RL_part-mag_epi.nii.gz
        │   ├── sub-001_ses-pilot016_acq-bold_dir-RL_part-phase_epi.json
        │   ├── sub-001_ses-pilot016_acq-bold_dir-RL_part-phase_epi.nii.gz
        │   ├── sub-001_ses-pilot016_magnitude1.json
        │   ├── sub-001_ses-pilot016_magnitude1.nii.gz
        │   ├── sub-001_ses-pilot016_magnitude2.json
        │   ├── sub-001_ses-pilot016_magnitude2.nii.gz
        │   ├── sub-001_ses-pilot016_phasediff.json
        │   └── sub-001_ses-pilot016_phasediff.nii.gz
        ├── func
        │   ├── sub-001_ses-pilot016_task-bht_echo-1_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-bht_echo-1_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-bht_echo-1_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-bht_echo-1_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-bht_echo-2_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-bht_echo-2_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-bht_echo-2_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-bht_echo-2_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-bht_echo-3_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-bht_echo-3_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-bht_echo-3_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-bht_echo-3_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-bht_echo-4_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-bht_echo-4_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-bht_echo-4_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-bht_echo-4_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-bht_part-mag_events.tsv
        │   ├── sub-001_ses-pilot016_task-bht_part-phase_events.tsv
        │   ├── sub-001_ses-pilot016_task-qct_echo-1_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-qct_echo-1_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-qct_echo-1_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-qct_echo-1_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-qct_echo-2_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-qct_echo-2_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-qct_echo-2_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-qct_echo-2_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-qct_echo-3_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-qct_echo-3_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-qct_echo-3_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-qct_echo-3_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-qct_echo-4_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-qct_echo-4_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-qct_echo-4_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-qct_echo-4_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-qct_part-mag_events.tsv
        │   ├── sub-001_ses-pilot016_task-qct_part-phase_events.tsv
        │   ├── sub-001_ses-pilot016_task-rest_echo-1_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-rest_echo-1_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-rest_echo-1_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-rest_echo-1_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-rest_echo-2_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-rest_echo-2_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-rest_echo-2_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-rest_echo-2_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-rest_echo-3_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-rest_echo-3_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-rest_echo-3_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-rest_echo-3_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-rest_echo-4_part-mag_bold.json
        │   ├── sub-001_ses-pilot016_task-rest_echo-4_part-mag_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-rest_echo-4_part-phase_bold.json
        │   ├── sub-001_ses-pilot016_task-rest_echo-4_part-phase_bold.nii.gz
        │   ├── sub-001_ses-pilot016_task-rest_part-mag_events.tsv
        │   └── sub-001_ses-pilot016_task-rest_part-phase_events.tsv
        └── sub-001_ses-pilot016_scans.tsv
    
    We started to generate phase and magnitude only after session 15

    As a result, the piloting data up to session 14 will look more like:

    ├── ses-pilot014
    │   ├── anat
    │   │   ├── sub-001_ses-pilot014_acq-original_T1w.json
    │   │   ├── sub-001_ses-pilot014_acq-original_T1w.nii.gz
    │   │   ├── sub-001_ses-pilot014_acq-undistorted_T1w.json
    │   │   ├── sub-001_ses-pilot014_acq-undistorted_T1w.nii.gz
    │   │   ├── sub-001_ses-pilot014_T2w.json
    │   │   └── sub-001_ses-pilot014_T2w.nii.gz
    │   ├── dwi
    │   │   ├── sub-001_ses-pilot014_acq-highres_dir-PA_dwi.bval
    │   │   ├── sub-001_ses-pilot014_acq-highres_dir-PA_dwi.bvec
    │   │   ├── sub-001_ses-pilot014_acq-highres_dir-PA_dwi.json
    │   │   └── sub-001_ses-pilot014_acq-highres_dir-PA_dwi.nii.gz
    │   ├── fmap
    │   │   ├── sub-001_ses-pilot014_acq-b0_dir-AP_epi.json
    │   │   ├── sub-001_ses-pilot014_acq-b0_dir-AP_epi.nii.gz
    │   │   ├── sub-001_ses-pilot014_acq-bold_dir-PA_run-1_epi.json
    │   │   ├── sub-001_ses-pilot014_acq-bold_dir-PA_run-1_epi.nii.gz
    │   │   ├── sub-001_ses-pilot014_acq-bold_dir-PA_run-2_epi.json
    │   │   ├── sub-001_ses-pilot014_acq-bold_dir-PA_run-2_epi.nii.gz
    │   │   ├── sub-001_ses-pilot014_magnitude1.json
    │   │   ├── sub-001_ses-pilot014_magnitude1.nii.gz
    │   │   ├── sub-001_ses-pilot014_magnitude2.json
    │   │   ├── sub-001_ses-pilot014_magnitude2.nii.gz
    │   │   ├── sub-001_ses-pilot014_phasediff.json
    │   │   └── sub-001_ses-pilot014_phasediff.nii.gz
    │   ├── func
    │   │   ├── sub-001_ses-pilot014_task-bht_echo-1_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_echo-1_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_echo-2_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_echo-2_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_echo-3_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_echo-3_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_echo-4_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_echo-4_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_events.tsv
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_echo-1_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_echo-1_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_echo-2_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_echo-2_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_echo-3_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_echo-3_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_echo-4_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_echo-4_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_run-1_events.tsv
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_echo-1_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_echo-1_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_echo-2_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_echo-2_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_echo-3_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_echo-3_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_echo-4_bold.json
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_echo-4_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-bht_run-2_events.tsv
    │   │   ├── sub-001_ses-pilot014_task-qct_echo-1_bold.json
    │   │   ├── sub-001_ses-pilot014_task-qct_echo-1_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-qct_echo-2_bold.json
    │   │   ├── sub-001_ses-pilot014_task-qct_echo-2_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-qct_echo-3_bold.json
    │   │   ├── sub-001_ses-pilot014_task-qct_echo-3_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-qct_echo-4_bold.json
    │   │   ├── sub-001_ses-pilot014_task-qct_echo-4_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-qct_events.tsv
    │   │   ├── sub-001_ses-pilot014_task-rest_echo-1_bold.json
    │   │   ├── sub-001_ses-pilot014_task-rest_echo-1_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-rest_echo-2_bold.json
    │   │   ├── sub-001_ses-pilot014_task-rest_echo-2_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-rest_echo-3_bold.json
    │   │   ├── sub-001_ses-pilot014_task-rest_echo-3_bold.nii.gz
    │   │   ├── sub-001_ses-pilot014_task-rest_echo-4_bold.json
    │   │   ├── sub-001_ses-pilot014_task-rest_echo-4_bold.nii.gz
    │   │   └── sub-001_ses-pilot014_task-rest_events.tsv
    │   └── sub-001_ses-pilot014_scans.tsv
    

Clean-up after BIDS conversion and preparation for archival

  • Delete incorrect files generated by HeuDiConv:

    find sub-001/ -name "*_part-mag_events.tsv" -or -name "*_part-phase_events.tsv" | xargs rm
    
  • Compact DICOM session folder and remove it if successful

    tar vczf ses-18950702.tar.gz \
             /data/datasets/hcph-sourcedata/\
             sub-01/\
             ses-18950702 \
    && \
    rm -rf /data/datasets/hcph-sourcedata/\
           sub-01/\
           ses-18950702
    
  • Remove write permissions on the newly downloaded data:

    chmod -R a-w /data/datasets/hcph-sourcedata/sub-01/
    

Generate BIDS' events files

  • Execute conversion with code/events/psychopy2events:

    python psychopy2events.py --path ./outputdir/sub-001/ses-pilot016/func/
    
Example of a session with events files

The corresponding events files are highlighted below:

ses-024
├── anat
├── dwi
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.bval
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.bvec
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.json
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.nii.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-cardiac_physio.json
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-cardiac_physio.tsv.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-respiratory_physio.json
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-respiratory_physio.tsv.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_stim.json
   └── sub-001_ses-024_acq-highres_dir-AP_stim.tsv.gz
├── fmap
├── func
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.bval
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.bvec
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.json
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.nii.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-cardiac_physio.json
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-cardiac_physio.tsv.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-respiratory_physio.json
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-respiratory_physio.tsv.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_stim.json
   └── sub-001_ses-024_acq-highres_dir-AP_stim.tsv.gz
   ├── sub-001_ses-024_task-bht_dir-AP_events.tsv
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_recording-cardiac_physio.json
   ├── sub-001_ses-024_task-qct_dir-AP_recording-cardiac_physio.tsv.gz
   ├── sub-001_ses-024_task-qct_dir-AP_recording-respiratory_physio.json
   ├── sub-001_ses-024_task-qct_dir-AP_recording-respiratory_physio.tsv.gz
   ├── sub-001_ses-024_task-qct_dir-AP_stim.json
   ├── sub-001_ses-024_task-qct_dir-AP_stim.tsv.gz
   ├── sub-001_ses-024_task-qct_dir-AP_events.tsv
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_recording-cardiac_physio.json
   ├── sub-001_ses-024_task-rest_dir-AP_recording-cardiac_physio.tsv.gz
   ├── sub-001_ses-024_task-rest_dir-AP_recording-respiratory_physio.json
   ├── sub-001_ses-024_task-rest_dir-AP_recording-respiratory_physio.tsv.gz
   ├── sub-001_ses-024_task-rest_dir-AP_stim.json
   └── sub-001_ses-024_task-rest_dir-AP_stim.tsv.gz
   └── sub-001_ses-024_task-rest_dir-AP_events.tsv
└── sub-001_ses-024_scans.tsv

Convert physiological recordings into BIDS (in-house)

  • Install the necessary packages.

    python -m pip install bioread pandas matplotlib numpy pathlib scipy
    

  • Update the appropriate session number within cell 3 in the conversion Jupyter notebook.

  • Execute the notebook.
Example of a session with physiological recordings

The execution of the notebook on session 24 yields the following new outputs (highlighted):

ses-024
├── anat
├── dwi
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.bval
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.bvec
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.json
   ├── sub-001_ses-024_acq-highres_dir-AP_dwi.nii.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-cardiac_physio.json
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-cardiac_physio.tsv.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-respiratory_physio.json
   ├── sub-001_ses-024_acq-highres_dir-AP_recording-respiratory_physio.tsv.gz
   ├── sub-001_ses-024_acq-highres_dir-AP_stim.json
   └── sub-001_ses-024_acq-highres_dir-AP_stim.tsv.gz
├── fmap
├── func
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_recording-cardiac_physio.json
   ├── sub-001_ses-024_task-bht_dir-AP_recording-cardiac_physio.tsv.gz
   ├── sub-001_ses-024_task-bht_dir-AP_recording-respiratory_physio.json
   ├── sub-001_ses-024_task-bht_dir-AP_recording-respiratory_physio.tsv.gz
   ├── sub-001_ses-024_task-bht_dir-AP_stim.json
   ├── sub-001_ses-024_task-bht_dir-AP_stim.tsv.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_recording-cardiac_physio.json
   ├── sub-001_ses-024_task-qct_dir-AP_recording-cardiac_physio.tsv.gz
   ├── sub-001_ses-024_task-qct_dir-AP_recording-respiratory_physio.json
   ├── sub-001_ses-024_task-qct_dir-AP_recording-respiratory_physio.tsv.gz
   ├── sub-001_ses-024_task-qct_dir-AP_stim.json
   ├── sub-001_ses-024_task-qct_dir-AP_stim.tsv.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_recording-cardiac_physio.json
   ├── sub-001_ses-024_task-rest_dir-AP_recording-cardiac_physio.tsv.gz
   ├── sub-001_ses-024_task-rest_dir-AP_recording-respiratory_physio.json
   ├── sub-001_ses-024_task-rest_dir-AP_recording-respiratory_physio.tsv.gz
   ├── sub-001_ses-024_task-rest_dir-AP_stim.json
   └── sub-001_ses-024_task-rest_dir-AP_stim.tsv.gz
└── sub-001_ses-024_scans.tsv

Convert eye-tracking into BIDS (in-house)

Instead of the current specifications, we are using the following BEP

Example of a session with eye-tracking recordings

The files corresponding to eye-tracking are highlighted below:

ses-024
├── anat
├── dwi
├── fmap
├── func
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-bht_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-bht_dir-AP_eyetrack.json
   ├── sub-001_ses-024_task-bht_dir-AP_eyetrack.tsv.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-qct_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-qct_dir-AP_eyetrack.json
   ├── sub-001_ses-024_task-qct_dir-AP_eyetrack.tsv.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-1_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-2_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-3_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-mag_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-mag_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-phase_bold.json
   ├── sub-001_ses-024_task-rest_dir-AP_echo-4_part-phase_bold.nii.gz
   ├── sub-001_ses-024_task-rest_dir-AP_eyetrack.json
   └── sub-001_ses-024_task-rest_dir-AP_eyetrack.tsv.gz
└── sub-001_ses-024_scans.tsv

Add new data to the DataLad dataset

As new sessions are collected, the corresponding BIDS structures MUST be saved within the DataLad dataset and pushed to remote storage systems:

  • Save the files in the dataset history using the command below. Replace <session_id> below with the number of the session (e.g., pilot017):

    datalad save -r -m "add: session <session_id>" sub-001/ses-<session_id>
    

    Always double-check that data are annexed and the metadata committed to git

    Although the creation of a procedure should ensure data and metadata are added to the appropriate version control (Git or Git-Annex), it is possible that some metadata or data formats are not anticipated, or do not follow the general rules.

    A generally good strategy is to avoid recursion (i.e., do not use the -r flag), and leverage Bash's find and xargs tools. For example, the following command line selects metadata files that should be committed to Git within one session (pilot020) and saves them:

    find sub-001/ses-pilot020 -name "*.tsv" -or -name "*.json" -or -name "*.bvec" -or -name "*.bval" | xargs datalad save --to-git -m '"add(pilot020): new session metadata"'
    

    Correspondingly, we can store NIfTI data and physiological information:

    find sub-001/ses-pilot020 -name "*.nii.gz" -or -name "*_eyetrack.tsv.gz" -or -name "*_physio.tsv.gz" -or -name "*_stim.tsv.gz" | xargs datalad save -m '"add(pilot020): new session NIfTI data, eye tracking and physio"'
    

    Please read DataLad's save documentation

    If you overeagerly datalad-saved too many files

    You can revert the datalad save operation without deleting changes with:

    git reset --mixed COMMIT
    

    where COMMIT is the hash of the last commit you want to keep (all the later commits will be dropped).

    To check the commit hash where you can roll history back to, you may want to use:

    git log -50 --oneline
    

    Saving batches of sessions

    It is possible to save several sessions with the following Bash script by enumerating them in the array defined in the first line here:

    SESSIONS=( 001 003 pilot21 ); \
    for SESSION in ${SESSIONS[@]}; do \
        find sub-001/ses-$SESSION -name "*.tsv" -or -name "*.json" -or -name "*.bvec" -or -name "*.bval" | xargs datalad save --to-git -m '"add('"$SESSION"'): new session metadata"'; \
        find sub-001/ses-$SESSION -name "*.nii.gz" -or -name "*_eyetrack.tsv.gz" -or -name "*_physio.tsv.gz" -or -name "*_stim.tsv.gz" | xargs datalad save -m '"add('"$SESSION"'): new session NIfTI data, eye tracking and physio"'; \
    done
    
  • Push the new data to the remote storage (if your git containing DataLad and the Git annex is different from origin, e.g., github, replace the name below):

    datalad push --to ria-storage
    datalad push --to origin
    

    Always double-check that data in the annex are uploaded to the RIA store

Formal QC

  • Consult the session logs to anticipate session peculiarities (e.g the session was aborted prematurely) and potential quality issues (e.g the participan fell asleep). Those are saved in the issues of our repository with the label scan. Keep note of the peculiar events, associated with their session index, and keep it close to you during quality control.

  • Run the BIDS Validator to check the formal quality of the dataset (filenames, homogeneity of modalities and parameters across sessions, etc.)

    docker run -ti --rm -v /path/to/data//hcph/:/data:ro bids/validator /data
    
BIDS non-compliance: WARNINGS and ERRORS

We do not fully comply with current BIDS specifications, so some ERRORS and WARNINGS will emerge. If errors and warnings are not listed here, please reach out to decide on a solution.

  1. ERRORS: Because we follow BEP 020 and it is not official yet, ET-related files will source ERRORS:

    [ERR] Files with such naming scheme are not part of BIDS specification.
    
  2. WARNINGS: some warnings are expected. Warnings that are not among this list should be addressed and BIDS conversion should be re-run on the affected files:

    • During the piloting phase of the study, we tried out different sequence parameters and sequence type. As such, the following warning is expected:

      [WARN] Not all subjects/sessions/runs have the same scanning parameters. (code: 39 - INCONSISTENT_PARAMETERS)
      
    • The _magnitude{1,2}.nii.gz files corresponding to some pilot sessions are missing:

      [WARN] Each _phasediff.nii[.gz] file should be associated with a _magnitude1.nii[.gz] file. (code: 92 MISSING_MAGNITUDE1_FILE)
      ./sub-001/ses-pilot001/fmap/sub-001_ses-pilot001_phasediff.nii.gz
      ./sub-001/ses-pilot004/fmap/sub-001_ses-pilot004_phasediff.nii.gz
      ./sub-001/ses-pilot006/fmap/sub-001_ses-pilot006_phasediff.nii.gz
      

Visual assessment of unprocessed data with MRIQC

Checking the data quality shortly after they are acquired increases the likelihood of catching systematic artifacts early enough to avert spreading throughout the whole dataset. It also modulates the burden of visual inspection over time, such that we avoid overwhelming raters with outbursts of images to assess. Better pacing in rating throughput also contributes to reducing raters' attrition and fatigue.

  • Screen all the unprocessed data and assess them as described in the next section.

Upon updates and bugfixes of the dataset

Update PyBIDS's database index

The PyBIDS index cache dramatically speeds up MRIQC, fMRIPrep and dMRIPrep

To speed up the tear-up time of NiPreps tools (MRIQC, fMRIPrep, and dMRIPrep) and other relevant code using PyBIDS, we have added a database cache under the /data/datasets/hcph-dataset/.bids-index/ folder. This cache can be leverage by adding the --bids-database-dir /data/datasets/hcph-dataset/.bids-index/ to the corresponding command line.

  • Unlock the database file to enable update:

    cd /data/datasets/hcph-dataset
    datalad unlock .bids-index/layout_index.sqlite
    
  • Reset the database file:

    python -m pip install "pybids>=0.16"
    $( dirname $( which python ) )/pybids layout --reset-db --no-validate --index-metadata . .bids-index/
    

    Successful execution will finalize with a message: Successfully generated database index at /data/datasets/hcph-dataset/.bids-index.

  • Save the dataset again:

    datalad save -m "enh: updated PyBIDS' database file" /data/datasets/hcph-dataset/.bids-index
    
  • Push the changes back to the repos:

    datalad push --to=github