Before you Start

This project was built in mind with integrating Minian output files. Below will be two sections delineating what to do if you’re coming from a Minian project and a more comprehensive guide if you come from a different CNMF related project.

Non-Minian CNMF Projects

If your project is based on non-Minian approach, you can look at the delineation of data structures below, so you can reach parity. Since there a multitude of different approaches to store CNMF related data, we will only cover what the end result should look like.

Following the approach in Minian, the data should be stored in a zarr format, due to their usage of xarray. Xarray is a numpyesque library that allows for efficient memory usage and parallelization. The following data is expected:

A.zarr: A 3D array of shape (unit_id, height, width). This array represents the spatial footprints of the neurons.
C.zarr: A 2D array of shape (unit_id, frame). This array represents the calcium traces of the neurons.
S.zarr: A 2D array of shape (unit_id, frame). This array represents the spike/firing rate.
YrA.zarr: A 2D array of shape (unit_id, frame). This array represents the raw signals/residual traces.
DFF.zarr: A 2D array of shape (unit_id, frame). This array represents the deltaF/F traces.

In the case that you don’t have a DFF trace, our project will generate and calculate it for you, using the detrend_df_f function from CaImAn.

However, it is heavily encouraged to have your own DFF trace calculated, as it will be more accurate and tailored to your data. The calculation of DFF if initially omitted will require the following additional data:

f.zarr: A 1D array of shape (frame). Estimation of background flourescence at each frame.
b.zarr: A 2D array of shape (height, width). Spatial background component.

How to convert numpy to xarray

If you have your data in numpy format, you can convert it to an xarray using the following code snippet:

import xarray as xr
import numpy as np

unit_ids = np.random.randint(0, 100, 100)

# Create a numpy array
C_numpy = np.random.rand(100, 100)

# Create an xarray
C = xr.DataArray(
    C_numpy,
    dims=["unit_id", "frame"],
    coords=dict(
        unit_id=unit_ids,
        frame=np.arange(C_numpy.shape[1]),
    ),
    name="C"
)

# Save the xarray to a zarr file
C.to_zarr("C.zarr")

# Now for A.zarr
A_numpy = np.random.rand(100, 100, 100)
A = xr.DataArray(
    A_numpy,
    dims=["unit_id", "height", "width"],
    coords=dict(
        unit_id=unit_ids,
        height=np.arange(A_numpy.shape[1]),
        width=np.arange(A_numpy.shape[2]),
    ),
    name="A"
)

A.to_zarr("A.zarr")

Repeat the process above for other variables.

Video files

The following videos are expected to likewise be in a zarr format:

varr.zarr: A 3D array of shape (frame, height, width). This array represents the raw video data. When chunked, ensure that it is chunked along the frame axis and not the height or width.

Y_fm_chk.zarr: A 3D array of shape (frame, height, width). This array represents the processed video data. When chunked, ensure that it is chunked along the frame axis and not the height or width.

Y_hw_chk.zarr: A 3D array of shape (frame, height, width). This array represents the processed video data. When chunked, ensure that it the frames are intact and it is chunked along heigh and width. It will be automatically created if it doesn’t exist (However it will take a considerable amount of time).

(Optional) behavior_video.zarr: A 3D array of shape (frame, height, width). This array represents the behavior video data. When chunked, ensure that it is chunked along the frame axis and not the height or width.

We are aware that the recording framerate in the behavior video will most likely differ to that of the calcium imaging video. We account for that in our project, you need to ensure that the first and last frame of the behavior video, roughly aligns with the first and last frame of the calcium imaging video.

Chunking is an important aspect of zarr, it dictates in what way the data is stored on disk and how it is read into memory. For the purposes of efficient GUI usage, you should chunk the data as stated above. To load in your data into a proper format and to have it chunked correctly, you can follow the steps in the Minian documentation.

Once you have your data in the correct format, you can proceed to the Minian section below.

Minian Projects

Loading in your data will require 2 folders and a csv file:

data: This folder should contain the following files:
- A.zarr
- C.zarr
- S.zarr
- YrA.zarr
- DFF.zarr (In the case that you don’t have this, include f.zarr and b.zarr so it will be calculated for you)
videos: This folder should contain the following files:
- varr.zarr
- Y_fm_chk.zarr
- Y_hw_chk.zarr (Optional, will be created if it doesn’t exist)
- behavior_video.zarr (Optional, look at the video files section for more information)
behavior.csv: This file contains both millisecond time information as well as the behavior data, where 0 represents no event occurred and 1 represents that an event happened. The following indicates the column information:
- Frame Number: The frame number of the video
- Time Stamp (ms): The time in milliseconds
- (Optional) RNF: Reinforcement
- (Optional) ALP: Active lever press
- (Optional) ILP: Inactive lever press
- (Optional) ALP_Timeout: Active lever press timeout

The following is an example of what the csv file could look like:

Example CSV File
Frame Number	Time Stamp (ms)	RNF
0	0	0
1	33	0
2	66	0
3	100	0
4	133	0
5	166	0
6	200	0
7	233	0
8	266	0
9	300	0
10	333	0
11	366	1
12	400	0

Creating the Config File

The final step is to create a config.ini file that will tell the GUI where to find the necessary data. This file can be saved anywhere convenient for the user as the GUI contains a file picker to select the desired config file. For ease of use, it is recommended to save the config file near the CalTrig folder. Below is a template that you can adjust to your needs:

[Session_Info]
mouseid = AA058
day = D1
session = S4
group = None
data_path = C:\path\to\folder\that\contains\data
video_path = C:\path\to\folder\that\contains\videos
behavior_path = C:\path\to\folder\that\contains\behavior.csv

What About Non-CNMF signals

If you have data that was not generated from the CNMF process, you could still try to integrate it into the GUI. We recommend to look at each of the signals and see, which ones adhere most closely to the CNMF data structures. If you don’t have a corresponding signal, then you could try to generate 0 arrays of the same shape as the CNMF data. However, this could lead to some issues with a few of the features in the GUI.

At the very least, you should have the equivalent A, C, DFF and a video file. If you only have a single video file, then you could copy it for the other expected video files. All of this is theoretical and hasn’t been tested, before. In case of any problems, please open an issue on the GitHub page.