Data Formats¶

Reference for all file formats CASTLE reads and writes.

Input¶

Video Files¶

CASTLE uses PyAV (FFmpeg wrapper) for video I/O.

Supported formats: MP4, AVI, MOV, WMV, FLV, MKV

Supported extensions: .mp4, .avi, .mov, .wmv, .flv, .mkv

Recommended: MP4 with H.264 codec for best compatibility and performance.

No resolution or frame rate limitations — CASTLE processes whatever the video contains.

Intermediate Files¶

Label Files (`.npz`)¶

Created by the Label ROI step. Stored in project/label/{video_name}/.

import numpy as np
data = np.load('0.npz')

data['frame']  # np.ndarray, shape (H, W, 3), dtype uint8 — RGB frame
data['mask']   # np.ndarray, shape (H, W, N_ROIS), dtype uint8 — per-ROI masks

Filename is the frame index (e.g., 0.npz, 247.npz)
Each ROI is a separate channel in the mask array
ROI colors are assigned sequentially

Tracked Masks (`mask_list.h5`)¶

Created by the Tracking step. Stored in project/track/{video_name}/.

HDF5 file using H5IO wrapper:

from castle.utils.h5_io import H5IO

tracker = H5IO('mask_list.h5')
mask = tracker[frame_index]       # np.ndarray, shape (H, W), dtype uint8
n_frames = len(tracker)           # Total number of frames
n_rois = tracker.get_n_rois()     # Number of tracked ROIs

Each frame is stored as a gzip-compressed dataset keyed by frame index (string)
Metadata keys: total_frames, n_rois
Mask values encode ROI IDs as pixel colors

Cropped Video (`.mp4`)¶

Created by Extract Crop Video. Stored in project/crop/{video_name}/.

Filename: {video_basename}_ROI_{id}_crop.mp4
Standard MP4 video of the aligned/cropped ROI

Output Files¶

Latent Features (`.npz`)¶

Created by Extract Latent. Stored in project/latent/{model_name}/.

import numpy as np
data = np.load('video_ROI_1_dinov2_vitb14_reg4_pretrain.npz')

data['latent']  # np.ndarray, shape (n_frames, feature_dim), dtype float32

Feature dimensions: 768 (ViT-B models) or 1024 (ViT-L models)
Frames with empty masks → NaN vectors
Filename pattern: {video}_ROI_{roi_id}_{model}_{tags}.npz
Tags: ctr (centered), rmbg (background removed)

Cluster ID Mapping (`id.csv`)¶

Created by Submit in Behavior Microscope. Stored in project/cluster/.

Id,Name
0,init
1,grooming
2,rearing
3,locomotion

Column	Type	Description
`Id`	int	Cluster numeric ID
`Name`	string	Human-assigned behavior name

Time Series (`time_series.csv`)¶

Frame-by-frame behavioral state assignments. Stored in project/cluster/.

,behavior
0,1
1,1
2,1
3,3
4,2

Column	Type	Description
(index)	int	Frame index
`behavior`	int	Cluster ID for this frame

-1 indicates unclassified / noise frames
When time_window > 1, values are repeated for each frame in the window (expanded to per-frame resolution)

SRT Subtitles (`.srt`)¶

Standard subtitle format for video overlay. Generated per-video.

1
00:00:00,000 --> 00:00:01,500
grooming

2
00:00:01,500 --> 00:00:03,200
locomotion

Embedding NPZ¶

Saved UMAP coordinates and cluster labels. Stored in project/cluster/.

import numpy as np
data = np.load('cluster_grooming_rearing_.npz')

data['emb']     # np.ndarray, shape (n_samples, 2) — UMAP 2D coordinates
data['cls']     # np.ndarray, shape (n_samples,), dtype int16 — cluster IDs
data['config']  # UMAP configuration used

NaN in emb → frame excluded from analysis
-1 in cls → unclassified frame

Project Directory Layout¶

Complete structure after a full analysis:

projects/my-project/
├── config.json
├── sources/
│   ├── video1.mp4
│   └── video2.mp4
├── label/
│   ├── video1.mp4/
│   │   ├── 0.npz
│   │   └── 247.npz
│   └── video2.mp4/
│       └── 0.npz
├── track/
│   ├── video1.mp4/
│   │   └── mask_list.h5
│   └── video2.mp4/
│       └── mask_list.h5
├── crop/
│   └── video1.mp4/
│       └── video1_ROI_1_crop.mp4
├── latent/
│   └── dinov2_vitb14_reg4_pretrain/
│       ├── video1_ROI_1_dinov2_vitb14_reg4_pretrain.npz
│       └── video2_ROI_1_dinov2_vitb14_reg4_pretrain.npz
└── cluster/
    ├── id.csv
    ├── time_series.csv
    └── cluster_grooming_rearing_.npz

Data Formats¶

Input¶

Video Files¶

Intermediate Files¶

Label Files (.npz)¶

Tracked Masks (mask_list.h5)¶

Cropped Video (.mp4)¶