Data Formats¶
Reference for all file formats CASTLE reads and writes.
Input¶
Video Files¶
CASTLE uses PyAV (FFmpeg wrapper) for video I/O.
Supported formats: MP4, AVI, MOV, WMV, FLV, MKV
Supported extensions: .mp4, .avi, .mov, .wmv, .flv, .mkv
Recommended: MP4 with H.264 codec for best compatibility and performance.
No resolution or frame rate limitations — CASTLE processes whatever the video contains.
Intermediate Files¶
Label Files (.npz)¶
Created by the Label ROI step. Stored in project/label/{video_name}/.
import numpy as np
data = np.load('0.npz')
data['frame'] # np.ndarray, shape (H, W, 3), dtype uint8 — RGB frame
data['mask'] # np.ndarray, shape (H, W, N_ROIS), dtype uint8 — per-ROI masks
- Filename is the frame index (e.g.,
0.npz,247.npz) - Each ROI is a separate channel in the mask array
- ROI colors are assigned sequentially
Tracked Masks (mask_list.h5)¶
Created by the Tracking step. Stored in project/track/{video_name}/.
HDF5 file using H5IO wrapper:
from castle.utils.h5_io import H5IO
tracker = H5IO('mask_list.h5')
mask = tracker[frame_index] # np.ndarray, shape (H, W), dtype uint8
n_frames = len(tracker) # Total number of frames
n_rois = tracker.get_n_rois() # Number of tracked ROIs
- Each frame is stored as a gzip-compressed dataset keyed by frame index (string)
- Metadata keys:
total_frames,n_rois - Mask values encode ROI IDs as pixel colors
Cropped Video (.mp4)¶
Created by Extract Crop Video. Stored in project/crop/{video_name}/.
- Filename:
{video_basename}_ROI_{id}_crop.mp4 - Standard MP4 video of the aligned/cropped ROI
Output Files¶
Latent Features (.npz)¶
Created by Extract Latent. Stored in project/latent/{model_name}/.
import numpy as np
data = np.load('video_ROI_1_dinov2_vitb14_reg4_pretrain.npz')
data['latent'] # np.ndarray, shape (n_frames, feature_dim), dtype float32
- Feature dimensions: 768 (ViT-B models) or 1024 (ViT-L models)
- Frames with empty masks → NaN vectors
- Filename pattern:
{video}_ROI_{roi_id}_{model}_{tags}.npz - Tags:
ctr(centered),rmbg(background removed)
Cluster ID Mapping (id.csv)¶
Created by Submit in Behavior Microscope. Stored in project/cluster/.
| Column | Type | Description |
|---|---|---|
Id |
int | Cluster numeric ID |
Name |
string | Human-assigned behavior name |
Time Series (time_series.csv)¶
Frame-by-frame behavioral state assignments. Stored in project/cluster/.
| Column | Type | Description |
|---|---|---|
| (index) | int | Frame index |
behavior |
int | Cluster ID for this frame |
-1indicates unclassified / noise frames- When
time_window > 1, values are repeated for each frame in the window (expanded to per-frame resolution)
SRT Subtitles (.srt)¶
Standard subtitle format for video overlay. Generated per-video.
Embedding NPZ¶
Saved UMAP coordinates and cluster labels. Stored in project/cluster/.
import numpy as np
data = np.load('cluster_grooming_rearing_.npz')
data['emb'] # np.ndarray, shape (n_samples, 2) — UMAP 2D coordinates
data['cls'] # np.ndarray, shape (n_samples,), dtype int16 — cluster IDs
data['config'] # UMAP configuration used
- NaN in
emb→ frame excluded from analysis -1incls→ unclassified frame
Project Directory Layout¶
Complete structure after a full analysis:
projects/my-project/
├── config.json
├── sources/
│ ├── video1.mp4
│ └── video2.mp4
├── label/
│ ├── video1.mp4/
│ │ ├── 0.npz
│ │ └── 247.npz
│ └── video2.mp4/
│ └── 0.npz
├── track/
│ ├── video1.mp4/
│ │ └── mask_list.h5
│ └── video2.mp4/
│ └── mask_list.h5
├── crop/
│ └── video1.mp4/
│ └── video1_ROI_1_crop.mp4
├── latent/
│ └── dinov2_vitb14_reg4_pretrain/
│ ├── video1_ROI_1_dinov2_vitb14_reg4_pretrain.npz
│ └── video2_ROI_1_dinov2_vitb14_reg4_pretrain.npz
└── cluster/
├── id.csv
├── time_series.csv
└── cluster_grooming_rearing_.npz