Step 4: Behavior Analysis¶
The 4. Behavior Microscope tab is where CASTLE's core analysis happens — transforming latent features into interpretable behavioral categories through dimensionality reduction and clustering.
Overview¶
The analysis workflow:
This is an iterative, hierarchical process. You start with broad categories (low magnification) and progressively zoom in to discover finer behavioral syllables.
Getting Started¶
Initialize¶
- Switch to the 4. Behavior Microscope tab
-
In the Input Setting accordion, configure:
Parameter Description Default Select Visual Model Must match the model used in Step 3 dinov3_vitb16Enter ROI ID Comma-separated list (e.g., 1or1,2,3)1Time window (frame) Number of frames to aggregate per data point 1 -
Click Initialize
Time Window
A time window of 1 means each data point represents a single frame. Higher values (e.g., 5 or 10) aggregate consecutive frames, which can smooth noise and capture temporal patterns but reduces temporal resolution.
UMAP Configuration¶
CASTLE provides magnification presets that control how the UMAP dimensionality reduction is performed. The key idea: different n_neighbors values reveal structure at different scales.
Presets¶
Low Magnification (Single-Stage UMAP)¶
Broad behavioral categories. Single UMAP step reducing directly to 2D.
Available presets with different n_neighbors values:
| Preset | n_neighbors | Use Case |
|---|---|---|
| Low-magnification objective 1000 | 1000 | Very broad categories, large datasets |
| Low-magnification objective 500 | 500 | Broad categories |
| Low-magnification objective 300 | 300 | Moderate categories |
| Low-magnification objective 100 | 100 | Default starting point |
| Low-magnification objective 50 | 50 | Finer categories |
| Low-magnification objective 25 | 25 | Fine categories, small datasets |
Configuration format:
Intermediate Magnification (Two-Stage UMAP)¶
Two-step reduction: first to 5D, then to 2D. Captures more structure than single-stage.
| Preset | Stage 1 n_neighbors | Stage 2 n_neighbors |
|---|---|---|
| Intermediate (1000, 500) | 1000 | 500 |
| Intermediate (500, 300) | 500 | 300 |
| Intermediate (300, 100) | 300 | 100 |
| Intermediate (100, 50) | 100 | 50 |
| Intermediate (50, 25) | 50 | 25 |
Configuration format:
[
{"n_neighbors": 300, "min_dist": 0.0, "n_components": 5},
{"n_neighbors": 100, "min_dist": 0.0, "n_components": 2}
]
High Magnification (Two-Stage, Higher Initial Dimension)¶
Two-step reduction: first to 10D, then to 2D. Preserves the most structure for fine-grained analysis.
| Preset | Stage 1 n_neighbors | Stage 2 n_neighbors |
|---|---|---|
| High (1000, 500) | 1000 | 500 |
| High (500, 300) | 500 | 300 |
| High (300, 100) | 300 | 100 |
| High (100, 50) | 100 | 50 |
| High (50, 25) | 50 | 25 |
Configuration format:
[
{"n_neighbors": 300, "min_dist": 0.0, "n_components": 10},
{"n_neighbors": 100, "min_dist": 0.0, "n_components": 2}
]
Custom Configuration¶
You can edit the UMAP config JSON directly for full control. The format is a list of UMAP stages, each with:
n_neighbors: number of nearest neighbors (larger = broader structure)min_dist: minimum distance between points in embedding (0.0 for clustering)n_components: output dimensions for that stage
Running the Analysis¶
1. Generate Embedding¶
- Select a cluster from the Select Cluster dropdown (starts with
init— the full dataset) - Choose a UMAP preset or edit the config manually
- Click Generate Embedding
The UMAP scatter plot appears on the right. Each point represents a data point (frame or time window).

Interactive Exploration
Click on any point in the UMAP plot to see the corresponding video frame. This helps you understand what each region of the embedding represents.
2. Cluster the Embedding¶
- Set the epsilon-neighborhood radius (eps) — controls cluster granularity
- Smaller eps → more clusters (finer categories)
- Larger eps → fewer clusters (broader categories)
- Range: 0.1 to 10.0 (default: 1.0)
- Click Generate Cluster
The plot updates with colors indicating cluster assignments.
3. Label Clusters¶
For each cluster you want to name:
- Enter the Cluster ID (number shown in the plot)
- Enter a Cluster Name (e.g., "grooming", "rearing", "locomotion")
- Click Enter
Tip
Click on points within each cluster to view representative frames. This helps you identify what behavior each cluster represents.
4. Submit¶
Click Submit to:
- Import the labeled clusters into the main analysis
- Generate a syllable plot (ethogram)
- Export CSV files (behavior IDs and time series)
- Generate SRT subtitle files for video overlay
- Save the embedding data
Hierarchical Analysis¶
The power of CASTLE's "Behavior Microscope" comes from iterative refinement:
- Start broad: use Low Magnification to identify major behavioral categories
- Zoom in: select a specific cluster, then re-run UMAP at Intermediate or High Magnification
- Refine: each cluster can be further subdivided into finer syllables
- Repeat: continue until you reach the desired granularity
This mirrors how a microscope works — you start with low magnification to find areas of interest, then zoom in for detail.

Outputs¶
After submitting, the following are generated:
| Output | Format | Description |
|---|---|---|
| Syllable Plot | Interactive plot | Timeline of behavioral states |
| Behavior ID CSV | .csv |
Mapping of cluster IDs to names |
| Time Series CSV | .csv |
Frame-by-frame cluster assignments |
| SRT Subtitles | .srt |
Behavioral labels as video subtitles |
| Embedding NPZ | .npz |
UMAP coordinates and cluster labels |
Tips¶
- Start with Low Magnification 100 as your first exploration
- eps = 1.0 is a good starting point for clustering
- If clusters are too noisy, try a larger n_neighbors value
- If behaviors are merged together, try higher magnification or smaller eps
- [HUMAN TO CONFIRM: Additional tips from real usage experience]
Next Step¶
Once you've identified and labeled behavioral clusters, proceed to Step 5: Export Results.