Step 4: Behavior Analysis¶

The 4. Behavior Microscope tab is where CASTLE's core analysis happens — transforming latent features into interpretable behavioral categories through dimensionality reduction and clustering.

Overview¶

The analysis workflow:

Latent Vectors → Initialize → Select Cluster → UMAP Embedding → DBSCAN Clustering → Label & Submit

This is an iterative, hierarchical process. You start with broad categories (low magnification) and progressively zoom in to discover finer behavioral syllables.

Getting Started¶

Initialize¶

Switch to the 4. Behavior Microscope tab

In the Input Setting accordion, configure:

Parameter	Description	Default
Select Visual Model	Must match the model used in Step 3	`dinov3_vitb16`
Enter ROI ID	Comma-separated list (e.g., `1` or `1,2,3`)	`1`
Time window (frame)	Number of frames to aggregate per data point	`1`

Click Initialize

Time Window

A time window of 1 means each data point represents a single frame. Higher values (e.g., 5 or 10) aggregate consecutive frames, which can smooth noise and capture temporal patterns but reduces temporal resolution.

UMAP Configuration¶

CASTLE provides magnification presets that control how the UMAP dimensionality reduction is performed. The key idea: different n_neighbors values reveal structure at different scales.

Presets¶

Low Magnification (Single-Stage UMAP)¶

Broad behavioral categories. Single UMAP step reducing directly to 2D.

Available presets with different n_neighbors values:

Preset	n_neighbors	Use Case
Low-magnification objective 1000	1000	Very broad categories, large datasets
Low-magnification objective 500	500	Broad categories
Low-magnification objective 300	300	Moderate categories
Low-magnification objective 100	100	Default starting point
Low-magnification objective 50	50	Finer categories
Low-magnification objective 25	25	Fine categories, small datasets

Configuration format:

[
    {
        "n_neighbors": 100,
        "min_dist": 0.0,
        "n_components": 2
    }
]

Intermediate Magnification (Two-Stage UMAP)¶

Two-step reduction: first to 5D, then to 2D. Captures more structure than single-stage.

Preset	Stage 1 n_neighbors	Stage 2 n_neighbors
Intermediate (1000, 500)	1000	500
Intermediate (500, 300)	500	300
Intermediate (300, 100)	300	100
Intermediate (100, 50)	100	50
Intermediate (50, 25)	50	25

Configuration format:

[
    {"n_neighbors": 300, "min_dist": 0.0, "n_components": 5},
    {"n_neighbors": 100, "min_dist": 0.0, "n_components": 2}
]

High Magnification (Two-Stage, Higher Initial Dimension)¶

Two-step reduction: first to 10D, then to 2D. Preserves the most structure for fine-grained analysis.

Preset	Stage 1 n_neighbors	Stage 2 n_neighbors
High (1000, 500)	1000	500
High (500, 300)	500	300
High (300, 100)	300	100
High (100, 50)	100	50
High (50, 25)	50	25

Configuration format:

[
    {"n_neighbors": 300, "min_dist": 0.0, "n_components": 10},
    {"n_neighbors": 100, "min_dist": 0.0, "n_components": 2}
]

Custom Configuration¶

You can edit the UMAP config JSON directly for full control. The format is a list of UMAP stages, each with:

n_neighbors: number of nearest neighbors (larger = broader structure)
min_dist: minimum distance between points in embedding (0.0 for clustering)
n_components: output dimensions for that stage

Running the Analysis¶

1. Generate Embedding¶

Select a cluster from the Select Cluster dropdown (starts with init — the full dataset)
Choose a UMAP preset or edit the config manually
Click Generate Embedding

The UMAP scatter plot appears on the right. Each point represents a data point (frame or time window).

UMAP embedding

Interactive Exploration

Click on any point in the UMAP plot to see the corresponding video frame. This helps you understand what each region of the embedding represents.

2. Cluster the Embedding¶

Set the epsilon-neighborhood radius (eps) — controls cluster granularity
- Smaller eps → more clusters (finer categories)
- Larger eps → fewer clusters (broader categories)
- Range: 0.1 to 10.0 (default: 1.0)
Click Generate Cluster

The plot updates with colors indicating cluster assignments.

3. Label Clusters¶

For each cluster you want to name:

Enter the Cluster ID (number shown in the plot)
Enter a Cluster Name (e.g., "grooming", "rearing", "locomotion")
Click Enter

Tip

Click on points within each cluster to view representative frames. This helps you identify what behavior each cluster represents.

4. Submit¶

Click Submit to:

Import the labeled clusters into the main analysis
Generate a syllable plot (ethogram)
Export CSV files (behavior IDs and time series)
Generate SRT subtitle files for video overlay
Save the embedding data

Hierarchical Analysis¶

The power of CASTLE's "Behavior Microscope" comes from iterative refinement:

Start broad: use Low Magnification to identify major behavioral categories
Zoom in: select a specific cluster, then re-run UMAP at Intermediate or High Magnification
Refine: each cluster can be further subdivided into finer syllables
Repeat: continue until you reach the desired granularity

This mirrors how a microscope works — you start with low magnification to find areas of interest, then zoom in for detail.

Hierarchical classification

Outputs¶

After submitting, the following are generated:

Output	Format	Description
Syllable Plot	Interactive plot	Timeline of behavioral states
Behavior ID CSV	`.csv`	Mapping of cluster IDs to names
Time Series CSV	`.csv`	Frame-by-frame cluster assignments
SRT Subtitles	`.srt`	Behavioral labels as video subtitles
Embedding NPZ	`.npz`	UMAP coordinates and cluster labels

Tips¶

Start with Low Magnification 100 as your first exploration
eps = 1.0 is a good starting point for clustering
If clusters are too noisy, try a larger n_neighbors value
If behaviors are merged together, try higher magnification or smaller eps
[HUMAN TO CONFIRM: Additional tips from real usage experience]

Next Step¶

Once you've identified and labeled behavioral clusters, proceed to Step 5: Export Results.