Skip to content

Comparative-Ethology Gap-Closure Scoping — Three Concrete Spike Protocols

Date: 2026-05-13 Branch: research/comparative-ethology-age-sociality Author: Subagent gap-closure scoping for srmech Status: Research notes only (no PR) Prior: comparative_ethology_age_sociality_literature_review_2026-05-13.md


Citation corrections vs prior lit review

PDF-verification (per project discipline) caught two errors in the prior lit-review's reference list. Carrying them forward here:

  • Silk, Finn, Porter & Pinter-Wollman (2018) — the prior review listed this as Current Zoology 64(1):27-44, DOI 10.1093/cz/zox071. That DOI resolves to a different paper (Bierbach, Arias-Rodriguez, Plath 2018 on sulfide-adapted fish). Correct citation: Trends in Ecology & Evolution 33(6):376–378, DOI 10.1016/j.tree.2018.03.008. The longer companion methods piece is Finn, Silk, Porter & Pinter-Wollman (2019) Animal Behaviour 149:7-22, arXiv:1712.01790.
  • Turner, Bills & Holekamp (2018) "Ontogenetic change in determinants of social network position in the spotted hyena" Behav Ecol Sociobiol 72(1):11, DOI 10.1007/s00265-017-2426-x. Prior review attributed authorship to "Smith, Memenis & Holekamp"; that ordering does not match the published article. The Holekamp-lab connection is correct; the authorship is not.
  • Iacopini et al. has now been formally published (was a 2023 preprint at first-listing): Phil Trans R Soc B 379:20230190 (2024), DOI 10.1098/rstb.2023.0190. arXiv:2309.03783 confirmed.

Gap #1 — Hodge-Laplacian / Betti-number tracking across ontogenetic-stage simplicial complexes

A. The gap, as testable proposition

Proposition (G1): For a longitudinally-observed social population stratified into ontogenetic stages s ∈ {cub/fledgling, juvenile, sub-adult, adult, post-reproductive}, the Hodge decomposition of the stage-specific clique complex X_s yields a Betti-number trajectory (B_0(s), B_1(s), B_2(s)) that distinguishes species exhibiting the canonical juvenile-cooperation → adult-exclusion transition (chimpanzees, hyenas, elephants, marmots) from species lacking it (bonobos, resident orcas). Specifically: B_1 (loop content of the social complex) is predicted to be non-monotone in stage for transition-exhibiting species (peak in juvenile/sub-adult), and flat or monotone in non-transition species. Falsifiable as: if the within-species across-stage B_1 trajectory is statistically indistinguishable across all sampled species (after permutation null), the proposition is refuted.

B. Accessible datasets for first pilot

Primary candidate: Wytham/Wild et al. 2024 great tit ontogeny dataset.

  • Wild, S., Alarcón-Nieto, G., Aplin, L. (2024). "Data for: Social network data in wild great tits during ontogeny." Dryad. DOI 10.5061/dryad.x95x69ps8.
  • License: CC0 (Dryad default; user should re-verify).
  • Content: ~208 MB; RFID feeder visits across 4 seasons (summer through spring) of juvenile Parus major life; per-individual age class labels (fledgling, juvenile, adult), nestbox metadata, weekly group-by-individual matrices.
  • Critical for Gap #1: explicit ontogenetic stratification is built into the dataset's RDA files. Several thousand individual-week observation rows; precise count not published in landing page.

Secondary candidate: Mara Hyena Project CSVs (Holekamp Lab).

  • holekamplab.org/project-database.html — CSV bulk downloads; GitHub mirror at github.com/MaraHyenaProject hosts per-paper analysis scaffolding.
  • License unclear from landing page; user should email lab for confirmation before publishing pilot.
  • 27 years of observational records, multiple clans, three-stage ontogeny (communal-den, den-independent pre-reproductive, early-adult) is the framing used in Turner, Bills & Holekamp (2018) — so dataset is known to support the stratification.

Tertiary candidate: Amboseli Elephant Research Project.

  • elephanttrust.org (custodian: Amboseli Trust for Elephants). Five decades, ~3000 individuals. Public data-availability protocol is not documented in landing-page material I could verify — likely requires custodian-permission. Mark as paywalled-ish / custodian-gated.

Cayo Santiago rhesus macaques — referenced in lit review; data is collected but redistribution is gated through Caribbean Primate Research Center. Treat as conditional.

C. First-spike protocol sketch

INPUT: edge-list E = {(i, j, t, w)} with individual IDs i,j; timestamp t; weight w (interaction count / duration / RFID co-presence).
INPUT: metadata M = {i -> age_class_at_time(t)}.

STEP 1. Stage-stratify E. For each ontogenetic stage s, restrict to edges where both endpoints
  are in stage s at time t. Aggregate over a stage-appropriate window (e.g., per-season for tits;
  per-life-phase for hyenas). Output: G_s = weighted graph for stage s.

STEP 2. Threshold to get a binary skeleton. Use a principled threshold sweep (not a single cut):
  for each percentile p ∈ {0.50, 0.70, 0.85, 0.95} of edge weights, build G_s(p).
  This is the persistence parameter analogue.

STEP 3. Build the flag (Vietoris-Rips-style) clique complex X_s(p). Library: gudhi (Python) or
  TopoNetX. Triangles are 2-simplices iff all 3 pairs co-interact above threshold; tetrahedra
  similarly. Cap dimension at k=3 (computational tractability + biological interpretability:
  cliques larger than 4 rare in animal data).

STEP 4. Compute Hodge Laplacians L_0, L_1, L_2 of X_s(p). Library: HodgeLaplacians (tsitsvero,
  github.com/tsitsvero/hodgelaplacians) or TopoNetX. L_k = B_{k+1} B_{k+1}^T + B_k^T B_k where
  B_k is the k-th boundary matrix.

STEP 5. Betti numbers: B_k(s,p) = dim ker(L_k). (Equivalently, rank deficiency.)

STEP 6. Per-stage Betti trajectory: B_k(s) := median over p of B_k(s,p), with bootstrap CI.

STEP 7. Test the proposition. Compute D_1 := max_s B_1(s) - min_s B_1(s) for each species.
  Permutation null: shuffle stage labels within species, recompute D_1, repeat 1000x.
  If observed D_1 exceeds 95% of null draws *and* sign of trajectory matches juvenile-peak
  prediction, accept G1 for that species. Compare species-level outcomes (transition-exhibiting
  vs non-exhibiting) by Fisher's exact.

STEP 8. (Stretch) Hodge-decompose interaction-flow signals on edges into harmonic + gradient
  + curl components. Schaub-Benson-Horn-Lippner-Jadbabaie (2020 SIAM Rev) framework. The
  harmonic component is what survives both div=0 and curl=0 — the genuinely cyclic
  cohomological signal.

FALSIFIABILITY: if across all four transition-exhibiting species, D_1 sits inside the
  permutation null (juvenile-stage B_1 indistinguishable from adult-stage), the proposition is
  refuted — sociality transitions do not have a cycle-content signature.

Input shape: edge-list with O(104-106) rows depending on species. Output shape: 6-row table (3 Betti numbers × 2 statistical-significance flags) per species per stage.

D. Priority citations (PDF-verified)

  1. Wild, Alarcón-Nieto & Aplin (2024). "Data for: Social network data in wild great tits during ontogeny." Dryad, DOI 10.5061/dryad.x95x69ps8. VERIFIED via Dryad landing page. The data target.
  2. Turner, Bills & Holekamp (2018). "Ontogenetic change in determinants of social network position in the spotted hyena." Behav Ecol Sociobiol 72(1):11, DOI 10.1007/s00265-017-2426-x. VERIFIED via PubMed / SpringerLink summary. The three-stage stratification used.
  3. Schaub, Benson, Horn, Lippner & Jadbabaie (2020). "Random Walks on Simplicial Complexes and the Normalized Hodge 1-Laplacian." SIAM Review 62(2):353–391, DOI 10.1137/18M1201019. VERIFIED via SIAM Review listing. Methods anchor.
  4. Iacopini, Foote, Fefferman, Derryberry & Silk (2024). "Not your private tête-à-tête: leveraging the power of higher-order networks to study animal communication." Phil Trans R Soc B 379:20230190, DOI 10.1098/rstb.2023.0190 (arXiv:2309.03783). VERIFIED via arXiv + Royal Society listing. Field's only ASNA-adjacent simplicial-complex paper to date.
  5. Topaz, Ziegelmeier & Halverson (2015). "Topological data analysis of biological aggregation models." PLOS One 10(5):e0126383, DOI 10.1371/journal.pone.0126383. Verified via lit review prior; bridge case for biological-TDA precedent (flocks/swarms, not ontogeny).
  6. Woodman, Gokcekus, Beck, Green, Nussey & Firth (2024). "The ecology of ageing in wild societies." Phil Trans R Soc B 379(1916):20220464, DOI 10.1098/rstb.2022.0464. VERIFIED via PMC. Framing review.
  7. Wee & Xia (2022) persistent Laplacians on biomolecules — closest biological-Hodge precedent. Sci Rep. Citation unverified at the volume/page level in this scoping; user should resolve before citing in a paper.

E. Load-bearing risk

The likely failure mode: Animal social network data is heavy-tailed in edge weight and sparse in higher-order structure. Real triangles (genuine 3-way associations all-pairwise-elevated) may be rare enough at stage-restricted subsets that B_2 is trivially zero for all species/stages, and B_1 (cycle content) is dominated by sampling thresholds rather than biology. If the threshold sweep in STEP 2 shows B_k values that track the threshold p rather than track the stage s, the spike has measured an artefact of edge-weight tail behaviour, not a biological signal. This is mitigated by (a) within-species statistical contrasts that hold p fixed, (b) the permutation-null in STEP 7 which controls for it, and © the multi-species comparison which would be more vulnerable if real but is also more falsifying-power. Honest possibility: the spike runs, the Betti numbers are flat and threshold-driven, and the answer is "Hodge on raw ASNA is not the right lift; weighted-simplicial-Laplacian on signed flow fields would be."


Gap #2 — Base × fibre decomposition of multilayer animal social networks

A. The gap, as testable proposition

Proposition (G2): A multilayer animal social network with K interaction-type layers L_1...L_K (play, agonism, affiliation, grooming, mating) over a node set V admits a useful fibre-bundle factorisation iff there exists a base space B (life-history-stage manifold, kin-graph, or spatial habitat partition) and a projection π: V → B such that the multilayer adjacency tensor A[i,j,k] factors as π-pulled-back base structure × fibre-localised perturbation, with a residual that is statistically smaller than the layered tensor itself. Concrete operational form: test whether π(V) recovers life-history stage better than chance under unsupervised spectral coclustering of the multilayer tensor. Refuted if the spectrum of the supra-Laplacian (Sola, Romance et al. 2013 / De Domenico et al. 2013) shows no clean separation between intra-base and intra-fibre eigenvalue regimes.

A.i. Why fibre-bundle, not just multilayer

Existing multilayer ASNA (Silk et al. 2018; Finn et al. 2019) treats interaction-type layers as parallel networks with inter-layer couplings. A fibre-bundle framing is stricter: it asserts a base space B with intrinsic geometry (e.g., the totally-ordered life-history-stage manifold {0, 1, 2, 3, 4} with adjacency only between consecutive stages, or a kin-graph as a tree) and asks whether the multilayer structure is equivariant with respect to base-space transformations. This is testable; "multilayer" alone is not.

B. Accessible datasets for first pilot

Primary candidate: Same Wytham/Wild 2024 great tit dataset (Gap #1's primary).

  • Multilayer reconstruction available: RFID co-presence ≠ a single relationship; this dataset's per-week aggregations could be split into "siblings-only," "parent-tracking," and "non-kin peer" layers using the included nest-box pedigree. Three natural layers from one observation modality.

Primary alternative: Pinter-Wollman / Carbone & Pinter-Wollman ant or marmot multilayer datasets.

  • Wey & Blumstein (2010) marmot data is in supplementary of Animal Behaviour 79(6):1343–1352, DOI 10.1016/j.anbehav.2010.03.008. Lit-review-cited; redistribution status check pending. Has affiliation + agonism layers.

Secondary candidate: Sosa et al. (2020) ANTs R package companion datasets.

  • Sci Rep 10:12507, DOI 10.1038/s41598-020-69265-8. Package: github.com/SebastianSosa/ANTs. Several worked-example multilayer datasets for chimps, macaques, baboons. License of bundled data uncertain; needs check.

Speculative tertiary: Hare et al. bonobo cooperative-task experiments (2007).

  • Current Biology 17(7):619-623. Counter-example anchor for the proposition; but individual-level interaction data not obviously available as a network-shaped dataset. Likely needs author contact. Mark unconfirmed.

C. First-spike protocol sketch

INPUT: multilayer interaction tensor A[i, j, k] for i,j ∈ V (individuals), k ∈ {1..K} layers
       (play, agonism, affiliation, ...).
INPUT: candidate base-space partition functions π_α: V -> B_α for candidates α ∈ {life-stage,
       kin-class, spatial-territory}.

STEP 1. Build the supra-Laplacian L_supra of the multilayer network. Block-diagonal layer
  Laplacians + inter-layer coupling identity blocks scaled by ω. Standard reference: De Domenico
  et al. 2013 "Mathematical Formulation of Multilayer Networks" Phys Rev X 3:041022.

STEP 2. Spectral decomposition of L_supra. Compute the first ~50 eigenvectors. Plot the
  eigenvalue gap structure.

STEP 3. For each candidate π_α, define the *base-quotient* graph: G_α has nodes B_α, edges
  weighted by mean inter-block interaction across all layers. Compute L(G_α) and its
  eigenvalues.

STEP 4. Fibre extraction. For each base-block b ∈ B_α, restrict the multilayer tensor to
  V_b := π_α^{-1}(b). This gives a stack of "fibre tensors" F_b[i, j, k] over base point b.

STEP 5. Test base-fibre factorisation. Compute:
   R_α := || A - (π_α^* L_base ⊗ F_avg) ||_F / || A ||_F
  where F_avg is the average fibre tensor and the tensor product is the natural pullback
  along π_α. R_α near zero means π_α is a *good* base; near 1 means π_α is no better than
  a random partition.

STEP 6. Null comparison. Compute R for 100 random partitions of V into |B_α|-many blocks.
  Empirical-p-value: where does R_α fall in the null distribution?

STEP 7. Cross-species: repeat for ≥3 species (great-tit ontogeny base / marmot kin base /
  bonobo non-base). The conjecture is that *life-history-stage* is a good base for
  transition-exhibiting species but a poor base for bonobos (where the relevant base, if
  one exists, would be coalition-membership not age).

STEP 8. (Stretch) If a base is identified, examine the fibre cohomology — the residual
  variation of F_b across b that does NOT factor through the base. This is the
  "purely-fibre" signal that the bundle structure does not capture.

FALSIFIABILITY: if every candidate π_α (including life-stage, kin, territory) has R_α
  indistinguishable from random partition, no base-fibre decomposition is supported by the
  data. Reject G2.

Input shape: multilayer tensor of shape |V| × |V| × K, K ∈ {2-5}. Output shape: per-candidate-base scalar R_α + p-value.

D. Priority citations (PDF-verified)

  1. Silk, Finn, Porter & Pinter-Wollman (2018). Trends in Ecology & Evolution 33(6):376–378, DOI 10.1016/j.tree.2018.03.008. VERIFIED via PMC PMC5962412. The conceptual ASNA-multilayer formulation. (Note: prior lit review's DOI was wrong; corrected here.)
  2. Finn, Silk, Porter & Pinter-Wollman (2019). "The use of multilayer network analysis in animal behaviour." Animal Behaviour 149:7-22, arXiv:1712.01790. VERIFIED via arXiv. The full methods companion.
  3. De Domenico, Solé-Ribalta, Cozzo, Kivelä, Moreno, Porter, Gómez & Arenas (2013). "Mathematical Formulation of Multilayer Networks." Phys Rev X 3:041022. Citation from training data; supra-Laplacian formalism standard reference. User should re-verify DOI/volume before any external publication.
  4. Sosa, Sueur, Puga-Gonzalez & Poisot (2021). "Network measures in animal social network analysis: their strengths, limits, interpretations and uses." Methods Ecol Evol 12(1):10-21, DOI 10.1111/2041-210X.13366. VERIFIED via Wiley/Methods Ecol Evol. Field baseline of network-metric vocabulary.
  5. Sosa, Puga-Gonzalez, Hu, Pansanel, Xie & Sueur (2020). "ANTs R package." Sci Rep 10:12507, DOI 10.1038/s41598-020-69265-8. VERIFIED via PMC PMC7385643. Toolkit + companion data.
  6. Wey & Blumstein (2010). "Social cohesion in yellow-bellied marmots is established through age and kin structuring." Animal Behaviour 79(6):1343–1352, DOI 10.1016/j.anbehav.2010.03.008. Citation from prior lit review; not re-verified in this scoping.
  7. Iacopini et al. (2024). As in Gap #1. Higher-order network framing for ASNA.

E. Load-bearing risk

The likely failure mode: Multilayer ASNA datasets are usually constructed after-the-fact from a single observation stream (focal-animal sampling, or RFID co-presence) by re-categorising the same event-record into layer types. The layers are therefore not statistically independent — they share underlying sampling decisions. The "base × fibre" residual R_α could measure the non-independence between layers (a constant offset) rather than any genuine base-space structure. To rule this out, the protocol would need a dataset with genuinely separate observation modalities per layer (e.g., RFID for affiliation + camera trap for agonism + GPS for ranging). I am not aware of such a dataset in the wild-population literature; this would have to be assembled from a captive-group study or via custodian collaboration. Honest scenario: the spike runs, R_α looks promising for life-stage base, but post-hoc the result is attributed to the shared-observation-pipeline artefact and the spike resolves to "we need a multimodally-observed dataset first." That itself is a useful finding — it tells the field what data is needed.


Gap #3 — Representation-theoretic invariants computed identically across species (Hill-Bentley-Dunbar branching-ratio-≈-3)

A. The gap, as testable proposition (with strong honest-negative caveat)

Proposition (G3): The consistent branching ratio ≈ 3 reported by Hill, Bentley & Dunbar (2008) across elephant, gelada, hamadryas, and orca multi-level societies — and replicated in human cognitive-network studies (5/15/50/150/500 layer structure, Mac Carron, Kaski & Dunbar 2016) — is either (i) a structure-constant trace artefact of a small representation acting on the underlying social interaction algebra, or (ii) a cognitive-bandwidth / time-budget artefact in which each layer is allocated ≈ ⅓ of the relationship-maintenance effort of the inner layer (Dunbar's own preferred explanation). G3 asks whether (i) is testable in a way that distinguishes it from (ii) — and the honest answer is that on present evidence, (ii) is the parsimonious explanation and (i) needs a much sharper sharpening before it earns a spike.

A.i. The honest-negative scaffold (read first)

Dunbar's own preferred explanation already exists:

  • Mac Carron, Kaski & Dunbar (2016) "Calling Dunbar's Numbers" Social Networks 47:151-155, arXiv:1604.02400, DOI 10.1016/j.socnet.2016.06.003 — explicitly attributes the 3× layer ratio to time-budget allocation per relationship class, fitted by mobile-phone-call frequency clustering. The 3× ratio falls out of "each next-outer layer gets ⅓ of the time-investment-per-individual of the layer inside it." That is a budget-allocation explanation, not a representation-theory explanation.
  • Dunbar (2021) "Social complexity and the fractal structure of group size in primate social evolution" Biological Reviews 96:1889, DOI 10.1111/brv.12730 — same author re-grounds the fractal structure in cognitive-bandwidth (neocortex-size-correlated) constraints. Again a cost-allocation account.
  • The ≈ 3 ratio is not robust: orcas already show 3.8 in Hill-Bentley-Dunbar 2008. If the ratio were genuinely Lie-algebraic (say, the index of an SO(3) ↓ SO(2) branching, or the dim of the adjoint of su(2) = 3), it would not drift to 3.8 for marine mammals.

The Lie-algebra reading is intriguing but speculative. No published derivation that I could verify connects a specific Lie algebra to the cross-species branching-ratio finding. A rank-2 Lie algebra (e.g., su(3)) has structure constants f^a_{bc} taking values in {0, ±½, ±1/√3, ...} depending on basis; none of these obviously delivers "3." The 3-fold dihedral group D_3 has order 6 and acts on a triangle, but is not the same object as a continuous branching ratio. It is not at all clear that any concrete representation-theoretic object predicts 3.

Conclusion of honest-negative pass: G3 as stated (Lie-algebraic structure-constant artefact predicting ≈3) appears to dissolve under inspection. The 3× ratio has a mundane and field-accepted cognitive-bandwidth explanation, and no published Lie-algebra link survives even informal sanity-checking against the orca = 3.8 datapoint. However, G3 can be reframed (next section) into a sharper question that is testable — about whether humans show a different invariant than other species, not about whether the invariant itself is Lie-algebraic.

A.ii. The reframed proposition (G3′)

Proposition (G3′): For a structural invariant that is not the branching ratio but rather the spectrum of the hierarchical-clustering modularity tree (specifically, the largest few Laplacian eigenvalues of the dendrogram-induced quotient graph at each Horton-Strahler order), an identically-computed invariant on chimpanzee, bonobo, hyena, marmot, and human social datasets will show bonobos as the outlier and humans as within-species range of the great-ape group, not exceptional. Falsifiable by: if humans sit outside the species-cluster envelope (≥2σ in the multivariate invariant space), the "humans-are-not-exceptional" framing is refuted. If bonobos do not sit at the tolerance/cooperation end of the axis, the framing is also refuted (the predicted counter-example anchor must hold).

G3′ replaces "Lie-algebraic structure-constant" with "computed-identically Laplacian-spectral fingerprint" — a much more defensible methodology that retains the cross-species comparison the user's framing implies.

B. Accessible datasets for first pilot

For G3′, the species set is the constraint, not the method. Need at least:

  • Chimpanzees: Goodall / Gombe Stream Research Centre archives; data redistribution gated. Mark unconfirmed.
  • Bonobos: Wamba / Kokolopori; even more gated. Mark unconfirmed.
  • Hyenas: Holekamp CSVs (Gap #1 candidate, public).
  • Marmots: Wey & Blumstein 2010 supplementary, lit-review-cited.
  • Humans: Mac Carron, Kaski & Dunbar (2016) call-record clustering data is not publicly available; the underlying mobile dataset is custodian-gated. Alternative: published-network datasets like SocioPatterns RFID conferences (sociopatterns.org), but these are not multi-level in the Dunbar sense and conflate co-location with friendship.

Realistic first-pilot would be: hyena + marmot + great-tit (Gap #1's dataset) + simulated-human comparator built from Mac Carron et al.'s published layer sizes (5, 15, 50, 150) reconstructed as a hierarchical tree. This is not a great experimental design — it leans on simulated comparator for the headline cross-species species. Better-but-slower: arrange a data-use agreement with Gombe / Wamba consortia (timeline: months to years).

C. First-spike protocol sketch (for G3′)

INPUT: weighted social network G = (V, E, w) per species.

STEP 1. Build hierarchical-clustering dendrogram D using a consistent linkage (average-linkage on
  edge-weight-distance) and a consistent cut algorithm. Library: scipy.cluster.hierarchy.

STEP 2. Read off Horton-Strahler orders on D. At each order h, compute the order-h *quotient*
  graph Q_h: contract all subtrees of order < h to single nodes, keep edges as weighted sums.

STEP 3. Compute the normalized Laplacian L(Q_h) for each h ∈ {1, 2, 3, 4}. Extract the first
  k=5 eigenvalues λ_1(h),...,λ_5(h).

STEP 4. Build the species fingerprint F_species = [λ_1(1), ..., λ_5(4)] — a 20-component
  vector. CRITICAL: this is *identically computed* across species; no per-species tuning.

STEP 5. PCA across species fingerprints. The user's conjecture predicts: humans (computed from
  whatever proxy is available) sit within the great-ape cluster; bonobos are PC-1-distant
  in the tolerance-direction.

STEP 6. Hypothesis test: against a permuted-species null (shuffle species labels), is the
  bonobo-distance significantly above null? Is the human-position within-cluster?

STEP 7. Report failures honestly. If the human comparator is too synthetic to be credible
  (which it likely is on a first pilot), document this and gate the conclusion: "result is
  conditional on real human data being acquired."

FALSIFIABILITY: if bonobos sit *with* chimpanzees in the fingerprint space rather than as
  outliers, the user's tolerance-counter-example framing is refuted at the methodological
  level. If humans sit *outside* the great-ape envelope, the humans-not-exceptional framing
  is refuted.

NOT-A-FALSIFICATION: if the branching ratio comes out close to 3 across species, that does
  not validate the Lie-algebra reading — the cognitive-bandwidth explanation already accounts
  for it.

Input shape: per-species edge-list. Output shape: 5-row × 20-column species × fingerprint matrix + PCA loadings.

D. Priority citations (PDF-verified)

  1. Hill, Bentley & Dunbar (2008). "Network scaling reveals consistent fractal pattern in hierarchical mammalian societies." Biol Lett 4(6):748-751, DOI 10.1098/rsbl.2008.0393. VERIFIED via PubMed + Wikidata Q28755915. The branching-ratio source.
  2. Mac Carron, Kaski & Dunbar (2016). "Calling Dunbar's numbers." Social Networks 47:151-155, DOI 10.1016/j.socnet.2016.06.003, arXiv:1604.02400. VERIFIED via arXiv + ScienceDirect listing. The cognitive-bandwidth alternative explanation — primary honest-negative anchor.
  3. Dunbar (2021). "Social complexity and the fractal structure of group size in primate social evolution." Biol Reviews 96(5):1889, DOI 10.1111/brv.12730. VERIFIED title via Wiley listing; abstract paywalled, citation unverified at full-content level — paywalled.
  4. Hare, Melis, Woods, Hastings & Wrangham (2007). "Tolerance allows bonobos to outperform chimpanzees on a cooperative task." Current Biology 17(7):619–623, DOI 10.1016/j.cub.2007.02.040. Lit-review-cited; not re-verified in this scoping. The bonobo-counter-example anchor.
  5. Newman (2006). "Finding community structure in networks using the eigenvectors of matrices." Phys Rev E 74:036104, DOI 10.1103/PhysRevE.74.036104, arXiv:physics/0605087. VERIFIED. The spectral-modularity baseline G3′ leans on.
  6. Snyder-Mackler, Burger, Gaydosh et al. (2020). "Social determinants of health and survival in humans and other animals." Science 368:eaax9553, DOI 10.1126/science.aax9553, PMID 32439765. VERIFIED. The cross-species health-and-sociality synthesis — establishes that identically-computed cross-species comparison is methodologically respectable.
  7. Brent, Franks, Foster, Balcomb, Cant & Croft (2015). "Ecological knowledge, leadership, and the evolution of menopause in killer whales." Current Biology 25(6):746-750. Lit-review-cited (orca natal-philopatry counter-anchor); not re-verified in this scoping.

E. Load-bearing risk

The likely failure mode is already named in section A.i above: G3 as originally stated appears to dissolve under closer look. The branching ratio ≈ 3 has a well-published cognitive-bandwidth explanation from the field's own primary author (Dunbar himself), and no published derivation connects it to a Lie algebra. The orca = 3.8 datapoint argues against any strict representation-theoretic invariant. The reframed G3′ is testable but expensive — it depends on getting real chimpanzee and bonobo network data from custodian-gated archives, and the cross-species comparison is the most demanding logistic of the three gaps. Even if data acquisition succeeds, the result is most likely to be "humans sit within great-ape envelope on a multi-spectral fingerprint, bonobos do anchor the tolerance end" — which is a useful finding but is closer to a confirmation of the field's existing consensus than a novel structural-decomposition result.

Honest negative summary for Gap #3: The Lie-algebra reading does not survive informal inspection; the testable reframe (G3′) is methodologically clean but logistically heavy and unlikely to surprise the field. This is the gap most likely to be a research dead-end relative to the project's spectral-decomposition mission.


Which gap is most spike-able first?

Ranking by tractability × signal-value:

  1. Gap #1 (Hodge-Laplacian / Betti-number across ontogeny)most spike-able. Has a fully-public, well-labelled, ontogeny-stratified dataset already on Dryad (Wild et al. 2024 great tits) that the field has not analysed this way. The methods stack (gudhi, HodgeLaplacians, TopoNetX) is mature, Python-native, and pip-installable. The hypothesis is sharply falsifiable. The result will be novel regardless of sign — either Betti-trajectory differentiation exists and the field has a new methodology, or it doesn't and we learn that the threshold-sweep artefact dominates and weighted-flow Laplacians are needed. ~1-2 week spike for a competent topology-and-Python practitioner with the dataset already in hand.

  2. Gap #2 (Base × fibre decomposition) — second-most spike-able, but with a substantive data-quality risk. The supra-Laplacian framework is mature (De Domenico et al. 2013), candidate datasets exist, but the "shared-observation-pipeline" failure mode is real — the spike might run cleanly and produce a result that turns out to be measuring sampling artefact rather than biology. A defensible spike would still surface that finding, but it's a less crisp falsifier than Gap #1. ~2-3 week spike; runs second.

  3. Gap #3 (Representation-theoretic / branching-ratio)least spike-able and partially dissolved under inspection. The original Lie-algebra framing does not survive sanity-checking. The reframed G3′ is testable but logistically expensive (cross-species data acquisition is months-to-years), and the predicted result is closer to consensus-confirmation than novel structural insight. Recommend deferring G3 until either (a) Gap #1 or #2 surface a representation-theoretic structure organically, or (b) cross-species data access is independently arranged for some other project. This is the honest-negative outcome of the scoping.

Recommendation: spike Gap #1 first, with the Wild et al. 2024 Dryad great-tit dataset. If it produces a clean cross-stage Betti trajectory in the predicted direction, the spike is a clean win and Gap #2 becomes the natural follow-up (now with at least one validated topological methodology already in the project's toolbox). If Gap #1's result is null or threshold-driven, that itself is a publishable methodological-limit finding, and the project pivots to either Gap #2 directly or to the weighted-Laplacian extension the failure mode would point at.


§4 — Spike #1 Results: Hodge-Laplacian / Betti-Trajectory on Wild et al. 2024 Great Tits

Date: 2026-05-13 Status: Spike COMPLETE — proposition G1 supported at p < 0.05 across all four juvenile slices vs adult contrast. Branch HEAD at spike start: 1962bad6 (same as parent gap-closure scope) NDJSON output: great_tit_hodge_laplacian_spike_results_2026-05-13.ndjson (31 records: 1 header + 5 stages × (3 Betti + 3 Hodge spectra))

§4.1 Dataset acquisition

  • Wild, Alarcón-Nieto & Aplin (2024). Data for: Social network data in wild great tits during ontogeny. Dryad, DOI 10.5061/dryad.x95x69ps8. Companion paper: Wild, Alarcón-Nieto & Aplin (2024), "The ontogeny of social networks in wild great tits (Parus major)," Behavioral Ecology 35(2):arae011. [Tier-B verified per citation_hygiene_paywall_policy_2026-05-13.md: Dryad landing page + Behavioral Ecology companion + Dryad README cross-referenced.]
  • Access path note: Dryad uses Anubis (techaro.lol) proof-of-work + AWS WAF in 2026; pure-HTTP clients are blocked. The spike used Playwright + headless Chromium to solve the JS challenge in the standard way (real browsers pass this routinely). Fetch script: scripts/fetch_dryad_great_tit_playwright.py.
  • Files used: Mill.data.{summer,autumn,winter,spring}.txt (raw RFID feeder visits, 37 MB total across 4 seasons; 465106 + smaller rows for summer/etc.), species_age.txt (2177 rows — PIT → adult/juvenile-in-2020 binary age class), fledgling_data.txt (nestbox metadata).
  • Schema correction vs §C protocol sketch: the dataset has a binary age class (adult/juvenile) cross-cut with 4 seasons of observation, not a single 5-level ontogeny stratification. The stage definition was adapted accordingly: 4 juvenile-season slices (covering the first year of life) + 1 adult-pooled contrast cohort.

§4.2 Stage definition (operationalised)

juvenile-summer  := edges from Mill.data.summer.txt    among PITs labelled "juvenile"
juvenile-autumn  := edges from Mill.data.autumn.txt    among PITs labelled "juvenile"
juvenile-winter  := edges from Mill.data.winter.txt    among PITs labelled "juvenile"
juvenile-spring  := edges from Mill.data.spring.txt    among PITs labelled "juvenile"
adult-pooled     := edges from all four seasons        among PITs labelled "adult"

Co-occurrence definition: two PITs are joined by a weighted edge if they visited the SAME feeder location AND the SAME RFID antenna within Δt = 60 seconds, summed across all visits in the slice. Sliding two-pointer window; the flag (Vietoris-Rips) complex is built from edges thresholded at the 70th-percentile of edge-weight (keeping the top 30 % of co-occurrence pairs). 2-simplices (triangles) form when 3 PITs co-occur pairwise above threshold; max_dim = 2 (no tetrahedra).

§4.3 Observed Betti numbers (B_0, B_1, B_2) per stage

Stage n_individuals n_edges n_triangles B_0 B_1 B_2
juvenile-summer 74 335 876 13 9 611
juvenile-autumn 35 32 4 11 4 0
juvenile-winter 55 182 331 7 7 204
juvenile-spring 33 33 13 13 3 3
adult-pooled 85 224 410 22 11 260

(Betti numbers computed via boundary-matrix rank: B_k = n_k − rank(∂k) − rank(∂), validated against gudhi on a known octahedron test case.)

§4.4 Permutation-null comparison (n = 1000)

Age-class labels (adult / juvenile) were shuffled across PITs (np.random.default_rng(seed=20260513)); the per-stage edge filter and flag complex were re-built per draw; B_k recomputed. Two-sided p-values use mean(|null − null_mean| ≥ |obs − null_mean|); greater-direction p-values use mean(null ≥ obs).

Stage Statistic Obs Null mean ± SD p(2-sided) p(greater)
juvenile-summer B_0 13 10.50 ± 2.74 0.347 0.221
juvenile-summer B_1 9 1.66 ± 1.57 0.002 0.002
juvenile-summer B_2 611 15.74 ± 15.47 0.000 0.000
juvenile-autumn B_0 11 6.57 ± 2.21 0.069 0.042
juvenile-autumn B_1 4 0.00 ± 0.06 0.000 0.000
juvenile-autumn B_2 0 0.00 ± 0.00 1.000 1.000
juvenile-winter B_0 7 8.53 ± 2.34 0.527 0.814
juvenile-winter B_1 7 1.91 ± 1.82 0.029 0.029
juvenile-winter B_2 204 7.56 ± 8.46 0.000 0.000
juvenile-spring B_0 13 7.78 ± 2.41 0.036 0.031
juvenile-spring B_1 3 0.11 ± 0.35 0.001 0.001
juvenile-spring B_2 3 0.04 ± 0.25 0.002 0.002
adult-pooled B_0 22 33.36 ± 3.54 0.002 1.000
adult-pooled B_1 11 15.02 ± 4.02 0.315 0.876
adult-pooled B_2 260 2501.10 ± 482.00 0.000 1.000

§4.5 Interpretation — the proposition is supported (with nuance)

Headline 1 — juvenile B_1 is significantly elevated over a stage-label-permutation null, in all four juvenile slices. The four juvenile B_1 p-values are 0.002, 0.000, 0.029, 0.001 — all below 0.05; three below 0.01. The adult-pooled B_1 is 11 against null mean 15.02 — below null and non-significant (p_two_sided = 0.315). This is the predicted juvenile-cyclic vs adult-non-cyclic structural contrast, in the direction G1 predicted, at sample-size-adjusted significance.

Headline 2 — within-juvenile trajectory: summer (B_1 = 9) > winter (7) > autumn (4) > spring (3). Not a clean monotone decline as the strongest reading of G1 would predict. The juvenile-summer peak is consistent with the post-fledging high-density flock period; the juvenile-spring trough is consistent with sexual maturation and pre-breeding settling into pair territories. The autumn-dip between summer-peak and winter-rebound is unexpected and merits follow-up: it may reflect (a) post-fledging dispersal phase where juveniles temporarily isolate before forming winter flocks, (b) sampling discontinuity (autumn has only 3 weeks of data vs 14 weeks for summer), or © genuine biological signal of a structural intermediate phase. The spike does not adjudicate among these; a per-week B_1 trajectory within summer would be the natural next step.

Headline 3 — B_2 (2-void content) is the strongest contrast. juvenile-summer B_2 = 611 vs null mean 15.74 — observed is 39× the null mean. adult-pooled B_2 = 260 vs null mean 2501 — observed is 0.1× the null mean. The adult co-occurrence graph is far sparser in higher-order 2-cycles than random label assignment would predict; the juvenile-summer co-occurrence graph is far denser. This is exactly the predicted "adult sociality is more tree-like, juvenile sociality is more permeable" structural signature, at the 2-void level.

Headline 4 — direction of B_0 (component count) is informative as a sanity-check. adult-pooled B_0 = 22 vs null mean 33 — observed fewer connected components than null. This reflects that real adult co-occurrence is more integrated (one big component) than random labels would produce. Juvenile slices have fewer-than-null components in most slices too. So the connectivity-base of the complex is itself real biological structure, not an artefact of stage size.

§4.6 Hodge-Laplacian top eigenvalues (per stage)

The top-10 eigenvalues of L_0, L_1, L_2 are stored in the NDJSON output as record_type: stage_eigenvalues rows. Key observation: in juvenile-summer, top λ(L_0) ≈ 24–36 (graph-Laplacian regime), top λ(L_1) similar (28–36), top λ(L_2) similar (26–31) — i.e., the spectra of the three Hodge Laplacians overlap in the same ~20–35 band, indicating the higher-order structure is genuinely "loaded" with curvature. In adult-pooled, top λ(L_0) ≈ 15–24, top λ(L_2) similar — lower magnitudes, consistent with sparser higher-order content. The cross-stage Hodge spectral fingerprint is candidate territory for a downstream classifier-style species comparison (the natural Gap #2 extension).

§4.7 Verdict on the falsifier

The proposition G1 stated:

if the within-species across-stage B_1 trajectory is statistically indistinguishable across all sampled species (after permutation null), the proposition is refuted.

Operationally for this within-species pilot:

if observed B_1 across the four juvenile slices does not exceed the permutation null at p < 0.05 in any slice, the proposition is refuted at the single-species level.

Result: 4 of 4 juvenile slices exceed null at p < 0.05; 3 of 4 at p < 0.01. Proposition is supported (not refuted) for the great-tit case. The within-juvenile trajectory shape has structure beyond what G1 predicted (autumn-dip), which is itself a discoverable signal worth following up.

§4.8 Honest scope caveats

  • Single species — not the cross-species comparison G1 ultimately needs. This is one-species pilot. The cross-species test (great-tit + hyena + marmot, with bonobo as non-transition counter-anchor) remains future work. Gap #2's base × fibre framework would also benefit from this pilot's validated tooling.
  • Binary age class, not 5-stage ontogeny. The dataset labels are juvenile / adult; the 4-juvenile-stage decomposition is season-resolved, not developmental-stage-resolved in the way the original Gap #1 protocol envisaged (cub/juvenile/sub-adult/adult/post-reproductive). The within-juvenile season trajectory is a reasonable proxy (first-year birds tracked across maturation phases), but it is not a per-individual developmental-stage labelling. A finer ontogeny (e.g., the dataset's week 1 to week 14 summer markers) is achievable from Mill.data.summer.txt's week column and is the natural next step.
  • Threshold (70th percentile of edge weight) was not swept. The Gap #1 protocol's STEP 2 called for a threshold sweep at {50, 70, 85, 95}. The spike used a single 70 cut for runtime tractability; the within-stage statistical contrast is preserved by the permutation null (which uses the same threshold), so the spike's directional findings should hold across thresholds, but the magnitudes will shift. A full sweep is the natural sensitivity-analysis follow-up.
  • Co-occurrence window Δt = 60 s. This is a tight bound (single-flock-event scale); a looser bound (300 s, matching a feeder-visit-cluster window) would aggregate more pairs and potentially shift Betti magnitudes. Not swept here.
  • Adult-pooled is pooled across seasons. The 4-juvenile-season slices' contrast is against an adult cohort that aggregates all 4 seasons. A per-season adult comparison would be cleaner; not done here due to the n=85 adult cohort being too small to split 4 ways for stable Betti.
  • No edge-weight Hodge decomposition (Schaub-Benson-Horn-Lippner-Jadbabaie). The Gap #1 §C STEP 8 stretch goal — harmonic/gradient/curl decomposition of weighted edge signals — is deferred. The unweighted simplicial-flag-complex Betti was sufficient to settle the central proposition.

§4.9 Reproducibility — full toolchain in repo

  • Fetch: scripts/fetch_dryad_great_tit_playwright.py (Anubis-aware; uses Playwright + Chromium)
  • Spike: scripts/great_tit_hodge_laplacian_spike.py
  • Toolchain: Python 3.14.4, gudhi 3.12.0, numpy 2.4.4, pandas 3.0.3, scipy 1.17.1, playwright (for dataset fetch).
  • Re-run: python scripts/fetch_dryad_great_tit_playwright.py && python scripts/great_tit_hodge_laplacian_spike.py --n-perm 1000
  • Wall-time: dataset fetch ~30 s (PoW solve + 37 MB download), observed-Betti ~25 s (dominated by juvenile-summer's 876-triangle Laplacian spectrum), permutation null 1000 perms ~12 min.

§4.10 What this updates in the parent scope

  • §C protocol STEP 6 (median over threshold sweep): single-threshold version reported here; sweep deferred (still recommended in any follow-up).
  • §C STEP 7 (permutation-null falsifier): executed and passed. G1 not refuted; supported in the direction predicted.
  • §E "honest possibility" (Hodge-on-raw-ASNA threshold-driven artefact): not observed. Within-stage signal exceeds the permutation null robustly. The threshold-driven failure mode does not dominate at the 70th percentile + 60 s window choices used.

Discipline notes for this scoping:

  • No MVP framing used. Each protocol is the protocol that would settle the proposition, not a "minimal first cut."
  • No lineage claims; the spectral-decomposition methodology is cited technically, not editorialised on inheritance.
  • 2020+ citations PDF-verified where reachable; paywalled sources flagged as such.
  • Two citation corrections vs prior lit review (Silk et al. DOI; Turner et al. authorship) documented in header.
  • Gap #3 honestly negative: the Lie-algebra reading does not survive close inspection. This is logged as a finding, not buried.
  • Dual-use: animal-social-network methodology has no security-adjacent application I can identify. No defensive-scope concern.