Recordings from 1975 were collected with a Nagra III reel-to-reel tape recorder and a Sennheiser 804 shotgun microphone and converted to digital files (32 bit, 96 kHz) by the Cornell Lab of Ornithology in 2013 (1). Recordings from 2012 (16 bit, 44.1 kHz) and 2019 (32 bit, 48 kHz) were collected with a Marantz PD661 solid-state recorder and a Sennheiser ME66 shotgun microphone. The recordings from 1975 were downsampled to 48 kHz prior to analysis (Luscinia processes both 44.1 kHz and 48 kHz files). In all three years special precautions were taken to avoid recording the same bird twice (1). Each site was visited only once. Within a site, only one individual was recorded within a 160 m radius until they stopped singing or flew away.
All songs were analyzed by (3) using Luscinia, a database and analysis program developed specifically for birdsong (https://rflachlan.github.io/Luscinia/). Songs were analyzed with a high-pass threshold of 2000 Hz, a maximum frequency of 9000 Hz and 5 dB of noise removal. 965 songs (26.2%) were excluded from the analysis due to high levels of noise. Continuous traces with more than 20 ms between them were classified as syllables (2).
Figure S1 shows an example of the first ten syllables of a house finch song recorded by (2) and analyzed in Luscinia, where the blue line is the mean frequency over time.
Figure S1: The first 10 syllables from a song recorded by (2) and analyzed in Luscinia. Each red bar corresponds to a syllable, and each green number corresponds to an element within that syllable. The blue traces represent mean frequency. In this song, syllable 3 and syllable 7 were classified as the same syllable type during dynamic time warping, and all other syllables are unique types. Reprinted from (3).
The deep split parameter (\(DS\)) determines the granularity of clustering by controlling the value of two other parameters: the maximum core scatter (\(MCS\)), which controls the maximum within-group variation, and the minimum gap (\(MG\)), which controls the minimum between-group variation (4,5). \(DS = \{0, 1, 2, 3, 4\}\) correspond to \(MCS = \{0.64, 0.73, 0.82, 0.91, 0.95\}\), and \(MG = (1 - MCS)*0.75\). In their analyses of house finch song, (1) and (6) manually set \(MCS = 1\) and \(MG = 0.5\) while (3) used \(DS = 3\) (corresponding to \(MCS = 0.91\) and \(MG = 0.0675\)), all of which led to a similar number of syllable types. I will follow (3) in using the simpler deep split parameter to control granularity in clustering, as this approach is recommended by the creators of dynamic tree cut (4) and has been widely used for a variety of applications (7,8) including vocal analysis (9).
Unseen species models, fit using the iNEXT package in R (10), were used to assess sampling coverage (see caption for details).
Figure S2: The results of unseen species models applied to the full frequency distribution of syllables classified at each level of deep split, run using the iNEXT package in R (10). The point in each plot marks the observed richness (y-axis) for the total number syllables sampled (x-axis). The solid lines mark interpolated richness, the dashed lines mark extrapolated richness, and the shaded areas mark 95% confidence intervals. The interpolated richness values are the number of types you observe if you subsample that number of tokens from the real data. The extrapolated values are the projected number of types you would observe if you were able to continue sampling tokens from the same population. More details can be found in Hsieh et al. (10): https://doi.org/10.1111/2041-210X.12613. We appear to have complete coverage of syllable types regardless of the level of deep split that is used.
Automated clustering methods may detect more variation in the features of longer syllables compared to shorter syllables, leading to a bias in classification where longer syllables are parsed into more categories. The risk of such bias should be minimized due to the use of dynamic time warping, which warps signals in the time domain (in this study up to 10% of the average signal length) before computing similarity (11), and dynamic tree cut, which varies the granularity of classification to maximize the connectedness of items at the tips of branches (4), but it is still worth highlighting.
If a hierarchical clustering algorithm is less sensitive to variation in short syllables, then it should tend to over-lump them into types with longer syllables at all cut heights. However, it is likely to over-lump them less at lower cut heights, where syllables are already being parsed into types with greater granularity. Based on this logic, I would predict two things from a hierarchical clustering algorithm that is biased with respect to duration:
Figure S3: Panel A shows the median duration of types from each level of granularity in syllable clustering. The black lines in panel B are the “trajectories” of individual syllable tokens—the median duration of types that individual tokens are assigned to with increasing levels of granularity. The orange dotted line is the median value across all syllables.
There is a subtle decrease in the distribution of the median duration of types as the granularity of clustering increases (Figure S3A), but the effect is not significant in a simple linear model (duration ~ granularity) (p > 0.05). If we track the median duration of the types that individual tokens are assigned to at increasing levels of granularity (Figure S3B), there appears to be no clear trend (avg. is the orange line). A simple linear model with type as a random effect (duration ~ granularity + (1|type)) does detect a significant increase in the median duration of types that tokens are assigned to with increasing granularity (estimate: 0.0018; 95% CI: 0.00061, 0.0029), but this effect is extremely low and only comes out to an increase of about 1 ms per unit increase in deep split (for signals that are ~100 ms on average).
Based on this analysis, I think that duration-bias in clustering is unlikely to be an issue in the analysis, but just to be sure I have replicated the results for Zipf’s law of abbreviation with manually-classified syllables from an experimental cross-fostering study conducted in house finches (12). See Replication with Data from Mann et al. below for details.
The hierarchical clustering leads to syllable types that explain a large amount of the variation in the duration, bandwidth, and excursion of tokens, but only captures concavity at the higher levels of deep split.
Parameter | DS: 2 | DS: 3 | DS: 4 |
---|---|---|---|
Duration | 0.76 | 0.71 | 0.70 |
Bandwidth | 0.72 | 0.84 | 0.90 |
Concavity | 0.08 | 0.24 | 0.35 |
Excursion | 0.36 | 0.49 | 0.59 |
Figure S4: The relationship between rank (x-axis) and count (y-axis) at each level of deep split (left, center, and right). The blue and orange lines denote the expected distributions according to Zipf’s rank-frequency law (blue) and Mandelbrot’s extension of it (orange).
Mandelbrot’s generalization of Zipf’s rank-frequency law takes the following form (13,14):
\[\begin{equation} f(r) = \frac{c}{(r + \beta)^\alpha} \tag{1} \end{equation}\]
where \(f(r)\) is the normalized frequency at each rank \(r\), \(c\) is a normalization term, and \(\alpha\) and \(\beta\) are parameters that control slope and convexity (respectively). According to (15), the bounds of (1) are \(\alpha > 1\), \(\beta > -1\), and \(c > 1\). When \(\beta = 0\), this function simplifies to the original form of Zipf’s rank-frequency law: \(f(r) \propto 1/r^\alpha\).
\(c\) is usually a normalization constant (15–17) defined as:
\[\begin{equation} c = \sum_{i=1}^{\infty}\frac{1}{(r + \beta)^\alpha} \tag{2} \end{equation}\]
In practice, this form of Zipf’s rank-frequency law is notoriously difficult to fit to data due to strong correlations between \(\alpha\) and \(\beta\), which in turn determine \(c\) (15–17). Here, I use a simplified version of (1) that treats \(c\) as a third parameter that is estimated alongside \(\alpha\) and \(\beta\) (18), as has been done in studies of chickadee calls (19–21), which should be interpreted as an approximation of the Zipf-Mandelbrot distribution.
The model was fit as a non-linear model on the original scale, as opposed to the log-log scale.
Parameter | Class | Prior | Lower Bound |
---|---|---|---|
a | b | normal(0, 10) | 1 |
b | b | normal(0, 10) | -1 |
c | b | normal(0, 10) | 0 |
sigma | student_t(3, 0, 2.5) | 0 |
DS | Zipf | Zipf-Mandelbrot |
---|---|---|
1 | -799 | -1,075 |
2 | -5,328 | -7,817 |
3 | -17,566 | -25,617 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 1.01 | 1.00 | 1.03 | 1 | 5,510 | 3,580 |
c | 0.23 | 0.22 | 0.24 | 1 | 7,223 | 6,440 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 2.20 | 2.03 | 2.39 | 1 | 2,129 | 2,702 |
b | 4.66 | 4.00 | 5.41 | 1 | 2,132 | 2,653 |
c | 9.43 | 5.22 | 16.68 | 1 | 2,102 | 2,515 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 1.00 | 1.00 | 1.00 | 1 | 6,264 | 4,253 |
c | 0.08 | 0.08 | 0.09 | 1 | 7,540 | 5,954 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 1.20 | 1.18 | 1.22 | 1 | 2,132 | 1,965 |
b | 5.69 | 5.42 | 5.98 | 1 | 2,187 | 2,477 |
c | 0.52 | 0.48 | 0.56 | 1 | 2,094 | 2,123 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 1.00 | 1.00 | 1.00 | 1 | 6,266 | 4,117 |
c | 0.03 | 0.03 | 0.03 | 1 | 8,434 | 7,188 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 1.08 | 1.07 | 1.09 | 1 | 2,079 | 2,879 |
b | 17.29 | 16.79 | 17.79 | 1 | 2,076 | 3,021 |
c | 0.34 | 0.32 | 0.36 | 1 | 2,032 | 2,774 |
Year | DS: 2 | DS: 3 | DS: 4 |
---|---|---|---|
1975 | -174 | -1,263 | -5,207 |
2012 | -156 | -1,958 | -5,518 |
2019 | -212 | -1,700 | -4,813 |
Year | DS: 2 | DS: 3 | DS: 4 |
---|---|---|---|
1975 | 0.993 | 0.985 | 0.994 |
2012 | 0.987 | 0.991 | 0.986 |
2019 | 0.991 | 0.993 | 0.996 |
It appears that the Zipf-Mandelbrot distribution has a poorer fit to the rarer syllable types. Systematic departure from the Zipf-Mandelbrot distribution at high ranks is a well-known phemonenon in human language (22–24). It is thought to be caused by suboptimal sample sizes (25) and other data quality issues (26), and can be reproduced by more sophisticated models of Zipf’s rank-frequency law (27,28).
Below are some simple simulations that demonstrate this principle in the house finch song data, conducted with the following steps at the intermediate level of granularity in syllable clustering (deep split of 3):
Figure S5: The relationship between rank (x-axis) and count (y-axis) at the intermediate granularity level (DS = 3, 596 syllable types). The orange lines denote the expected distributions according to Mandelbrot’s extension of Zipf’s rank-frequency law (a = 1.20, b = 5.69, c = 0.52). The points in the left and right panel are the expected counts and ranks when 42,099 and 1,000,000 tokens are sampled, respectively.
As you can see, there is a subtle but systematic departure from the Zipf-Mandelbrot distribution at high ranks when 42,099 tokens are sampled from the theoretical distribution (the same number observed in the real dataset). When 1,000,000 tokens are sampled this departure disappears. The logic here is that when sample sizes are limited rare types tend to be under-counted, leading to a “droop” (28) in the curve at higher ranks.
Figure S6: The relationship between four measures of production cost (x-axis) and count (y-axis) for each syllable type at each level of deep split (left, center, right). Each point shows the median value for a syllable type, so the orange best fit lines are from a simple Poisson model (count ~ cost) rather than the full log-normal model.
Figure S7: The relationship between four measures of production cost (x-axis) and count (y-axis) for each token (observation of each syllable type) at each level of deep split (left, center, right). Note that the y-axis is the count of each syllable type, which is why values are repeated across the individual tokens.
Class | Prior | Lower Bound |
---|---|---|
b | normal(0, 0.5) | |
Intercept | normal(0, 4) | 0 |
sd | normal(0, 0.5) | 0 |
sigma | normal(0, 0.5) | 0 |
Outcome | DS | Predictor | Rhat | Bulk ESS | Tail ESS |
---|---|---|---|---|---|
Duration | 2 | Intercept | 1 | 712 | 913 |
Count | 1 | 794 | 1,064 | ||
3 | Intercept | 1 | 451 | 915 | |
Count | 1 | 566 | 1,069 | ||
4 | Intercept | 1 | 268 | 580 | |
Count | 1 | 302 | 597 | ||
Bandwidth | 2 | Intercept | 1 | 835 | 1,801 |
Count | 1 | 1,047 | 2,088 | ||
3 | Intercept | 1 | 184 | 448 | |
Count | 1 | 230 | 586 | ||
4 | Intercept | 1 | 170 | 406 | |
Count | 1 | 230 | 525 | ||
Concavity | 2 | Intercept | 1 | 1,059 | 2,059 |
Count | 1 | 1,354 | 2,695 | ||
3 | Intercept | 1 | 1,307 | 2,561 | |
Count | 1 | 1,546 | 3,209 | ||
4 | Intercept | 1 | 723 | 1,469 | |
Count | 1 | 775 | 1,620 | ||
Excursion | 2 | Intercept | 1 | 1,301 | 2,456 |
Count | 1 | 1,508 | 2,661 | ||
3 | Intercept | 1 | 735 | 1,398 | |
Count | 1 | 855 | 1,592 | ||
4 | Intercept | 1 | 403 | 771 | |
Count | 1 | 479 | 956 |
Model | DS | Est. | 2.5% | 97.5% |
|
---|---|---|---|---|---|
duration ~ count | 2 | -0.38 | -0.66 | -0.10 | * |
3 | -0.47 | -0.59 | -0.35 | * | |
4 | -0.42 | -0.48 | -0.36 | * | |
bandwidth ~ count | 2 | -0.56 | -0.77 | -0.35 | * |
3 | -0.69 | -0.80 | -0.58 | * | |
4 | -0.64 | -0.70 | -0.58 | * | |
concavity ~ count | 2 | -0.02 | -0.21 | 0.17 | |
3 | -0.06 | -0.15 | 0.03 | ||
4 | -0.07 | -0.13 | -0.02 | * | |
excursion ~ count | 2 | -0.27 | -0.42 | -0.12 | * |
3 | -0.34 | -0.41 | -0.26 | * | |
4 | -0.32 | -0.36 | -0.28 | * |
I replicated the analysis using a method recently proposed by Lewis et al. (29) and released as the R package ZLAvian by Gilman et al. (30). This method computes Kendall’s tau, or the concordance between duration and frequency, where (tau + 1)/2 is the probability that if two random syllables are sampled (from either individuals or from the population) the longer note will be more common. The estimated tau is then compared against a null distribution that conservatively accounts for social learning (30).
At the population level, Kendall’s tau is significantly lower than the null for all four measures of production cost at every level of granularity in syllable clustering. There is also an individual level effect for bandwidth and excursion at all three levels of granularity, but the results are more mixed for duration and concavity. I think this is likely because we only have a median of 6 songs recorded for individual birds. I also replicated the analysis using data from a cross-fostering experiment (12), where individual birds were much more heavily sampled (100, 64, and 62 songs recorded for each one), and found individual level effects of duration, bandwidth, and excursion (see Replication with Data from Mann et al. below).
Parameter | DS | Level | tau | p-value |
|
---|---|---|---|---|---|
duration | 2 | individual | -0.20 | 3.0e-03 | ** |
population | -0.32 | 1.9e-07 | **** | ||
3 | individual | -0.06 | 1.2e-01 | ||
population | -0.35 | 7.1e-36 | **** | ||
4 | individual | 0.02 | 4.8e-01 | ||
population | -0.33 | 6.8e-88 | **** | ||
bandwidth | 2 | individual | -0.32 | 1.0e-03 | *** |
population | -0.48 | 1.8e-14 | **** | ||
3 | individual | -0.18 | 1.0e-03 | *** | |
population | -0.42 | 8.3e-53 | **** | ||
4 | individual | -0.09 | 1.0e-03 | *** | |
population | -0.38 | 1.3e-115 | **** | ||
concavity | 2 | individual | -0.09 | 7.2e-02 | |
population | -0.24 | 1.0e-04 | *** | ||
3 | individual | -0.04 | 4.5e-02 | * | |
population | -0.17 | 3.0e-10 | **** | ||
4 | individual | -0.02 | 8.6e-02 | ||
population | -0.12 | 3.1e-12 | **** | ||
excursion | 2 | individual | -0.24 | 1.0e-03 | *** |
population | -0.36 | 5.4e-09 | **** | ||
3 | individual | -0.19 | 1.0e-03 | *** | |
population | -0.31 | 9.9e-29 | **** | ||
4 | individual | -0.14 | 1.0e-03 | *** | |
population | -0.27 | 2.2e-57 | **** |
I contacted two labs with large corpora of field-recorded house finches, and neither of them had manually-classified syllables with analyzed acoustic features. However, there is one cross-fostering study conducted by Paul Mundinger in 1972, and analyzed and published by Mann et al. in 2021 (12), that includes both manually-classified syllables and acoustic features. I replicated the analysis using these data to ensure that my results for Zipf’s law of abbreviation were not an artifact of the automated classification procedure.
Mann et al. (12) compared the acoustic features of songs from (1) house finches tutored by other house finches, (2) house finches tutored by canaries, (3) house finches reared without tutors, and (4) canaries. Mann et al. (12) classified syllable types separately for each individual bird, so I analyzed the repertoires of the most heavily sampled bird from the first three categories (B1-74 (house finch tutor): 62 songs; B1 (canary tutor): 100 songs; D4 (no tutor): 64 songs). The results of the model, run with the exact same specification as the in main analysis, can been seen below.
Model | Tutor | Est. | Err. | 2.5% | 97.5% |
|
---|---|---|---|---|---|---|
duration ~ count | House Finch | -0.1679 | 0.0839 | -0.3329 | -0.0004 | * |
Canary | -0.3484 | 0.1303 | -0.6059 | -0.0866 | * | |
None | -0.2329 | 0.2127 | -0.6402 | 0.2005 | ||
bandwidth ~ count | House Finch | -0.1874 | 0.0832 | -0.3518 | -0.0275 | * |
Canary | -0.1933 | 0.1708 | -0.5372 | 0.1501 | ||
None | -0.3981 | 0.1714 | -0.7191 | -0.0353 | * | |
concavity ~ count | House Finch | -0.0980 | 0.2890 | -0.6572 | 0.4836 | |
Canary | -0.3833 | 0.3076 | -0.9716 | 0.2296 | ||
None | -0.0978 | 0.3652 | -0.8137 | 0.6168 | ||
excursion ~ count | House Finch | -0.2081 | 0.1051 | -0.4213 | -0.0025 | * |
Canary | -0.3189 | 0.1519 | -0.6124 | -0.0129 | * | |
None | -0.1825 | 0.1639 | -0.5052 | 0.1580 |
Under typical circumstances, where a house finch learns song from another house finch, the results are qualitatively the same as the main analysis—duration, bandwidth, and excursion all have strong negative effects on count. Interestingly, only duration and excursion negatively predict count in the canary-tutored house finch, and only bandwidth negatively predicts count in the house finch reared without a tutor.
I also replicated the analysis using a method recently proposed by Lewis et al. (29) and released as the R package ZLAvian by Gilman et al. (30). This method computes Kendall’s tau, or the concordance between duration and frequency, where (tau + 1)/2 is the probability that if two random syllables are sampled the longer note will be more common. The estimated tau is then compared against a null distribution that conservatively accounts for social learning (30). Kendall’s tau can be computed at the population level (i.e. random syllables are sampled from the population rather than a single bird, see above), but here I only report the results at the individual level since I am using data from individual repertoires.
Parameter | Tutor | tau | p |
|
---|---|---|---|---|
duration | house finch | -0.301 | 0.006 | ** |
canary | -0.515 | 0.009 | ** | |
none | -0.506 | 0.019 | * | |
bandwidth | house finch | -0.264 | 0.012 | * |
canary | -0.116 | 0.287 | ||
none | -0.460 | 0.030 | * | |
concavity | house finch | -0.164 | 0.104 | |
canary | -0.077 | 0.321 | ||
none | -0.214 | 0.186 | ||
excursion | house finch | -0.315 | 0.003 | ** |
canary | -0.429 | 0.012 | * | |
none | -0.230 | 0.159 |
The results are mostly consistent with the Bayesian model above: Kendall’s tau is significantly negative for duration under all conditions, for bandwidth when birds learned from a house finch or were raised in isolation, and for excursion when birds learned from a house finch or a canary. Interestingly, we found that Kendall’s tau for duration is consistently significantly negative at the individual level for the Mann et al. (12) data, whereas it is only consistently significantly negative at the population level for the Youngblood and Lahti (3) data (see Replication using Kendall’s tau above). This is likely because there are many more songs recorded from the individual birds in the Mann et al. (12) data (N = 100, 64, and 62) compared to the Youngblood and Lahti (3) data (median of 6).
Model | DS | Est. | 2.5% | 97.5% |
|
---|---|---|---|---|---|
duration ~ count | 2 | -0.94 | -1.65 | -0.23 | * |
3 | -0.97 | -1.27 | -0.66 | * | |
4 | -0.79 | -0.94 | -0.65 | * | |
bandwidth ~ count | 2 | -1.23 | -1.73 | -0.73 | * |
3 | -1.38 | -1.66 | -1.11 | * | |
4 | -1.19 | -1.33 | -1.04 | * | |
concavity ~ count | 2 | 0.00 | -0.15 | 0.14 | |
3 | 0.01 | -0.13 | 0.14 | ||
4 | 0.02 | -0.07 | 0.10 | ||
excursion ~ count | 2 | -0.62 | -0.93 | -0.31 | * |
3 | -0.73 | -0.93 | -0.53 | * | |
4 | -0.67 | -0.79 | -0.55 | * |
Model | DS | Est. | 2.5% | 97.5% |
|
---|---|---|---|---|---|
duration ~ count | 2 | -0.28 | -0.40 | -0.15 | * |
3 | -0.32 | -0.39 | -0.24 | * | |
4 | -0.29 | -0.33 | -0.25 | * | |
bandwidth ~ count | 2 | -0.40 | -0.54 | -0.26 | * |
3 | -0.51 | -0.60 | -0.43 | * | |
4 | -0.49 | -0.53 | -0.44 | * | |
concavity ~ count | 2 | -0.01 | -0.07 | 0.05 | |
3 | 0.01 | -0.04 | 0.05 | ||
4 | 0.01 | -0.02 | 0.05 | ||
excursion ~ count | 2 | -0.20 | -0.27 | -0.13 | * |
3 | -0.24 | -0.28 | -0.19 | * | |
4 | -0.22 | -0.25 | -0.19 | * |
Class | Prior | Lower Bound |
---|---|---|
b | normal(0, 0.1) | |
Intercept | normal(0, 3) | 0 |
sd | normal(0, 0.5) | 0 |
sigma | normal(0, 0.5) | 0 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
Intercept | 4.48 | 4.47 | 4.49 | 1 | 6,740 | 7,103 |
Song Length | -0.05 | -0.07 | -0.04 | 1 | 4,794 | 6,044 |
Simple Null Model | Production Null Model | |||
---|---|---|---|---|
Dataset | Intercept | Song Length | Intercept | Song Length |
1 | 0.9998875 | 0.9996983 | 0.9997456 | 1.000656 |
2 | 0.9997542 | 0.9997727 | 0.9999726 | 1.000328 |
3 | 1.0000627 | 0.9997787 | 0.9998281 | 1.001598 |
4 | 0.9997391 | 0.9997015 | 1.0005752 | 1.000895 |
5 | 0.9999400 | 1.0004971 | 0.9997567 | 1.000817 |
6 | 0.9999405 | 0.9998756 | 0.9997009 | 1.000332 |
7 | 0.9999780 | 0.9998101 | 0.9997015 | 1.001533 |
8 | 1.0005872 | 0.9998251 | 0.9998900 | 1.000225 |
9 | 0.9999938 | 0.9997752 | 0.9998704 | 1.001099 |
10 | 0.9997444 | 0.9998224 | 1.0001865 | 1.002680 |
Param. | Estimate | 2.5% | 97.5% |
---|---|---|---|
Intercept | 4.500 | 4.40 | 4.600 |
Song Length | -0.047 | -0.06 | -0.033 |
Year | DS: 2 | DS: 3 | DS: 4 |
---|---|---|---|
1975 | 1.88 | 5.58 | 10.45 |
2012 | 1.76 | 5.71 | 13.34 |
2019 | 2.09 | 6.95 | 14.30 |
DS | Model | WAIC | R-Sq |
---|---|---|---|
2 | Exponential | -829 | 0.951 |
Power-Law | -766 | 0.910 | |
Composite | -840 | 0.952 | |
3 | Exponential | -617 | 0.982 |
Power-Law | -487 | 0.927 | |
Composite | -668 | 0.986 | |
4 | Exponential | -599 | 0.990 |
Power-Law | -432 | 0.926 | |
Composite | -688 | 0.993 |
Parameter | Prior | Lower Bound |
---|---|---|
a | normal(0, 1) | 0 |
b | normal(0, 1) | 0 |
c | normal(0, 1) | 0 |
d | normal(0, 1) | 0 |
2 | 3 | 4 | |||||||
---|---|---|---|---|---|---|---|---|---|
Exp | PL | Comp | Exp | PL | Comp | Exp | PL | Comp | |
100 | -829 | -765 | -841 | -617 | -486 | -668 | -598 | -432 | -688 |
200 | -1,557 | -1,513 | -1,560 | -1,232 | -1,068 | -1,281 | -1,201 | -967 | -1,267 |
300 | -2,101 | -2,086 | -2,114 | -1,635 | -1,555 | -1,695 | -1,626 | -1,468 | -1,686 |
400 | -2,683 | -2,676 | -2,712 | -2,008 | -1,977 | -2,131 | -2,082 | -1,950 | -2,142 |
500 | -2,984 | -2,988 | -3,073 | -2,078 | -2,093 | -2,276 | -2,260 | -2,217 | -2,387 |
600 | -3,114 | -3,176 | -3,248 | -2,264 | -2,299 | -2,527 | -2,454 | -2,449 | -2,639 |
700 | -3,291 | -3,430 | -3,471 | -2,493 | -2,539 | -2,808 | -2,650 | -2,669 | -2,901 |
800 | -3,620 | -3,734 | -3,766 | -2,479 | -2,598 | -2,798 | -2,209 | -2,240 | -2,485 |
900 | -3,615 | -3,785 | -3,801 | -2,108 | -2,379 | -2,467 | -1,568 | -1,804 | -1,886 |
1000 | -3,712 | -3,931 | -3,942 | -1,958 | -2,350 | -2,408 | -1,235 | -1,608 | -1,654 |
1200 | -3,175 | -3,244 | -3,246 | -1,483 | -1,711 | -1,731 | -823 | -1,084 | -1,104 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 0.17 | 0.16 | 0.19 | 1 | 4,336 | 4,595 |
b | 0.39 | 0.35 | 0.44 | 1 | 4,511 | 4,996 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
c | 0.13 | 0.12 | 0.14 | 1 | 6,738 | 7,059 |
d | 1.08 | 1.01 | 1.15 | 1 | 6,437 | 6,285 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 0.16 | 0.12 | 0.18 | 1 | 3,354 | 3,685 |
b | 0.43 | 0.38 | 0.48 | 1 | 4,592 | 4,387 |
c | 0.02 | 0.00 | 0.04 | 1 | 3,021 | 2,579 |
d | 0.66 | 0.29 | 0.96 | 1 | 3,237 | 2,798 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 0.60 | 0.57 | 0.64 | 1 | 4,478 | 5,210 |
b | 0.27 | 0.26 | 0.29 | 1 | 4,453 | 5,343 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
c | 0.52 | 0.49 | 0.56 | 1 | 6,895 | 5,998 |
d | 0.95 | 0.90 | 1.00 | 1 | 7,133 | 7,055 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 0.53 | 0.47 | 0.59 | 1 | 2,820 | 3,907 |
b | 0.30 | 0.28 | 0.32 | 1 | 4,896 | 4,646 |
c | 0.08 | 0.04 | 0.13 | 1 | 2,882 | 2,864 |
d | 0.61 | 0.42 | 0.77 | 1 | 2,978 | 3,096 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 0.74 | 0.71 | 0.77 | 1 | 3,787 | 4,572 |
b | 0.24 | 0.23 | 0.25 | 1 | 4,287 | 5,362 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
c | 0.69 | 0.64 | 0.73 | 1 | 6,835 | 6,785 |
d | 0.92 | 0.87 | 0.97 | 1 | 6,638 | 6,848 |
Param. | Estimate | l-95% CI | u-95% CI | Rhat | Bulk_ESS | Tail_ESS |
---|---|---|---|---|---|---|
a | 0.65 | 0.59 | 0.71 | 1 | 2,387 | 2,851 |
b | 0.26 | 0.25 | 0.27 | 1 | 4,566 | 4,959 |
c | 0.10 | 0.05 | 0.15 | 1 | 2,433 | 2,760 |
d | 0.61 | 0.45 | 0.74 | 1 | 2,360 | 2,726 |
DS | Year | Exponential | Power-Law | Composite |
---|---|---|---|---|
2 | 1975 | -653 | -607 | -656 |
2012 | -547 | -421 | -564 | |
2019 | -521 | -427 | -571 | |
3 | 1975 | -724 | -685 | -725 |
2012 | -598 | -446 | -607 | |
2019 | -588 | -427 | -640 | |
4 | 1975 | -687 | -671 | -723 |
2012 | -518 | -396 | -572 | |
2019 | -511 | -370 | -587 |
DS | Exponential | Power-Law | Composite |
---|---|---|---|
2 | -77 | -84 | -84 |
3 | -33 | -28 | -31 |
4 | -60 | -50 | -57 |
This analysis was conducted to ensure that the ordering of songs within song bouts contributes mutual information to the decay curves.
Long-range dependencies in song sequences come from two places: the ordering of syllables within very long songs, and the ordering of songs within song bouts. To isolate the statistical signals of the latter, I created a “dummy” dataset where syllable sequences within songs were shuffled, but the ordering of bouts was the same. As a comparison, I created a second “dummy” dataset that has the same song sequences as the first, but with random song bouts (shuffled within individuals).
Wilcoxon signed-rank tests and t-tests show that the first dataset contains significantly more information than the second at all three levels of granularity in syllable clustering.
| DS: 2 | DS: 3 | DS: 4 |
---|---|---|---|
Wilcoxon signed-rank test | 3.2e-13 | 5.6e-26 | 3.2e-08 |
t-test | 1.7e-02 | 6.2e-11 | 2.4e-02 |
A long-standing critique of Zipf’s laws is that they may be statistical artifacts of other processes (31), starting with Miller’s observation that randomly typing on keyboards can produce similar patterns (32). That being said, random typing accounts are not realistic causal descriptions of how communication systems emerge, and there are good empirical reasons to doubt that they undermine efficiency accounts (24). Randomly-generated texts produce rank-frequency distributions that differ from those in real corpora (33), random typing models are not truly neutral as they can be mathematically reframed as minimizing costs (34,35), and there is direct experimental evidence that Zipfian abbreviation emerges from pressure for efficient communication (36). In my view, the most important contribution of the random typing account is to highlight that the problem of equifinality—different processes leading to similar outcomes (37)—means that patterns resembling Zipf’s laws are not sufficient to make conclusions about efficiency (38,39). Multiple lines of evidence should be presented alongside other work demonstrating that efficiency is shaping the system (e.g. physical (12) and environmental (40) constraints), as I have done here. See (39), (24), and (38) for more complete summaries of this debate.
Outside of linguistics, efficiency and complexity are often discussed in relation to cumulative cultural evolution (CCE). Definitions of CCE vary and a full review is outside of the scope of this study, but for convenience we will use the definition of (41): “the accumulation of sequential changes within a single socially learned behavior that results in improved function”. Discussions of CCE often focus on increasing complexity over time (42), which was once thought to be a hallmark of human culture (43) but has now been observed in several non-human communication systems including humpback whale (44) and Savannah sparrow song (41). (45) make a convincing argument that efficiency deserves more attention in CCE, as increases in complexity in one domain require increases in efficiency in another (see Equation 1 in the Introduction). House finch song may be a good research model for how the interplay between efficiency and complexity drives CCE, as male house finches have a social learning bias for more complex syllables (3), possibly as an adaptation to female preferences for more complex songs (46–48), and there appears to be pressure for efficiency at the level of both syllables and songs. That being said, CCE may not be the best framework for understanding the interaction between efficiency and complexity in birdsong, as its logic is more difficult to apply to “aesthetic” behavior (49) especially when it is optimized for female preferences that evolve to maximize inclusive fitness rather than the specific properties of songs that males sing (50).
House finch song exhibits language-like efficiency and structure, but music-like structure has not been similarly studied in this species. In the last two decades researchers have identified aspects of birdsong, such as rhythm and pitch intervals in thrush nightingales (51–53), that closely resemble aspects of human music. Future studies should explore language- and music-like properties of birdsong in parallel across multiple levels of granularity to inform the ongoing debate about whether birdsong is more akin to music or language (54–56).