Abstract
Vocal communication systems in humans and other animals experience selection for efficiency—optimizing the benefits they convey relative to the costs of producing them. Two hallmarks of efficiency, Menzerath’s law and Zipf’s law of abbreviation, predict that longer sequences will consist of shorter elements and more frequent elements will be shorter, respectively. Here, I assessed the evidence for both laws in cetaceans by analyzing vocal sequences from 16 baleen and toothed whale species and comparing them to 51 human languages. 11 whale species exhibit Menzerath’s law, sometimes with greater effect sizes than human speech. Two of the five whale species with categorized element types exhibit Zipf’s law of abbreviation. On average, whales also tend to shorten elements and intervals towards the end of sequences, although this varies by species. Overall, the results of this study suggest that the vocalizations of many cetacean species have undergone compression for increased efficiency in time.
© Popular Mechanics
Vocal communication is essential to survival and reproduction in many species, as it enables individuals to convey critical information related to predation, resource access, courtship, and social relationships (1). More complex signals, which vary across multiple dimensions, can encode greater amounts of information (2), and redundancy increases the likelihood of successful transmission between signalers and receivers (3). However, elaborate and sustained vocalizations carry considerable costs, including heightened predation risk (4) and increased energetic demands, sometimes up to 2-8 times the resting metabolic rate in certain species (5). Consequently, vocal communication systems experience selection for efficiency (6)—optimizing the benefits they convey relative to the costs of producing them (7, 8)—a concept closely related to the “principle of least effort” in linguistics (9).
One of the simplest ways to increase efficiency is by reducing vocalization time (4). Individuals who convey the same information in less time incur lower metabolic costs (10) and are less likely to be detected by predators and potential prey (4). Vocalization time can evolve in response to factors that alter the relative costs and benefits of communication, like group size (11), as well as physical features that affect vocal production (12). Within species, vocalization time may also change over generations through cultural evolution (i.e., via social learning) (13), and within individuals during ontogeny (14) or as a flexible response to anthropogenic noise (15–18), in a way that optimizes efficiency.
In human language, efficiency is often quantified through two linguistic laws that directly relate to vocalization time: Menzerath’s law and Zipf’s law of abbreviation. Imagine a set of sequences (e.g., sentences, words, songs), each composed of multiple elements (e.g., words, phonemes, notes). Menzerath’s law predicts that longer sequences (e.g., songs, words) will be composed of shorter elements (e.g., notes, phonemes) (19). In other words, when production costs increase in one domain (e.g., sequence length) they decrease in another (e.g., element duration). Zipf’s law of abbreviation predicts that more frequently used elements (e.g., notes, phonemes, words) will be shorter in duration (9). Both laws result in an overall reduction in vocalization time, and mathematical modeling work indicates that they emerge from pressure for more efficient communication (20–22).
Outside of human language, Menzerath’s law and Zipf’s law of abbreviation have been observed in an increasing number of species, including gibbons (23), African penguins (24), and house finches (25). Comparative studies assessing both Menzerath’s law and Zipf’s law of abbreviation within the same species, however, reveal an interesting discrepancy: the former is always found (23–34), whereas the latter only appears in around half of cases (23, 24, 31–34). As others have noted, this discrepancy may stem from the laws reflecting different mechanisms or constraints (29, 35).
One hypothesis for this pattern is that Menzerath’s law has primarily physical origins, driven by natural selection for a more efficient vocal apparatus. Menzerath’s law in humans appears to be stronger in spoken than in written language (22, 36), deafened canaries and zebra finches produce songs consistent with the law without hearing adult models (37), and African penguins display the law without engaging in vocal learning (24). In contrast, Zipf’s law of abbreviation may result from a more a complex combination of factors (7). Physical efficiency appears to be important, as common words are shorter and have more easily articulated phoneme sequences (38), but predictability and informativeness may also play a role. Experiments with artificial languages show that Zipfian abbreviation emerges when participants are under pressure to be both informative and fast (39), speakers shorten words when their meaning is predictable from context (40), and information content may predict word length more than frequency in some conditions (41). The two laws, then, may have very different prerequisites. Menzerath’s law might arise wherever vocalizations occur in sequences, regardless of whether learning is involved. In contrast, Zipf’s law of abbreviation may require that elements form distinct categories that vary in predictability and convey meaningful information.
Communicative efficiency is relatively understudied in cetaceans. To my knowledge, Menzerath’s law has only been observed in bottlenose dolphins (31, 33), and Zipf’s law of abbreviation has only been observed in humpback whales and bottlenose dolphins (31, 33, 42). Given cetaceans’ extensive reliance on learned vocalizations for complex social behavior—from courtship in baleen whales to individual recognition and coordination in toothed whales (43)—they offer a valuable research model for efficiency in non-human communication. Additionally, the breadth of data on cetacean vocalizations makes it possible to conduct a meta-analysis, assessing the prevalence and strength of Menzerath’s law and Zipf’s law of abbreviation in a wide range of species using previously published datasets. To-date, comprehensive meta-analyses of these two laws in vocal communication have only been done for human speech, where both appear to be statistical universals (22, 35), and birdsong, where Menzerath’s law is widespread (37) but Zipf’s law of abbreviation is quite rare (44). The aims of this meta-analysis were to (1) determine the prevalence of Menzerath’s law and Zipf’s law of abbreviation in cetaceans, and to (2) directly compare the strength of the laws in cetaceans with spoken human language data—in other words, assess whether vocal efficiency in cetaceans is “language-like”.
In studies of Menzerath’s law in vocal communication, duration is typically measured in one of two ways: (1) from the start to the end of a sound, or (2) from the start of one sound to the start of the next. The first method, which captures only the vocalization time and excludes pauses, is widely used for animal communication (20, 23, 26, 32, 34, 37). I refer to this as the element duration—the difference between a sound’s start and end time. The second method measures the vocalization time including the pause before the next sound. This approach has been used for marmosets (45), bottlenose dolphins (31), and is standard for human speech (22, 36, 46), which is fairly continuous. Large spoken language corpora, such as Glissando (36), Buckeye (22), and DoReCo (46), include the small gaps between phonemes in their duration measurements. More broadly, this measure is the “go-to” for studies of rhythm in humans and animals (47). Following the rhythm literature, I refer to this as the inter-onset interval—the difference between the start of one sound and the start of the next. A couple of studies in treefrogs (27) and geladas (20) have assessed Menzerath’s law using only the pauses between sounds, to supplement analyses of element durations, but this approach is rare and will not be used in this study.
In cetacean vocalizations, element durations are typically used when sequences consist of distinct notes, calls, or elements, with information thought to be encoded in acoustic features like frequency, bandwidth, and timbre (analogous to birdsong, second and third row of Figure 1). In contrast, inter-onset intervals are used when sequences are made up of uniform clicks or pulses, where the rhythmic timing is thought to encode information (analogous to human drumming, fourth and fifth row of Figure 1). It is worth noting that the latter case is quite different from human language (first row of Figure 1), where inter-onset intervals are used because gaps between phonemes are either absent or minimal. However, regardless of the measurement used—element durations or inter-onset intervals—Menzerath’s law reflects the same underlying principle: “the greater the whole the smaller its parts” (19, 48). In other words, when longer sequences are made up of smaller components, the total vocalization time is reduced. A recent study in marmosets illustrates this concept: when individuals were rewarded for producing an increasing number of vocalizations, they maximized their vocal efficiency by reducing both the element durations and inter-onset intervals of their call sequences (45). In this study, the distributions of element durations and inter-onset intervals in whale vocal sequences exhibit the same shape (Supplementary Information), and Menzerath’s law is only slightly different when computed from intervals in both whales (Table 2) and humans (Supplementary Information).