The modern Japanese color lexicon

Open Access

Article| March 2017

Ichiro Kuriki; Ryan Lange; Yumiko Muto; Angela M. Brown; Kazuho f*ckuda; Rumi Tokunaga; Delwin T. Lindsey; Keiji Uchikawa; Satoshi Shioiri

Author Affiliations

Ichiro Kuriki

Research Institute of Electrical Communication, Tohoku University, Sendai, Japan
ikuriki@riec.tohoku.ac.jp
www.vision.riec.tohoku.ac.jp/ikuriki
Ryan Lange

College of Optometry, Ohio State University, Columbus, OH, USA
Yumiko Muto

Department of Information Processing, Tokyo Institute of Technology Graduate School, Yokohama, Japan
Angela M. Brown

College of Optometry, Ohio State University, Columbus, OH, USA
Kazuho f*ckuda

Department of Information Design, Kogakuin University, Tokyo, Japan
Rumi Tokunaga

Ritsumeikan Global Innovation Research Organization, Ritsumeikan University, Kyoto, Japan
Delwin T. Lindsey

Department of Psychology, Ohio State University, Mansfield, OH, USA
College of Optometry, Ohio State University, Columbus, OH, USA
Keiji Uchikawa

Department of Information Processing, Tokyo Institute of Technology Graduate School, Yokohama, Japan
Satoshi Shioiri

Research Institute of Electrical Communication, Tohoku University, Sendai, Japan

Journal of Vision March 2017, Vol.17, 1. doi:https://doi.org/10.1167/17.3.1

Abstract

Despite numerous prior studies, important questions about the Japanese color lexicon persist, particularly about the number of Japanese basic color terms and their deployment across color space. Here, 57 native Japanese speakers provided monolexemic terms for 320 chromatic and 10 achromatic Munsell color samples. Through k-means cluster analysis we revealed 16 statistically distinct Japanese chromatic categories. These included eight chromatic basic color terms (aka/red, ki/yellow, midori/green, ao/blue, pink, orange, cha/brown, and murasaki/purple) plus eight additional terms: mizu (“water”)/light blue, hada (“skin tone”)/peach, kon (“indigo”)/dark blue, matcha (“green tea”)/yellow-green, enji/maroon, oudo (“sand or mud”)/mustard, yamabuki (“globeflower”)/gold, and cream. Of these additional terms, mizu was used by 98% of informants, and emerged as a strong candidate for a 12th Japanese basic color term. Japanese and American English color-naming systems were broadly similar, except for color categories in one language (mizu, kon, teal, lavender, magenta, lime) that had no equivalent in the other. Our analysis revealed two statistically distinct Japanese motifs (or color-naming systems), which differed mainly in the extension of mizu across our color palette. Comparison of the present data with an earlier study by Uchikawa & Boynton (1987) suggests that some changes in the Japanese color lexicon have occurred over the last 30 years.

Introduction

The association between color terms, the color categories they name, and the stimuli that elicit them is a classic model system for studying the relationship between words and their referents. This is because all languages have at least some color terms in their lexicons, because colors are easily specified quantitatively, and because the physiology of the perception of color is better understood than the perception of many other stimuli. Furthermore, a physiological response to color categories may be present even in prelinguistic infants (Yang, Kanazawa, Yamaguchi, & Kuriki, 2016), although there remains some controversy about whether the acquisition of language modifies those innate categories (Franklin, Clifford, Williamson, & Davies, 2005; Roberson, Davidoff , Davies, & Shapiro, 2006).

Much of the modern work on color and language has been inspired by three key proposals in the seminal work of Berlin and Kay (1969). The first is that world languages contain salient terms for colors—the basic color terms (BCTs)—that are monolexemic, that are known and used by all members of the language community, that can be used to communicate about the color of any type of object, and that name colors not covered by any other BCT. The second proposal is that the BCTs in every language name colors that are derived from a set of 11 universal categories. The third proposal is that cross-cultural differences in color naming exist because color lexicons are at different stages along a constrained trajectory of color-term evolution. As color lexicons evolve over time, they increase in size, adding BCTs in a highly constrained order. With this “universalist” framework in mind, we have examined the modern Japanese color lexicon. We compare it to the contemporaneous American English color lexicon, and we compare it to an earlier study of the Japanese color lexicon for evidence of recent evolution of Japanese color terms.

The study of the Japanese color lexicon is important for three interrelated reasons. First, Japanese is spoken in a modern, highly industrialized society, where the chromatic environment is as diverse and colorful as anywhere on earth. According to the universalist perspective, the Japanese color lexicon should therefore closely approximate the lexicons of English and other languages spoken in industrialized societies. Second, there remain several questions regarding the number of BCTs in the Japanese color lexicon. It is known from the earliest written records of vernacular Japanese (the Manyō-shū poems, dating from before 759 D.E.) that the Japanese words ao (blue) and midori (green) were used more or less interchangeably, in a usage pattern similar to the “grue” motif of Lindsey and Brown (2009) for color-naming systems with a single term for green-or-blue. In present-day Japanese, ao is still used to denote certain green things, as well as being an abstract color term for blue things in general, whereas midori always names only green things. Moreover, historical linguists (e.g., Stanlaw, 2010) sometimes include kon (indigo) as a word for dark blue among Japanese BCTs. Therefore, it is possible that Japanese, like many world languages spoken in nonindustrialized societies, might not conform to the color-category structure seen in English. Third, a quantitative, empirical study of Japanese color naming that was conducted 30 years ago by Uchikawa and Boynton (1987; U&B) suggested that three nonbasic color terms—mizu (light blue), hada (peach), and kusa (yellow green)—might achieve BCT status sometime in the future. Comparing the results of the present study to those of U&B allows us to examine the Japanese color lexicon for evidence of language change over the intervening years.

U&B investigated Japanese color naming from the universalist perspective of Berlin and Kay. Using a color palette consisting of the 425 samples comprising the OSA-UCS (Optical Society of America, Uniform Color Scale), U&B found that Japanese color terms conforming to Berlin and Kay's 11 BCTs showed better across-subjects consensus, better test–retest reliability, and shorter reaction time than other nonbasic Japanese color terms, including mizu, hada, kusa, and kon. The present study diverges from U&B's methodology in two important respects. First, whereas U&B's samples spanned a relatively narrow range of generally low chromas (saturations), the present study adopts the 330 Munsell samples that were used by Lindsey and Brown (2014; L&B) and most other modern studies of color naming, which contain a larger range of generally higher chromas.

Second, the present study applies several quantitative tools that Lindsey and Brown (2006, 2009, 2014) have developed over the years for analyzing color-naming data. They showed that cluster analysis and associated statistical techniques can reveal important regularities in a language's color lexicon that might be missed by ethnographic studies or analyses of frequency of word usage alone (Lindsey & Brown, 2006). In particular, cluster analysis offers an objective way of controlling for synonymy in color naming by ignoring the color terms themselves and focusing instead on how they are deployed across color space. In the languages that Lindsey and Brown have examined so far—the 110 languages spoken in nonindustrialized societies that were included in the World Color Survey (Kay, Berlin, Maffi, Merrifield, & Cook, 2009 [WCS]; Lindsey & Brown, 2006) as well as English (L&B), Somali (Brown, Isse, & Lindsey, 2016), and Hadzane (Lindsey, Brown, Brainard, & Apicella, 2015)—the color terms glossed by cluster analysis correspond, with minor variations and some additions, to the standard list of universal BCTs from Berlin and Kay.

Lindsey and Brown (2009) applied a second cluster analysis to the color lexicons of the WCS, revealing the existence of a limited number of common color-naming systems, which they called motifs. These motifs recur, with minor variation, throughout the WCS data set. Strikingly, almost all the languages in the WCS, as well as Somali (Brown et al., 2016) and English (L&B), contain multiple motifs among their speakers. The importance of the motif analysis is that it can reveal statistically significant regularities in subpopulations of informants in a diverse language community that would be missed if that community were assumed to be hom*ogeneous. In the case of American English, speakers' color vocabularies are divided into two motifs. Those motifs refine Berlin and Kay's concept of the BCT in that, in addition to the 11 original BCTs, some color terms (e.g., teal, lavender, peach, and maroon) are “basic” for the individuals whose color idiolects fall into one motif but not for those whose color idiolects fall into the other motif.

The diversity of color idiolects seen in American English and elsewhere, as embodied in the motif concept, suggests a mechanism for color-term evolution that parallels biological evolution: Color lexicons evolve when the proportion of speakers in motifs with fewer BCTs declines and the proportion of speakers in motifs with more BCTs increases. It is from this theoretical perspective that we compare the present structure of the Japanese color lexicon to that observed 30 years ago by U&B.

Methods

Subjects

Thirty-two subjects (18 men, 14 women) from Tohoku University and 25 from the Tokyo Institute of Technology (12 men, 13 women) took part in this study. All were native speakers of Japanese. Most subjects were graduate students, but Japanese authors IK, YM, KF, and RT also participated. Only the authors were aware of the purposes of the study at the time they were tested. All subjects had normal or corrected-to-normal visual acuity, and their color vision was confirmed to be normal with Ishihara pseudoisochromatic plates. The experimental procedures followed the precepts of the Declaration of Helsinki and were approved by the ethics committees of Tohoku University and the Tokyo Institute of Technology.

Apparatus

Color samples and illuminant

The color samples, illuminant, and background color papers were similar to those used in the WCS. The 330 color chips used in the present study were from the Munsell Book of Color glossy ed., X-Rite, Inc., www.munsell.com). The chips were chosen to match the WCS samples with respect to hue, chroma, and value (although the WCS samples were from the matte edition). Each chip was mounted on a cardboard square 5 cm by 5 cm covered with gray matte paper approximating N5/ (in Munsell notation). Four pairs of 40-W D₆₅-simulating fluorescent lamps with high color-rendering index (FLR40S-D-EDL-D65, Toshiba, Minato, Japan) were mounted on the ceiling of an observation booth, providing an illuminance of 2,713 lx. An amber filter covered four lamps to adjust the color temperature of the illuminant to approximate 6000 K.

Procedure

Subjects used a single, monolexemic color term to name each sample. They were not allowed to use compound color terms like ki-midori (yellow-green) or modifier words like usu-murasaki (pale purple). However, they were allowed to use the name of a substance if they felt that the name was generally agreed to represent a color and could be generalized to name the color of any type of object. Each session took about 40 min.

Cluster analysis

Analysis of the Japanese color-naming data was performed in two steps, both of which involved k-means cluster analysis. The first cluster analysis was used to extract two entities from the raw data sets: (a) an estimate of the number of statistically significant named chromatic color categories in the Japanese language and (b) the extensions of each of these categories across color space. The second step used the results of the first cluster analysis to examine the color-naming patterns (motifs) used by Japanese informants by (a) estimating the number of motifs and (b) determining their categorical structures.

All analyses were performed using MATLAB (MathWorks, Natick, MA) and Mathematica (Wolfram Research, Inc., Champaign, IL) software platforms. We used custom programs which had been used previously by Lindsey and Brown and their colleagues in their analyses of color-naming data (Brown et al., 2016; Lindsey & Brown, 2006, 2009, 2014). Here we present an overview of our methodology; additional details may be found in the specific references cited.

The first k-means cluster analysis was used to classify feature vectors representing the sets of color samples associated with each chromatic color term deployed by each of our Japanese informants. A chromatic color term was defined as a term used by a subject to name one or more of the 320 chromatic colors in the WCS chart but never used by that subject to name any of the 10 achromatic colors (achromatic color terms were handled separately, as described later). Each chromatic-term feature vector consisted of 320 elements, each of which was set to a value of 1 or 0 depending on whether (or not) the chromatic color term was used by the informant to name the WCS color sample represented by that particular vector element (for details, see Lindsey & Brown, 2006, 2009, 2014). The resulting 828 binary feature vectors obtained from the chromatic words used by our 57 Japanese subjects were then sorted into k clusters using the kmeans(.) function in MATLAB.

This first k-means cluster analysis was designed to control for synonymy and hom*onymy in estimating the number of statistically significant named chromatic color categories in the Japanese language, which we designate k_L,opt. Cluster analysis classifies responses solely on the basis of how color terms are deployed across the 320 WCS chromatic colors, as embodied in the patterns of color-term deployment encoded in the binary feature vectors, without regard for the actual terms used by the subject. In American English (L&B), for example, k-means analysis showed that cyan and turquoise were synonymous with teal. The same analysis also revealed that tan has two meanings in American English; some subjects used it to name greenish-brown colors, which k-means analysis assigned to the olive English color category. Other subjects used tan to name light, pale pinkish-orange colors, and these feature vectors were assigned to the beige category. As we show later, the Japanese color lexicon also contains chromatic synonyms and hom*onyms.

By design, the k-means algorithm will produce a cluster solution for any predetermined number of clusters k from 1 to the total number N of feature vectors being sorted. Thus, k-means cannot estimate k_L,opt without additional analysis. For this purpose, we relied on the gap statistic of Tibshirani, Walther, Hastie (2001). First we performed k-means analyses for values of k from 1 to 25. Then, following the computational framework of Tibshirani et al., we performed gap-statistic analysis on these 25 separate cluster results by comparing, for each value of k, the tightness of clustering of the data to the tightness obtained by k-means clustering (using the same value of k) of reference null distributions derived from the data, as described later. By design, the expected value of k_L,opt for a reference distribution is 1. Thus, as the value of k increases from 1 to k_L,opt, the tightness of clustering of the data is expected to improve relative to that obtained from k-means clustering of the reference null distributions. Beyond k_L,opt, increasing k should not lead to any further improvement in the relative tightness of clustering. We express this result with the gap statistic G(k) (see L&B, equation 2): G(k) ≥ 0.0, 2 ≤ k ≤ k_L,opt. A step-by-step computational framework for gap-statistic analysis is given by Tibshirani et al. (2001, pp. 414–415). See L&B (pp. 11–14) for additional details regarding our particular implementation of k-means/gap-statistic analysis.

A somewhat subjective aspect of our methodology involved the algorithm for creating suitable reference null distributions for the gap-statistic analysis. For the present study, we adopted the algorithm used by Lindsey and Brown (2006, 2009, 2014), which is similar to one used by Kay and Regier (2003) in their analysis of color-naming centroids obtained from the data of WCS informants. To create a reference null distribution, each informant's raw chromatic color terms were first arranged in a 40 × 8 matrix according to the informant's responses to the 40 Munsell hues × 8 Munsell lightnesses of the chromatic samples used in our study. We then circularly shifted the elements of each informant's matrix by a random number of columns on the “cylindrical” surface of the WCS color space (shifting in the hue dimension) and then randomly reflected the rows of the resulting matrix (flipped the matrix vertically, corresponding to the lightness dimension) either zero times or one time. The resulting matrix was then decomposed into the appropriate feature vectors, as outlined earlier, based on the new mapping of color terms onto the WCS colors. In this way, our reference null distributions preserved much of the basic structure of patterns of Japanese color-term deployment—for example, the sizes and shapes of the patterns—while randomizing their locations within the 2-D hue/lightness coordinate frame of the WCS color chart. In this way, we obtained reference distributions with expected numbers of clusters of 1.

L&B noted some variation in the solutions produced from run to run in their k-means/gap-statistic analysis of the American English lexicon. Therefore, following their approach, we performed 1,000 different k-means/gap-statistic analyses, as described earlier (see Figure 4). The results of this procedure were compiled into a histogram of 1,000 resulting estimates of k_L,opt. Our conclusions regarding the size and structure of the Japanese color lexicon were based on the modal value of this histogram (Figure 4, inset). As we show later, the first step in our analysis of Japanese color naming revealed 16 distinct clusters of chromatic color-term deployment.

Figure 1

The modern Japanese color lexicon | JOV (2024)

This feature is available to authenticated users only.