Stereotypes evoked by occupational labels
The term stereotype refers to a generalized and simplified belief about a group of people [1], which forms through a socially embedded cognitive process that involves associating attributions to the group [2]. It is widely accepted that stereotypes might be formed based on direct observations [i.e. data-driven model of stereotypes, e.g., [3], or expectations held of a group [theory-driven model, e.g., [4], or reflect a combination of the two models [5]. It extends to physical appearance, interest, occupations or any similar characteristics held by a group of people [6,7,8,9]. Stereotypes as the contextual background of social perception result in that information about group members will be assimilated towards the stereotype, that is, they are being seen as more similar to the members of their own group [10]. Categorization is an inherent part of all perception [11] which also occurs after extremely brief exposure both in the visual [12] and the auditory modality [13]. In the social domain, peoples’ observable features evoke the process of instant categorization. Though categorization and the activation of stereotypes are distinct processes [14], stereotypes are activated automatically in most real-life and experimental settings as soon as someone is exposed to another individual [13,14,15,16,17,18,19]. A few attributes typical of a group are sufficient to infer group membership and behavioral characteristics [20]. For instance, clothing is used to infer other’s sexual preferences [21, 22] and to categorize them accordingly. When someone has been already categorized into a group, for example by their gender, it affects how their behavior is judged: people observing a man performing an act see him as more aggressive than a woman doing the same thing [20]. In addition to gender, age, ethnicity, beauty, and other stereotypes, occupational labels also activate automatic inference processes, and elicit the overall representation of a particular group [23].
While the content of stereotypes may differ between cultures, several principles of stereotyping are culturally universal [24, 25]. This includes the minimum requirement that is necessary for being able to talk about stereotyping, namely the phenomenon that all societies categorize and organize their members into subgroups [26]. The very basis of this is to classify people according to their age and gender. In societies in which resources are distributed unequally, more complex social constructs may divide the cultural groups further. This results in the divergence of groups based on financial resources, knowledge, and social relations in basically all Western and non-Western societies beyond the level of mere subsistence [25]. In countries with similarly structured societies the segmentation to subgroups has a lot in common. In a study conducted in the USA and Germany, participants were asked to form groups of listed professions [27]. A striking convergence was found between the stereotype dimensions in the categorization in the two countries. Participants of the study predominantly discriminated occupations based on agency and progressiveness, and to some extent on sociability as well. Further research discovered that the employees were not only identified by their job title, but assumed personality traits were also assigned to them based on their professions. It was also noticed that positive or negative perceptions of occupations in the same group were “transferred” to the rest of the group, so that the groups also received a kind of shared rating [27]. The conclusion is the same as in most previous studies on stereotypes [21,22,23]: if we do not have enough information about a person as an individual, inference will be based on categorical attributes.
Semantic processing in person perception
One of the most systematic studies to date, which aimed to map how semantic information, delivered by occupational labels, is represented, was conducted by Imhoff et al. [27]. As noted in the previous section, their research focused on those possible dimensions that could serve as anchors when evaluating people. Undeniably, the number of dimensions and variations of spontaneous categorization could be countless. Therefore, the study focused on filtering out the most typical and practical ones. To obtain truly spontaneous stereotypes that best reflect what people usually think of first, the participants were free to come up with dimensions based on their own logic, and to classify and categorize the target labels according to these dimensions. The researchers found that the most frequent dimension by choice was agency. For instance, surgeons, software developers, and aerospace engineers were placed at the top of this dimension; cashiers, telemarketers, and parking attendants at the bottom. This suggests that agency refers to being powerful, assertive, and high in status. The second most frequent defining dimension was progressiveness. On this dimension, paramedics, firefighters, and police officers were labelled as conservative, conventional, and preventive (rule-based) types, while musicians, athletes, and designers were more likely to be labelled as liberal, alternative, and promotional types (characterized by innovation, risk-taking, brainstorming, etc.) [27, 28].
One obvious result of the study above is that it has shown that different occupations evoke similar stereotypes quite consistently. Certain social expectations may come with these resulting dimensions, which in turn can also be associated with evaluative judgements. Agency and progressiveness, beside providing superordinate categories for the occupations, deliver an affective meaning as well. Of course, people have idiosyncratic differences in how they relate to progressiveness; this term can have a positive meaning for some and negative for others. Agency, that is, having the knowledge and capability to deal with demanding situations, in contrast, is a term universally evaluated as a desirable feature (see 33). However, beyond the fact that some abstract categorization happens, we do not know much about the basic cognitive processes underlying these attributions.
Much of what we know about the cognitive processes of person perception comes from studies on the cognitive background of face perception. Bruce and Young's [29] cognitive model, later improved by Breen et al. [30, 31], shows that a key element of intact, conscious recognition of persons is that faces should also recall semantic information. These include names, occupations, places of residence, and so on. The so-called person identification node integrates visual information from the face with knowledge stored in memory. At the same time, characteristics that are not very specific to the individual—but are nevertheless important attributes, such as the aforementioned dimensions of agency, progressiveness, and sociability—might be parts of the features that are also involved in person recognition, just like they are elements of semantic information processing. Although the latter has not been investigated directly, there have been several concordant studies that show that facial appearance is used for spontaneous inference about trustworthiness and competence within a short time [32,33,34,35].
Affective processing during face encoding—a model for understanding abstract representations
Just as stereotypes are based on observed behaviour and typical physical appearance, stereotypical descriptions activate representations of both expected behaviour and physical appearance. Therefore, we need to examine at which level evaluation of individuals happens in the first place: is it appearance, behaviour, or semantic knowledge? Each of these evoke affections directly, and the provided emotional content in turn contributes to the conscious recognition of persons, as it has been implied by the most influential face perception models [30, 36, 37]. Face recognition and face categorization are topics that have been studied extensively, therefore, theories of the cognitive processes in their backgrounds received a significant amount of empirical support. Face categorization is a special type of categorization which is, in real-life settings, usually involved in forming stereotypes and making social decisions. Hence, person perception cannot be understood in depth without having sufficient knowledge about face perception itself. Therefore, we will use face perception models as the starting point from which we intend to construct models that work on a more general representational level. To do this, it is necessary to review basic insights from said field, and highlight those cognitive and neural processes which might explain features of categorization in a more general sense, and those which might be paralleled with representation of semantic and affective contents.
The concept of affective space is anchored to face recognition models [7, 8], and is also present in research on person perception [38, 39]. Affective space is considered as a two-dimensional categorization system, where faces are assigned a place based on the emotions they evoke in the perceiver [40]. The dimensions of affective space are arousal (intensity) and valence (pleasantness). These dimensions can be thought of as the X and Y axes of a coordinate system where each face has a valence-arousal coordinate. In the dimensional approach, the intensity of the emotions we experience ranges, along the vertical axis, from low activation to high alertness (i.e., from calm to agitated, or bored to tense), while on the horizontal axis of the dimension the valence ranges from negative to positive (i.e., unpleasant-pleasant, sad-satisfied, upset-joyful). Valence refers to a kind of evaluation, or value attribution, that is subjectively induced by the appearance of emotions, while arousal refers to the level of activation or energization associated with emotions and their physiological characteristics [41, 42]. The study of Lang [41] explains emotional valence and arousal in terms of the functioning of specific motivational systems of the brain. According to the dimensional approach, the neural functioning of the two is mainly determined by valence, i.e., the emotional evaluation itself [42]. Valence is associated with the functioning of the two types of motivational systems in the brain and therefore plays a primary role [41]. According to Bradley and colleagues [40], the affective space can be equated with the approach-avoidance system, i.e. the appetitive and the aversive systems. The appetitive system is associated with pleasant things (exploratory behavior, eating, sexual behavior), but their intensity can vary from a relaxed state to an aroused state; while the aversive system deals with unpleasant consequences (avoidance, defensive behavior), also showing a large variance along the arousal dimension. The latter, since it communicates only differences in activation to the appetitive, the aversive, or both systems, has only a secondary, complementary role in the dimensional ordering of emotions [41].
The affective content elicited by unfamiliar faces depends primarily on the structural features of the face, its attractiveness, and how it is categorized by the observer. A divergence between affective processing and other processes involved in social cognition can be observed here as well: studies by Harris and Fiske [43, 44] revealed that faces implying having low competence and low warmth elicit responses in the amygdala and insula, that is, in regions playing a role in the processing of negative emotions, such as disgust, but no activation was measured in the medial prefrontal cortex, an area essential in social cognition. In person perception research, prejudices are typical cases of the expression of emotions. For someone who is prejudiced against a race, the perception of characteristic physical traits of that race is accompanied by the affective content of the prejudice [45]. The emotions evoked by the face of personal acquaintances and famous people, in contrast, depend on the specific experiences associated with that individual [46,47,48,49]. According to cognitive models of face perception, affective content plays an essential role in the conscious recognition of persons, in addition to semantic content. If the affective processing pathway is impaired, it leads to severe face recognition deficits. For instance, patients with Capgras-syndrome are able to recognize people who they have met before, however, the impairment of the so-called covert pathway of face recognition—which is responsible for affective contents—prohibits the acceptance of the fact that the observed faces are identical to those who are personally familiar [30, 36]. In contrast, patients suffering from prosopagnosia (i.e., an inability to consciously recognize familiar faces) show elevated physiological arousal (including increased heart rate and galvanic skin response) when they are exposed to faces of close acquaintances. In these individuals the overt pathway of face recognition is impaired, whereas the covert pathway is intact [30, 31, 50]. In summary, cognitive models of face recognition suggest that a sufficient affective charge is necessary for the activation of faces stored in the memory. Similarly, when approaching the process of face recognition from a representational view, we can say that stimulation of the affective space activates the associated region of the face space. Such affective content can be conveyed by labels that are known to activate stereotypes, that is, which evoke our expectations of behavior and appearance.
Interconnectedness of representational spaces
The process of face recognition combines semantic and affective information to allow the recall of familiar faces. Above, we briefly summarized how this happens on the cognitive level. However, relatively little research has explored the interconnectedness of representational spaces, or tried to integrate the available pieces of information into a cognitive model. One notable attempt, called the Trait Inference Mapping (TIM) model [51], aimed to combine the concepts of face space with trait space.
However, this model has a rather monolithic view of trait space, and does not differentiate between the semantic and the affective contents that constitute the representation of traits. Furthermore, it only focuses on the physical aspect of people, namely faces. Nevertheless, the presentation of a face leads very rapidly to categorization, which in turn activates stereotypes related to that category. Hence, TIM is a good example of how different representational spaces might interact. Taking advantage of the approach of TIM, we aim to extend and generalize this model by suggesting that any aspect of a person—including facial appearance, group belonging, typical behavior, occupation, or any other characteristics—can be treated as an individual representational space, and thus be the subject of analysis. Similarly, trait space might be divided into semantic and affective spaces, and the latter one, if necessary, broken down to indices of valence and arousal. This is a similar approach to that utilized earlier by Stephan and Stephan [52]. They described stereotype activation as an interplay between cognition and affect (see also [53]). The terms they used, however, differ from those in the current manuscript. Cognitive processing in their wording is similar to what we call semantic representation; activations of affective states is similar to valence in our approach. Despite these differences, the inferences which can be drawn from their model can be paralleled with our expectations: when someone is exposed to a label describing a group, an emotional state (which involves certain levels of the feeling of pleasantness and activation as well) is elicited. It is important to note that this approach of differentiating between semantic information, and valence and arousal, serves the aims of a theoretical investigation; it is an abstract, purely cognitive model, and might not persevere when its neuroanatomical implications are tested.
Aims and hypotheses
Person perception can be understood as a process of integrating different representational spaces. In this process, the image of each person is composed of elements that sometimes complement each other and sometimes mutually determine each other. Examples of such elements are position in face space, i.e. physical appearance (including facial symmetry, masculinity/femininity, skin texture, etc.), semantically interpretable attributes (group category, occupation, etc.), and position in affective space, determined by valence and arousal. These components may be interpreted at different levels of neural processing, but they can nevertheless be incorporated into a common cognitive model. In our first attempt to build a usable model of person perception, we focus on the processing of semantic and affective information. Therefore, the fundamental question of our research was to explore how big of a role the semantic content of labels, as well as the emotional responses they elicit, play in the processing of group-typical labels.
To this end, we designed an experiment where participants had to arrange occupational labels based on their semantic and their affective contents, respectively. Our analysis plan was to run cluster analyses on the arrangements to see whether the labels intuitively grouped together in the first task would show the same pattern in the second, formally instructed task as well. This analysis would reveal some of the connections between the two representational spaces. We expected that the representational space of affective contents would show a significant overlap with the space of semantic representations.