Mind, Body, Cultural Evolution Lab

Home » Posts tagged 'Theory'

Tag Archives: Theory

Philosophy of measurement, functional equivalence, DSM V… or how did I get here?

Here is a very raw and unfinished “trying to wrap my head around some rather confusing issues” post. I have been thinking about levels of equivalence or invariance in cross-cultural measurement. I have been a wee bit unhappy with a couple of conceptual problems in the framework, but particularly the most general or abstract level of ‘functional equivalence’ has intrigued me for a while. Traditionally, it is more of a philosophical or theoretical statement of the similarity of functions of a psychological construct in different cultural groups. In other words, a particular behaviour serves the same functions in two or more cultural contexts.

I have been following some of the discussions on IOnet and the posts by Paul Barrett as well as the more biologically oriented personality literature. Following a few of these leads, I recently started reading some more conceptual and philosophical papers on the philosophy of measurement in psychology. More specifically, I just finished reading Joel Michell’s Quantitative science and the definition of measurement in psychology and Michael D. Maraun’s Measurement as a Normative Practice. These papers are superbly well written (as far as you can say that about these kinds of papers) and express quite a few of my growing concerns about psychological research in very clear terms. I started off wondering about functional equivalence, but got much bigger issues to chew on now.

Michell’s main logical argument is as follows (from his very concise reply to a number of commentaries, p. 401):

Premise 1. All measurement is of quantitative attributes.
Premise 2. Quantitative attributes are distinguished from non-quantitative attributes by the possession
of additive structure.
Premise 3. The issue of whether or not any attribute possesses additive structure is an empirical one.
Conclusion 1. The issue of whether or not any attribute is measurable is an empirical one.
Premise 4. With respect to any empirical hypothesis, the scientific task is to test it relative to the evidence.
Premise 5. Quantitative psychologists have hypothesized that some psychological attributes are
Final thesis. The scientific task for quantitative psychologists is to test the hypothesis that their
hypothesized attributes are measurable (i.e. that they possess additive structure).

The  major task for psychology is to actually prove that anything that we do has a quantitative structure. Much of his review is taking to task the legacy of Fechner and especially Stevens (for those of you who ever suffered through some advanced methods classes… these names should be painfully familiar). It was an eye opener to see the larger context and the re-interpretation of stuff that I just took for granted as a student and never really questioned later on in my professional life. Fechner’s legacy leading to a so-called quantitative imperative (e.g., Spearman, Cattell, Thorndike) was challenged in the early to mid-parts of the last century (the so-called Ferguson Committee), but Stevens became the most successful defender of this empiricist tradition. He argued in a representational theory of measurement that measurement is the numerical representation of empirical relations. There is a

‘kind of isomorphism between (1) empirical relations among objects and events and (2) the properties of…’ numerical systems (Stevens, 1951, p. 1). From this starting point he developed his theory of the four possible types of measurement scales (nominal, ordinal, interval and ratio)’ (Michell, page 370). This is the foundation of any scale development in psychology. In a second argument beautifully laid out by Michell, it then becomes clear that these numerical representations due to their assumed isomorphic relations then both define the relations represented and represent them. Given this operationism, ‘any rule for assigning numerals to objects or events could be taken as providing a numerical representation of at least the equivalence relation operationally defined by the rule itself.’ (Michell, p. 371).

And this loop is where we are stuck. We take a few items or questions, administer them to a bunch of people, factor analyze them to get a simple structure and voila… we have measured depression, anxiety, dominance, identity… you name it. Or take implicit measures…  you present a number of stimuli with no inherent coherent meaning and present them to individuals to measure their accuracy or reaction speed or whatever you want. Take the score and you have some measure of implicit bias, cognitive interference, etc. There is no relation between the empirical reality and the numerical representation as scores anymore. The question of whether the phenomenon of interest can be quantified has disappeared.

How does the DSM V fit in here? Well, it could be seen as just the latest installment of the same confusion. We don’t know what exactly we are measuring (see for example this article on grief as a case in point).

The issue is that we need to test whether psychological constructs can actually be quantified. As simple or complex as that. As much as I agree, I can’t stop scratching my head and wondering how the heck we are going to do that. How would you be able to examine whether any psychological construct (which is basically just an idea in our beautiful minds that we try to use and build some kind of professional convention around it) is actually quantifiable or not? The responses by a number of eminent psychometricians to this challenges suggested that nobody was able to come with an example to show that this has worked in a wider context within mainstream psychology.

Enter the second paper. Approaching the problem using Wittgenstein’s philosophy of measurement as normative practice (comparing it to the logical structure of language), Maraun argues that measurement needs to be rule-based or normative. You need to start with a definition that then leads to a specific set of rules or norms of how to measure this particular phenomenon just defined. The definition and the set of rules are the most basic form of expression. There is nothing simpler or more basic than this. Once these norms are established, any other person should be able to arrive at a similar result, that even if based on a different metric should still be convertible (e.g., from meters to feet). In psychology in contrast, we have no rules. We have a test or an experiment that is being conducted and the results are examined against another set of empirical observations to claim that the results are valid. According the practice of measurement in physics, empirically based arguments are not relevant for claiming that something has been measured. Measuring a number of items that factor together and then correlating it with some other instrument similarly derived does not mean that anything meaningful has been measured. Observing some kind of empirical pattern in an experiment does not constitute measurement if it is then validated or compared to a different set of  empirical observations. The issue is that the concept is not sufficiently precise defined to lead to a set of rules that govern its measurement.

There a number of other points in that paper around validity, nomological networks, covariance structure and the like. Again, I keep scratching my head. These guys got a point… but how to get out of it. Maraun is very pessimistic. He argues:

Simply put, measurement requires a formalization which does not seem well suited to what Wittgenstein calls the ‘messy’ grammars of psychological concepts, grammars that evolved in an organic fashion through the ‘grafting of language onto natural (“animal”) behaviour’ (Baker & Hacker, 1982). One aspect of this mismatch arises from the flexibility in the grounds of instantiation of many psychological concepts, the property that Baker and Hacker (1982) call an open-circumstance relativity (see also Gergen, Hepburn, & Comer Fisher, 1986, for a similar point). Take, for example, the concept dominance. Given the appropriate background conditions, practically any ‘raw’ behaviour could instantiate the concept. Hence, Joe’s standing with his back to Sue could, in certain instances, be correctly conceptualized as a dominant action. On the other hand, Bob’s ordering of someone to get off the phone is not a dominant action if closer scrutiny reveals the motivation for his behaviour to be a medical emergency which necessitated an immediate call for an ambulance. The possibility for the broadening of background conditions to defeat the application of a psychological concept is known as the defeasibility of criteria (Baker & Hacker, 1982). Together, open-circumstance relativity and the defeasibility of criteria suggest that psychological concepts are simply not organized around finite sets of behaviours which jointly provide necessary and sufficient conditions for their instantiation (Baker & Hacker, 1982). Yet, this is precisely the kind of formalization required if a concept is to play a role in measurement. (p. 457-458).

Maybe what we are studying is just the social construction of meanings of psychological concepts as expressed in the heads of individuals? Is this a feasible reconciliation? From a researcher perspective it might be a worthwhile endeavor (think of discourse analysts embracing factor analysis… the thought is actually quite amusing). However, this approach leaves our search for a) latent variables and b) measurement invariance completely meaningless.

The reading continues. Some random thoughts at 1am while I am writing these notes:
a) The search for quantitative latent constructs in psychology probably should (?) or could (?) start from basic biological principles. In essence, we assume that there is something ‘latent’ out there if we use EFA or CFA or any of the typical covariance structure tests. If there are biological mechanisms that lead to certain psychological phenomena, we can study the biological principles and their interaction with the social environment that lead to psychological realities. Then we could get around the quantification problem. Problem… what biological principles and at what level of specificity?
b) The use of covariance analyses provide simple structures of language concerning folk concepts. This may be useful and meaningful for understanding how people in a specific context interpret items or questions. It is probably more of a sociological analysis of meaning conventions than a psychological analysis. This could be useful or interesting for research purposes, but it is not quite how we commonly understand or interpret the results when we are using these kinds of techniques.

Or am I missing something? How can this measurement paradox be tackled?

Indigenous or not indigenous…. that is the question

Today I listened to a really interesting talk by Peter Smith from Sussex Uni. He was presenting his work on social influence, including some of the new stuff on different indigenous social influence strategies such as the Chinese guanxi, Brazilian jeitinho, Middle Eastern wasta and Russian svasy. These are all local behaviours that individuals may adopt to solve problems in their environment, typically by relying on social relationships or their power (e.g., for a great example from the news this morning – the son of the Iraqi transportation minister forcing a plane to return to Beirut). Peter and his colleagues asked students and managers to come up with good examples of each local cultural strategy in their local culture. They then took the most representative scenarios from each culture, removed any identifying content and gave it to managers in other cultures. What they found was that the supposedly indigenous influence strategies were generally seen as typical even in other cultures. In other words, British ‘pulling strings’ was often as likely to be seen as applicable and typical in China and Saudi Arabia as in the original British context.
This clearly challenges notions that indigenous influence strategies are unique and distinct to a specific local context. Of course, he immediately got challenged by some people in the audience defending the indigenous approach, claiming that these wimpy scenarios miss the rich context and the social relationships that go with each style.
I think that there are subtle differences in how these influence strategies work and are employed (see for example our qualitative ethnographic work on Brazilian jeitinho here and a set of empirical studies where we also make some theoretical claims about jeitinho vs guanxi here). Yet, there are three major issues that I think the indigenous people are missing.

First, there are limited behavioural options for humans. We are live in social settings with a core family, extended family and a relatively stable set of limited contacts in an extended social network. All these networks are more or less hierarchically structured. We all need to negotiate these networks and there is only a limited set of behavioural strategies for any of us (e.g., ingratiation, calling in favors, returning favors, making compliments, breaking some rules, paying a bribe, giving some gifts… you name it). See work by David Ralston. We can not just come up with something completely different. It is all there. We are humans. Therefore, people in most contexts will be able to recognize and distinguish particular types of behaviours. Hence, people can call a spade a spade… even if it looks a bit funny shaped.

Second, the functions of all these behaviours are to solve problems. It is the functionality of these behaviors, even if not socially approved and even considered illegal (think of corruption or nepotism), it still gets things done. This is why they are so widespread and so similar in form. We made this argument and showed some data supporting this claim here.  Peter Smith and his colleagues also found similar results in their cross-national study. Think behaviours – think functions. And think power corrupts… probably as universal a function of human behaviour as there can be.

Third, many of these behaviours are locally embellished, discussed, criticized, analyzed, debated. By doing this, these behavioural strategies take a life on their own in the minds of concerned members of a community. Go to Brazil and talk to them about jeitinho – you will be listening to complaints for hours – hopefully while having some good cool caipirinhas. Go to Lebanon and ask somebody about wasta – and better have a good shisha or coffee next to you, because you are not going to move for a while. These behaviours are often recognized as problematic, but they are so damn useful and this is why they continue. At the same time, discussing and gossiping about them becomes a reinforcer of the social norm and therefore serves as an identity marker. The behaviour is not just a behaviour anymore, but has taken a cultural life of its own. Therefore, it has to be unique – you can’t say that another place has also something that really seems to be jeitinho… or wasta… or guanxi. It is what makes us who we are as people… So dare you say that somebody else may have come up with something similar.

So how does my claim that there are subtle differences fit in with that? I think the first and second point are the answer to that. There are a number of limited behaviours and strategies that people can use to solve problems. The nature and type of problems will differ slightly by context. Therefore, some behaviours will be more common or be expressed with greater force or variety than others. Hence, there is a matrix of behaviours which is latently present in all contexts, but then is expressed to slightly different degrees. Some patterns of the behavioural matrix may be missing or be expressed very weakly in some places. Others may take a particular form due to the different social relations- compare the loose social relations in Brazil which allows more flexibility in social norm bending with the still relatively strong family networks in China that may be less flexible. So what differentiates the various styles is how the matrix is filled with specific behaviours in a specific context. Jeitinho may be a bit more norm breaking, wasta a bit more relying on social hierarchy, guanxi a bit more social relationship harmony focused. But the matrix is there. It is recognizable. It has blends of the same ingredients. It is this matrix that makes us human and helps us to interact with anyone in the world. It is what makes us humans.

A Brazilian will recognize Chinese guanxi and know what it is all about. A Russian will painfully remember some personal experiences when hearing an example of wasta in Lebanon. We all can understand what happened in Beirut this morning – even though we may not want to do or can not do it ourselves (even though I have to admit it would be bloody awesome sometimes to force that damn train or bus to come back when I just missed it… Just saying… :).

Applied Cross-Cultural Psychology: Some ideas for a meaningful science

I just spent the last 72 hours in 3 different countries. Lots of random thoughts raced through my mind while spending time in small eateries, big airports and on roads wide and narrow. How can cross-cultural research contribute to the development and well-being of societies? What are the tools that psychologists interested in culture can use to inform politicians and political decision-making? How can we make cross-cultural relevant to everyday actions and events, considering the massive challenges that humanity faces through globalization, climate change and increasing interdependencies at a global level?

I think there are three different paths that may address these broad questions of policy relevance and societal development. For lack of better words, I will call them culturally sensitive understanding, culturally sensitive change and culturally sensitive evaluation of change. In other words: a) an examination of processes that are of societal importance and relevance, b) development and application of culturally sensitive change programs and c) a culture-sensitive evaluation of existing intervention programs so that the needs of communities are better met. Engaging with bigger questions and practical problems entailed in these three approaches can help sharpening our basic research questions and theories as well as contributing to understanding and managing global issues.

Culturally sensitive understanding of societal level problems

The first option is a focus on a better understanding of psychological processes related to important societal outcomes. There are many debates about how society can be made more humane, healthy and prosperous. What are the psychological processes that are associated with these outcomes? Here, the strength of cross-cultural psychology is the quasi-experimental nature of culture. Societies differ along a number of important outcomes and potential antecedents, cross-cultural psychologists can take these variabilities and study what variables are most likely implicated in the different outcomes across societies. An open, but critical mind about potential antecedents about potential contributing factors is important. Once certain variables have been identified as potentially important, more controlled experiments to test the causality may be conducted. Not all variables can be manipulated in experimental settings (just think of the difficulty of manipulating national histories or seasonal patterns). This option is probably closest to standard psychological research. The main difference is a closer alignment between scientific research topics and questions of practical and societal relevance. 

My own focus has been more along the multi-country, sociological level of inquiry. One example is the work by Seini O’Connor. Corruption and political transparency has been on the minds of politicians, philosophers and political scientists for millennia. One of the major unaddressed questions though is what variables might be implicated in changes of corruption levels over time. There are many theories and ideas of what makes societies more or less transparent. Seini’s honours project addressed these ideas through an innovative longitudinal method and found some pretty surprising findings (see, the actual study can be found here).

Implementing culturally sensitive change programs

Second, cross-cultural psychologists can engage in developing and running culturally sensitive interventions that address practical problems. Psychologists interested in culture have been relatively successful in developing and running intercultural training programs. At the same time, programs that focus on developing and changing behaviours of individuals and groups have largely been left to general psychologists or other disciplines (e.g., developmental workers, economists, sociologists, political scientists). Only few programs have taken a culturally sensitive approach when trying to change behaviours (for a cool example, have a look at this project). There is much scope for innovative and important work to be done.

Evaluating interventions in culturally sensitive ways

Third, cross-cultural psychologists could get involved more in assessing existing change programs as they are applied and implemented in diverse cultures around the world. For example, micro-crediting – that is the provision of small loans to individuals or groups – has been used in many disadvantaged communities to fight poverty and contribute to economic growth. Yet, we know relatively little about the effectiveness of these initiatives, especially about how they fit in with the larger cultural norms, beliefs and practices. One of the interesting studies in this regard was reported in a study in Science last year . Karlan and colleagues demonstrated that micro-crediting in the Philippines led to down-sizing of enterprises and higher stress among recipients, which is contrary to common expectations about the effectiveness of micro-crediting. This study was conducted by economists who have little interest in examining the cultural (or even psychological) processes. Cross-cultural psychologists could significantly contribute to such research and help in evaluating programs so that they better meet the needs of the communities.

A gentle intro to cross-cultural equivalence – or how can we measure across cultures?

Psychology is the study of human behaviour and mental processes through scientific methods. The claim of psychology is often to be universal, that is applicable to all of humanity. Using scientific methods, we psychologists rely on a systematic and objective process of proposing and testing hypotheses and making predictions about the state of human nature.  Ever since the beginning of psychology as an academic discipline, the scientific quest to quantify natural occurrences to better understand and predict them in the future became one of the ultimate goals. Of course, this requires often extensive qualitative research, but ultimately the hope was and is that we can understand a behaviour or mental process so precisely that we can quantitatively measure it and also change it.

The application of such quantitative methods are now often taken for granted, even though the levels of quantification may vary. For example, we may want to select the most able person for a particular job, refer a child with learning problems to a specialist or we may wish to help a person with mental health problems to fully function in society again. Even though all these problems can be phrased in qualitative terms (a good person for the job, a child that has problems learning, a person who is not well), these are essentially quantitative problems because they always have some reference to implicit or explicit standards. A person might be BETTER qualified than another to take up a job or a person may have GREATER problems understanding concepts or material than 75% of the children of her age. Therefore, in many day-to-day situations we make implicit and intuitive quantitative statements.

If we want to make quantitative statements about a scientific concept, we run into one of the central problems in psychology. This is namely WHAT do we want to make a comparison about? Or in other words, how do we define a psychological construct so that we can measure it? A geographer, chemist or physicist is unlikely to phase the problems that psychologists have… after all, we can easily measure distances (e.g., how far is Auckland from Wellington), we have ways of dating the age of a piece of rock or we can measure the energy of particles when we collide them at the near speed of light. Psychologists on the other hand are dealing with intangible concepts that are difficult to specify. Most of you are familiar with concepts such as intelligence, attitudes, personality traits, depression or identity. However, if we were to ask you to pinpoint any of these concepts in the real world, we would be unable to do so. Our psychological terminology refer to unobserved mental constructs that we create in our community of fellow psychologists to indicate a particular set of problems, describe a particular set of behaviours or mental representations. I would argue that underlying many of these psychological terms are assumptions about relative coherence, stability, generalizability and potentially even some general biological foundations that lead to the emergence of such a syndrome. Therefore, we don’t just invent these terms on a whim, but we think that there is something meaningful to them that we think is important enough to look into and tell other people about.

Therefore, the first issue in any psychological study, even though it may not seem obvious anymore, is to clearly and unambiguously define and specify what we want to study. What is our construct or process of interest? It is at this point, that culture will throw the first curve ball at any psychologist attempting to address this question. How can we make sure that our definition or mental construct of our psychological term or process is actually valid or does have some meaning in another cultural context? How does our upbringing in a highly developed Western society influence how we think about psychological constructs? Can we assume that identity is a concept that is meaningful in a village in the lowland Amazon basin? Is our definition of depression applicable to refugees coming from Syria or Iraq? Is conscientiousness a useful term to screen out applicants for jobs in an international organization? Therefore, the first problem in any psychological study is to unambiguously define and describe the psychological process for all the populations that we are interested in. We could think of this as a mental bubble that we draw around some problem or process. Does this bubble ‘exist’ in all the different cultures that we want to include in our study? How can we find out whether this bubble is meaningful and has some value or relevance for all the local populations? We will discuss this as the question offunctional equivalence.

If we are confident that there is some value to this mental bubble of ours (let’s say, depression, personality or identity) and that the terms are meaningful in two or more cultures, then we need to find good indicators for it. In psychological terms, this is called operationalization. How can we empirically say that one person has more of this latent category quality that we just created with our mental bubble compared to another person? What would be a good indicator to tell us that one person is better for a job compared to another person or that one person is a better learner than another, who in turn may need some help? Here again, culture will throw lots of beautiful little challenges at us. We need to find indicators that are meaningful and relevant in each cultural context, but obviously we would still need to be able to compare the results across contexts. Therefore, we can’t have indicators that are relevant and meaningful in each context, but cannot be compared across cultures. We want to aim for some level of comparability. For example, is staying late at your desk a good indicator of being conscientious? Or could it be seen as being disorganized and incompetent? What if people are unfamiliar with office jobs? Is the number of items that you circled the temple this morning before going to work a better indicator of your conscientiousness? Is the ability to track animals over long distances and varied terrain a good indicator of concentration?  Or should we give people lots of d’s and b’s and p’s and q’s and then ask them to count how many p and q’s were together in each line? Should we measure intelligence by asking people to name as many types of medicinal plans for diarrhoea? Or give them complex questions about history and philosophy? This problem of identifying good measurement indicators will be called structural equivalence. Obviously, how we define and how we operationalize a construct is very much dependent on each other. For this reason, some researchers lump the two terms together as construct equivalence. For reasons that we will discuss later, I prefer to keep them separate.

So, we now have a mental bubble and we have a number of indicators that give us some clue about the latent bubble. However, we don’t actually know how good each of these indicators is in representing that latent bubble. We need to find a way to show us how well each indicator works in each of our cultures. In other words, is the same indicator better in capturing a key aspect of our construct in one culture compared to another? For example, is going to parties and having lots of friends a good indicator of extraversion? Is having many wives a good indicator of social status in all cultures? Is staying late at work to finish a good indicator in all cultures for high conscientiousness? This problems is called metric equivalence. It is the question about the relative strength of the indicator-latent variable relationship. In technical terms, we are concerned with the equivalence of factor loadings or item slopes in classic test theory or the item discriminability in item response theory.

Finally, we may be convinced that our indicators work equally well in all contexts. Each questionnaire or test items is really giving us a good and reliable insight into the construct. But there may be still problems. Some items, even though they have the same relationship with the latent construct in all cultures, may still be a bit more difficult or easier in one context compared to another.  If I would ask you to name the capital of Benin, most of you would probably struggle finding the correct answer. Benin is a country that is quite far from our thoughts and most of us will never set foot in this place or may not have heard about it in the media. However, if I would ask you about the capital city of one of your neighbouring countries, you would probably quite easily be able to name it. Therefore, asking about the capital of Benin would be easier for somebody living in Togo or Nigeria compared to somebody living in NZ or Denmark. This is the issue of full score or scalar equivalence. Technically, we would look at the invariance of item intercepts (in a multi-group CFA) or the differential item difficulty (in IRT).


In summary, measuring psychological attributes or processes across cultural contexts is quite difficult. I gave some relatively superficial and easy examples to make this a relatively non-technical and easy intro to the problem. We need to define our construct – draw our mental bubble around what we want to study. The first step in any cultural study then is to make sure that this construct or mental bubble is meaningful and functional in all cultures that we want to study. Once we think this is the case, we need to find good indicators that are observable and give us some insight into the position or state of an individual in relation to our mental bubble. We then need to discuss whether the indicators are equally good in all contexts or whether some are better in telling us something about a person or process in one cultural context compared to another. Finally, we need to find out whether all indicators are equally easy or difficult. Only once we have fulfilled this last criterion can we actually make any comparisons between individuals or groups across cultures. This is a tough task and unfortunately, most studies that you will see in the literature do fall well short of it. But this is the challenge that we really need to meet in order to develop a meaningful and universal psychological science.

Unpackaging culture & cultural differences

One of the most fascinating questions arises when we observe that individuals in a different cultural system behave or act in a different way. Why do they do that? What is the explanation or reason for showing these particular behaviours or responses? For example, we may have stepped on an exotic island and observe that the inhabitants eat way more chocolate ice cream that we do. Or they may tell you that there are lots of little ghosts out there taking care of them, many more than you ever thought would be possible to inhabit a small island like this. Or they may simple say in some interviews or surveys that they do not like to work as hard as you normally would expect in adult samples. How could we explain any of these differences?

Given the perpetual problem of potential bias in comparative research, we can never really rule out that our observations were simply erroneous – we might have had the wrong instruments, there may have been language problems in interactions (remember Lost in Translation?), we may have mis-interpreted the data or it may have simply been a chance difference.

One persuasive idea that has been around for quite a while in the social sciences is the idea of unpackaging. The terms goes back to a classic study conducted by Whiting and Whiting (1975). They orchestrated a large ethnographic study of child development among six communities: a New England Baptist community; a Philippine barrio; an Okinawan village; an Indian village in Mexico; a northern Indian caste group; and a rural tribal group in Kenya. They reported differences in a number of psychological processes, socialization and child-rearing patterns. Going beyond just noting these differences, they reasoned that there must be specific contextual variables that could explain the differences found, linking ecological constraints faced by these communities to psychological processes via adaptive socialization practices. For instance, they compared the activities of children from the same families, some of whom were living in cities and others in villages. They also compared families in which young boys helped with baby-tending with those in which girls did the helping. Therefore, these social conditions were linked to observed behavioural differences, leading to one plausible explanation of why they may have occurred in the first place. This is essentially what psychologists study as mediation:  processes and variables that explain the relationship between an independent or predictor variable and the dependent or criterion variable. It is about the causal theoretical processes, the how and why of the observed effects. We often think of mediators as internalized psychological processes of external conditions that lead to other outcomes down the causal chain. In cross-cultural work, it does not necessarily always be an internal psychological variable – it could also be living conditions or social constraints and norms that can act as mediators.

Put differently, unpackaging studies are extensions of basic cross-cultural comparisons in which the active ingredient presumed to cause the observed differences in psychological processes is directly measured and explicitly tested for its role in explaining the outcome. Have a look at the graphical representation of mediation. Unpackaging culture is one term often found in the literature, others include ‘linkage studies’ (Matsumoto & Yoo, 2006), ‘mediation studies’ (Kirkman, Lowe & Gibson, 2006) or ‘covariate studies/strategies’ (Leung & van de Vijver, 2008).

For example, Tinsley (2001) found that differences in the conflict management strategies of German, Japanese and US managers were completely mediated by the values held by members of these cultural groups, and Felfe, Yan and Six (2008) reported that individuals’ scores on a ‘collectivism’ scale mediated differences in organizational commitment across samples of Romanian, German and Chinese employees.

In an ideal test of mediation, the researcher tests whether other relevant variables that are not related to the hypothesis also yield mediation effects. This provides greater certainty in establishing exactly what the causes the results that are obtained. For instance, Y. Chen, Brockner, and Katz (1998) showed that a measure of individual-collective primacy mediated the intergroup effects that they had predicted and found. They then tested whether six other measures derived from the concept of individualism-collectivism also mediated these effects, and found that they did not. Studies of this kind help to clarify the loose and varied ways in which the psychological aspects of individualism and collectivism have been employed by different authors.

What are some general concerns?

In the above examples, the mediator and dependent or criterion variable were measured using the same or similar methods. If there is some third unmeasured variable that is related or unrelated to the independent variable, we may end up with a situation where it appears that there is mediation, whereas in reality, there is none. Having multiple mediators measured with the same method as the DV may lead to some reassurance about the findings, but ultimately, the best test would be a test using independent methods

Experiments have been much in vogue recently to study cultural differences. One of the major concerns is whether the manipulation was effective or not. This is again the problem of potential bias in comparative studies. If we have a psychological mediator in our experiment that highlights how the manipulation is affecting the DV, then we are much safer grounds in terms of explaining the psychological processes.

In summary, unpackaging has two important and inter-related features: identification of the theoretical factors or processes that may cause cultural differences in psychological outcomes of interest, and an explicit empirical test of the proposed processes leading to these outcomes. Therefore, it is as much about theory as it is about methods and stats. Having unpackaged an observed difference in behaviour, attitudes or beliefs and having ruled out alternative theoretical explanations (other potential mediators), we can also place much more confidence in our results. I leave it up to you to come up with potentially meaningful variables that we could use in those three semi-silly examples (ice cream, ghosts and motivation). Once you have some mechanism, the next phase would be to test whether the mediator(s) actually do the job. Ideally, this is one of the best ways to rule out measurement bias – explain where the difference came from and that the difference is not driven by artefacts.


Some more technical explanations are available in Fischer (2009); Leung & Van de Vijver (2008) andPoortinga & Van de Vijver (1987, this is an excellent discussion early discussion with some great multi-method examples). Excellent resources and tutorials for running state-of-the-art mediation analyses are available from on Kristopher Preacher’s and Andrew Hayes‘ websites. Use them!!!!!

Overall, I think this is the most exciting part of cross-cultural research – put on your detective hat and find out where any difference that you perceive in the world ultimately stems from.