I’ve avoided talking about the Google memo for as long as I can — partly because I naïvely thought that its shitty argument and shittier writing would do the work of discreditation without anyone’s help, and partly because, as women and people of color and especially women of color (I’m white) know, it’s exhausting and demoralizing to respond to every attack on your right to inhabit certain spaces. The “keep your head down and work twice as hard” approach feels tempting, for me at least — but, as the Google memo shows, whatever work we might produce in the process will automatically be invalidated as a statistical outlier: if women don’t do (science/math/analytical thought/whatever) well, and you’re doing (X) well, you must not be a real woman. (This circularity is, I’m guessing, extra hard on trans and gender-non-binary folks, an experience I can’t speak to and won’t attempt to.) Besides, the memo has been gaining more traction than I expected, largely among Facebook friends-of-friends whose pleas for science and objectivity I see only because my Facebook friends are valiantly telling them to stop being idiots.
So let’s talk about the science. If you want to see the memo in the context of a very long history of invoking “science” to justify the biased status quo, read Chanda Prescod-Weinstein’s great piece in Slate; if you want a detailed breakdown of the methodological and conceptual flaws behind many studies on “gender in the brain,” read Cordelia Fine’s book Delusions of Gender. If you want an annotation of every non sequitur, straw man, and entirely evidence-free argument in the memo itself … well, to my knowledge nobody has done that yet, but anyone who undertook it would be doing the Lord’s work, so hit me up if you do. My aim for this piece is much smaller than that: I’m going to focus on one aspect of one study that has been cited as evidence for one of the claims in the memo.
In discussions of the memo, a couple of folks have attempted to explain the gender gap in STEM fields (or rather, in some of them; more on this later) by appealing to a supposedly innate difference in what men and women find interesting. Men, the story goes, are biologically predisposed to be more interested in things, while women are predisposed to be more interested in people. We might break this claim down into three components:
- There exists a clearly defined opposition between thing-interests and people-interests.
- This difference clusters along gender lines (i.e., men prefer to think about things, while women prefer to think about people).
- This gendered pattern can be explained by biological differences present at birth.
Now, claim 3 is arguably the juiciest and the most overtly controversial, so it tends to attract most of the focus, but it’s not my focus here. In brief, I’ll just note that there’s a tendency, in studies of gender and in the popular uptake of neuroscience in general, to assume that finding something in the brain — whether it’s a structural feature or a particular pattern of activity — means that it cannot possibly be attributed to cultural factors. Given what we know about neuroplasticity, this notion is misleading to say the least; it’s very obviously derived from a faulty understanding of the relationship between body and mind; and, like many similarly unexplored assumptions, it gets a thorough takedown in Delusions of Gender — which, really, just go read it. (Of course, it is not a perfect book, and I don’t endorse every argument in it.)
“Gendered Occupational Interests”, the article that many people seem to believe constitutes a smoking gun for this people-things distinction, takes part in this attempt to prove claim 3. The study appeared in Hormones and Behavior in 2011, and I’m addressing it not because I believe it is a uniquely weak article, but because it seems to strike a chord with so many fervent promoters of gender difference. The authors argue that prenatal androgen levels explain significant differences in career interests among male and female adolescents; to support this claim, they compare a group of participants with congenital adrenal hyperplasia — in other words, girls and boys who were exposed to higher-than-normal levels of testosterone and other androgens in the womb. In girls, this condition can lead to “masculinization” or “virilization” of the genitalia. Congenital adrenal hyperplasia is a major research focus for scientists interested in making biological claims about gendered characteristics, and I can’t possibly do justice to this corpus here; suffice it to say that there are plenty of internal disagreements in the field.
Let’s take a couple of steps back, though, and focus on claim 1. I’m a literary scholar with a general research interest in the process of personification; my work often comes up against problems of what constitutes personhood or objecthood, and what features humans use to ascribe those qualities to an entity in the world or in a text. So what does it mean to be interested in people to the exclusion of things, or vice versa? For the authors of the article in question, these interests can be gauged through career choice — specifically, to an individual’s degree of interest in the items on an “Occupational Interest Inventory.” This list is divided into six categories — “Realistic, Investigative, Artistic, Social, Enterprising, Conventional” — of which the authors remove the final (“conventional”) set, largely because the jobs it includes (bank teller, office clerk, “hospital records clerk”) proved not to be particularly interesting to any of the participants.1 For the record, this “RIASEC” schema is based on the Strong Interest Inventory, developed by one E.K. Strong in the 1920s and designed for career counseling.2 None of this information inevitably invalidates the metric, but it does perhaps open up some questions about its present applicability (“software engineering,” the career with which the Google memo identifies the entire “tech” field, is nowhere on the list) and its psychological relevance. Any survey or experimental setup steers participants toward certain responses to the exclusion of others, of course, but one might have qualms about assessing the range of an adolescent’s interests — from the passions that animate their being to the elements of an everyday scene they focus on — through a list of careers that seemed broadly accessible to a vocational counselor advising predominantly white American students.
Now, outside of the “Social” designation, none of the RIASEC categories seem automatically to scream “people” or “things” respectively. I’m not making the more general point that pretty much all jobs necessitate dealing with persons and objects at some point; this is true enough, but I’m willing to accept that some jobs entail more interpersonal interaction than others. One might surmise that a job like “mechanical engineer” would fall under the “things” category, for instance, while something like “day care worker” would be classed as “people.” Plenty of the items on the list, though, are genuinely ambiguous: a “scientific research worker” might be studying biological anthropology, theoretical physics, or — I don’t know — adolescent career choice as a function of fetal androgen levels; an “artist” might execute highly interactive performance pieces or might spend most of their time alone in an atelier making assemblages; even “building contractor,” I think it’s fair to say, admits of a pretty wide range of approaches to personal interaction. And all of this leaves aside the obvious point that “interest,” in the context of careers, implies more than just enjoying the task at hand; I might be “interested” in becoming a software engineer, not because of any particular aptitude for the subject or pleasure in the work, but because I want to make lots of money and have a job that brings me high status. (Is status-seeking evidence of an interest in “things” or in “people”?)
These objections are all based on the specifics of the survey situation; there’s another, more theoretical critique to be made of the implicit assumption that “people” and “things” are somehow opposites. The authors, to their credit, acknowledge that the “bipolarity of Things-People” is not a given, even though “analyses generally produce a bipolar factor”; perhaps, they suggest, a surplus of people-interest does not necessarily imply a deficit of thing-interest or vice versa. This concession comes after a set of quantitative results specifically structured to generate a bipolar distribution (on which more in just a second), so it is perhaps understandable that most readers seem to focus more on the tables and graphs then on this apparent afterthought. That the authors themselves were not interested in doing the conceptual work to tease out this problem is apparent by their one (approving) citation in this paragraph, which points to an article on “psychological androgyny” from 1975; the author of that study (apologies for the paywall) correlated participants’ scores on a survey assessing gender identification to their performance on what was essentially a test of their ability to satisfy gender stereotypes. A choice quote from the discussion of results:
It is the results of the feminine females which are the most surprising. It is true that, as predicted, the feminine females failed to display masculine independence in the face of pressure to conform, but it is also true that they failed to display feminine playfulness when given the opportunity to interact with a tiny kitten. Thus, across the two experimental situations, the feminine females can be said to have “flunked” both critical tasks, and consequently, it is they who seem to have the most serious behavioral deficit. (642)
But I digress. Back to that “bipolarity” question: what reason do we have for thinking that humans can be sorted according to their people-interest and thing-interest in the first place, regardless of those categories’ relationship to gender identification? For the authors, supporting this claim requires demonstrating that responses on the Occupational Interest Inventory can be reliably sorted along a people-things axis — in other words, that individual variation in interest in particular jobs can be best explained by those jobs’ peopleness or thingosity, respectively. This is important, because it’s not really enough for these authors to show gender differences in the preference for particular jobs (which they do); as they well know, those differences could be accounted for by a huge variety of factors, many of which are not biological in the slightest. So they need to provide evidence that these differences are well explained by the people-things distinction — which, in turn, they believe to be explained at least in part by fetal testosterone levels.
To make this connection, the authors turn to a principal component analysis. Now, as it happens, I have some experience with PCAs through my work at the Literary Lab, which uses quantitative and computational methods to analyze large corpora of literary texts.3 In its pamphlet series, the Lab has often used PCAs (perhaps most effectively in “Style at the Scale of the Sentence”) to look for clusters in the distribution of words; if you have a lot of variables that may or may not be correlated with each other, PCAs are great at grouping those variables into sets that are in some way statistically similar. One of the downsides of PCAs, of which we at the Lab are acutely aware, is their relative opacity. The two “principal components” of the name are the eigenvectors that represent the greatest degree of variance in the data (see the above Wikipedia article for more detail on how those eigenvectors are obtained), but those components are determined through complex statistical modeling and need not bear any relationship to what a human would consider to be significant axes. In other words, a PCA can show you that two groups of values cluster separately, and it can show you that those groups are maximally different along some axis of variation, but it can’t tell you what accounts for that difference; to figure that out, you need to interpret.4 That’s not a problem in itself — statistical results only acquire meaning through interpretation, and interpretation is literary scholars’ stock in trade. But it does require self-awareness, care, creativity, and a willingness to consider alternative explanations for the distribution.
The variables that the authors of “Gendered Occupational Interest” used in their PCA were the composite individual scores for each of the RIASEC categories (minus “Conventional,” as mentioned above). Note that by taking the composite scores for each category, rather than looking at the scores for each career, the authors are already eliminating a bit of variation that might have proved interesting: if the scores for, say, “biologist” and “astronomer” turned out to cluster with those for “poet” and “musician” more than they did with “surgeon” and “physician,” we might have had to reevaluate the validity of the RIASEC schema. But that’s pure speculation, because the authors chose to assume the internal coherence of the categories; the PCA shows us only the covariance of individual ratings for each of the categories. In other words, it shows us to the axes that best separate an individual’s rating of five different broad career types. If we see “Realistic” and “Social” on opposite sides of the PCA, we should assume that individuals with a high preference for “Realistic” careers are likely to have a low preference for “Social” careers, and vice versa.
The results might at first seem to indicate a victory for the people-things hypothesis: “Realistic” and “Social” do seem to be well separated by one of the components of the graph. The other component is clearly doing some work as well: it pulls “Enterprising,” which seems to be almost perfectly neutral on the Realistic/Social axis, far to one side, while “Investigative” tacks to the other. But let’s remember that the labels on this graph are not immanent in the data; they were obtained by interpreting a statistical process designed to maximize covariance. On the one hand, the authors went into the study with the hypothesis that the “Realistic” and “Social” career types would separate cleanly, and they did; hooray! On the other hand, they based that hypothesis on the assumption that the separation was attributable to thing-orientation or people-orientation — and the PCA does not show that. The “Things” and “People” labels, then, are as much an interpretive move as the “Data” and “Ideas” labels, for which the authors do not offer any rationale, but which I find frankly perplexing. (I suspect that the scientists whose careers are included under the “Investigative” rubric would be surprised to learn that their jobs involve less “data” than managing a hotel.)5
One might object here: okay, technically the PCA doesn’t prove that the Realistic/Social axis is explained by the things/people distinction, but what else could it be? Doesn’t it just feel obvious that being a social worker is more like being an art teacher than it is like being a jet pilot, or that being a surgeon is more like being a mechanical engineer than like being a “playground director” (whatever that is)? And if so, don’t we have to conclude that those differences are based on the degree of interpersonal interaction, or the degree of object-orientation, associated with those jobs?
I have to agree with you on one point, imaginary interlocutor: it does feel obvious! And I think there is a very good reason why. Have a look at Table 2; the relevant column here is “Sex Difference (Unaffected)” which shows the difference between non-CAH-affected females’ and males’ positive ratings for different career categories. There’s a pretty big discrepancy, as you can see, between “Realistic” and “Social” careers: on average, boys do rate “Realistic” careers more highly than girls do, and girls do rate “Social” careers more highly than boys do. I don’t pretend to have particular insight into the causes behind these ratings, but one explanation does come to mind: these career sets conform especially neatly to male and female stereotypes. The “Realistic” jobs involve some element of either technical training or physical risk, while the “Social” ones are uniformly concerned with caregiving. (That there might be social jobs that do not involve caregiving doesn’t seem to be accounted for here, which by now should not come as a shock.) Historically, these categories have been particularly bifurcated along gender lines, both statistically and in the popular imagination; and if, instead of reaching for ‘70s-era theories of gender conformity, we turn to possible selves theory, we find a solid theoretical explanation — coupled with abundant anecdotal evidence — for why adolescents might be less likely to imagine themselves in a role that people of their gender rarely fill.
When I turn my interpretive eye on the PCA provided by the authors, then, what I see along the “Things/People” axis is — a pretty good approximation of a “Male Stereotype/Female Stereotype” axis. Jet pilots and elementary school teachers occupy the respective extremes, of course, but the interim makes sense too: the case that “physician” is a less people-focused job than “interior decorator” seems at the very least un-obvious, but as we’ve seen from a recent article by engineers at Google, the word physician is implicitly gendered male in news articles, just as “interior designer” shows up as an “extreme she occupation.” When researchers use tools that presume a binary to look for gender distinctions — whether those tools are PCAs or, as in the word embeddings article, vector space analysis — they find a binary. That binary isn’t imaginary — it is true that gender is culturally perceived to be a binary — but it says exactly nothing about where that binary comes from or whether it has a biological basis. And the fact that people identifying as male and female sort themselves along male and female stereotype axes is, while worth demonstrating, neither surprising nor compelling evidence for some kind of basic cognitive or motivational difference between those genders; neither is the fact that people whose gender identification is more complicated (because they were born with ambiguous genitalia, or because they have been treated for high androgen levels) would have more complicated answers.
Does the fact that this particular article does not hold water spell the end of biological justifications for gender inequality? Clearly not — both because the lack of empirical evidence never stopped sexism before, and because it remains theoretically possible that some ingeniously constructed study will actually demonstrate an innate, non-malleable, behaviorally efficacious neurological difference between male-identified and female-identified humans.6 But this reading seemed to me worth doing, first because it’s clearly more than most of the folks citing this article as proof positive of biologically determined gender roles have done, and second because it demonstrates that the Google memo’s rallying cry of “science” is neither sufficient nor sincere. The tools I’ve used to think through this study are all tools that I learned in my humanistic education, which encouraged critical thinking of a kind that the hysterical co-signers of the memo don’t seem willing or equipped to practice. If we’re applying a binarizing gender logic to academia, the humanities certainly should be the opposite of everything the world of “tech” imagines itself to be: “fuzzy,” touchy-feely, deficient in logic, unappreciative of evidence … If you’ve made it this far, I think I can leave the conclusion to that “if” as an exercise for the reader.
1. Most of the category names are explicable enough, with the notable exception of “Realistic.” I’m not sure what makes the profession of “auto racer” more “realistic” than that of “special education teacher,” but I suspect it has some not-quite-nice relationship to gender roles!↩
2. Strong himself published a volume on Vocational Interests of Men and Women in 1943; it’s hard to come by a copy now, so I’ll let everyone speculate freely about the conclusions he drew.↩
3. But literary studies is a people field, and computers are things! I know; I know.↩
4. That interpretation can be smoothed along if the structure of your data is in itself conducive to PCA analysis. In “Style at the Scale of the Sentence,” the Lab was interested in the semantic differences among four discrete sentence types, which, at least theoretically, provides a good target for the four-quadrant analysis that a PCA allows. As it happens, two of the sentence types clustered quite close together in the resulting PCA, while the other two were extremely distinct along one of the axes of variation; this allowed the writers to realize that their four sentence types were actually effectively three, semantically speaking.↩
5. To me, that mysterious entrepreneurial axis is the more interesting finding in this PCA. Could this be the status-seeking dimension that I posited above? (After all, artists also tack distinctly to the “Enterprising” side …) That’s an example of a hypothesis that the data in this article is entirely insufficient to test.↩
6. But, as one Facebook commenter pointed out, this article was written by women! I know. This isn’t a “men vs. women” argument; it’s a “sexists vs. anti-sexists” argument.↩