Inspired by an article and a test in The New York Times Magazine, the Gender Genie uses a simplified version of an algorithm developed by Moshe Koppel, Bar-Ilan University in Israel, and Shlomo Argamon, Illinois Institute of Technology, to predict the gender of an author.You simply copy and paste a block of writing into the text box, choose the genre of the writing (you can pick from "fiction," "nonfiction," and "blog post") and press "submit." The tool quickly analyzes the text and spits out two columns of words (one for each gender), and tells you whether it thinks the author of the work is male or female. It is supposedly around 80% accurate.
The tool scans the document, picks out certain key words, and assigns a numerical value to each word. Then adds up all these numbers -- when the answer falls within a certain range, the author is determined to be a certain gender.
The number each word is assigned is apparently based on how "masculine" or "feminine" it is -- but not in the ways you might expect. For more on the "gender of a word" see Alexander Chancellor's article about the tool from The Guardian:
Unsurprisingly, the tool picked me out as a male every single time -- even when I tested it with The Stark White Elevator, a story I wrote with a female narrator. I must admit, I have always felt more comfortable with thinking in the realm of objects than the world of feelings, people, relationships. Psychologists have been calling us "Left Brainers" for years: we are good at math and other logical things, though we can never understand why puny humans cry. And I guess my cries of "I am not a Robot!" have been easily found out as lies -- even, ironically, by a computer program.
One of [the researcher's] findings is that women are far more likely than men to use personal pronouns ("I", "you", "she", etc), whereas men prefer words that identify or determine nouns ("a", "the", "that") or that quantify them ("one", "two", "more"). According to Moshe Koppel, one of the authors of the project, this is because women are more comfortable thinking about people and relationships, whereas men prefer thinking about things. But the self-styled "stylometricians", in creating their gender-identifying algorithm, have been at pains to avoid the obvious.
The algorithm pays no attention to the subject matter of a piece of writing, or to the occurrence in it of words that might suggest a greater interest by one sex or the other, such as "lipstick" or "bullets". Instead, it looks for little clues that both writers and readers would probably fail to notice, such as the number of personal pronouns used.
So I am a Robot; now I know. But after learning that almost all of the female contributors to The Guardian were discerned by the program to be male, I wanted to experiment some more, to see if I too could trick the system. So I put in my sister's report on Woodie Guthrie and Odetta Holmes, but it was determined to be distinctly feminine.
But then I tried her report on Macbeth, which came up male! Yet another startling revelation? Maybe Hannah is a man, or maybe the machine had gotten it right in a different way. I distinctly remember helping "a lot" with that report. How much is "a lot" could and would be debated to the end of the Earth, but now Robots do not lie.
Now try the tool out for yourself, and ta-ta for Tao!