BARBELITH underground
 

Subcultural engagement for the 21st Century...
Barbelith is a new kind of community (find out more)...
You can login or register.


Ambisexual writer, me...

 
 
Mourne Kransky
10:13 / 09.11.03
This site (Gender Genie), inspired by an article in The New York Times Magazine, uses a simplified version of an algorithm developed by Moshe Koppel, Bar-Ilan University in Israel, and Shlomo Argamon, Illinois Institute of Technology, to predict the gender of an author.

I pasted in a barble about (atypically for me) women's bottoms and this piece of writing was, apparently, scored:
Female Score: 146
Male Score: 290


Says it works best on text more than 500 words long so I dug out a long, long piece of Daphne du Maurier pastiche I wrote ages ago in The Creation for shortfatdyke and it scored:
Female Score: 1918
Male Score: 1212


Maybe there's something in it if I'm male when I write about Kylie's arse and female when I'm writing with the voice of the second Mrs de Winter. Or maybe not. Hohoho.
 
 
Ethan Hawke
15:25 / 09.11.03
I tried this experiment on my LJ with a few barbelith and other people a while ago, and the tendency was, the longer writing sample entered for each person, the more "female" the writing gendered.

This is the alogrithm used by the program, if anyone's interested.

1. Count the number of words in the document.

2. For each appearance in the document of the following words ADD the number of points indicated:
'the' (17)
'a' (6)
'some' (6)
any number, written in digits or in words (5)
'it' (2)

3. For each appearance in the document of the following words SUBTRACT the number of points indicated:
'with' (14)
possessives, ending in 's' (5)
possessive pronouns, such as 'mine', 'yours', 'his', 'hers', (3)
'for' (4)
'not' or any word ending with 'n't' (4)

4. If the total score (after adding and subtracting as indicated) is greater than the total number of words in the document, then the author of the document is probably a male. Otherwise, the author is probably a female.

Now, if you excuse me, I have several pie charts in the oven for the inkblot lab report.
 
 
Mourne Kransky
19:38 / 09.11.03
This article in Nature claims:

A new computer program can tell whether a book was written by a man or a woman. The simple scan of key words and syntax is around 80% accurate on both fiction and non-fiction.

The program's success seems to confirm the stereotypical perception of differences in male and female language use. Crudely put, men talk more about objects, and women more about relationships.

Female writers use more pronouns (I, you, she, their, myself), say the program's developers, Moshe Koppel of Bar-Ilan University in Ramat Gan, Israel, and colleagues. Males prefer words that identify or determine nouns (a, the, that) and words that quantify them (one, two, more).

So this article would already, through sentences such as this, have probably betrayed its author as male: there is a prevalence of plural pronouns (they, them), indicating the male tendency to categorize rather than personalize.

If I were female, the researchers imply, I'd be more likely to write sentences like this, which assume that you and I share common knowledge or engage us in a direct relationship. These differing styles have previously been called 'informational' and 'involved', respectively.


They tried the program out on 566 English-language works to achieve the scores above. A. S. Byatt's (crap) book "Possession" was misclassified by gender, along with Kazuo Ishiguro's "The Remains of the Day".

The research team are now testing texts from further back in history and in other languages to see if the same findings result.

Choosing some online texts by (so far as I know) female writers, I gave it the first five paragraphs of Mary Shelley’s "Frankenstein" and George Eliot’s "Middlemarch" and it got their gender right.

I gave it the first five paragraphs of Edith Wharton’s "The Age of Innocence", Anna Sewell’s "Black Beauty" and Mrs Gaskell’s "Wives and Daughters" and it thought they were men.

I did the same with D H Lawrence’s "Sons and Lovers", Jack London’s "White Fang", Somerset Maugham’s "Of Human Bondage", Edgar Allan Poe’s "The Pit and the Pendulum", and Oscar Wilde’s "Lord Arthur Savile’s Crime".

The computer said Lawrence and London were clearly very male, Maugham and Wilde were male and Poe was female.

The Maugham and Wilde samples had short paragraphs, so I gave it the first 550 words. When I fed in the initial shorter text samples, it thought both Maugham and Wilde were female.

I tried it with some more recent writing from the Guardian site (where I first heard about the Gender Genie).

I submitted the first five hundred words (plus) of five articles by male writers and five by female.

It said David Aaronovitch, Iain Banks, Hugh Fearnley-Whittingstall, Julie Burchill, Germaine Greer, Christina Odone and Zadie Smith were male.

It said Gareth MacLean, Gary Younge and Sandi Toksvig were female.

I finished off by submitting the first two posts above and the programme decided both Todd and I are female.

I wonder how they came up with their claim of 80% accuracy? Of the twenty two samples I’ve just given it to analyse, it got less than fifty percent right.

Clearly, I am at a loose end, this Sunday night...
 
  
Add Your Reply