Login

Comparing Speech and Surrounding Narrative

Main Page edit | discuss | share | view history

Introduction

It might be interesting to tease apart the text of a novel (or several novels) into the text that the author writes directly and the text the author gives to the characters to say.

I'll upload several parsed books by Austen:

Visualizations

Here's a first look at this using Jane Austen's "Pride and Prejudice".

Wordles

We'll kick off with two Worldes comparing the tag cloud produced by all the non-speech and then all the speech from the novel.




The first things that hit one are how much more Elizabethe is spoken of in the narrative than within the character's dialogue. It's also interesting how the dialogue is laden with judgemental words like 'can', 'must', 'may', or 'nothing' and 'never'. These words do not seem as prevalent in Austen's own narrative.

Let's check if that difference is born out if we build a tag cloud of word pairs.

Word Pair Tag Clouds

Here we can compare the tag clouds produced by all the non-speech and then all the speech word pairs from the novel.




Striking there is the similarity. The paired words tend to be (title, surname) pairs either referring to who spoke (e.g. "xxxxxxxx" said Mr Darcy) or referring to the characters in converstion.

Now we can look at differences within the structure of the text.

Word Trees

These are harder as the points of comparison, the text prefixes we choose, offer so many possibilities but it's interesting to start with how Darcy is described by the characters and how he is described by Austen herself.