Skip to content

Text Analysis of Harry Potter Fanfiction (Part Two)

Posted in my research

My comparison of the textual features of the Harry Potter novels and Harry Potter fanfiction continues…

ANALYSES – PUNCTUATION

I looked at the occurrences of various types of punctuation used in the Harry Potter corpus (HP) and the Harry Potter Fanfiction corpus (HPFF).

There are far more occurrences of periods, questions marks and exclamation marks in the Harry Potter fanfiction corpus overall. To try to account for the higher frequency of periods in the Harry Potter fanfiction corpus, I counted occurrences of ellipses and looked for other anomalies and did not find enough occurrences to account for the difference. Therefore, I believe the data shows that there are more and shorter sentences in the Harry Potter fanfiction corpus. To account for the higher frequency of question marks in the Harry Potter fanfiction corpus I looked at the individual text files in the Harry Potter fanfiction corpus and the much higher occurrence of question marks seem to arise from more introspective narration (characters asking rhetorical questions in their thoughts) and as a consequence of having more sentences.

ANALYSES – WIZARDING TERMINOLOGY

Following Zareen Farooqui’s analyses of the occurrences of world-specific terms across the Harry Potter novels, I extended this analysis to the Harry Potter fanfiction corpus, expecting fewer occurrences given the propensity for Harry Potter fanfiction to be set in Alternate Universes and to often forgo the existence of magic.

As expected, the Harry Potter corpus has far more occurrences of world-specific terms. A comparison of each term:

Surprisingly, “diagon alley” and “potions” are more prevalent in the Harry Potter fanfiction corpus. I uploaded the 17 individual Harry Potter fanfiction text files into AntConc to compare them. Two works of fanfiction appeared to be significant and so I did some old-fashioned text analysis (a.k.a. reading) to see why.

One of the works of fanfiction analysed is set in Diagon Alley and accounts for 28 of the occurrences of that term (45.16% of total occurrences in the Harry Potter fanfiction corpus), explaining the discrepancy.

Another of the works of fanfiction depicts Harry training to become a Potions Master and accounts for 146 of the occurrences of that term (51.41%)of total occurrences in the Harry Potter fanfiction corpus), explaining that discrepancy.

I believe the overall trend of the Harry Potter fanfiction corpus using fewer world-specific terms will bear out over a larger selection of works of fanfiction.

ANALYSES – CHARACTER RELATIONSHIPS

Zareen Farooqui models how relationships between characters change across the seven novels. (She offers more in-depth analysis of this using Word2Vec in another article.)

I performed a more basis test by searching for occurrences of the various trigrams in the format “character_A and character_B” and “character_B and character_A” then visualising the results.

The Harry Potter corpus has more pairings overall, perhaps because it is third person with multiple characters rather than the propensity for fanfiction to focus on two main characters. “Hermione and Ron” is the most popular pairing in each, presumably because they are grouped together in being separate from Harry as the focal character. A comparison of each character pairing:

“Harry and Draco” are paired far more often in the Harry Potter fanfiction corpus (only occurring once in the Harry Potter corpus), which reflects the ability of fanfiction to depict non-canonical friendships and relationships and also to depict relationships between characters of the same sex on a way that is less common in commercially published fiction, especially children’s fiction.

Since the sample of Harry Potter fanfiction used was so small, I looked at the total occurrences of various pairings (romantic and platonic) in the approximately 150,000 works of Harry Potter fanfiction on ArchiveofOurOwn.org to get a better idea of their popularity. As you can see, Harry/Draco is by far the most popular relationship depicted.

Pairing Occurrences in all HP fanfiction (%)
Harry/Draco 16.0
Harry/Snape 6.3
Harry/Ginny 3.5
Harry/Ron 1.2
Harry/Hermione 1.2
Harry/Hermione/Ron 0.5

I should note that for the relationships depicted in the works of Harry Potter fanfiction on ArchiveofOurOwn.org I only used the author’s tags regarding relationships, while for the analysis of relationships depicted in the Harry Potter and Harry Potter fanfiction corpora I searched the texts for trigrams, which will account for some difference.

CONCLUSIONS

These very basic forms of analysis have yielded some conclusions.

My analysis of word frequencies confirmed my intuition about more variable use of speech verbs in (this selection of) fanfiction and revealed usage of the present tense. My analysis of punctuation indicates the use of much shorter sentences in the Harry Potter fanfiction corpus, but, as the texts were fairly messy, I would like to investigate further before being definitive. My analysis of world-specific terms begins to give insight into which aspects of the Harry Potter novels are taken up as topics for fanfiction (Potions and Diagon Alley, not the Ministry of Magic, Divination or Defence Against the Dark Arts), but I feel a larger sample of fanfiction must be used to identify trends. My analysis of character pairings is both revealing about the Harry Potter novels’ narrative structures and indicates some of the key differences between fanfiction and published fiction. I would very much like to continue this analysis in the hope that it would reveal something interesting about how popular certain characters and character pairings are across Harry Potter fanfiction.

—————–

Tools used:

Python

BASH Command Shell

AntConc

Raw Graphs

Excel

Notepad++

 

Photo by Jack Anstey on Unsplash

Comments are closed.

css.php