One data scientist over at Degenerate Estate has undertaken quite the project. Utilizing Dark Lyrics as a source, the scientist only identified as Iain looked at 222,623 metal songs from 22,314 albums put out by 7,364 bands. Iain goes extremely in-depth into the processes they used in order to come up with the list, though for brevity's sake we're going to go right to the list.
The list goes by a "Metalness" score given by Iain, which is essentially the frequency at which the word shows up, or as Iain defines it –
One approach might be to look at how the relative frequency of words change between the metal lyrics and the English language in general. To do this we need some sort of measure of what "standard" English looks like, and given I'm using NLTK for text processing, an easy comparison is to the brown corpus, a collection of documents published in 1961 covering a range of different genres (although it should be pointed out, no lyrics).
So if you're in a metal band, maybe try to avoid using these words. Everyone else sure as hell doesn't.
- burn 3.81
- cries 3.63
- veins 3.59
- eternity 3.56
- breathe 3.54
- beast 3.54
- gonna 3.53
- demons 3.53
- ashes 3.51
- soul 3.40
- sorrow 3.40
- sword 3.38
- goodbye 3.28
- dreams 3.28
- gods 3.24
- pray 3.22
- reign 3.15
- tear 3.12
- flames 3.12
- scream 3.11
Oh, and here's the bottom 20 metal words. Go ahead and try to find a use for these in your upcoming album!
- particularly -6.47
- indicated -6.32
- secretary -6.29
- committee -6.16
- university -6.09
- relatively -6.08
- noted -5.85
- approximately -5.75
- chairman -5.69
- employees -5.67
- attorney -5.66
- membership -5.64
- administrative -5.61
- considerable -5.60
- academic -5.51
- literary -5.49
- agencies -5.48
- measurements -5.47
- fiscal -5.45
- residential -5.45
Iain also adds this about his study, citing the potential unfairness in these list.
Of course, this is a slightly unfair comparison. It is not too much to believe that not just the content of the documents we compare affects the word frequency, but also the type of the document being compared. A relevant example would be that a song about a brutal murder would have a different word count to a news article on the same murder. A better measure of what constitutes "Metalness" would have been a comparison with lyrics of other genres, unfortunately I don't have any of these to hand.
They also note that Pig Destroyer's lyrics are the most complex within this cohort, while Five Finger Death Punch curses the most.