I recently came across a chart showing where letters are used in an English word. Here, the author took a corpus of English works and used it to count in which position a letter most often occurs.
This inspires me to adapt it for Scrabble use, in particular to address one question: where should I put a tile as I shuffle it on my rack? For many players, myself included, shuffling the tiles on the rack help trigger some visual cues to discover a word lodged somewhere in the recesses of the brain. However, shuffling through all possible combinations will simply take too much time, or even risks missing a word if the combination is not attempted. An intuition on where certain tiles will most of the time end up may help alleviate this issue.
The chart below shows a sort of “heat map”, where the reddest colour indicates location most likely a letter would occur in a word (and hence, where you should put that Scrabble tile more often when shuffling).
A few differences from the original chart in prooffreader.com :
– This chart uses CSW12 word list where each word occurs only once. The original chart uses actual English bodies of work, hence certain words which are more frequent (e.g. “the”) will skew the position of certain letters. Relevant for linguistic exploration, not relevant for Scrabble.
– I use percentage rather than absolute count. I’m not interested to know how often V occurs overall in the dictionary. Instead, what I want to know if if I have a V, what is the percentage chance there is a word with it in a particular position.
– I only include words of 4-8 letter long. 2- and 3-letter words should be known cold for any serious Scrabble players, and longer words are far less useful unless you’re Nigel Richards (in which case you don’t need this heat map guide anyway).
This chart confirms some well-known assumptions, e.g. S is very valuable as an ending due to its presence in plurals and third-person present tense forms. Ditto the ending-dominant D due to -ED, and G due to -ING.
Y at the end is not so much surprising as its percentage: 51% of the time Y is found at an ending; even more than S which is at about 49%. No doubt the fact that -CY, -ITY, -LY etc all add to it, but the high percentage is mainly due to the fact that Y is hardly present elsewhere. So, don’t put your Ys on the left of your rack as you shuffle.
S on the other hand in itself is frequently found even in non-plurals; particularly S as the first letter seems to be the most frequent among the one-pointers. This is something that many beginners seem to forget: try to shuffle S to other parts also, you’d be surprised how flexible it is.
The chart also confirms an intuition that already helped me find words much faster: mid-valued tiles (3-4 pointers) are mainly dominant as the first letter, with the exception being H, possibly due to its many digraphs (-CH, -PH, -SH, -GH).
Among the power tiles, Q and J are front-heavy, X is end-heavy, and Z is pretty flexible, highlighting why its the best tile among the power tiles. K, however, surprised me as I have tended to put it at the front when shuffling; I guess now I know better.
I further split the heat map by word lengths, to see if there are any changes to the pattern within the same letter.
Where there is a grey spot in a square in the chart above, that means the letter never appears in that position (e.g. Q will never be found in the 6th position of a 7-letter word, but appears – albeit very infrequently – as the last in a 7-letter word).
A general observation is that for the longer words, the colours for the same letter are less contrasting, i.e. the letter are more likely to be found all over the place.
Some letter-specific observations:
– From 5-letter onwards, E occurs most frequently in second-last position. See one row above, and you’ll notice it moves in tandem with D’s hot spot in the last tile position, confirming the prevalence of -ED.
– B, F, P and W notably lost heat as the last letter as the words grow longer: they are still reasonably common end letters in 4-letter words, but practically non-existent for bingo-length words. (Though I gleefully remember an opponent playing the P to the top-right TWS to empty the bag, reasoning there is hardly any long words ending with P, and I hastily rearranged my homeless HEISTER to plonk down TREESHIP to bingo out and eked out a win). W is the learning point for me, as I somehow intuited that there are reasonably many words ending with -OW; apparently not many enough. Also worth noting that B is not even common in the second last position.
– S, for all its flexibility, is pretty useless as a second letter.
– One-pointer consonants (L, N, R, T) are as expected very flexible and present everywhere. N does seem more prominent as the second last letter, presumably due to -ING.
– A, O, and U are particularly frequent in the second position; I guess second syllables onwards use more E and I.
– C ending is not as common as I thought – possibly a wrong intuition due to my tendency to look for -IC forms.