` Part 7: Word and Phrase Usage ~ LEGO Ideas Data
Lego Ideas Data logo

Part 7: Word and Phrase Usage

Words have meaning as do phrases. In this blog post we'll look at what word and phrases occur most frequently in the titles for LEGO Ideas submissions.  Can they provide any insight? Maybe it's all just for fun. Let's find out.

This post looks at the entire period of time from Jan 1, 2020, to Feb 20, 2024. In reality, words and phrases will ebb and flow over time. This is especially true of IP such as Indiana Jones and Lord of the Rings. As movies, musicians, etc., change in terms of their popularity, so do the submissions for that IP. On the other hand, non-IP submissions tend to remain fairly steady. Examples might include classic spacemodular building, etc.

This post is not going to do a deep dive to try and determine which words and phrases were most popular during a specific time within the date range. Rather, it looks at the most frequent words and phrases over that entire time period.


Click here for a popup on what makes up the data:  


All words were first 'normalized' before a frequency count was done. Normalizing just means all the words were formatted the same prior to determining frequency. This included removal of punctuation, symbols, emojis, dashes, etc. Additionally, common words such as and, the, or, etc., were not included.


Question 1
What are the top 20 words appearing in LEGO Ideas submissions?


Items of note:
  • The only IP-related entry is legend(s) with the majority of the 95 entries coming from Legend of Zelda.


Question 2
What are the top 10 phrases appearing in LEGO Ideas submissions?


Items of note:
  • A much larger portion of the top phrases are made up of IP entries (7 of 20) when compared to the top word entries (1 of 20).
  • Two of the top five phrases actually deal with the IP Zelda, Legend of Zelda and Breath of the Wild


Question 3
Were the top words/phrases the same throughout the four-year data period?

The simple answer is... I don't know for sure as I didn't specifically analyze that. However, I think we can safely assume the following:

  1. Words/phrases that are not IP-based would remain fairly static in their frequency over the entire four-year period.
    • Words like house, car, space and phrases like classic space, sports car, modular building are used every year by the LEGO Group in titles of their sets. Similarly, it would be expected that submissions would also follow that same pattern.

  2. Words/phrases based on IP such as movies, tv, music, etc., will tend to fluctuate based on the popularity of that IP at any given time. 
    • A great example of this is Indiana Jones which comes in at #8 for the phrase frequency. However, if you examine current actively gathering support submissions, you'll find there is only three. What gives?!?
    • Indiana Jones appears in the title of sets as follows: 2020 - 17 times; 2021 - 6 times; 2022 - 2 times; 2023 - none.
    • It looks like Indiana Jones peaked during 2020, so what was happening around then? It was the year before the 40th anniversary of the Raiders of the Lost Ark... the first movie in the franchise. I can imagine entries increased in anticipation of LEGO possibly producing a set to celebrate the occasion. Then, they began to decline as time went on.


Question 4
Can we use the words/phrases data to determine the most popular IP?

Of course! In fact, that is a MUCH better idea than trying to categorize each of the 12,188 submissions. It won't be perfect as it won't include entries that are based on an IP but did not include in their table... but it will be close.


WAIT! What about Disney!?!?!

Without a doubt, Disney is the number one company that has IP represented in LEGO Ideas. However, LEGO Ideas treats each individual 'property' as an IP. For example, Beauty and the Beast, Disney Princesses, and Stars Wars are all considered individual IPs even though they are all owned by Disney. This is done so that an IP already produced can be blocked from submission, while still allowing other Disney properties to be proposed.






Idiosyncratic - real words that seem fake




Neologisms - newly coined invented words





Jabberwocky - made up words by the submitter











up next: What makes a submission "Most Popular"?
Share:

No comments:

Post a Comment


My Submission

6,638
Supporters


Snoopy - Campfire

Snoopy - Campfire
Support me on
LEGO Ideas!

I'm on LEGO Ideas as bossofdos64.

If you enjoy this blog, please take the time to support and comment on one or more of my builds.

  


Subscribe by Email

your address is never shared



Contact Form

Name

Email *

Message *