
I compared the distribution of first digits in TikTok video likes to the expected distribution based on Benfords law. Benfords law predicts that lower digits like one appear more frequently as the first digit than higher numbers. My data followed a similar trend with the frequencies getting lower as the numbers got higher but didn’t follow the exact pattern.
I chose TikTok likes as my data set. I chose it because TikTok is a regular app that I use everyday so I wanted to relate it to this because it makes it a little bit easier for me to understand the concept of it being any random data set. I had no idea that there would be any frequencies on a set of data that is so random. I collected it by finding the number of likes on each video and then taking the first number of it. I went threw 113 videos.
There were discrepancies in the comparison. Benfords law shows a gradual decline where as mine went down it sort of jumped up and down. For example between 2 and 3.
Some data sets may not work for Benfords law, for example the sample size may be too small to get an accurate result. It could also be because it is human made data and is influenced my human decisions.
Bendfords law can be used in almost any sets of data with numbers to look for something wrong with the data set. If the data set tested is completely off with the pattern of Benwords law it can suggest that there is something wrong with that data set.
Benfords law relies on a log scale because it reflects how numbers grow in natural systems. Numbers grown in a way that’s not evenly placed. Like going from 1 to 2 is doubling it while going from 8 to 9 is a much smaller jump. This growth can be explained by using logs which also circles back to Benfords law