Intro and disclaimer
About two months ago, my friend Vasyl Sergienko was looking for a proofreader on Upwork, as he I stumbled accross a pretty intriguing offer.
Why would anyone give money for randomly giving 50 claps to a random articles, he asked me. There could be only one explanation — somebody is building a bot network of active profiles. It is easy to assume, that later those contractors would be offered money for clapping and promoting posts of interest.
I became curious — what is the size of those networks? Who offers this services and how can we locate those bots?
“This is it — a perfect chance to play in investigative journalism” — I said to my friend, Vasyl Sergienko
We splitted our ways here. Vasyl Sergienko looked throught the internet to find offers for paid clapping. It wasn’t hard to find. After some negotiations he was able to order 3000 claps to some of his articles. Delivery was super-fast and in few minutes 3 of his articles were boosted to 3k+ claps, coming from a handfull of users. About 50 to 60 users have him 50 claps each.
This momentarily stumbled me. We could calculate “Clap Ratio” — any unrealistic value should highlight an article promoted in such a stupid way. Using this metric and this initial seed of “fake” users we could uncover the whole network!
To give this a try a wrote a simple Python parser. This is agains Medium’s Terms and Conditions, so here I would like to apologize for that. I tried my best to parse only a small subset of Medium and mostly focused on a social network of users and articles, associated with profiles of users, who participated in “paid clapping”. Also, I had to parse some articles, that looked “normal” to me, to have the comparison to typical non-paid user activities.
Reported results are likely to be no really represenative and have bias, but still they are intersting to think of. I hope Medium Stuff will respect this approach.
Let’s start with general data overview. I’ve collected information about 10.000 articles and 1.300.000 users, that interacted with those articles. Total number of CLAP interactions collected was around 3.000.000. To check my initial hypothesis of bot networks, giving around of 50 claps per user, I calculated Clap Ratio (amount of claps divided by number of users, who clapped) and plotted it.
This plot shows few disctint clusters of articles.
- Low clap articles. Those are the articles receiving 1–2 claps per user.
- Typical claps artciles. I seems that a “normal” amount of claps per article lies around 4–10 claps user.
- High clap ratio. The long tail articles, that have 20+ claps per user, with noticable spike at 25 claps (ha-ha, smart bots!) and a peak at 45–50 claps (only bots clapped those articles).
Before diving deeper, there’s another visualisation to help you understand those clusters. Let’s just plot the correlation between number of claps and number of users, that clapped to a certain article.
Green line corresponds to purely “fake” articles, orange seems to be typical Medium user behavior, and blue goes for low-claps, many users cluster.
It was pretty easy to detect “stupid bots” network. To do that, I’ve found a subset of users, that clapped to at least 4 articles in the “fake” orange cluster. Let’s just assume this criteria is enough. Of course, this is arguable, and more advanced detection methods would do a better job.
I’ve discovered a group of 1000 users using this criteria. Then, I traced back to see, what articles they clapped, using similar criteria. Article is considered “promoted” if it was clapped by more than 5 bad actors. Not, perfect, but something.
“Promoted” articles were found in the most prominent Medium blogs, including HackerNoon (I hope to publish this article there, lol).
If we take a look at what tags those articles were posted, it started to make some sense.
ICO, blockchain and cryptocurrency domain are the typical customersof fake bot networks. One of the heaviest one is so-called Ubex AI, that receives claps mostly from bot networks and every article has 3–4k claps.
Another strange and prominent customers are the writers of various poetry blogs. V. Plut, for example, seems to promote majority of his posts, as well as around 170 other poets.
What’s interesting here, not all articles, marked by bad actors claps exhibit unrealistic clap ratio.
Let’s take for example a poet A Maguire. She writes for Intimately Intricate, and all of her posts are 2–3K claps.
Lot’s of hidden trends.
To make this article consumable I am stopping here, but the research goes deeper. Lot’s of cool and strange insights related to promotion of articles that are eligible for Revenue Share from Medium, blocked from non-subscribers, and more complicated bots and promotion strategies. As for now I just wanted the share those initial finding and see, what you guys think of that.
Subsribe to our blog, and remember — you can give uo to 50 claps per article.