web analytics

Technology

Scouring billions of links for 6 years showed us the web is both expanding and shrinking

todayMay 2, 2021 2

Background
share close

The online world is continuously expanding — always aggregating more services, more users and more activity. Last year, the number of websites registered on the “.com” domain surpassed 150,000,000.

However, more than a quarter of a century since its first commercial use, the growth of the online world is now slowing down in some key categories.

We conducted a multi-year research project analyzing global trends in online diversity and dominance. Our research, published today in Public Library of Science, is the first to reveal some long-term trends in how businesses compete in the age of the web.

We saw a dramatic consolidation of attention towards a shrinking (but increasingly dominant) group of online organisations. So, while there is still growth in the functions, features and applications offered on the web, the number of entities providing these functions is shrinking.

Web diversity nosedives

We analysed more than six billion user comments from the social media website Reddit dating back to 2006, as well as 11.8 billion Twitter posts from as far back as 2011. In total, our research used a massive 5.6Tb trove of data from more than a decade of global activity.

This dataset was more than four times the size of the original data from the Hubble Space Telescope, which helped Brian Schmidt and colleagues do their Nobel-prize winning work in 1998 to prove the universe’s expansion is accelerating.

With the Reddit posts, we analysed all the links to other sites and online services — more than one billion in total — to understand the dynamics of link growth, dominance and diversity through the decade.

We used a measure of link “uniqueness”. On this scale, 1 represents maximum diversity (all links have their own domain) and 0 is minimum diversity (all links are on one domain, such as “youtube.com”).

A decade ago, there was a much greater variety of domains within links posted by users of Reddit, with more than 20 different domains for every 100 random links users posted. Now there are only about five different domains for every 100 links posted.

0%