Insights to Incite | Transcript: Watching Algorithms

April 22, 2021 • 17 Minutes

Watching Algorithms

Gen Z grew up with streaming video. These platforms use databases and personalization algorithms to keep our attention. What is this doing to us?

I have one grandfather who is still alive today. He was born in 1927. That same year, the newspapers reported on a new breakthrough of science: television.

One newspaper called it “a telephone with eyes.” It wasn’t until some two decades later in the late 1940s, when my father was born, that television would become commercially popular.

When I was born in the early 1980s, cable television was the new hot thing. No longer were viewers confined to the 3 major broadcast networks of ABC, NBC and CBS. We now had nearly a hundred channels to choose from.

Before cable television, watching television was free. Free with advertising that is. But once cable television came out, and people started paying subscription fees... commercials didn’t disappear.

In 1999, around the time I was about to graduate high school, TiVo completely upended the broadcast industry. It allowed people to watch their favorite TV shows whenever they wanted. You no longer had to watch a TV show at it’s scheduled broadcast time.

Another bonus? You could skip commercials. When I reflect back on this period, I’m struck by how broadcasters squeezed out profits. Once I had a TiVo, I thought it was only a matter of time until commercials completely disappeared. I was convinced that once we went digital, product placement with the ability to buy on demand was going to be the thing that TV networks invested in. I was so convinced that as an undergraduate student, I actually pitched this idea as part of a group research project on cable television conglomerates.

They didn’t though. While cable companies originally marketed to people that lived in areas that broadcast signals had a hard time reaching, it didn’t take long for most people to start paying for television. In fact, I’m still surprised that many people don’t realize they can still get broadcast channels for free with a digital antenna.

TiVo though, you know what it did allow? Binging.

My parents have a VHS tape of me running around our backyard with my brother when I was 5 years old. My dad had borrowed my uncle’s handheld camera, which was prohibitively expensive for the average person.

My childhood was largely unrecorded. There isn’t much evidence of me being stupid, not because I didn’t do anything stupid, but because digital photography and video was in its infancy.

The first streaming platform arrived in 1994 with Real Networks, but it was streaming audio. The bandwidth didn’t exist to stream much else. But it felt revolutionary to listen to a radio broadcast over the internet. Being an early technology adopter, I had started posting videos on the internet before YouTube launched. It was an involved process. Footage had to be digitized before it could be edited on a computer. The amount of storage required versus what was cost effective at the time made it inaccessible to most people. And the quality? It sucked.

YouTube launched 15 years ago, in 2005. For me, it didn’t seem revolutionary. It just made it a lot easier to share videos. The first clips on YouTube were only a few minutes long. This was a restriction based on bandwidth. Yet that restriction created a clip culture.

When YouTube started, it was all about amateurs sharing their videos. You can see this with some of the first viral videos. Charlie Bit My Finger was uploaded in 2007. It was less than a minute long. By 2009, it had 130 million views. Judson Laipply digitized a 6 minute bit from his comedy routine from called The Evolution of Dance. The video had 70 million views in under 8 months and became the most viewed video on YouTube.

If you became internet famous during this period, you would be asked a question like, “you’re famous because of something you did on the internet?” The interesting part of the instant fame wasn’t because of what you were actually doing in the video. Becoming famous on the internet was the aberration.

There was a really short window of time when amateurs made most of the content. Scholars thought that this might be the start of a new shift in democracy. This democratization would allow for more voices to be heard. It’s the shift from read to read/write.

But it didn’t take long for the professional media to dominate streaming. In 2013, 8 years after the launch of YouTube, Gangnam Style became the first YouTube video to surpass a billion views. The top ranked YouTube videos are primarily from record labels.The industry has finally caught on and has begun mastering the attention economy. It’s now normal to think of signing someone to YouTube as a performer.

This is the professionalization of YouTube. The rise of the influencer and alternative media. There are definitely more voices than before, but at the end of the day, most of the voices are focused on their brand.

The internet definitely allows for new genres to break through that would have never seen the light of day in the broadcast television paradigm. Autonomous Sensory Meridian Response, or ASMR is one of those genres. Long videos of people whispering and making sounds that, for some people, create a pleasant tingling sensation on their scalp and upper neck while listening to.

There are videos that document the release of fluid from pimples, cyst and other abscesses known as popping. I was surprised to find out that Dr. Pimple Popper practices in the city I grew up in.

Unboxing and haul videos are definitely a product of capitalism. Oddly satisfying videos are inexplicably rewarding to watch. The oddest new genre for me is the mukbang: a live stream of a host binge-eating large quantities of food, talking to their audience.

The internet offers immediate access to an unprecedented supply of knowledge, with videos by amateurs and professionals offering to inform, interpret, and instruct.

The promise of the internet is not only that we can learn anything, but that each of us has something to teach.

We create videos in regular conversation with communities. We source material as an ongoing dialogue, with increasingly low barriers allowing for the most casual of participation.

There’s an ever growing list, and all of these new genres are popular in part because they scratch an itch that can’t be found elsewhere. But for these videos to be dropped into your feed, what does it take?

To understand the algorithms that make up our internet reality, perhaps it is easiest to start at the beginning. If a website is available on the internet, but no one visits it, does it exist?

The first method of finding websites on the internet were just pages with a collection of known links. Yahoo, one of the first search engines, asked website owners to submit their links. When you submitted a link, they would ask the person submitting the site to categorize it. These categories were constructed by Yahoo.

Then came the crawlers. A crawler is a piece of software that systematically browses the web. It starts with a list of URLs to visit. For Yahoo, this would be the list of sites that users have already submitted. From those pages, the crawler identifies all the hyperlinks in those pages and adds them to the list of URLs to visit. Then it repeats the process.

The number of internet pages is extremely large. The largest crawlers still fall short of making a complete index. In 2009, a study showed that large scale search engines indexed no more than 40-70% of the indexable web.

In the early years of the web, search engines also struggled to give relevant results. Then along came Google. Google’s search engine uses an algorithm called PageRank. It is a link analysis algorithm that assigns a numerical weight to each element of a hyperlinked set of documents.

It’s purpose is to measure the relative importance of a document within the set. The name “PageRank” plays off the name of developer Larry Page along with the concept of a web page. Page, along with Co-Founder Sergei Brin, were graduate students at Stanford at the time. BTW, Stanford owns the patent for PageRank.

Today, Google uses over 200 ranking factors in their algorithm. They include factors like the domain name, the level of the page, the site, backlinks, user interaction, brand signals, webspam, and of course special google rules.

To give you an idea of how this works, take keywords. A keyword that is first in a domain name has an edge over sites that don’t have it in the domain name or if it comes later in the domain name. Google used to scan pages to see if they contained the exact keyword someone searched for.

That was before they invented RankBrain.RankBrain has two jobs: understanding search queries and then measuring how people interact with the results. Google measures satisfaction first by seeing what link you clicked on. If lots of people like that particular page, it gets a ranking boost. If it doesn’t, it gets dropped. It also measures the dwell time, bounce rate and pogo-sticking.

Might you be wondering what pogo-sticking is? Have you ever clicked on the first search result, only to see that it’s not what you wanted, so you hit the back button and check out the next result? That’s pogo-sticking.

Then there are click signals. A click signal is a reference to the user’s click patterns to figure out what a user means when they type a particular query. If users who searched for the same thing clicks on the same links, it’s probably a good link.

Google’s algorithms are incredibly complex. And these signals definitely help to push users to pages that other users are trying to get them to see. But it isn’t really possible for a computer algorithm to discern fact from fiction.

Facebook keeps a historical account of your personal engagement with posts by friends and brand pages. They predict what you want to see based on these past interactions.

In 2018, Facebook re-engineered its surfacing algorithm to focus on meaningful interactions rather than the newsfeed. These interactions are grouped into two buckets. The first bucket is Comment Activity. This is the back and forth comments on a post, news article, or video.

The second bucket is interactive posts. These are posts that you might want to share and react to. For example, a post from a friend looking for advice or a friend asking for trip recommendations. After a year of surfacing meaningful interactions, one study found that while engagement increased 50% year over year, the changes had also increased divisiveness and outrage.

This was due to the promotion of posts that got people worked up. For example, right wing media tends to evoke strong opinions from many people. It also ended up rewarding fake news from unreliable sources. These were places that understood how to game Facebook’s algorithm.

Facebook didn’t really change anything after realizing it increased divisiveness. Instead, they introduced a button labeled “Why am I seeing this?” Clicking on the button will give you an idea of why you are seeing that post. So if you’re a white supremacist, you’ll be told that you consume a lot of content from white nationalist news outlets. Shocking.

In 2019, Facebook also started surveying people to see what content mattered to them. I find this hilarious because it took Facebook a decade to actually ask users what they might want.

Twitter splits up how they surface content.

Top Tweets are tweets that are based on your personal interaction history. Twitter calls these ranking signals. Recency is a ranking signal that provides a weight based on how recently the tweet was published.

Engagement is a ranking signal that weighs things like how many retweets, clicks, favorites,and impressions a tweet has received. A tweet’s engagement is also weighted relative to other tweets from the same user and how often people engage with the tweet’s author, through active engagements and impressions.

Twitter also ranks the type of media you look at. If you engage more with videos, Twitter will prioritize video over types of media. So if you have a Tweeter-in-Chief that likes and retweets disinformation, people that interact with that account get a lot more vile content, in the medium that you most prefer.

The second way Twitter surfaces content is a simple reverse chronological feed of tweets from people you follow. So if you just want to see what your bubble is up to, use that.

Instagram shows content based on timeliness. But when you search Instagram, it operates based on your personal activity. It depends on the people you follow and the posts you like. As your activity changes, so will your search results. As we move to a culture of search queries, this troubles me. You can’t tell another person “just search for X Y and Z” because they’ll get something different. Hell, two days later if you searched the same phrase you might get something different.

All of these algorithms are machine driven, but they’re also prone to human bias. For example, in 2016, LinkedIn was accused of showing a preference to men over women when users would search for job candidates.

There’s another genre of internet video that is definitely causing a reckoning as of late and that’s the witness. The proliferation of networked cameras, and the platforms that make it easier to sidestep traditional distribution gatekeepers, has made once rare first-hand videos of resistance, injustice, and tragedy increasingly common.

Dash cam videos have become increasingly popular. Consumer grade dash cams allow everyday people to collect potential evidence for things like insurance fraud, road rage, and hit and run accidents.

There’s the Point of View, where protestors and sympathizers broadcast video from their perspective in order to mobilize, document, dnd amplify their voices. And then there’s one of the most horrifying genres of the 21st century,

Police violence. To this day, I have not watched the murder of George Floyd. I do not plan on watching the murder of George Floyd. I don’t think I can handle it. I don’t want to watch anyone die, especially of someone begging for mercy for 8 minutes and 46 seconds.

8 minutes and 46 seconds showing people that are supposed to protect us, but instead, callously killing a fellow citizen?

Fast forward to January 6th, 2021 and you have right wing extremists live streaming the looting and vandalizing of the Capitol Building in Washington DC. Another man’s trash is another man’s treasure, I guess.

Are we ready for the Internet to live up to its revolutionary potential?

Can we use this technology to hold monsters accountable for their actions?

Can we use this technology to build better institutions and replace the ones that are rotten to their core?

Can we use this technology to instill kindness and compassion rather than fear, hate, and division?

I hope so.

2021 NERDLab