How Facebook is Using Big Data - The Good, the Bad, and the Ugly

Have you ever seen one of the videos on Facebook that shows a “flashback” of posts, likes, or images—like the ones you might see on your birthday, or on the anniversary of becoming friends with someone? If so, you have seen examples of how Facebook uses Big Data.

A report from McKinsey & Co. stated that by 2009, companies with more than 1,000 employees already had more than 200 terabytes of data of their customer’s lives stored. Consider adding that startling amount of stored data to the rapid growth of data provided to social media platforms since then. There are trillions of tweets, billions of Facebook likes, and other social media sites like Snapchat, Instagram, and Pinterest are only adding to this social media data deluge.

Social media accelerates innovation, drives cost savings, and strengthens brands through mass collaboration. Across every industry, companies are using social media platforms to market and hype up their services and products, along with monitoring what the audience is saying about their brand.

The convergence of social media and big data gives birth to a whole new level of technology.
Data Science Certification

The Facebook Context

Arguably the world’s most popular social media network with more than two billion monthly active users worldwide, Facebook stores enormous amounts of user data, making it a massive data wonderland. It’s estimated that there will be more than 169 million Facebook users in the United States alone by 2018. Facebook is the fifth most valuable public company in the world, with a market value of approximately $321 billion.

Social platform Prefernce

Every day, we feed Facebook’s data beast with mounds of information. Every 60 seconds, 136,000 photos are uploaded, 510,000 comments are posted, and 293,000 status updates are posted. That is a LOT of data.

At first, this information may not seem to mean very much. But with data like this, Facebook knows who our friends are, what we look like, where we are, what we are doing, our likes, our dislikes, and so much more. Some researchers even say Facebook has enough data to know us better than our therapists!

Did you know

Apart from Google, Facebook is probably the only company that possesses this high level of detailed customer information. The more users who use Facebook, the more information they amass. Heavily investing in their ability to collect, store, and analyze data, Facebook does not stop there. Apart from analyzing user data, Facebook has other ways of determining user behavior.

  1. Tracking cookies: Facebook tracks its users across the web by using tracking cookies. If a user is logged into Facebook and simultaneously browses other websites, Facebook can track the sites they are visiting.
  2. Facial recognition: One of Facebook’s latest investments has been in facial recognition and image processing capabilities. Facebook can track its users across the internet and other Facebook profiles with image data provided through user sharing.
  3. Tag suggestions: Facebook suggests who to tag in user photos through image processing and facial recognition.
  4. Analyzing the Likes: A recent study conducted showed that is viable to predict data accurately on a range of personal attributes that are highly sensitive just by analyzing a user’s Facebook Likes. Work conducted by researchers at Cambridge University and Microsoft Research show how the patterns of Facebook Likes can very accurately predict your sexual orientation, satisfaction with life, intelligence, emotional stability, religion, alcohol use and drug use, relationship status, age, gender, race, and political views—among many others.

Facebook Inc. analytics chief Ken Rudin says, “Big Data is crucial to the company’s very being.” He goes on to say that, “Facebook relies on a massive installation of Hadoop, a highly scalable open-source framework that uses clusters of low-cost servers to solve problems. Facebook even designs its own hardware for this purpose. Hadoop is just one of many Big Data technologies employed at Facebook.”


Here are a few examples that show how Facebook uses its Big Data.

Example 1: The Flashback

Honoring its 10th anniversary, Facebook offered its users the option of viewing and sharing a video that traces the course of their social network activity from the date of registration till the present. Called the “Flashback,” this video is a collection of photos and posts that received the most comments and likes and set to a nostalgic background music. 

Other videos have been created since then, including those you can view and share to celebrate a “Friendversary,” the anniversary of two people becoming friends on Facebook. You’ll also be able to see a special video on your birthday.

Example 2: I Voted

Facebook successfully tied political activity to user engagement when they came out with a social experiment by creating a sticker allowing its users to declare “I Voted” on their profiles.

This experiment ran during the 2010 midterm elections and seemed effective. Users who noticed the button were likely to vote and be vocal about the behavior of voting once they saw their friends were participating in it. Out of a total of 61 million users, then, 20% of the users who saw their friends voting, also clicked the sticker.

The Data science unit at Facebook has claimed that with the combination of their stickers that motivated close to 60,000 voters directly, and the social contagion, which motivated 280,000 connected users to vote for a total of 340,000 additional voters in the midterm elections.

For the 2016 elections, Facebook expanded their involvement into the voting process with reminders and directions to users’ polling places.

Example 3: Celebrate Pride

Following the Supreme Court’s judgment on same sex marriage as a Constitutional right, Facebook turned into a rainbow drenched spectacle called “Celebrate Pride,” a way of showing support for marriage equality. Facebook provided an easy, simple way transform profile pictures into rainbow colored ones. Celebrations such as these hadn’t been seen since 2013, when 3 million people updated their profile pictures to the red equals sign (the logo of the Human Rights Campaign).

Within the first few hours of availability, more than a million users had changed their profile pictures, according to the spokesperson for Facebook, William Nevius. All this excitement also raised questions about what kind of research Facebook was conducting after their tracking user moods and citing behavior research. When the company published a paper The Diffusion of Support in an Online Social Movement, two data scientists at Facebook had analyzed the factors which predicted the support for marriage equality on Facebook. Factors that contributed to a user changing profile pictures to the red sign were looked at.

Check out this article to learn more about how Big Data is being used in other industries

Example 4: Topic Data

Topic Data is a Facebook technology that displays to marketers the responses of the audience with regard to brands, events, activities, and subjects, in a way that keeps their personal information private. Marketers use the information from topic data to selectively change the way they market on the platform as well as other channels.

This data was previously available through third parties, but was not as useful because the sample size was too small to be significant and the determination of demographics was almost impossible. With Topic Data, Facebook has grouped the data and stripped personal information for user activity to help marketers by offering insights on all the possible activities related to a certain topic. This gives marketers an actionable and a comprehensive view of their audience for the first time.

The Downsides

Privacy Issues

Due to this massive gold mine of data, advertisers wait like hungry vultures. In fact, the 2015 Social Media Marketing Industry Report stated that Facebook is the #1 social platform for marketers.

Facebook has always assured its users that information is shared only with their permission and anonymized when sold on to marketers. However, issues still seem to crop up; there have always been high levels of privacy concerns among Facebook users, who ask “Is Privacy Dead?”. For example, many users complain that Facebook’s privacy settings are not clearly explained or too complex. It is easy for users to share things unintentionally.

Two Problems with Facebook:

Ken Rudin states that companies who rely on Big Data often owe their frustration to two mistakes:

did you know

  1. They rely too much on one technology, like Hadoop. Facebook relies on a massive installation of Hadoop software, which is a highly scalable open source framework that uses bundles of low cost servers to solve problems. The company even designs its own in-house hardware for this purpose. Mr. Rudin says, “The analytic process at Facebook begins with a 300 petabyte data analysis warehouse. To answer a specific query, data is often pulled out of the warehouse and placed into a table so that it can be studied. The team also built a search engine that indexes data in the warehouse. These are just some of many technologies that Facebook uses to manage and analyze information.”
  2. Companies use big data to answer meaningless questions. Mr. Rudin also says, “At Facebook, a meaningful question is defined as one that leads to an answer that provides a basis for changing behavior. If you can’t imagine how the answer to a question would lead you to change your business practices, the question isn’t worth asking.”


Fascinated by Big Data? Interested in launching a career in Big Data? Want to learn more about how social media runs on Big Data?

Simplilearn offers a wide variety of Big Data and Analytics training, including a Big Data and Hadoop training course. With 32 hours of instructor-led training, 25 hours of high quality eLearning material, hands-on projects with CloudLabs, and Java Essentials for Hadoop, take your first steps into the world of Big Data.

Get a taste of Simplilearn training with this 15-minute video: Big Data Tutorial—Spark Introduction

Check out this article to know how Big Data is being used in other Industries

About the Author

Avantika MonnappaAvantika Monnappa

A project management and digital marketing knowledge manager, Avantika’s area of interest is project design and analysis for digital marketing, data science, and analytics companies.

View More
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.