Information Technology and Ethics/Social Media Data Collection

Online social networking has touched billions of people since the late 2000s, with the number growing every year.[1] In 2016, approximately 79% of all Internet users in the United States use Facebook.[2] Internet users often use more than one social networking website, adding up to hundreds of millions of users of people on social media. As Internet usage grows, this number is expected to grow.

Among the social networks to be discussed in this chapter, none require a user to pay in order to join.[3] This is a common model among many social networks, where a paywall would be a barrier towards expanding the network. Due to such massive user bases these social networking websites have accumulated, the potential for advertising to those users based on their connections and preferences on the sites is great. The user data is often viewed as a product that can be sold, towards advertisers and others who can use it for purposes that will be described.

The collection of data brings up ethical issues around privacy and what an online community really means, as data monetization is often central to the business model of the company building the social network. Privacy is often non-existent on social media, as even if the user’s particular content is protected from unauthorized outsiders, the company still has the ability to monitor and analyze a user’s activity on the site.[4] As a social media company gains more sophistication in breaking down segments of users, it may become possible to target users with very specific interests.

This chapter will discuss what data collection does take place, how data collected on social media is used, and the implications such data collection can have (privacy and others) on society.

Social networks’ data collection and internal usageEdit

Just about every social network collects data on its users. This section discusses what each social network collects about its users, and how the social network uses the data for its own product.


Facebook is among the most largest collectors of data, being the largest social network in the world. Since it was started in 2004, its capabilities at collecting data about users have gotten extremely sophisticated. Facebook collects lots of data from users interacting on the site. In cases where it does not have enough data from users, it also purchases other data, such as income and shopping preferences, from large companies.[5] All the information combined helps to build a comprehensive profile of the user. This profile helps to target ads in an extremely specific manner. ProPublica analyzed the “categories of interest” that Facebook users can have, and found over “52,000 unique attributes … used to classify users”.[6] For an average user, it can be difficult to unlink themselves from the third party data due to the huge number of providers used.

Facebook itself collects lots of data on a user as they interact with the site. This involves tracking what posts or pages a user has liked, what posts they have commented on, and other post interactions.[5] With photo tagging, Facebook can build up a facial recognition database as users are encouraged to tag their friends in photos. It uses this database to automatically highlight faces that should be tagged, with a recommendation of whose face it is. Facebook profiles have become quite complex, with many different data points a user can add themselves.[7] As sharing is incentivized on the site, users reveal lots of personal information. While on other websites, Facebook can still track browsing history via various embedded widgets that site providers can add. All this data can be sold to advertisers in the form of ad targeting.[5]


Similarly to Facebook, Twitter also collects lots of data on users, for advertising among other purposes. However, Twitter has a very simplistic profile so most of the data which Twitter collects comes from engagement with tweets. All links within a tweet go thru Twitter’s own URL shortener, which is also used to track how many times the link was clicked on, among other metrics.[8] Like Facebook, it also has a form of liking, sharing, and commenting, with favorites/hearts, retweets, and replies (respectively). All such interactions are tracked by Twitter and displayed. As Twitter is generally more public than other social networks, it is easy to discover what other users are discussing. Brands may take advantage of this to offer proactive customer support. Finally, similar to Facebook, Twitter has its own Tweet button that can be embedded on other sites. Visiting the site will inform Twitter of the user’s browsing history at that site.[9]

Google PlusEdit

Google Plus is Google’s own social network, which was launched in 2011 and is now integrated into many different Google products, providing additional information about Google users. It is not a very popular social network compared to Facebook, Twitter, and others, but it has similar features to Facebook. Data collected on Google Plus could be used to influence ads on other Google products.[10][11]


Another Google social network (though in a different sense) is YouTube, the premium platform for hosting videos online. YouTube collects data and displays what is trending and popular based off what the majority of the population is watching. The key draw of YouTube is that a user can find information on any topic. Topics range from news programs, cooking shows, how-to tutorials and much more. There are two main areas of focus that YouTube's analytics collects data on:[12]

First, YouTube automatically collects standard channel metrics which consists of collecting a number of channel pageviews, subscribers, friends, channel comments.[13] Along with that, a user's data is also collected from each video that the user uploads. The data collected consists of views, number of comments, video responses, a rating and the times the video was marked as a favorite.


LinkedIn is the world’s largest professional networking platform. Unlike the focus of other social networks, its primary goal is to make professional connections between professionals around the world, while providing an efficient and informative tool at the user’s fingertips. LinkedIn gathers quite a lot of information to provide their service, and users are often rewarded when supplying information.

From the moment a user creates an account, all the personal information is collected from an individual’s profile, such as job title, education, skills, memberships, affiliations and much more. Consistently adding more and more information enables the user to benefit more from the services provided to showcase a user’s professional presence.[14] Other than continuously improving the user’s experience with other professional on the Linkedin. They also gather the user’s data to continuously improve service development and customize the user's experience. For example, one of LinkedIn's features welcomes the user with newly added members and companies that the user may be interested to follow.[14]

Other social mediaEdit

There are many social networks today that help people interact with each other in the web. Many of these sites allow users to post, stream, and share their daily lives. Many of this information is collected and stored within sites' databases and may not be deleted. One of the more recent social networks is Snapchat, which has become very popular over the past few years. Snapchat is a free mobile app that allows users to instantly share photos and video with their friends online. Snapchat is known for letting users post short video updates as well as giving the users the security that their information posted will be deleted, therefore granting users more privacy. However, recently there has been many changes within the company and much of the information that the users share within this app are being used for commercial purposes. Snapchat can know a great deal about users — name, current location, their friends, and when they message them.[15] This is a lot of information that is being stored about each one of its users. In today’s society some find it hard not to be a part of an online community even though that means providing information about oneself to social media sites and third parties.

The Cost of Social MediaEdit

Most social media platforms are advertised as free services. This is true in the sense that there is no monetary requirement to access and use the platform. However, the consumer is still paying a price in the form of their data and privacy. They are leveraging their information as a commodity for payment to gain access to the system or goods.[16] Privacy is included here since social media is considered a part of the public sphere. With regards to privacy in a legal manner, an individual's reasonable expectation of privacy is diminished when information, even of a personal nature, is provided in a public setting or becomes inherently public.[17] How much the user is actually paying is difficult to determine, since it will change from person to person based on how much they value the data that is being collected and their privacy.

Cost of Information (Data)Edit

The value of the data itself is established relatively clearly as it is more tied to monetary values. A user will not knowingly give up their data unless they feel they are receiving proper compensation. Most of the time if they believe it has monetary value, they will consider it intellectual property and take legal actions to copyright or patent it. What pictures they liked or news articles they shared do not fall into this category as the user sees no inherent value in it for themselves. These small pieces of data are combined into big data pools, which then becomes a lucrative business for the companies that collect, process, and/or sell it. This can be seen with Facebook's 2018 revenue of $16.9 billion, with profits listed at $6.9 billion.[18]

Cost of PrivacyEdit

It is generally agreed that some personal information is of high importance and needs to be safeguarded from the public sphere. This type of information is labeled as nonpublic personal information (NPI) and refers to confidential and sensitive information, such as financial or medical records.[19] There are numerous regulations in place to protect this type of information, such as the Health Insurance Portability and Accountability Act of 1996 (HIPAA).[20] Information that is neither confidential or sensitive, such as what type of car you drive or where you work, is considered public personal information (PPI).[19] This refers to information that is publically available and does not receive the same type of regulatory oversight. When PPI is not protected it could lead to privacy failures for the users. As early as “2000 the combination of ZIP code, date of birth, and gender was enough to identify 87% of the US population.”[21] Even if the companies remove a person's immediate personal information from the information, it is still possible for third parties to identify who the data came from. There were 53,308 cyber security incidents with 2,216 data breaches in 2018 alone.[22] As recently as 2018 saw Facebook get breached exposing 50 million users.[23] It is safe to assume that it is not a matter of if, but when another breach will occur. Without regulations ensuring that information is properly protected, it is logical to conclude that you are also paying with your information privacy when you use social media.

Data usage and exploitationEdit

Every day, companies collect data from individuals. This data is stored within many of these companies’ databases. The data collected helps many of these companies to make a profile of each and one of its consumers. These practices are also known as data mining. Data mining involves the indirect gathering of personal information through an analysis of implicit patterns discoverable in data.[24] Collecting data from consumers may not be always negative, however. Some of the information collected ensures companies can provide better customer service as well as keep customer preferences for future purchases. Despite the use of data may help create a better service, it also hurts the users’ privacy. Consumers data is important and therefore worth a lot of money. Such data is now a $300 billion per year industry and employs 3 million people in the United States alone, according to the McKinsey Global Institute.[25] The collection of this data through social media helps target an audience for people to sell their products.


Social networks constitute an immense measure of customer "big data." The average global internet user burns through over two hours day by day via web-based social/networking media, and their movement uncovers an extraordinary arrangement about what makes them tick.

Some insight on how data collected by various social networks can be used to find out information about individuals:

  • Facebook gathers 63 unique bits of information for its API, more than any other social media. So much substance is shared on Facebook, Facebook can give an insight into what individuals think about. Facebook's "like" catch is squeezed 2.7 billion times each day over the web.[26]
  • Twenty-two percent of LinkedIn clients have between 500-999 first-degree connections on the social media, and 19% have between 301-499. The information is making another method for understanding recruitment and retention. Recruiters would judge on an individual’s profile and reaches out to them if impressed by the profile.[26]
  • Twitter is handling 143,199 tweets for every second globally. These tweets give a real-time insight into the news and data that individuals think about. Fifty-two percent of Twitter users in the U.S. consume news on the site (more than the percent who do as such on Facebook), as indicated by Pew Research information.[26]
  • Pinterest: Several thousands pictures are stuck to individual pages (called pinboards) on Pinterest consistently. They speak to a unique insight into a large number of retailers/shoppers aspiration. Over 17% of all pinboards are arranged under "Home," while about 12% fall under "Style/Fashion." What's more, 80% of pins on Pinterest are repins, so pictures of items have a long time frame of realistic usability, long after the underlying pin.[26]

Currently, social media make noteworthy interests in giving this information a use. On the off chance that they accomplish a firmer grasp on clients' connections, interests, and ways of managing money, organizations will have the capacity to give their clients customized products, and advertisers will have the capacity to hyper-target users.


Students in this generation are heavily exposed to digital technologies and the Internet. The extensive use of the Internet and social media has the potential to offer new types of educational settings. The use of social media in higher education is essential as the use of these tools and technologies have been part and parcel of student’s lifestyles. LinkedIn has a feature or a product within itself called LinkedIn Learning, where an individual can learn the most in demand business and technical skills.[27]

The web is filled with intellectual journals and databases that help students maneuver through the information that is needed. Many of this information are also obtained through search engines such as Google and Bing. Social media gives a vast expanse of information to its users and as a result there are students that can learn from the information that is shared. There are some teachers that use social media in order for students to communicate with one another and learn from the articles from the web. In different countries where there is a lack of education due to poverty, many of the students use social media in order to supplement for their education.


The biggest sources of information these days are Google and large social networks, especially Facebook. In such mediums there are, at times, direct contact to customers. “We’re living in a world where businesses and important life opportunities are being decided based on this amalgamated data,” according to Pam Dixon of the World Privacy Forum.[28] This data can be utilized to customize promotions as indicated by the attributes, conditions and inclinations of every person, particularly when the advertisements are conveyed specifically on the web. Presently, with the advancement of portable applications that track a client's area continuously, the promotions can even be adjusted to the beneficiary's present area. Therefore they are also with this information they are able to execute digital marketing successfully based on geography.

Another ideal example with the current scenario is the sensitive data or the private data being used as a marketing icon or an element. They tend to use such data to make predictions on consumer’s buying behaviors. To land at the prediction, the data required is numerous and eventually the marketing strategy would involve taking not just the basic information (name, email) of a customer but also the more private data such as their family status, type of car or the card they own, and type of shopping/shopping habits. This various information is traded by the automobile industries, credit agencies and so on.


Somewhat untimely and idealistic, however government pioneers and organizations are progressively saddling the forces of web-based social networking to both interface with people in general and extract information.

Instances where social media were used to connect with government for example:

  • The U.K. police set up a devoted online networking team to guarantee the security of the 2012 London Olympic Games.[29]
  • Amid the 2012 U.S. presidential race, Twitter built up a fresh out of the new political examination device called the Twindex, which gaged online discussions and conclusion around Barack Obama and Mitt Romney.[30]

These days social media is used as a real time two way communication between the government and the public, almost making sure everyone gets the chance to voice their opinions.

Effects on behavior / privacyEdit

While there are many positive uses and effects of data collection, there are also many negative effects, such as those described here.

Chilling effectEdit

This phenomena called the ‘chilling effect’ has a strong connection to the American First Amendment. A connection, in the sense that this effect holds back the user from free speech. A common theme found across social networking services has been that a user presents their digital persona based off their audience’s expectations.[31] Most of the time a user’s audience consists of a diverse group of individuals.

One of the results a user finds themselves in is a ‘context collapse’. This scenario consists of a SNS user categorizing their audiences into a single group and thus aim all their postings and activity aimed at the standards of the single group.

Another result a user finds themselves can be explained through the concept impression management. A study done in 2015 by Lang and Barton, “found that 84% of users have experience been tagged in an undesirable photograph and subsequently taken defensive actions such as untagging”.[31] This result tends to lean more towards the negative aspect since it projects the user’s actions when discovering an undesired image or self presentation. The immediate action of untagging oneself is a negative aspect since the user is adjusting their behavior due to audience’s standards or expectations.

Leaving social mediaEdit

Once a posting has been made on a social networking, it’s a very hard task to delete a message or a post. Knowing that a post or message can be easily reshared and reposted to others makes the possibility of erasing a post very difficult. The rule of thumb on the Internet is that once something is posted to the Internet, it can no longer be deleted for sure.[32] The story of this lesson is to think before someone posts anything on Facebook, LinkedIn, Twitter, etc. in the first place.

The entire concept of leaving a social media platform is slightly tricky to begin with. Taking Facebook for example, while a user can easily delete individual posts and message pretty easily, that however does not apply to an individual account. When it comes to deactivating an account, it simply takes a user’s data, centralizes it and makes it invisible to the public. This data can be public and viewed whenever a user decides to reactivate their account. Completely deleting a profile on Facebook is a different approach, the user has to visit the delete page. At first, the account will only be deactivated for the following two weeks and as long as the user has not been detected with being active. The frustrating aspect about that is that typically users have Facebook or other media platform active on multiple devices.[32] So if a user automatically syncs in on one of those devices or via browser cache, that deleting process of an account will be unsuccessful.


Illinois Institute of Technology ITMM 485 Spring 2017

Eric Tendian, Consuelo Huerta, Preethi Thesinghraja, Shefali Varma


  1. The Effect of Social Media Communication on Consumer Perceptions of Brands
  2. Demographics of Social Media Users in 2016
  3. List of social networking websites
  4. Towards a Theory of Privacy in the Information Age
  5. a b c Breaking the Black Box: What Facebook Knows About You
  6. Facebook Doesn't Tell Users Everything It Really Knows About Them
  7. Public faces? A critical exploration of the diffusion of face recognition technologies in online social networks
  8. About Twitter's link service (
  9. De-anonymizing Web Browsing Data with Social Networks
  10. New Kid on the Block: Exploring the Google+ Social Graph
  11. Google+ or Google-? Dissecting the Evolution of the New OSN in its First Year
  12. Google Privacy: 5 Things the Tech Giant Does With Your Data
  13. Privacy Impact Assessment: Social Media including Facebook, Flickr, GitHub, Instagram, LinkedIn, Storify
  14. a b Privacy Policy | LinkedIn
  15. Privacy Policy - Snap Inc.
  16. Invalid <ref> tag; no text was provided for refs named ”Smith”
  17. Invalid <ref> tag; no text was provided for refs named ”Lvovsky”
  18. Invalid <ref> tag; no text was provided for refs named ”Faceprof”
  19. a b Invalid <ref> tag; no text was provided for refs named ”Tavani”
  20. Invalid <ref> tag; no text was provided for refs named ”HIPAA”
  21. Invalid <ref> tag; no text was provided for refs named ”Purtova”
  22. Invalid <ref> tag; no text was provided for refs named ”Verizon”
  23. Invalid <ref> tag; no text was provided for refs named ”Facebreach”
  24. Tavani, Herman T. (2011) Ethics and technology: controversies, questions, and strategies for ethical computing Hoboken, N.J. : Wiley,
  25. Big data: The next frontier for innovation, competition, and productivity
  26. a b c d Social Big Data: The User Data Collected By Each Of The World's Largest Social Networks — And What It Means
  27. Academic Use of Social Media Technologies as an Integral Element of Informatics Program Delivery in Malaysia
  28. The Secretive World of Selling Data About You
  29. Beyond London 2012: The quest for a security legacy
  30. Pfister, D. S. (2015). The timeliness of the new (networked) rhetoric: On critical distance, sentiment analysis, and public feeling polls. In C. Palczewski (Ed.), Disturbing Argument: Selected papers from the 2013 NCA/AFA Summer Conference on Argumentation (pp. 225­‐231). New York, NY: Routledge.
  31. a b The extended ‘chilling’ effect of Facebook: The cold reality of ubiquitous social networking
  32. a b The Risks of Social Networking