• HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
Thursday, July 31, 2025
BIOENGINEER.ORG
No Result
View All Result
  • Login
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
No Result
View All Result
Bioengineer.org
No Result
View All Result
Home NEWS Science News

Over 150 million websites among a billion tested include sensitive (and tracked) content

Bioengineer by Bioengineer
October 14, 2020
in Science News
Reading Time: 3 mins read
0
Share on FacebookShare on TwitterShare on LinkedinShare on RedditShare on Telegram

Nikolaos Laoutaris, Research Professor at IMDEA Networks Institute, participates in the biggest study about tracking of sensitive topics on the web

IMAGE

Credit: IMDEA Networks Institute

The European General Data Protection Regulation (GDPR) includes specific clauses that put restrictions on the collection and processing of sensitive personal data, defined as any data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, also genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation…

The After two years of hard work, and having crunched more than one billion web-sites (most of the English speaking web), an international team, with Nikolaos Laoutaris (Research Professor at IMDEA Networks Institute, Madrid), as well as researchers from TU Berlin and the Cyprus University of Technology, has developed specialised machine learning classifiers that are able to identify sensitive URLs on the web and used them to search for such URLs on a corpus of some 1 billion URLs in total. As a main (and disturbing) result, some 150 million of them were find to include sensitive content related to Health, Political Beliefs, Sexual Orientation, etc … and still be tracked nearly as much as the rest of the web.

A real time detection

Existing legislation about sensitive personal data is targeted mostly for use by humans, e.g., to file complaints, conduct investigations, and even pursue cases in courts of law. With the use of the new automated machine learning classifiers, however, additional proactive measures can also be put in place for the first time. For example, the browser of the user, or an add-on program, can warn him before clicking and following URLs pointing to sensitive content. Upon visiting such sites, trackers can be blocked, and complaints can be automatically filed. Being able to do the above hinges on being able to classify automatically whether a URL is a sensitive one or not, in real time.

The latter is easier said than done. The reason has to do with the ambiguity of terms such as “Health”, that are used by legal documents to indicate what types of information are considered as sensitive. Indeed, the word Health can be found in both web-sites about healthy eating, sports, and organic food, but also on web-sites about chronic diseases, sexually transmitted diseases, and cancer. Most of the effort on producing the aforementioned classifier went into collecting sufficient “ground-truth” data for training the classifier and allowing it to distinguish truly sensitive uses of words such as health from less sensitive ones.

The results of the work of the team will be presented, as a scientific paper, in ACM IMC’20 (ACM Internet Measurement Conference 2020, 27-29 October, Pittsburgh, USA). Laoutaris also participates in PIMCity (Building the next generation personal data platforms), the EU-funded project to increase transparency and provide users with control over their data. “Privacy law is made for use by humans -Laoutaris comments-, typically after a privacy breach has occurred – e.g., an illegal processing of such data- … but how can we teach this law to machines and have them protect us before privacy breaches occur?”. The research team is working to bring a technological solution to the user in 2021.

“Tracking people -add the researcher- when they visit websites with content that belongs to the GDPR sensitive categories is the true ‘Elephant in the Room’ of privacy. Most people donĀ“t mind be tracked about things that they consider innocent, but would be very upset to know that their visit to sensitive websites are being logged and released to unknown third parties. Our study is, by far, the biggest study about tracking of sensitive topics on the web. It shows that a good part of the web includes content of sensitive character. Unfortunately, these sensitive pages appear to be as tracked as the rest of the web”.

###

About Nikolaos Laoutaris

Research professor at IMDEA Networks since December 2018. Laoutaris is a doctor of computational sciences from the University of Athens (Greece) and worked as a researcher at Harvard University and Boston University. His areas of research centre on privacy, transparency and data protection; the network and information economy; smart transport; distributed systems and network protocols and traffic measurements.

Media Contact
Marta Dorado
[email protected]

Original Source

https://networks.imdea.org/over-150-million-websites-among-a-billion-tested-include-sensitive-and-tracked-content/

Tags: Computer ScienceInternetScience/Health/LawTechnology/Engineering/Computer Science
Share12Tweet8Share2ShareShareShare2

Related Posts

Epithelial Membrane Damage Triggers Allergic Inflammation

Epithelial Membrane Damage Triggers Allergic Inflammation

July 31, 2025
Targeting Fibroblast sFRP2: siRNA Therapy for Uterine Scarring

Targeting Fibroblast sFRP2: siRNA Therapy for Uterine Scarring

July 31, 2025

Bispecific CDH17-GUCY2C ADC Targets Colorectal Cancer

July 31, 2025

Zinc Found in Blocked Syringes: A Closer Look at Contamination Sources

July 31, 2025
Please login to join discussion

POPULAR NEWS

  • Blind to the Burn

    Overlooked Dangers: Debunking Common Myths About Skin Cancer Risk in the U.S.

    60 shares
    Share 24 Tweet 15
  • Dr. Miriam Merad Honored with French Knighthood for Groundbreaking Contributions to Science and Medicine

    46 shares
    Share 18 Tweet 12
  • Study Reveals Beta-HPV Directly Causes Skin Cancer in Immunocompromised Individuals

    37 shares
    Share 15 Tweet 9
  • Engineered Cellular Communication Enhances CAR-T Therapy Effectiveness Against Glioblastoma

    35 shares
    Share 14 Tweet 9

About

We bring you the latest biotechnology news from best research centers and universities around the world. Check our website.

Follow us

Recent News

Epithelial Membrane Damage Triggers Allergic Inflammation

Targeting Fibroblast sFRP2: siRNA Therapy for Uterine Scarring

Bispecific CDH17-GUCY2C ADC Targets Colorectal Cancer

  • Contact Us

Bioengineer.org Ā© Copyright 2023 All Rights Reserved.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Homepages
    • Home Page 1
    • Home Page 2
  • News
  • National
  • Business
  • Health
  • Lifestyle
  • Science

Bioengineer.org Ā© Copyright 2023 All Rights Reserved.