• HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
Friday, October 24, 2025
BIOENGINEER.ORG
No Result
View All Result
  • Login
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
No Result
View All Result
Bioengineer.org
No Result
View All Result
Home NEWS Science News

Over 150 million websites among a billion tested include sensitive (and tracked) content

Bioengineer by Bioengineer
October 14, 2020
in Science News
Reading Time: 3 mins read
0
IMAGE
Share on FacebookShare on TwitterShare on LinkedinShare on RedditShare on Telegram

Nikolaos Laoutaris, Research Professor at IMDEA Networks Institute, participates in the biggest study about tracking of sensitive topics on the web

IMAGE

Credit: IMDEA Networks Institute

The European General Data Protection Regulation (GDPR) includes specific clauses that put restrictions on the collection and processing of sensitive personal data, defined as any data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, also genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation…

The After two years of hard work, and having crunched more than one billion web-sites (most of the English speaking web), an international team, with Nikolaos Laoutaris (Research Professor at IMDEA Networks Institute, Madrid), as well as researchers from TU Berlin and the Cyprus University of Technology, has developed specialised machine learning classifiers that are able to identify sensitive URLs on the web and used them to search for such URLs on a corpus of some 1 billion URLs in total. As a main (and disturbing) result, some 150 million of them were find to include sensitive content related to Health, Political Beliefs, Sexual Orientation, etc … and still be tracked nearly as much as the rest of the web.

A real time detection

Existing legislation about sensitive personal data is targeted mostly for use by humans, e.g., to file complaints, conduct investigations, and even pursue cases in courts of law. With the use of the new automated machine learning classifiers, however, additional proactive measures can also be put in place for the first time. For example, the browser of the user, or an add-on program, can warn him before clicking and following URLs pointing to sensitive content. Upon visiting such sites, trackers can be blocked, and complaints can be automatically filed. Being able to do the above hinges on being able to classify automatically whether a URL is a sensitive one or not, in real time.

The latter is easier said than done. The reason has to do with the ambiguity of terms such as “Health”, that are used by legal documents to indicate what types of information are considered as sensitive. Indeed, the word Health can be found in both web-sites about healthy eating, sports, and organic food, but also on web-sites about chronic diseases, sexually transmitted diseases, and cancer. Most of the effort on producing the aforementioned classifier went into collecting sufficient “ground-truth” data for training the classifier and allowing it to distinguish truly sensitive uses of words such as health from less sensitive ones.

The results of the work of the team will be presented, as a scientific paper, in ACM IMC’20 (ACM Internet Measurement Conference 2020, 27-29 October, Pittsburgh, USA). Laoutaris also participates in PIMCity (Building the next generation personal data platforms), the EU-funded project to increase transparency and provide users with control over their data. “Privacy law is made for use by humans -Laoutaris comments-, typically after a privacy breach has occurred – e.g., an illegal processing of such data- … but how can we teach this law to machines and have them protect us before privacy breaches occur?”. The research team is working to bring a technological solution to the user in 2021.

“Tracking people -add the researcher- when they visit websites with content that belongs to the GDPR sensitive categories is the true ‘Elephant in the Room’ of privacy. Most people don´t mind be tracked about things that they consider innocent, but would be very upset to know that their visit to sensitive websites are being logged and released to unknown third parties. Our study is, by far, the biggest study about tracking of sensitive topics on the web. It shows that a good part of the web includes content of sensitive character. Unfortunately, these sensitive pages appear to be as tracked as the rest of the web”.

###

About Nikolaos Laoutaris

Research professor at IMDEA Networks since December 2018. Laoutaris is a doctor of computational sciences from the University of Athens (Greece) and worked as a researcher at Harvard University and Boston University. His areas of research centre on privacy, transparency and data protection; the network and information economy; smart transport; distributed systems and network protocols and traffic measurements.

Media Contact
Marta Dorado
[email protected]

Original Source

https://networks.imdea.org/over-150-million-websites-among-a-billion-tested-include-sensitive-and-tracked-content/

Tags: Computer ScienceInternetScience/Health/LawTechnology/Engineering/Computer Science
Share12Tweet8Share2ShareShareShare2

Related Posts

Rab5 GTPases Direct ROP Signaling for Pollen Polarity

Rab5 GTPases Direct ROP Signaling for Pollen Polarity

October 24, 2025

New Brain PET Tracer Targets TDP-43 Pathology

October 24, 2025

Evaluating Chinese Nurses’ Sexual Harassment Scale Validity

October 24, 2025

Engineered Metarhizium Fungi Lure and Kill Mosquitoes

October 24, 2025
Please login to join discussion

POPULAR NEWS

  • Sperm MicroRNAs: Crucial Mediators of Paternal Exercise Capacity Transmission

    1279 shares
    Share 511 Tweet 319
  • Stinkbug Leg Organ Hosts Symbiotic Fungi That Protect Eggs from Parasitic Wasps

    308 shares
    Share 123 Tweet 77
  • ESMO 2025: mRNA COVID Vaccines Enhance Efficacy of Cancer Immunotherapy

    184 shares
    Share 74 Tweet 46
  • New Study Suggests ALS and MS May Stem from Common Environmental Factor

    133 shares
    Share 53 Tweet 33

About

We bring you the latest biotechnology news from best research centers and universities around the world. Check our website.

Follow us

Recent News

Rab5 GTPases Direct ROP Signaling for Pollen Polarity

New Brain PET Tracer Targets TDP-43 Pathology

Evaluating Chinese Nurses’ Sexual Harassment Scale Validity

Subscribe to Blog via Email

Success! An email was just sent to confirm your subscription. Please find the email now and click 'Confirm' to start subscribing.

Join 66 other subscribers
  • Contact Us

Bioengineer.org © Copyright 2023 All Rights Reserved.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Homepages
    • Home Page 1
    • Home Page 2
  • News
  • National
  • Business
  • Health
  • Lifestyle
  • Science

Bioengineer.org © Copyright 2023 All Rights Reserved.