• HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
Thursday, August 21, 2025
BIOENGINEER.ORG
No Result
View All Result
  • Login
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
No Result
View All Result
Bioengineer.org
No Result
View All Result
Home NEWS Science News

Researchers run largest known transparent checkpointing process

Bioengineer by Bioengineer
November 23, 2016
in Science News
Reading Time: 3 mins read
0
Share on FacebookShare on TwitterShare on LinkedinShare on RedditShare on Telegram

A team of researchers led by Jiajun Cao, a PhD candidate in the College of Computer and Information Science (CCIS) at Northeastern University, recently completed what appears to be the largest known instance of transparent checkpointing.

Transparent checkpointing allows computer scientists and engineers working on large projects to save and reopen programs without modifying any code. This assures researchers working across hundreds or thousands of computers that their work will be safe in case of a computer failure. Their programs are run on CPU cores, and computers can contain multiple cores, allowing them to run one program simultaneously across multiple cores. Transparent checkpointing could simplify the work of computer scientists handling large amounts of data and using supercomputers to process that data. For example, with transparent checkpointing software, meteorologists can process and analyze billions of pieces of weather data without the fear that a computer crash could erase that work.

"The idea of checkpointing is that one can take a running computation, automatically stop it in the middle and save the state of everything to a file on disk," Gene Cooperman, a professor at CCIS and Cao's advisor, explains. "Then you can copy that file to another computer or keep it on the same one. When you restart, the program continues running from where it left off." Cooperman's work with Distributed Multi-Threaded CheckPointing (DMTCP) software, which is responsible for checkpointing, is now in its second decade.

What makes this example of transparent checkpointing significant is the massive amount of data that was run and saved in a short period of time. The MVAPICH software supporting the Message Passing Interface (MPI) was used to run the High Performance Conjugate Gradients (HPCG) program for linear algebra in parallel over 32,768 CPU cores on 2,048 computers. It used a total memory of 38 terabytes, and was checkpointed in 10 minutes and 53 seconds. A second program, Nanoscale Molecular Dynamics (NAMD), was run in parallel over 16,368 CPU cores on 1,024 computers, using a total memory of 10 terabytes. It was checkpointed in two minutes and 38 seconds. Checkpointing these amounts of data in 11 minutes or less is a breakthrough for scientists usually restricted by having to run their programs before modifying and saving them within 24-hour time slots.

These processes were carried out on the Stampede supercomputer at the Texas Advanced Computer Center (TACC). Stampede is one of the world's largest supercomputers. The research was supported by a grant from the National Science Foundation awarded to Cooperman's DMTCP project, under which Cao's checkpointing research falls.

"These results show how the Extended Collaborative Support Services from the National Science Foundation-supported Extreme Science and Engineering Discovery Environment can help scientists and developers improve the scalability and efficiency of their code on high performance computing clusters," says Jérôme Vienne, a research associate at TACC.

Dhabaleswar K. Panda, who leads the MVAPICH team at Ohio State, explains that "the results of this collaborative work push the existing capabilities of the MVAPICH2 library further in terms of fault-tolerance and check-pointing."

Cao's collaborators include Kapil Arya of Mesophere, Inc.; Rohan Garg and Gene Cooperman of Northeastern University; Shawn Matott of the Center for Computational Research at the State University of New York at Buffalo; Dhabaleswar K. Panda and Hari Subramoni of Ohio State University; and Jérôme Vienne of the Texas Advanced Computing Center at the University of Texas at Austin.

The paper, titled "System-level Scalable Checkpoint-Restart for Petascale Computing," is available to read online. This work will be published at the 22nd Institute of Electrical and Electronics Engineers International Conference on Parallel and Distributed Systems (ICPADS) in December 2016.

###

Media Contact

Nicole Bekerian
[email protected]
617-373-6841
@Northeastern

http://www.neu.edu

############

Story Source: Materials provided by Scienmag

Share12Tweet8Share2ShareShareShare2

Related Posts

blank

Metformin’s Potential Role in Breast Cancer

August 21, 2025
blank

Nerve Injury from Cancer Fuels Anti-PD-1 Resistance

August 21, 2025

Nanosecond Perovskite Quantum Dot LEDs Revolutionize Displays

August 21, 2025

Pediatric AKI: Biomarkers and AI Transform Detection

August 21, 2025
Please login to join discussion

POPULAR NEWS

  • blank

    Molecules in Focus: Capturing the Timeless Dance of Particles

    141 shares
    Share 56 Tweet 35
  • Neuropsychiatric Risks Linked to COVID-19 Revealed

    81 shares
    Share 32 Tweet 20
  • Modified DASH Diet Reduces Blood Sugar Levels in Adults with Type 2 Diabetes, Clinical Trial Finds

    60 shares
    Share 24 Tweet 15
  • Predicting Colorectal Cancer Using Lifestyle Factors

    47 shares
    Share 19 Tweet 12

About

We bring you the latest biotechnology news from best research centers and universities around the world. Check our website.

Follow us

Recent News

Metformin’s Potential Role in Breast Cancer

Nerve Injury from Cancer Fuels Anti-PD-1 Resistance

Nanosecond Perovskite Quantum Dot LEDs Revolutionize Displays

  • Contact Us

Bioengineer.org © Copyright 2023 All Rights Reserved.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Homepages
    • Home Page 1
    • Home Page 2
  • News
  • National
  • Business
  • Health
  • Lifestyle
  • Science

Bioengineer.org © Copyright 2023 All Rights Reserved.