The provided text is a detailed methodological excerpt from a study involving the MULTI Consortium and several large biobank and cohort datasets, including UK Biobank (UKBB), FinnGen, Psychiatric Genomics Consortium (PGC), TriNetX, the Baltimore Longitudinal Study of Ageing (BLSA), and the Multi-Ethnic Study of Atherosclerosis (MESA). Below is a concise overview and key points summary organized by major topics:
MULTI Consortium Overview
Purpose: Integrate multi-organ and multi-omics data (imaging, genetics, metabolomics, proteomics) to model human aging and disease across the lifespan.
Data Sources: Builds on existing consortia and studies.
Ethics: Approved by Columbia University IRB (AAAV6751).
UK Biobank (UKBB)
Participants: ~500,000 UK individuals (2006-2010).
Sleep Data: Self-reported sleep duration via touchscreen (field ID 1160), including naps.
Data Collection Details: Basic quality control excludes unrealistic values; average sleep over 4 weeks considered.
Imaging Data: Multi-organ MRI-derived Image-Derived Phenotypes (IDPs) across brain, heart, liver, pancreas, spleen, adipose, kidney, and eye OCT.
Biomarkers: Plasma proteomics (UKB Pharma Proteomics Project) and metabolomics (Nightingale Health).
Ageing Clocks: Developed 23 multi-organ biological age gaps (BAGs)—7 MRIBAGs (MRI-based age gaps), 11 proteomics BAGs (ProtBAGs), 5 metabolomics BAGs (MetBAGs).
Analyses: Used Generalized Additive Models (GAMs) to model sleep duration relationships with BAGs and other phenotypes, adjusting for multiple covariates and minimizing overfitting via penalized regression splines.
Key Methodological Approaches
Ageing Clocks: Training and validation using nested cross-validation on healthy control populations, with splitting into training, validation, test, and hold-out test datasets.
Multi-Omics Integration: Proteins and metabolites were quality controlled, normalized, and annotated with the Human Protein Atlas.
Modeling Sleep Effects: GAMs allowed flexible nonlinear patterns including U-shaped relationships; tested main effects, sex differences, and sex-sleep interactions.
Effect Size Outcomes: Associations quantified while excluding outliers and controlling for confounders.
FinnGen
Dataset: >500,000 Finnish biobank samples.
GWAS Data: Summary statistics for 521 disease endpoints included after harmonization.
Method: No individual data used; analyses based on REGENIE-generated summary statistics with age, sex, PCs, and batch covariates.
Psychiatric Genomics Consortium (PGC)
Focus: Genetics of psychiatric disorders.
Data: 6 brain disease GWAS summary datasets included (e.g., schizophrenia, bipolar disorder, major depression).
Usage: Summary data only; quality controlled and harmonized.
TriNetX
Data Type: Real-world clinical data on >90 million patients from >70 healthcare organizations.
Purpose: Assess associations of sleep traits (insomnia, hypersomnia) with systemic diseases identified in UKBB.
Baltimore Longitudinal Study of Ageing (BLSA)
Goal: Track physiological and cognitive ageing.
Data: Brain MRI, self-reported, and actigraphy (wearable) sleep duration measures (n=385).
Replication: Used to replicate U-shaped sleep-brain aging associations observed in UKBB.
Multi-Ethnic Study of Atherosclerosis (MESA)
Participants: >6,000 diverse US adults.
Data Used: 573 participants with brain MRI, self-reported sleep duration.
Purpose: Replicate UKBB U-shaped sleep and brain ageing pattern.
Summary of Major Analytical Techniques
GAMs: Flexible nonlinear models for exploring sleep-organ ageing associations. Used to identify U-shaped or other nonlinear trends.
Machine Learning for Ageing Clocks: Nested cross-validation, hyperparameter tuning, held-out test sets, algorithms including LASSO, SVR, elastic net, neural nets.
Covariate Controls: Age, sex, anthropometrics, blood pressure, assessment center, disease presence, organ-specific confounders.
Important URLs and References
UKBB ethics and data gateway: https://www.ukbiobank.ac.uk
Human Protein Atlas: https://www.proteinatlas.org/
FinnGen GWAS repository: https://www.finngen.fi/en/access_results
PGC GWAS data: https://pgc.unc.edu/for-researchers/download-results/
TriNetX platform: https://trinetx.com
If you want, I can also assist with:
Summarizing specific sections or figures
Explaining particular methodologies (e.g., GAM, imaging data types)
Designing analyses based on these data
Helping develop code snippets for GAM modeling or machine learning
Or answering questions about study design, cohort characteristics, or multi-omics integration.
Please specify if you want detailed explanations or help with specific aspects!
Tags: biological age gaps and sleep durationgeneralized additive models in aging researchintegrative aging modelslongitudinal aging data analysisMRI-derived phenotypes in agingmulti-ethnic aging cohort studiesmulti-omics aging biomarkersmulti-organ biological age clocksproteomics and metabolomics aging markerssleep and age-related disease risksleep patterns and agingUK Biobank sleep data analysis



