Dataset Request

For more information on specific datasets, request access to the spreadsheet.

  • The VETSA Longitudinal Twin Study of Cognition and Aging
    The multi-disciplinary, longitudinal, genetically-informative VETSA study is better positioned than most others to identify early markers of Alzheimer’s Disease risk. Cognitive tests were given at age 18, when participants were inducted to the military during the Vietnam conflict. The same tests and more have been administered on several occasions since. Biomarker and magnetic resonance imaging of the twins’ brains comprise part of the valuable dataset. VETSA is a national sample, demographically and educationally similar to American men in their age range. Drs. Neale and Gillespie collaborate with the UCSD-based team led by William Kremen and Carol Franz, and with the Boston University team of Michael Lyons.
  • Juvenile Anxiety Study (JAS)
    The Twin Study of Negative Valence Emotional Constructs (aka, Juvenile Anxiety Study, JAS) is a project funded under the NIMH RDoC initiative (R01MH098055, 2013-17) designed to examine a broad suite of putative endophenotypic measures for negative valence systems (NVS) and their relationship to early symptoms of internalizing disorders. Pre-adolescent twins (N=398 pairs, aged 9-14) were recruited through the Mid-Atlantic Twin Registry and completed various dimensional self-report measures along with cognitive, emotional, and psychophysiological laboratory paradigms designed to assess NVS function. Parents also completed surveys about their twins and themselves. A small subset of the twins (20 pairs) also participated in a pilot neuroimaging protocols. For complete details, see Carney et al., Twin Research and Human Genetics. 2016 Oct;19(5):456-64. Please contact PI (J Hettema) or co-I (R Roberson-Nay) for more information.
  • The Adolescent Brain and Cognitive Development Study (ABCD)
    VCU is one of 21 sites in the ABCD consortium, which is currently collecting structural and functional neuroimaging data from 11,500 9-10 year-old children in the first wave of a planned 10-year longitudinal study. The project also assesses psychopathology, substance use and other health-related factors and outcomes. Drs. Neale and Bjork are co-principal investigators of the VCU site, which is one of four that are each collecting data from 200 pairs of twins and 150 non-twins. These data are scheduled for release to the scientific community, beginning in December 2017. The primary goals are to establish normative neurocognitive development in this age range, and to assess whether there are causal effects of substance use or other behaviors on brain development. See for further information.
  • Spit for Science: The VCU Student Survey
    Spit for Science: The VCU Student Survey is an effort being led by researchers at VCU to create a unique universitywide research opportunity for VCU students.

    The scientific focus of the project is to understand why some people are more likely than others to develop problems associated with the use of alcohol, the use of other substances, and difficulties with emotional health. The project aims to understand how individual genetic predispositions come together with environmental factors to contribute to these outcomes. We know that 1 in 4 people over the age of 18 are affected by substance use or mental health problems. This project is an opportunity for students at VCU to work together with some of the leading researchers in the world in this area to try to understand and prevent these important and widespread problems. For more information about this study and a description of the dataset, visit the site.

  • Virginia 30,000 (VA 30K)
    The VA30k sample contains data from 14,763 twins, ascertained from two sources. Public birth records in the Commonwealth of Virginia for twins born in Virginia between 1915 and 1971 and responses to a letter published in the newsletter of the American Association of Retired Persons (AARP, 9476 individuals). Twins were mailed a 16-page ‘Health and Lifestyles’ questionnaire (HLS), and were asked to supply names and addresses of their spouses, siblings, parents and children for the follow-up study of relatives of twins. Completed questionnaires were obtained from 69.8% of twins. The original twin questionnaire was modified slightly to provide two additional forms, one appropriate for parents of twins and another for spouses, children and siblings of twins. Response rates from relatives (44.7%) was much lower than that from twins. Of the complete sample of 28,492 individuals (from 8567 extended kinships), 58% were female. The HLS included demographics, health history, lifestyle, life events, relationships, personality, problems, attitudes.
  • Australian 25,000 (OZ 25K)
    The OZ25k sample was ascertained through two cohorts of twins. The first cohort, recruited in 1980-82 from a sampling frame of 5,967 twin pairs aged 18 years or older (born 1893-1964) and enrolled in the Australian NHMRC Twin Registry (ANTR), included 3,808 pairs (64%) who completed HLS and were followed up with a second questionnaire in 1988-90 with responses from 2,708 complete pairs (81%). The second cohort, born 1964-71, recruited from ANTR in 1989, were mailed HLS in 1989-91 with responses from 3,769 individuals from 4,269 eligible pairs. Both cohorts were asked to provide names of relatives, who were mailed a modified HLS in 1989-91 and respectively 8601 (60%) and 2799 (56%) of relatives from cohorts 1 and 2 returned questionnaires (RR 56-65%). In total there were 21,256 respondents, of whom 20,945 had valid scores for smoking. The HLS included demographics, health history, lifestyle, life events, relationships, personality, problems, attitudes.
  • Virginia Adult Twin Studies of Psychiatric and Substance Use Disorders (VATSPSUD)
    VATSPSUD is the first population-based adult twin study of common psychiatric disorder. VATSPSUD focused initially on female twins (‘FF’ study) but was later extended to include male (‘MM’) and unlike-sex pairs (‘MF’) with a more extensive assessment of substance use and disorders. The first wave targeted 2352 women from 1176 FF pairs from the population-based Virginia Twin Registry (VTR) now MATR, of which 2162 (92.0%) were interviewed yielding 1032 complete pairs. 2002 women completed wave 2, 1899 wave 3 and 1939 wave 4. The first wave of the MM/MF studies targeted 9418 eligible individuals from MM and MF twin pairs. 6092 males and 1720 females (N=6812, 72.4%) completed wave 1 interviews, 5621 wave 2 and 752 male pairs wave 3. Parental psychiatric assessments were also done for the FF part. The study includes measures of adult psychopathology, psychiatric and drug use disorders. Several of the principal disorders have been assessed on more than one occasion by structured interview and self report (and sometimes cotwin reports) using one-year prevalence or lifetime history. Personality measures, demographic variables and environmental risk factors were also included. Waves FF1-FF4 contaings over 25k ‘person months’; MM1-MM2 45,838 and 38,051, OSDZ pairs 43,030 and 35,282. Parental interviews of twins in the FF study provide multiple independent assessments of a range of risk factors for psychopathology in twins.
  • Collaborative Study on the Genetics of Alcoholism (COGA)
    The Collaborative Study on the Genetics of Alcoholism (COGA) is a multi-site project, with the goal of identifying specific genes involved in the predisposition to alcohol dependence and related disorders. The COGA sample consists of large families densely affected with alcohol dependence, in which were identified through inpatient or outpatient alcohol treatment programs. All individuals were administered the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA) interview, which is a polydiagnostic instrument that assesses most major psychiatric disorders (Bucholz et al., 1994; Hesselbrock et al., 1999). More than 14,000 individuals have been assessed. Individuals from the most densely affected families also underwent an electrophysiological protocol, including EEG and a battery of auditory and visual evoked potentials, as well as a variety of additional phenotypic information including personality, alcohol craving & expectancies, neuropsychological tests, and IQ. COGA has used a variety of complementary strategies for gene identification, and has a variety of genotyped samples in which different types of genetic analyses are on-going. These include (1) family-based linkage sample, which has microsatellite and SNP marker data on ~2282 individuals from 262 of the most densely affected, multiplex alcoholic families; (2) case control GWAS sample, consisting of 1235 alcohol dependent cases and 711 controls; and (3) child-adolescent sample, consisting of ~3000 individuals between the ages of 12-25, who are being followed longitudinally, in order to understand the effects of susceptibility genes identified in the adult samples across development. Other sites (PIs): University of Connecticut (V. Hesselbrock); Indiana University (H.J. Edenberg, J. Nurnberger Jr., T. Foroud); University of Iowa (S. Kuperman, J. Kramer); SUNY Downstate (B. Porjesz); Washington University in St. Louis (L. Bierut, A. Goate, J. Rice, K. Bucholz); University of California at San Diego (M. Schuckit); Howard University (R. Taylor); Rutgers University (J. Tischfield); Southwest Foundation (L. Almasy). For more about this study and a description of the dataset please see NIAAA-COGA and Wiki-COGA.
  • Virginia Twin Studies of Adolescent Behavioral Development (VTSABD/YAFU/TSA)
    VTSABD (Eaves) and its young adult follow-up (YAFU, Silberg & TSA, Miles) are the first population-based, multi-wave, cohort-sequential twin study of adolescent psychopathology and its risk factors. It included Caucasian families of male and female MZ and DZ twins and their parents to assess the role of genes and environment in developmental trajectories of behavior and disorders from childhood to young adulthood and identify major familial psychosocial risk factors and characterize their correlation and interaction with genetic risk. Adolescent male & female twins aged 8 through 16 were ascertained through Virginia schools. In the first wave, 1412 Caucasian families participated (2775 individuals twins in 1384 complete pairs). Twins under age 18 were followed every 18 months up to 4 times. All twins were targeted for a young adult assessment. 1185 pairs have been followed up in YAFU at a median age of 21 years; 399 pairs (1084 individuals) in TSA at 25 years. VTSABD comprises 6282 face-to-face assessments of juveniles across waves and 3373 assessments of young adults. The twins’ parents completed psychiatric assessments (N=2470); parents and teachers acted as informants about adolescent twins. The study used a rich assessment battery (published or widely-used instruments, supplemented or modified for a longitudinal twin-family study) including dimensional and categorical measures, multiple raters, and environmental indices. The data comprise psychiatric and substance use assessment of twins as adolescents and young adults, and parents, with the principal psychosocial and environmental risk factors. Prospective ratings were secured about each child by direct assessment of the child and by reports of parents and teachers. The core juvenile assessment comprised face-to-face interviews using the Child and Adolescent Psychiatric Assessment (CAPA) adapted for use with twins and their parents, which yields symptomatology relevant to common areas of childhood and adolescent behavior and disorder nuanced with onset, frequency, incapacity, treatment, and context for each clinical domain. Other personality, behavioral, cognitive and environmental assessments were done. VTSABD used a SCID-based assessment of psychopathology in young adult twins (YAFU) and their parents. The Life Experiences Interview, a structured interview adapted from the Early Experiences Interview, was developed to study environmental factors of drug abuse for TSA.
  • The Mid-Atlantic School Age Twin Study (MASATS)
    Subjects were enrolled from the population-based Mid-Atlantic Twin Registry (MATR) which combines the Virginia, North Carolina, and South Carolina Twin Registries. Questionnaires were mailed to NC and VA mothers of twins and their 11 to 18-year-old adolescent twins. The pilot sample consisted of 656 mothers and 448 adolescent twins. MASATS comprises 1127 adolescent twins (357 twin pairs with known zygosity) 65% RR. Questionnaires include measures of risk and protective factors for adolescent externalizing and internalizing problems based on published instruments and substance use based on Monitoring the Future Study and Drug Use Screening Inventory.
  • National Study of Adolescent Health (AddHealth)
    Add Health is a nationally representative study of the causes of health-related behaviors of adolescents in grades 7-12 and their outcomes in young adulthood, and seeks to examine how social contexts (families, friends, peers, schools, neighborhoods, and communities) influence adolescents’ health and risk behaviors. Data at the individual, family, school, and community levels were collected in two waves between 1994 and 1996. In 2001 and 2002, Add Health respondents, 18 to 26 years old, were re-interviewed in a third wave to investigate the influence that adolescence has on young adulthood. Questionnaires included questions of all areas of adolescent health, including substance use.
  • The Medical College of Virginia Twin Study (CVT)
    CVT is a longitudinal population-based twin study of genetic epidemiology of risk for hypertension. Ascertainment Adolescent twins were identified through over 75 elementary schools within a 150 mile radius. Twins and their parents visited MCV the first time around the twins’ 11th birthday and were followed up every 18 months until age 17. In 1988, a new cohort of 9.5 year old twins entered the project, most of whom were followed up until age 14. The majority of twins are Caucasian (N=459), 19% of families were African-American (N=111). 95% of mothers and 70% of fathers participated at least once. CVT data comprise 1835 person visits. The study protocol included anthropometric and cardiovascular measures and questionnaires. Heart rate, blood pressure and echocardiographic dimensions were measured at rest and reactivity measures to physical and mental stress were obtained. Health and family history, demographics, puberty, personality and behavioral characteristics, such as type A behavior and substance use were assessed by questionnaire. Blood samples were taken for zygosity. Overlap VTSABD-CVT Although the two projects ascertained twins independently, both were conducted at the same time in Virginia, consequently a sample of 174 twin pairs participated in both projects, often at multiple occasions. Of these families, 157 have participated in at least 5 combined visits, resulting in 1133 twin pair visits.
  • The Leuven Longitudinal Twin Study (LLTS)
    LLTS is a longitudinal population-based twin study of genetic epidemiology of physical fitness. Ascertainment Adolescent twins were ascertained from the population-based East Flanders Prospective Twin Study (EFPTS) with zygosity diagnosis at birth. Twins and parents visited for the first time around the twins’ 10th birthday with follow-up visits of the twins scheduled every 6 months until age 16 with an additional visit at age 18. A total of 115 twin families from five birth cohorts (1976-1981) participated with approximately equal numbers of twins in each zygosity by sex group . Drop out rates were small. LLTS data comprise 1380 person visits. The LLTS protocol was very similar to the CVT protocol except for physical fitness. Included were 27 body measures, somatotype, skeletal maturity, cardiovascular evaluation, physical fitness test battery (nine Eurofit motor tests and VO2max). Subjects filled out questionnaires on their health, family history, sports participation, physical activity, SES, substance use.
  • The Finnish Twin Studies
    FinnTwin16 (FT16) and FinnTwin12 (FT12) are two population-based twin studies aimed at understanding how genetic and environmental influences impact the development of alcohol use and related behaviors across adolescence and into young adulthood. All twins were identified through Finland’s Central Population Registry, permitting exhaustive and unbiased ascertainment of all twins born in the country across 10 birth, for a total of ~10,000 twins and their families. FT16 has questionnaire assessments at ages 16, 17, 18.5, and in the mid-20s. These questionnaires contain items on alcohol use, smoking, other drug use, personality, and related health habits and environmental factors. A subset of the twins highly concordant or discordant for alcohol use in adolescence (~600 twins) also completed psychiatric interviews, DNA collection, electrophysiological measures, and neuropsychological testing at the mid-20s assessment. FT12 first assessed children at age 12, with follow-ups at age 14, 17, and in the young 20s. In FT12, we have rich data from the twins, parents, teachers, and peers. A subset (~1850 twins and their parents) also completed psychiatric interviews at ages 14 and 22. GWAS data for this subset are also available. Other PIs: Richard Rose, Indiana University; Jaakko Kaprio, University of Helsinki.
  • The Child Development Project (CDP)
    The Child Development Project (CDP) is a community based sample, in which children were recruited during kindergarten pre-registration from a variety of schools that served families from a range of socioeconomic status groups at three US cities. The original CDP sample consisted of 585 children (52% male; 81% European American, 17% African American, and 2% other ethnic groups). Data collection began the summer before the participants entered kindergarten (at ~age 5) and follow-ups have been conducted annually and remain on-going (participants are currently entering their 30s). DNA was collected from 93% of the target sample of regular CDP participants and is stored in the VIPBG molecular genetics laboratory of Dr. Riley. The project’s guiding model of developmental process is that children’s biological dispositions, cultural contexts, life experiences, and characteristic social cognitions transactionally combine to influence a variety of behavioral outcomes. The rich, longitudinal assessments of the CDP offer special advantages for advancing understanding of genetic mechanisms in behavioral development. The CDP’s database contains multi-source, multi-method measures of multiple levels of social context and process, including family, school, peer, neighborhood, and child characteristics. It also contains a wide range of adjustment measures, including externalizing and internalizing behavior problems, substance use, academic, occupational and military achievement, romantic relationships, and religious and civic involvement. The CDP database provides an unusually dense array of theoretically important, phenotypic measures over a long span of development. This richness of phenotypes makes the genetic information available in the sample particularly valuable for studying how these genes impact developmental pathways. Other PIs: Ken Dodge & Jen Lansford (Duke); Jack Bates (Indiana University); Greg Petit (Auburn).
  • The Mobile Youth Survey (MYS)
    The MYS is designed to identify the life course trajectories of adolescents (aged 10-18) living in poverty in the Mobile-Prichard inner city area of Alabama. The MYS has been administered in a group-format for the past eight summers and examines a number of psychosocial variables including risk behaviors (e.g., violence and aggressive behavior, alcohol and drug use, sexual behavior), family factors (e.g., family structure and parental monitoring), and neighborhood factors (e.g., support from neighborhood). During the past six years nearly 6,000 different adolescents have been surveyed. In 2008, we obtained a grant to collect DNA and more extensive phenotypic measures on a subsample of ~700 MYS participants ages 14-18. These youth were also part of a ‘natural experiment’ in which a random subset of families were relocated to better housing make possible by a government grant. The MYS sample is a unique resource for extending our understanding of the risks associated with identified genes, both in the sense that it is a largely African-American sample, an under-represented population in genetic studies, and an impoverished sample, making it possible to study how extreme environmental conditions, such as poverty, may alter the importance/expression of individual genetic predispositions and/or the role of other important environmental factors, such as family and peer variables. Other PIs: Brian Mustanski (Univ of Illinois, Chicago), John Bolland (Univ of Alabama).
  • Avon Longitudinal Study of Parents and Children (ALSPAC)
    ALSPAC has followed a large epidemiological cohort of ~14,000 children (with DNA on 10,000) and their parents from early in the mother’s pregnancy through childhood and adolescence, with subjects now commencing evaluations at age 17. The project has collected comprehensive health-related information, including phenotypic outcomes, environmental factors, and DNA, with >85 assessments from mothers, their partners, and children, conducted from the pre-natal stage through age 17 at yearly, or more frequent, intervals. We are funded to collect alcohol-related information at the age 18 and 22 year assessments, assessing key constructs such as quantity/frequency/density of alcohol use and AUD symptoms, other drug use and antisocial behavior, as well as related constructs of importance, including peer group deviance, pro-social behaviors, other risk-taking behaviors, religious involvement, parent-child relations and monitoring (if respondent is still living at home), attitudes to and expectancies from alcohol use as well as measures of major life transitions such as moving out of the childhood home, work experience, attendance at university, romantic relationships/marriage, and child-bearing. Accordingly, with its detailed and frequent phenotypic assessments, the ALSPAC cohort provides a unique opportunity to clarify, in a developmental context, the complex web of susceptibility and protective factors for alcohol use and the development of alcohol-related problems.
  • Alcohol Dependence GWAS
    706 related cases and 1755 controls, Affymetrix V6.0, BEAGLECALL genotypes, imputed to 1000 Genomes.
  • Schizophrenia GWAS I
    1606 cases and 1794 controls, Affymetrix V6.0, BEAGLECALL genotypes, imputed to 1000 Genomes.
  • Schizophrenia GWAS II
    843 individuals from 237 multiplex pedigrees, Illumina 610 QUAD, imputed to 1000 Genomes.
  • The Mood and Immune Regulation In Twins Study (MIRT)
    The MIRT study is a longitudinal pilot study of the genetic and environmental factors related to depression, inflammation, and diabetes risk in mid-life. The sample consists of monozygotic twins discordant for history of depression who do not currently have diabetes recruited from the Mid-Atlantic Twin Registry. Overall, 43 complete pairs aged 40 – 70 were interviewed. Participants underwent a complete clinical examination (blood pressure, adiposity, and venipuncture to assess glucose, HbA1c, insulin and pre- and anti-inflammatory biomarkers, as well as mRNA to assess expression of immune-related genes) at the VCU Clinical Research Center (CRC) and completed in-person interviews (~90 minutes long) which assessed family history, personality, lifetime history of major depression using the Diagnostic Interview Schedule, stressful life events, coping, health behaviors, social relationships, and early life adversity. All of these procedures were repeated 6-months later, making this both a within-person and between-person/within-pair design. For more information please contact the MIRT Study PI, Dr. Briana Mezuk (
  • Brisbane Longitudinal Twin Study 19 & Up Prohect (BLTS-19UP)
    The BLTS-19UP project in Australia was funded by the US National Institute on Drug Abuse, and the Australian National Health and Medical Research Council as part of Dr. Nathan Gillespie’s R00. The aim was to make a significant contribution to the discovery of quantitative trait loci influencing cannabis use disorders. In addition to cannabis use and misuse in young adults, measures of comorbid licit and illicit substance use and substance use disorders were also collected, as well as a variety of internalizing and externalizing disorders, health, and lifestyle measures. Complete data including QC’d imputed GWAS data are available from ~2,900 subjects. For full details see: Gillespie et al. ‘The Brisbane Longitudinal Twin Study: Pathways to Cannabis Use, Abuse, and Dependence project-current status, preliminary results, and future directions.’ Twin Res Hum Genet. 2013; 16(1): 21-33. doi: 10.1017/thg.2012.111.
  • Schizophrenia Whole Genome Sequencing
    Currently funded with all agreements in place to sequence 600-1200 members of the schizophrenia high density pedigrees, 200 singleton schizophrenia cases and 2000 controls, all from Ireland.
  • Irish Study of High Density Schizophrenia Families (ISHDSF)
    PsychChip data
  • UK 10K
    10x whole genome sequencing from (currently) 3781 UK controls. A useful dataset for additional controls.
  • UK Biobank
    WAS and imputation for 500k normal population individuals, various phenotypes available.