Available Data: Primary and Secondary
Click here for the list of data source
- Swiss Hepatitis Cohort Study: 2000-2006 (primary, cohort)
The Swiss Hepatitis C Cohort Study (SCCS) is a joint effort between the Swiss Group of Experts in Viral Hepatitis and the Swiss Association for the Study of the Liver. The SCCS was established because large population-based cohort studies are the only way to confirm or refute working hypotheses on the natural course of chronic hepatitis C and on hepatitis C virus pathology, and partly because experience with a similar collaborative effort of specialized treatment centers had already been successfully established for a human immunodeficiency virus cohort in Switzerland.
- Nationwide Emergency Department Sample (NEDS): 2006-2013 (secondary, cross-sectional survey)- Codebook
The Nationwide Emergency Department Sample (NEDS) is part of a family of databases and software tools developed for the Healthcare Cost and Utilization Project (HCUP). The NEDS is the largest all-payer emergency department (ED) database in the United States, yielding national estimates of hospital-based ED visits. Unweighted, it contains data from approximately 30 million discharges each year. Weighted, it estimates roughly 135 million ED visits.
- National Longitudinal Mortality Study (NLMS): 1973-2013 (secondary) – Reference Manual
The National Longitudinal Mortality Study (NLMS) is a national, longitudinal, mortality study sponsored by the National Cancer Institute, the National Heart, Lung, and Blood Institute, the National Institute on Aging, the National Center for Health Statistics and the U.S. Census Bureau for the purpose of studying the effects of differentials in demographic and socio-economic characteristics on mortality.
- Surveillance, Epidemiology, and End Results (SEER): 1973-2012 (secondary, cohort) – Codebook
The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute works to provide information on cancer statistics in an effort to reduce the burden of cancer among the U.S. Population. For an in-depth look, read their program overview, or take a look at their fact sheets & brochures.
- National Violent Death Reporting System (NVDRS): 2003-2013 (secondary, cross-sectional survey) – Codebook
The National Violent Death Reporting System (NVDRS) provides states and communities with a clearer understanding of violent deaths to guide local decisions about efforts to prevent violence and track progress over time. NVDRS is the only state-based surveillance (reporting) system that pools data on violent deaths from multiple sources into a usable, anonymous database. These sources include state and local medical examiner, coroner, law enforcement, crime lab, and vital statistics records.
- World Trade Center Health Registry: 2003,2004,2006,2007,2008,2011,2012 (secondary, cohort) – wave1-codebook wave2-codebook wave3-codebook
The Agency for Toxic Substances and Disease Registry and the New York City Health Department established the World Trade Center (WTC) Health Registry in 2002, with the goal of monitoring the health of people directly exposed to the WTC disaster. Today, the Registry is an ongoing collaboration with the National Institute for Occupational Safety and Health. The WTC Health Registry will periodically follow-up with enrollees over the next 20 years to track changes in physical and mental health.
- Gun ownership (primary, cross-sectional survey) – codebook
YouGov data set contains 4000 respondents who were identified as a nationally representative cohort. It invited 11, 471 potential participants, out of which 5392 (47.0%) started the survey and eventually 4622 (40.3%) completed the survey. Using the 4622 participants, propensity score matching with 2010 American Community Survey sample with selection within strata by weighted sampling with replacements was performed to obtain a nationally representative population. Out of the 4622 respondents, 4000 were matched and identified to be nationally representative.
- National longitudinal study of adolescent to adult health: 1994,1995,1996,2001,2002,2008,2009 (secondary, cohort) – Codebook
The National Longitudinal Study of Adolescent to Adult Health has collected data of interest to investigators from many disciplines in the social and behavioral sciences and from many theoretical traditions. Data are available for study from four instruments in Wave I (conducted from September 1994 through December 1995), two surveys in Wave II (conducted from April 1996 through August 1996), several sources in Wave III (collected from August 2001 through April 2002), and one in-home interview in Wave IV (conducted from January 2008 through February 2009).
- United Network for Organ Sharing (UNOS) : 1988-2015 (secondary, cohort)
United Network for Organ Sharing (UNOS) is the private, non-profit organization that manages the nation’s organ transplant system under contract with the federal government. This system contains data regarding every organ donation and transplant event occurring in the United States since October 1, 1987. Transplant professionals use it to register transplant candidates on the national waiting list, match them with donated organs, and enter vital medical data on candidates, donors and transplant recipients.
- School shooting cohort (data collection in progress)
- National Health and Nutrition Examination Survey (NHANES) : 1999-2000,2001-2002,2003-2004,2005-2006,2007-2008,2009-2010,2011-2012,2013-2014,2015-2016 (secondary, cohort)
The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations.
- National Hospital Discharge Survey (NHDS) : 1965-2010 (secondary, cross-sectional survey) – Study design
The National Hospital Discharge Survey (NHDS), which was conducted annually from 1965-2010, was a national probability survey designed to meet the need for information on characteristics of inpatients discharged from non-Federal short-stay hospitals in the United States. Data from the NHDS are available annually and are used to examine important topics of interest in public health and for a variety of activities by governmental, scientific, academic, and commercial institutions.
- General Social Survey (GSS) 1972-2014 (secondary, cross-sectional survey) – Codebook
Since 1972, the General Social Survey (GSS) has provided politicians, policymakers, and scholars with a clear and unbiased perspective on what Americans think and feel about such issues as national spending priorities, crime and punishment, intergroup relations, and confidence in institutions.
- National Immigrant Study (NIS) : 2003 (secondary, cohort) – Codebook
The New Immigrant Survey (NIS) is a multi-cohort prospective-retrospective panel study of new legal immigrants to the United States. The first full cohort (NIS-2003-1) sampled immigrants in the period May-November 2003. The baseline survey was conducted from June 2003 to June 2004. A survey pilot project (NIS-P) was carried out in 1996 to inform the fielding and design of the full NIS. The follow-up interview (NIS-2003-2) was conducted from June 2007 to December 2009.
- Framingham Heart Study :1948-present (secondary, cohort)
Since our beginning in 1948, the Framingham Heart Study, under the direction of the National Heart, Lung and Blood Institute (NHLBI), formerly known as the National Heart Institute, has been committed to identifying the common factors or characteristics that contribute to cardiovascular disease (CVD). They have followed CVD development over a long period of time in three generations of participants.
- MARKETSCAN :1995-present (primary, cohort)
The Truven Health Analytics MarketScan® Research Databases contain real-world data for healthcare research and analytics. They provide access to fully integrated, de-identified, individual-level healthcare claims data that can be used to examine health economics and treatment outcomes. They are the gold standard in proprietary U.S. healthcare databases.
- OPTUM (cohorts)– in partnership with OPTUM
Optum is a health services and innovation company on a mission. They are 94,000 people dedicated to improving the health system for everyone in it. They power modern health care by combining data and analytics with technology and expertise. They focus on three key areas of change: modernizing the system’s infrastructure, advancing care and supporting people as they take control of their own health.
- Youth Risk Behavior Surveillance System (YRBSS) : (1991,1993,1995,1997,1999,2001,2003,2005,2007,2009,2011,2013) – Codebook
The Youth Risk Behavior Surveillance System (YRBSS) monitors six types of health-risk behaviors that contribute to the leading causes of death and disability among youth and adults, including (1) Behaviors that contribute to unintentional injuries and violence (2)Sexual behaviors that contribute to unintended pregnancy and sexually transmitted diseases, including HIV infection (3) Alcohol and other drug use (4) Tobacco use (5) Unhealthy dietary behavior (6) Inadequate physical activity.
- National (Nationwide) Inpatient Sample (NIS): 1988-2013 (secondary, cross-sectional survey) – Codebook
The National (Nationwide) Inpatient Sample (NIS) is part of a family of databases and software tools developed for the Healthcare Cost and Utilization Project (HCUP). The NIS is the largest publicly available all-payer inpatient health care database in the United States, yielding national estimates of hospital inpatient stays. Unweighted, it contains data from more than 7 million hospital stays each year. Weighted, it estimates more than 35 million hospitalizations nationally.
- Firearm legislation, 2009 – data and codebook
This is a data set of state specific firearm legislation in 2009. Firearms are ubiquitous in the US, and the high rates of firearm ownership have been directly associated with increased risk of firearm-related mortality. Firearm violence prevention has had limited success in the form of a federal law, “Brady Handgun Violence Prevention Act”, (Pub.L.103–159, 107 Stat. 1536, enacted November 30, 1993, effective on February 28, 1994), commonly called the Brady Law. The Brady Law requires background checks to be conducted on individuals before a firearm may be purchased from a federally licensed dealer, manufacturer or importer—unless an exception applies. However, the loopholes to this statute allow unfettered sales from unlicensed dealers. To offset the limitations of the Brady Law, several states have instituted separate laws intended to fill these gaps. States have implemented firearm laws in an effort to reduce firearm access to children and to regulate firearm storage practices. Conversely, many states have also enacted laws aimed to further deregulate the carrying of firearms through “Stand Your Ground” laws. These state regulations have been implemented either as amendments to an existing firearm law or as a separate legislation.
- Wisconsin Longitudinal Study (WLS) :1957-2011 (primary, cohort)
The Wisconsin Longitudinal Study (WLS) is a long-term study of a random sample of 10,317 men and women who graduated from Wisconsin high schools in 1957. The WLS provides an opportunity to study the life course, intergenerational transfers and relationships, family functioning, physical and mental health and well-being, and morbidity and mortality from late adolescence through 2011. WLS data also cover social background, youthful aspirations, schooling, military service, labor market experiences, family characteristics and events, social participation, psychological characteristics and retirement.
- Nationwide Readmissions Database (NRD): 2013 (secondary, cohort) codebook
The Nationwide Readmissions Database (NRD) is part of a family of databases and software tools developed for the Healthcare Cost and Utilization Project (HCUP). The NRD is a unique and powerful database designed to support various types of analyses of national readmission rates for all payers and the uninsured. This database addresses a large gap in health care data – the lack of nationally representative information on hospital readmissions for all ages. Unweighted, the NRD contains data from approximately 14 million discharges each year. Weighted, it estimates roughly 36 million discharges.
- National Health Interview Survey (NHIS): 1963-2014 (secondary, cohort) codebook
The National Health Interview Survey (NHIS) is the principal source of information on the health of the civilian noninstitutionalized population of the United States and is one of the major data collection programs of the National Center for Health Statistics (NCHS) which is part of the Centers for Disease Control and Prevention (CDC). The National Health Survey Act of 1956 provided for a continuing survey and special studies to secure accurate and current statistical information on the amount, distribution, and effects of illness and disability in the United States and the services rendered for or because of such conditions. The survey referred to in the Act, now called the National Health Interview Survey, was initiated in July 1957. Since 1960, the survey has been conducted by NCHS, which was formed when the National Health Survey and the National Vital Statistics Division were combined.
- State Inpatient Databases (SID): 2003-2011 (secondary, cohort) codebook
The State Inpatient Databases (SID) are part of the family of databases and software tools developed for the Healthcare Cost and Utilization Project (HCUP). The SID includes inpatient discharge records from community hospitals in that State. The SID files encompass all patients, regardless of payer, providing a unique view of inpatient care in a defined market or State over time. This data set contains records in California from 2003 to 2011, and in New York from 1990 to 2013.
- State Emergency Department Databases (SEDD): 2005-2011 (secondary, cohort) codebook
The State Emergency Department Databases (SEDD) are part of the family of databases and software tools developed for the Healthcare Cost and Utilization Project (HCUP). The SEDD capture emergency visits at hospital-affiliated emergency departments (EDs) that do not result in hospitalization. Information about patients initially seen in the ED and then admitted to the hospital is included in the State Inpatient Databases (SID). The SEDD files include all patients, regardless of payer, providing a unique view of ED care in a State or in a defined market over time. This data set contains records in California from 2005 to 2011, and in New York from 2007 to 2013.
- State Ambulatory Surgery and Services Databases (SASD): 2005-2011 (secondary, cohort) codebook
The State Ambulatory Surgery and Services Databases (SASD) are part of the family of databases and software tools developed for the Healthcare Cost and Utilization Project (HCUP). The SASD include encounter-level data for ambulatory surgeries and may also include various types of outpatient services such as observation stays, lithotripsy, radiation therapy, imaging, chemotherapy, and labor and delivery. The specific types of ambulatory surgery and outpatient services included in each SASD vary by State and data year. All SASD include data from hospital-owned ambulatory surgery facilities. In addition, some States include data from nonhospital-owned facilities. This data set contains records in California from 2005 to 2011, and in New York from 1997 to 2013.
- One Million Medicare patients data: 2009-2012 (secondary, cohort)
One million medicare patients data is offered via TEC in collaboration with Amresh Hanchate. It is a nationally representative sample of 1 million enrollees aged 65+ in 2009, followed for 4 years (2009 to 2012) or until death (if within 4 years). It is stratified to oversample non-Whites (Blacks, Hispanics and Asians). Data are available on inpatient care, outpatient care and prescription meds (part D) use. The data is under data user agreement. Depending on the type of data request, we may be able to use these data for specific project. Please use the data request form to specify your project.
- One Million Medicaid Cohort (inviting collaborations):
TEC is inviting collaborators to jointly fund and purchase Medicaid data of 1 million participants. We plan to obtain Medicaid data from RESDAC for different cohorts. For e.g., aortic stenosis cohort, pedestrian motor vehicle accident cohort. You can also indicate procedure specific cohorts such as surgical aortic valve replacement cohort, transcatheter aortic valve replacement and bariatric surgery cohort. Collaborators are invited to complete the form to indicate the cohort specifications and email to ctecer@bu.edu. For further questions please contact kalesan@bu.edu.
- Medical Expenditure Panel Survey (MEPS): 1996-2013 (secondary, cohort)
The Medical Expenditure Panel Survey (MEPS) is a set of large-scale surveys of families and individuals, their medical providers, and employers across the United States. MEPS is the most complete source of data on the cost and use of health care and health insurance coverage. Please note that the reference day information is unavailable in 2007 and 2013 onwards.
- FAVORIT Trial: 2007-2011 (clinical trial) codebook
The Folic Acid for Vascular Outcome Reduction In Transplantation Study (FAVORIT) is a multicenter randomized controlled clinical trial designed to evaluate whether treatment with folic acid, vitamins B6 and B12 reduces cardiovascular disease in clinically stable renal transplant recipients with elevated total homocysteine levels. Thirty clinical sites (twenty-seven in the U.S., two in Canada, and one in Brazil) randomized 4,110 participants into the study. The trial was sponsored by the National Institute for Diabetes and Digestive and Kidney Diseases (NIDDK) of the National Institutes of Health. The study intervention was provided by Pamlab L.L.C. Data collection for the trial began in August 2002, baseline enrollment ended on January 31, 2007 and follow-up ended June 24, 2011. Click here for publications from FAVORIT Trial.
- Action to Control Cardiovascular Risk in Diabetes (ACCORD): 2001-present (clinical trial)
The ACCORD (Action to Control Cardiovascular Risk in Diabetes) study was a large clinical trial of adults with established type 2 diabetes who are at especially high risk of cardiovascular disease (CVD). The study began enrolling participants in 2001 and took place in 77 clinical sites across the United States and Canada. A total of 10,251 adults with established type 2 diabetes participated in ACCORD. At enrollment, study participants were between the ages 40 and 79 (average age 62), had diabetes for an average of 10 years, and were at especially high risk for CVD events because they already had pre-existing CVD, evidence of subclinical CVD, or at least two CVD risk factors in addition to type 2 diabetes.
- Atrial Fibrillation Follow-Up Investigation of Rhythm Management (AFFIRM): 1995-2002 (clinical trial)
A randomized multicenter trial. The trial enrolled only patients with atrial fibrillation who were at high risk for stroke, that is, over 65 years of age or less than 65 and with one or more other risk factors for stroke such as systemic hypertension, diabetes mellitus, congestive heart failure, transient ischemic attack, prior cerebral vascular accident. High risk patients were treated with the anticoagulant warfarin. Cardioversion (electrical or pharmacologic) might have been attempted before randomization, but if it was unsuccessful, the patient was excluded from further consideration for randomization. Normal sinus rhythm must have persisted for one hour or greater after cardioversion to qualify as successful cardioversion. Patients were randomly assigned to treatment groups which included maintenance of sinus rhythm or heart rate control. Both treatment groups had two steps.
- Bypass Angioplasty Revascularization Investigation in Type 2 Diabetes (BARI 2D): 2000-2009 (clinical trial)
The BARI 2D trial is a multicenter study that uses a 2×2 factorial design, with 2400 patients being assigned at random to initial elective revascularization with aggressive medical therapy or aggressive medical therapy alone with equal probability, and simultaneously being assigned at random to an insulin providing or insulin sensitizing strategy of glycemic control (with a target value for HbA1c of less than 7.0% for all patients).
- Atherosclerosis Risk in Communities (ARIC): 1987-present (cohort study)
- Jackson Heart Study (JHS): 2000-present (cohort study)
The Jackson Heart Study is a community-based cohort study of risk factors for cardiovascular disease (CVD) among adult African American men and women living in the Jackson, Mississippi metropolitan area. The JHS is a collaborative effort among three Jackson-area academic institutions University of Mississippi Medical Center, Jackson State University, and Tougaloo College, and consists of five centers: a Field Center and a Coordinating Center (UMMC); a Community Outreach Center and a Graduate Training and Education Center (JSU); and an Undergraduate Training and Education Center (Tougaloo College). The JHS is supported by contracts from NHLBI and NIMHD.
- Functional Outcomes in Cardiovascular Patients Undergoing Surgical Hip Fracture Repair (FOCUS): 2003-2009 (clinical trial)
The FOCUS trial tested the hypothesis that a higher threshold for blood transfusion would improve functional recovery and reduce morbidity and mortality, as compared with a more restrictive transfusion strategy.Participants were randomly assigned to the liberal-strategy group or the restrictive-strategy group. Clinical-site staff members, clinicians, and patients were aware of study-group assignments. Patients in the liberal-strategy group received one unit of packed red cells and additional blood as needed to maintain a hemoglobin level of at least 10 g/dL. An assessment of the hemoglobin level after transfusion was required, and an additional unit of blood was transfused if the patient’s hemoglobin level was below 10 g/dL. Patients in the restrictive-strategy group were permitted to receive transfusions if symptoms or signs of anemia developed or at the discretion of their physicians if the hemoglobin level fell below 8 g/dL. Blood was administered one unit at a time, and the presence of symptoms or signs was reassessed.
- Multiple Risk Factor Intervention Trial for the Prevention of Coronary Heart Disease (MRFIT): 1946-1973 (clinical trial)
The purpose of the trial is to determine for a group of men at high risk of death from coronary heart disease whether a special intervention program to lower serum cholesterol, reduce blood pressure, and eliminate cigarette smoking would result in a significant reduction in mortality from coronary heart disease.
- Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT): 1994-2002 (clinical trial)
- Bogalusa Heart Study (BHS): 1972-present (cohort study)
This study is one of the longest on-going studies of a biracial, semi-rural community in the South. Their focus is on understanding the impact of vascular and metabolic changes on health throughout the lifespan. Since 1972, the Bogalusa Heart Study has been conducting research on heart disease risk factors both in the community and in the laboratory and is now the flagship study for Center for Lifespan Epidemiology Research at Tulane University.
- Cardiovascular Outcomes in Renal Atherosclerotic Lesions (CORAL): 2004-2013 (clinical trial)
The CORAL trial compared the incidence of cardiovascular and renal adverse events for medical therapy alone with medical therapy plus renal-artery stenting in patients with atherosclerotic renal-artery stenosis and elevated blood pressure, chronic kidney disease, or both.
- NHLBI Growth and Health Study (NGHS): 1985-2000 (clinical trial)
The objective of NGHS is to investigate racial differences in dietary, physical activity, family, and psychosocial factors associated with the development of obesity from pre-adolescence through maturation between African-American and white girls. Secondarily, the NGHS sought to examine the effects of obesity on cardiovascular disease risk factors. The NHLBI National Growth and Health Study recruited girls 9 and 10 years of age in two communities (Richmond, California and Cincinnati, Ohio) and also from families enrolled in a health maintenance organization in the Washington, D.C. area. A total of 2,379 girls were enrolled in the study between 1987-88 and were followed for 9 years. Slightly more than half of the cohort was African-American. Subjects had annual examinations, and data collected included: physical examination, anthropometric measurements, dietary information including food pattern and nutrient intake, physical activity, lipid, lipoprotein, and apolipoprotein profiles, family socioeconomic status, and psychosocial information.
- Optimal Macronutrient Intake Trial to Prevent Heart Disease (OMNI Heart): 2002-2008 (clinical trial)
The objective of this study was to compare the effects of 3 healthy diets, each with reduced saturated fat intake, on blood pressure and serum lipids. The trial included 164 total subjects recruited from 2 clinical centers: Johns Hopkins Medical Institutions (Baltimore, MD) and Brigham and Women’s Hospital (Boston, MA). Subjects were generally healthy adults, age ≥30 years with a systolic blood pressure (SBP) 120-159 mm Hg or diastolic blood pressure (DBP) 80-99 mm Hg. This range includes individuals with prehypertension (systolic, 120-139 mm Hg or diastolic, 80-89 mm Hg) and stage 1 hypertension (systolic, 140-159 mm Hg or diastolic, 90-99 mm Hg). Exclusion criteria included diabetes, prior or active CVD, LDL-cholesterol >220 mg/dL, fasting triglycerides >750 mg/dL, weight more than 350lb, taking medication for reduction of blood pressure lipid levels, unwillingness to stop taking vitamin and mineral supplements, and alcoholic intake of more than 14 drinks per week. The mean age of the participants was 53.6 years, 45% were women, and 55% were African American.
- Practice Based Opportunities for Weight Reduction Trial at the University of Pennsylvania (POWER-UP): 2008-2011 (clinical trial)
The POWER-UP trial tested the effectiveness of three primary care practice behavioral interventions in reducing weight. The primary aim of the study was to show that both brief and enhanced brief lifestyle counseling would result in significantly greater weight loss at 24 months than would usual care. 390 participants were recruited and treated at six primary care practices owned by Penn Medicine. Eligibility criteria included an age of 21 years or older, a body-mass index of 30 to 50, and at least two of five components of the metabolic syndrome to increase the likelihood that the participants would have cardiovascular risk factors.
- Systolic Blood Pressure Intervention Trial Primary Outcome Paper (SPRINT-POP): 2010-2016 (clinical trial)
The Systolic Blood Pressure Trial (SPRINT) was conducted to test the hypothesis that treating systolic blood pressure to a goal lower than what is currently recommended would reduce the incidence of cardiovascular disease. Adults 50 years of age or more with a systolic blood pressure of 130 to 180 mm Hg with an increased risk of cardiovascular disease but without diabetes or a history of stroke. Increased cardiovascular risk was defined by one or more of the following: clinical or subclinical cardiovascular disease other than stroke; chronic kidney disease, excluding polycystic kidney disease, with an estimated glomerular filtration rate (eGFR) of 20 to less than 60 ml per minute per 1.73 m2 of body surface area as calculated by the four variable Modification of Diet in Renal Disease equation; a 10-year risk of cardiovascular disease of 15% or greater on the basis of the Framingham risk score; or an age of 75 years or older.
- Primary Graft Dysfunction: 2011-2014 (cohort study)
The data set contains characteristics of donor and recipient in orthotopic heart transplantation, as well as the intraoperative events and clinical outcomes.
- American Community Survey (ACS): 2005-2015 (cross sectional)
The American Community Survey (ACS) is an ongoing survey that provides vital information on a yearly basis about our nation and its people. Information from the survey generates data that help determine how more than $400 billion in federal and state funds are distributed each year.
- NVSS (Mortality): 1959-2015 (cross sectional)
Mortality data from the National Vital Statistics System (NVSS) are a fundamental source of demographic, geographic, and cause-of-death information. This is one of the few sources of health-related data that are comparable for small geographic areas and are available for a long time period in the United States. The data are also used to present the characteristics of those dying in the United States, to determine life expectancy, and to compare mortality trends with other countries.