Saturday, December 3, 2016

Assignment 1: Getting Started

Hi Everyone!  In this first assignment we are asked to select a data set, develop a research question, perform a review of relevant publications, and finally create a hypothesis.


The Data

I have decided to to select a data set available outside of the course: the "natality" data set made available on Google's BigQuery sample tables

The natality data set describes all United States births registered in the 50 States, the District of Columbia, and New York City from 1969 to 2008. This data is obtained from the CDC's Division of Vital Statistics which actually provides data sets from 1968 through 2015 via this page. (CDC Wonder data access portals also provide access to this data on this page for years 1995 - 2014).


The Question

So, when I initially saw this variables of this data on the BigQuery schema page, I thought to myself, "interesting, they collect Apgar scores!".  Having just had a baby last December, I knew that Apgar scores were obtained at 1 minute past delivery time and again at 5 minutes past delivery time, to provide an assessment of newborn health based upon infant activity, pulse, grimace reflex, appearance, and respiration. 

Though Apgar scores are only recorded from 1978 - 2002 in the Vital Statistics database, I am interested to determine if there is an association among Apgar scores, and the variables that pertain the maternal health and well being, including: cigarette smoking, drinking, and weight gain during pregnancy. 


Literature Review

Honestly, I would have "guessed" that there would be some type of correlation between maternal health and infant Apgar scores.  However, when I searched for articles comparing Apgar scores and cigarette smoking, alcohol consumption, and/or maternal weight gain, the following articles suggest potential conflicting information regarding a correlation between these factors and Apgar scores:

  • Maternal cigarette smoking, psychoactive substance use, and infant Apgar scores (1982)
    • "A study of 1,709 mother/child pairs at Boston City Hospital examined whether maternal cigarette smoking, drinking, or the use of other psychoactive substances was associated with low infant Apgar scores"...."None of the substance use variables was significantly associated with low infant Apgar scores at 1 and 5 minutes. Other labor and delivery risks, such as short length of gestation, abnormal delivery presentation, placental abnormalities, nuchal cord, and exposure to general anesthesia during delivery, were associated with low Apgar scores."
    • This article is based in Canada and examines "the relation between maternal smoking, alcohol consumption and drug dependence during pregnancy and early neonatal morbidity." 
    • "Markers of neonatal morbidity were Apgar scores (<7 at 5 minutes postpartum) and resuscitation measures (2001-2005, N=191,686), and neonatal intensive care unit (NICU) admissions (2002-2005, N=154,924)."
    • "The main findings of this analysis are that smoking, daily or high alcohol consumption and drug dependency during pregnancy contribute to early neonatal morbidity and that eliminating maternal smoking would prevent 10-15% of each of the three markers of neonatal morbidity. "
  • A Prospective Study of Smoking and Pregnancy (1970)
    • "A 50% increase of prematurity rate was registered among smoking women compared with non-smoking women" however "No effect of smoking on the mean Apgar score of surviving, non-malformed children was seen."
  • The Seattle longitudinal prospective study on alcohol and pregnancy. (1981)
    • "An unselected sample of 1529 women (predominantly white, married, and middle-class) were interviewed during pregnancy regarding their use of alcohol, nicotine, caffeine, drugs, and other variables. Subsets of offspring were examined to assess the relationship of self-reported maternal alcohol use to infant health and development." 
    • "The following are among those outcomes significantly related to increase maternal alcohol use after adjusting for smoking and other variables: smaller infant size (birth weight, length and head circumference); lower Apgar scores; poorer neonatal habituation; decreased sucking pressure; increased tremulousness and head-turns-to-left; decreased vigorous activity; and a higher frequency of minor dysmorphic characteristics combined with low birth weight and microcephaly."

My Hypothesis

I propose that for the US data available from the natality statistics, there will be a statistically significant correlation between prenatal cigarette smoking, alcohol consumption, gestational weight gain, and lower Apgar scores.

Disclaimers: My posting/thoughts/opinions/analysis are my own, and do not reflect any active project at Google. I'm not a doctor, social scientist, or any type of "health expert" - my analysis here is purely for the opportunity to play with data sets, Python, and statistical analysis. I don't expect anyone but my class peers to read this blog but if you stumble upon it, please treat this for what it is: an academic exercise for personal development. Thanks!