To complete this project, you will need the following materials: STATDISK User Manual (found in the classroom in DocSharing) Access to the Internet to download the STATDISK program. Part I. Analyze Data Instructions Answers 1. Open the file COTININE using menu option Datasets and then Elementary Stats, 9th Edition. This file contains some information about a exposure to second-hand smoke. How many observations are there in this file? In this file, there are three variables, labeled Smokers, ETS, and No ETS. The dataset was collected for a study of second-hand smoke. The sample data consists of the measured serum cotinine levels in three different groups of people. The NOETS group lists the cotinine levels for subjects who are nonsmokers and have no exposure to environmental tobacco smoke at home or work. The ETS group lists cotinine levels for subjects who are nonsmokers exposed to tobacco smoke at home or work. The SMOKERS group lists cotinine levels for subjects who report tobacco use. Serum cotinine is a metabolite of nicotine, meaning that cotinine is produced when nicotine is absorbed by the body. Higher levels of cotinine correspond to higher levels of exposure to smoke that contains nicotine. 2. What results do you expect to find in this data? Part II. Descriptive Statistics 3-6Generate descriptive statistics for all three groups of people and complete the following table. Round all results to 2 decimal places. Variable Sample Mean Sample Standard Deviation Sample Size Smokers ETS No ETS 7.Did you get the results you expected here? Explain why. 8.In which of the three groups did we experience the MOST variation (highest deviation from the mean)? Part III. Confidence Intervals 9.Generate a 95% interval for the mean of the SMOKERS group. Paste your results here. 10.Generate a 95% interval for the mean of the ETS group. Paste your results here. 11.Generate a 95% interval for the mean of the No ETS group. Paste your results here 12.Create a graph below by illustrating all three confidence intervals on one graph using the tools in your word processor (example below). Statdisk cannot do this for you. Create your graph and turn the font red. For this process, I just use the dashes and wrote a scale below the axes. Here is an example, but it is not based on the data you are analyzing: Case 1 14———————42 Case 2 35————————-70 ________________________________________ 0 20 40 60 Your Solution: 13.. Based on the confidence intervals shown above, does there appear to be some evidence to indicate that exposure to tobacco smoke corresponds to higher levels of cotinine? Part IV. Hypothesis Testing 14. Lets say that you are trying to show that the level of cotinine is equal to zero (0) for people who do not smoke and are not exposed to Environmental Smoke. Compose and test a hypothesis (use Significance of 0.01) that the cotinine levels are significantly different from 0 for the No ETS group. Show all steps. Step 1. Determine parameter of interest and compose null and alternative hypotheses. Step 2. Determine the sample mean, sample standard deviation, and sample size. [Hint: You did this in an above step in this assignment.] Step 3. Determine the likelihood that the population mean is actually equal to 0 by completing a Hypothesis Test: One Mean in STATDISK. Use significance of 0.01. Paste results here. Step 4. State your conclusion. Your conclusion should relate the p-value and level of significance to whether you reject the null hypothesis. 15. Now test the claim of a tobacco company spokesmen that people in the ETS population have “positive” levels of cotinine (significantly greater than zero), suggesting that this measure is not a good indicator of exposure to tobacco smoke. Use a significance level of 0.05. Step 1. Determine parameter of interest and compose null and alternative hypotheses. Step 2. Determine the sample mean, sample standard deviation, and sample size. [Hint: You did this in an above step in this assignment.] Step 3. Determine the likelihood that the population mean is actually equal to 0 by completing a Hypothesis Test: One Mean in STATDISK. Use significance of 0.05. Paste results here. Step 4. State your conclusion in terms of the claim by the tobacco company. 16. Answer the following questions based on the above hypothesis test. a. What is the p-value in this example and what does it represent? b. Given that your data and p-value do not change, what would need to be different in order for us to FAIL TO REJECT the null hypothesis here? 1. Given the following hypothesis statements: Ho: The average GPA of males=average GPA of females Ha: The average GPA of males is not equal to the average GPA of females Explain in the context of GPA for males and females what it means to make a type I and type II error. 2. Suppose that in a city of 28,000 people 3,500 work for Silly Songs Inc. You survey 500 people and find that 145 work for Silly Songs Inc. a. What is the population proportion of people who work for Silly songs Inc.? b. What is the sample proportion of people who work for silly songs Inc.? 3. Assume that cans of soda are filled so the actual amounts have an average of 18 oz. A sample of 36 cans has a mean amount of 18.37 ounces. The distribution of sample means of size 36 is normal with an assumed mean of 18 ounces and standard deviation of 0.07 ounces. a. How many standard deviations is the sample mean from the mean of the distribution of sample means? b. In general what is the probability that a random sample of size 36 has a mean of at least 18.37 ounces c. Does it appear consumers are being cheated? 4. The table below gives the number of hours 10 sixth graders read per week and their performance on a standardized verbal test with a maximum of 100. Reading time per week Verbal test score 1 49 1 67 3 57 3 62 4 67 4 57 5 75 6 50 11 87 12 38 a. Construct a scatter diagram and include a line of best fit b. Computer r and r2 c. Identify any potential outliers and their effects on the strength of the correlation and line of best fit d. Based on the results above do you feel the line of best fit gives a reliable prediction outside the range of data provided? 5. Find the value of the standard score, z, and determine if the alternative hypothesis is supported at 0.05 significance level Ha:<144, n=144, x bar=142, =16 6. Consider the scatter diagram in the figure below: a. Which point is an outlier? Ignoring the outlier estimate or compute the correlation coefficient for the remaining data points. b. Does the outlier effect the correlation coefficient? c. Now include the outlier which of the following best represents the correlation coefficient using all data points i. -0.89 ii. 0.89 iii. -0.67 iv. 0.67 7. An experiment was conducted for a method of gender selection. According to this experiment of the 300 babies born 158 were males, does the appear to be statistically significant? 8. **For this question be sure to see section 9.3 of your textbook*** On research study of illegal drug use among teenagers shows a decrease from 11.4% in 1997 to 9.5% now. Suppose a study in a large high school reveals that in a simple random sample of 1054 students 97 report using illegal drugs. Use the 0.05 significance level to test the principals claim that illegal drug use is below the national average. a. formulate the null and alternative hypothesis b. The sample statistics are the sample size n=1054 and the sample proportion , find the sample proportion rounded to four decimal places c. Find the standard score, z for the sample proportion d. Is there sufficient evidence to support the principals claim that the illegal drug use at this school is below the national average? 9. Assume that the population mean is to be estimated from a sample. Use the sample results to approximate the margin of error and 95% confidence level. Sample size=121 sample mean=80 sample standard deviation =14 10. Identify the Type I and type II error that correspond to the given hypothesis: The proportion of students earning an A in STAT 3001 is 0.10 11. Assume that the number of male births and female births are equally likely. The following tables show the probabilities of various numbers of male babies in random samples of 100 births. A random sample of 100 births has 46 male babies, Is this significant at the 0.01 level? What is the p value? Number of males among 100 Probability 35 or fewer 0.002 40 or fewer 0.028 45 or fewer 0.184 48 or fewer 0.382 12. You selected a random sample of 150 people attending a educational conference attended by 1850 people. Within your sample you found that 75 people had traveled from abroad. Based on this sample statistic how many people at the conference traveled from abroad? Would you be more confident if your sample included 400 people? 13. A researcher wishes to estimate the average number of hours that high school students spend on facebook each day. A margin of error of 0.22 hours is desired. Past studies suggest a population standard deviation of 2.1 hours is reasonable, estimate the minimum sample size needed to estimate the population mean with the desired accuracy. 14. You select a random sample of n=15 families in your neighborhood and find the following family sizes. 7 8 11 10 9 7 8 8 7 8 7 8 9 10 6 Find the mean family size from the sample as well as the standard deviation What is the best estimate for the mean sample size for the population of all family sizes in the country? What is the 95% confidence interval for the mean? Do you feel this sample is representative of the entire nation why or why not? 15. The assets in billions of dollars of the four wealthiest people in the nation are 42, 39, 35 and 23. Assume that samples of size 2 are randomly selected WITH REPLACEMENT from this population of four values. Complete the table below to describe the sampling distribution of the sample means Sample mean Probability Sample mean Probability 42 35 40.5 32.5 39 31 38.5 29 37 23 Find the mean of the sampling distribution Find the mean of the population of the four listed values Is the mean of the sampling distribution equal to that of the population mean?
?
?
?
?
?
?
? Please
?take
?the
?time
?to
?review
?the
?following
?mathematical
?problems.
?
?If
?you
?have
?are
? able
?to
?solve
?these
?problems
?you
?will
?be
?in
?GOOD
?shape
?for
?your
?actual
?quiz.
?
?
?You
? may
?check
?your
?solutions
?at
?the
?end
?of
?this
?document.
?
? Remember
?you
?have
?three
?attempts
?at
?your
?quiz
?with
?the
?highest
?score
?counting.
?
? All
?quizzes
?are
?due
?at
?11:59
?PM
?PST
?Sunday
?night.
?
?
?
? Your
?exam
?is
?open
?note
?and
?open
?book
?please
?take
?advantage
?of
?the
?opportunity
?to
?use
? these
?resources
?
?
? 1. Given
?the
?following
?hypothesis
?statements:
? Ho:
?The
?average
?GPA
?of
?males=average
?GPA
?of
?females
? Ha:
?The
?average
?GPA
?of
?males
?is
?not
?equal
?to
?the
?average
?GPA
?of
? females
?
? Explain
?in
?the
?context
?of
?GPA
?for
?males
?and
?females
?what
?it
?means
?to
? make
?a
?type
?I
?and
?type
?II
?error.
?
?
?2. Suppose
?that
?in
?a
?city
?of
?28,000
?people
?3,500
?work
?for
?Silly
?Songs
?Inc.
?
?You
? survey
?500
?people
?and
?find
?that
?
?145
?work
?for
?Silly
?Songs
?Inc.
? a. What
?is
?the
?population
?proportion
?of
?people
?who
?work
?for
?Silly
?songs
? Inc.?
?b. What
?is
?the
?sample
?proportion
?of
?people
?who
?work
?for
?silly
?songs
? Inc.?
?
?3. Assume
?that
?cans
?of
?soda
?are
?filled
?so
?the
?actual
?amounts
?have
?an
?average
?of
? 18
?oz.
?
?A
?sample
?of
?36
?cans
?has
?a
?mean
?amount
?of
?18.37
?ounces.
?
?The
? distribution
?of
?sample
?means
?of
?size
?36
?is
?normal
?with
?an
?assumed
?mean
?of
? 18
?ounces
?and
?standard
?deviation
?of
?0.07
?ounces.
? a. How
?many
?standard
?deviations
?is
?the
?sample
?mean
?from
?the
?mean
?of
? the
?distribution
?of
?sample
?means?
?
?b. In
?general
?what
?is
?the
?probability
?that
?a
?random
?sample
?of
?size
?36
?has
? a
?mean
?of
?at
?least
?18.37
?ounces
?c. Does
?it
?appear
?consumers
?are
?being
?cheated?
?
?
?
?
?
4.
?The
?table
?below
?gives
?the
?number
?of
?hours
?10
?sixth
?graders
?read
?per
?week
? and
?their
?performance
?on
?a
?standardized
?verbal
?test
?with
?a
?maximum
?of
?100.
?
?
? Reading time per week Verbal test score 1 49 1 67 3 57 3 62 4 67 4 57 5 75 6 50 11 87 12 38
? a. Construct
?a
?scatter
?diagram
?and
?include
?a
?line
?of
?best
?fit
? b. Computer
?r
?and
?r2
? c. Identify
?any
?potential
?outliers
?and
?their
?effects
?on
?the
?strength
?of
?the
? correlation
?and
?line
?of
?best
?fit
?d. Based
?on
?the
?results
?above
?do
?you
?feel
?the
?line
?of
?best
?fit
?gives
?a
?reliable
? prediction
?outside
?the
?range
?of
?data
?provided?
?
?5. Find
?the
?value
?of
?the
?standard
?score,
?z,
?and
?determine
?if
?the
?alternative
? hypothesis
?is
?supported
?at
?0.05
?significance
?level
? Ha:?<144,
?n=144,
?x
?bar=142,
?s=16
?
?6. Consider
?the
?scatter
?diagram
?in
?the
?figure
?below:
?
?a. Which
?point
?is
?an
?outlier?
?Ignoring
?the
?outlier
?estimate
?or
?compute
? the
?correlation
?coefficient
?for
?the
?remaining
?data
?points.
?
?
?b. Does
?the
?outlier
?effect
?the
?correlation
?coefficient?
?c. Now
?include
?the
?outlier
?which
?of
?the
?following
?best
?represents
?the
? correlation
?coefficient
?using
?all
?data
?points
? i. -?-0.89
? ii. 0.89
? iii. -?-0.67
? iv. 0.67
?
7.
?An
?experiment
?was
?conducted
?for
?a
?method
?of
?gender
?selection.
?
?According
? to
?this
?experiment
?of
?the
?300
?babies
?born
?158
?were
?males,
?does
?the
?appear
?to
? be
?statistically
?significant?
?
?8. **For
?this
?question
?be
?sure
?to
?see
?section
?9.3
?of
?your
?textbook***
? On
?research
?study
?of
?illegal
?drug
?use
?among
?teenagers
?shows
?a
?decrease
?from
? 11.4%
?in
?1997
?to
?9.5%
?now.
?
?Suppose
?a
?study
?in
?a
?large
?high
?school
?reveals
? that
?in
?a
?simple
?random
?sample
?of
?1054
?students
?97
?report
?using
?illegal
? drugs.
?
?Use
?the
?0.05
?significance
?level
?to
?test
?the
?principal?s
?claim
?that
?illegal
? drug
?use
?is
?below
?the
?national
?average.
? a. formulate
?the
?null
?and
?alternative
?hypothesis
? b. The
?sample
?statistics
?are
?the
?sample
?size
?n=1054
?and
?the
?sample
? proportion
?,
?find
?the
?sample
?proportion
?rounded
?to
?four
?decimal
? places
?c. Find
?the
?standard
?score,
?z
?for
?the
?sample
?proportion
?d. Is
?there
?sufficient
?evidence
?to
?support
?the
?principals
?claim
?that
?the
? illegal
?drug
?use
?at
?this
?school
?is
?below
?the
?national
?average?
?
?9. Assume
?that
?the
?population
?mean
?is
?to
?be
?estimated
?from
?a
?sample.
?
?Use
?the
? sample
?results
?to
?approximate
?the
?margin
?of
?error
?and
?95%
?confidence
?level.
?
? Sample
?size=121
?sample
?mean=80
?sample
?standard
?deviation
?=14
?10.
?Identify
?the
?Type
?I
?and
?type
?II
?error
?that
?correspond
?to
?the
?given
?hypothesis:
? The
?proportion
?of
?students
?earning
?an
?A
?in
?STAT
?3001
?is
?0.10
?11.
?Assume
?that
?the
?number
?of
?male
?births
?and
?female
?births
?are
?equally
?likely.
?
? The
?following
?tables
?show
?the
?probabilities
?of
?various
?numbers
?of
?male
? babies
?in
?random
?samples
?of
?100
?births.
?
?A
?random
?sample
?of
?100
?births
?has
? 46
?male
?babies,
?Is
?this
?significant
?at
?the
?0.01
?level?
?What
?is
?the
?p
?value?
?
? Number
?of
?males
?among
?100
? Probability
? 35
?or
?fewer
?
?
?
? 0.002
? 40
?or
?fewer
?
?
?
? 0.028
? 45
?or
?fewer
?
?
?
? 0.184
? 48
?or
?fewer
?
?
?
? 0.382
?12.
?You
?selected
?a
?random
?sample
?of
?150
?people
?
?attending
?
?a
?educational
? conference
?attended
?by
?1850
?people.
?
?Within
?your
?sample
?you
?found
?that
?75
? people
?had
?traveled
?from
?abroad.
?
?Based
?on
?this
?sample
?statistic
?how
?many
? people
?at
?the
?conference
?traveled
?from
?abroad?
?Would
?you
?be
?more
? confident
?if
?your
?sample
?included
?400
?people?
?
?13.
?A
?researcher
?wishes
?to
?estimate
?the
?average
?number
?of
?hours
?that
?high
? school
?students
?spend
?on
?facebook
?each
?day.
?
?A
?margin
?of
?error
?of
?0.22
?hours
? is
?desired.
?
?Past
?studies
?suggest
?a
?population
?standard
?deviation
?of
?2.1
?hours
? is
?reasonable,
?estimate
?the
?minimum
?sample
?size
?needed
?to
?estimate
?the
? population
?mean
?with
?the
?desired
?accuracy.
?
?14. You
?select
?a
?random
?sample
?of
?n=15
?families
?in
?your
?neighborhood
?and
?find
? the
?following
?family
?sizes.
?
?
?7 8 11
10 9 7 8 8 7 8 7 8 9 10 6
Find the mean family size from the sample as well as the standard deviation What is the best estimate for the mean sample size for the population of all family sizes in the country? What is the 95% confidence interval for the mean? Do you feel this sample is representative of the entire nation why or why not?
?15.
?
?The
?assets
?in
?billions
?of
?dollars
?of
?the
?four
?wealthiest
?people
?in
?the
?nation
? are
?42,
?39,
?35
?and
?23.
?
?Assume
?that
?samples
?of
?size
?2
?are
?randomly
?selected
? WITH
?REPLACEMENT
?from
?this
?population
?of
?four
?values.
? Complete
?the
?table
?below
?to
?describe
?the
?sampling
?distribution
?of
?the
?sample
?means
? Sample
?mean
? Probability
? Sample
?mean
? Probability
? 42
?
? 35
?
? 40.5
?
? 32.5
?
? 39
?
? 31
?
? 38.5
?
? 29
?
? 37
?
? 23
?
?
?
?
? Find
?the
?mean
?of
?the
?sampling
?distribution
? Find
?the
?mean
?of
?the
?population
?of
?the
?four
?listed
?values
? Is
?the
?mean
?of
?the
?sampling
?distribution
?equal
?to
?that
?of
?the
?population
?mean?
?
?
?
?