Ask Your Writing Question. Writing Experts Answer You ASAP.

(Not a Writing Question?)

Write a brief essay (suggested length of 2 pages) in which ...

Sent to Writing Experts June 11 2008 at 11:47 AM
   

Write a brief essay (suggested length of 2 pages) in which you address the following:



A. Define “association” in statistics. Explain how association is identified and demonstrated.



B. Define “causation” in statistical analysis. Describe at least two factors that influence relationships between two variables and can lead to misinterpretation of data analysis.



C. Explain when it is appropriate to use averages when computing correlations. Explain what statisticians should be aware of when doing this.

Follows is what I have written. Please help me clean it up without quoting outside resources.

Association is the relationship between categories of data. It is most easily viewed as graph. For data that is quantitative in nature, the scatter plot is used to graphically show association. The association is viewed as a relatively straight line. The tighter the points are to the center of the line, the higher it correlation. Association can also be shown through the use of a two way table for qualitative data.
     Correlation is represented by how tightly the quantitative data follows the straight line. The question is how to compute it for significance in a mathematical way. Using the Pearson correlation coefficient represented by (formula would be here) where r is always a value between -1 and 1. A value close to -1 shows a strong negative correlation which weakens as it grows closer to 0 and a value close to 1 shows a strong positive correlation which weakens as it approaches 0. Again, graphically, this would be represent by -1 and 1 being very straight lines and 0 not showing a line at all on a scatter plot.
     Causation is the conclusion that due to an association or correlation that one factor directly influences another. This tends to be a dangerous assumption. The only way to find a casual relationship is to repeat well defined experiments manipulating variable to try to weed out any outside hidden influences. The outside influences are referred to as lurking variables. Another influence that can cause issues with determining causation is the way in which the experiment is constructed. A poorly constructed experiment could influence subjects to behave differently. This can be shown by the example of a medical study which is only single blind. The observers could influence the outcome by their knowledge of which treatment the subject is using.
      To demonstrate these effects and the misinterpretation of data to infer a casual relationship, a statistical experiment this writer performed in a previous line of work will be used. The pre-hospital regulatory component of Kentucky state government switched from a home grown set of evaluation tests to a standard national test 4 years ago. After 3 years, the induction of new EMTs and Paramedics was down approximately 40% each year. The regulatory board decided that the test must be too difficult.
     This led the training staff to conduct an experiment. The scores for the first 6 attempts on the national test were already stored for all persons who took the test. The first set of three scores is based on the initial training that was presented to the student, and the second set of scores occurs after a 40 hour refresher training. The controls for this experiment were based on the equivalent results from 3 other states of similar demographic characteristics.
     The initial outcome from preliminary studies did in fact indicate that there was a 40% overall first time pass rate in Kentucky when all training institutions were taken as a whole and computed as a state average. This number supported the boards idea that the national exam should be abandoned, but the study then broke down the training institutions by the type of institutions that were delivering the content. The new break down were based on if the training institutions were owned or operated by a ambulance service, privately owned or university based.
     The ambulance service operated training institutions had an average of 62% first time pass rate and a 78% second time pass rate. Most students did not have to go for retraining after a 3rd attempt. This was due to the service having a financial interest in the student succeeding.
The privately owned services had truly had no computable base line. It ranged from 18% first time pass to 70% first time pass. Because of the diversity of the range, an investigation into these institutions was started to determine why this occurred.
     On the high end of the spectrum, universities had a 83% first time pass rate and a 94% second time pass rate. The difference was that the universities did not offer the courses as stand alone but as part of degree programs. These scores were significantly higher than the 3 states average of 70%, which was higher than Kentucky's average.
From this high over view of the data, a person can see were casual relationships can be reached inaccurately. The initial data presented based on a broad average indicated the regulatory board's hypothesis of the test being too difficult to be accurate. The mean was influenced by the outliers and skewed the data. If the significant outliers had been removed, the mean would have actually represented the overall state pass rate. Even with that being said, only the controlled experiment of segmenting the data into the type of institution providing the training showed the cause. The true root cause was ineffective training, as inferred by the diversity of the scores and proven by later investigation.

Edited by Customer (name blocked for privacy) on June 11 2008 at 11:49 AM

Customer (name blocked for privacy)
Nobody has been able to answer this question yet. Can you help answer it?
Click here to become an Expert.

 

JustAnswer > Writing