Featured Post

Homeland Security: The Sworn Duty of Public Officials

Homeland Security: The Sworn Duty of Public Officials     The United States has a unique position amongst the countries of the world;...

Friday, January 9, 2015

Poisson distribution

Advanced Research Methods
Week 4
Discussion 1

I wanted to say one thing before I start; I am going to be using as simple terms as I can in this discussion. I don't want to give the impression that I'm talking down to anybody; at this point I don't understand the use of Poisson distribution clearly, so I am talking simply for my own benefit.

“A Poisson distribution is a discrete probability distribution that applies to occurrences of some event over a specified interval. The random variable x is the number of occurrences of the event in an interval. The interval can be time, distance, area, volume, or some similar unit” (Trioli p229)

My judgement of the advantage of using Poisson distribution in research is that is allows for the standardization of results across the defined interval as opposed to basing the analysis on an aggregate of counts.

The disadvantages of using Poisson distribution are that there are several requirements to using it:
  1. The random variable x is the number of occurrences of an event over some interval
  2. The occurrences must be random
  3. The occurrences must be independent of each other
  4. The occurrences must be uniformly distributed over the interval being used

I think that the value added to statistical analysis would be the same as the advantage of using Poisson distribution; that the measurements of occurrences would be by the defined interval.

Baumer, Wolff, and Amio use census tracts as an interval for Poisson distribution in their study of the effects of foreclosure rates on crime levels ( specifically, robbery and burglary). They “adopt a Poisson framework because our data contain a considerable number of tracts with relatively small populations and low crime counts; these features yield highly skewed distributions for crime rates and a heterogeneous error variance, properties that violate assumptions of conventional linear regression models.” (589-590) This implies to me that the use of the census tract allows a standardized measure.

I am not sure I will be using Poisson distribution in my research on COINTELPRO operations I don't think I understand it's use correctly. In addition, while I could possibly measure (operations per time unit) or (operations per state) as intervals, I think the severity of operations or the importance of targets will be of more distinction in my study, which I think is more qualitative then quantitative.

I am going to keep looking at the Poisson distribution, because I think I am missing something easy, and I am thinking fuzzy this week ;>

Finally, I got a little distracted by a question. We know that correlation does not imply causation, but what about the reverse? Does causation always result in correlation? Those two links present arguments that causation does NOT always cause correlation – the third link...well, I'm still working on it.

“Causation without Correlation is Possible”

“Strong causation can exist without any correlation: The strange case of the chain smokers, and a note about diet"

"Causation without Correlation is Possible?”

Instructor comment

"The Poisson Distribution is meant to demonstrate the impact of one or more independent variables on the dependent variable.  Essentially, it is determining whether or not there is a predictable correlational relationship, not a causal relationship.  In your proposed study of the juvenile justice system, what is the dependent variable and what are the independent variables that you wish to test?  What type of correlational relationship do you expect to find or explore through the use of Poisson distribution?  Are you  attempting to predict whether current interventions have an impact on future recidivism?"


The use of statistics to misinform is pretty common.  There is a  book called "Lies, Damned Lies, and Statistics" ( as well as an updated version) that discusses this issue.

"Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: 'There are three kinds of lies: lies, damned lies, and statistics."
Mark Twain

It's one reason that valid research is clear in it's methodology:  why did you research this, how did you test it?  what are the definitions of your test terms?
That way others can see what you're up to.
numbers can clarify things;  you have to explain where you got your numbers and how you use them.

That way others can replicate your research or contradict it.  If you hide your data, use data that is known to be incorrect, or refuse to disclose your methodology then it's not really research

No comments:

Post a Comment