Data Analysis: An old problem revisited
Geoffrey Osborne
Geoff.Osborne at anu.edu.au
Tue Feb 11 19:15:51 EST 1997
Dear All,
I'm wondering if any of you can give me some advice regarding a problem a
user presented me with yesterday. Essentially they have been running a
series of repeat experiments over a number of months and they want to pool
all the data. The problem lies in which method of presentation of the pool
data gives the truest reflect of the the biology of what was occurring on a
particular day, (given that the negative control can move a little from day
to day and with aslight voltage change).
Here is my example.
Samples on day 1 for example, were run with the negative control median
positioned on 5 on the relative fluorescence intensity (rfi) scale 0-10000,
and the positive sample median falls on 305 rfi.
Day 2, control negative 9.7, positive 437 rfi.
Day 3, control negative 14.7, positive 604 rfi
What the user wants to do is express the *relative* change in fluorescence,
so they were dividing the median of the sample by the median of the control,
yielding 61, 45, & 40 respectively.
I suggested subtracting the control median from the sample median for each
day, thus 300, 427.3, 589.1.
This doesn't seem to be right to me either, as both methods just seem to
reflect that the log amplifier responds in a non linear fashion, or that the
further out on the log scale the positive population is pushed, the greater
the error.
Now if I was to become a fan of the geometric mean, ignoring the "fact" that
some outlying events may be "ignored" then would I have a truer indication
of what's really happening?
I'd really appreciate your thoughts on this, as I honestly don't know what
is the "right" answer.
Thanks
Geoff
