此篇文章來自韓國延世大學(Yonsei
University), Young Moon Chae
教授之演講(PowerPoint),其標題為〞Issues
in Research Design and statistical Methods for
Medical
Journals〞。
統計方法已廣泛應用於醫學研究,但是問題發生在如何正確的使用與如何呈現其發現,對於統計方法的誤用或是錯誤使用可能導致研究方法偏移至錯誤方向,或產生不正確的研究結果。以Altman
(1994)
之研究,醫學期刊上常常出現的統計錯誤包括:(Altman
DG. The scandal
of poor medical
research. British
Medical
Journal,
1994;308:283-284.)
1.
使用錯誤的統計方法。(Use
the wrong statistical methods)
2.
正確的方法被誤用。(Use
the right methods wrongly)
3.
誤解了研究結果。
4.
選擇性的教導研究結果,選擇性的引用文獻。
5.
不公正的進行結論。
在醫學期刊上,統計錯誤有多麼嚴重?
1.
在1993年的British
Journal of Psychiatry, 共發表248篇期刊論文,164篇有量化數據(66%),其中65篇(40%)統計出現錯誤。
2.
在1994年,the
American Journal of Obstetrics and Gynecology
出現的145篇論文,有46篇(32%)出現統計問題。
3.
在2008年,任意選出281篇關於Psychological
論文,18%發生統計錯誤。
4.
中國前10名的醫學期刊,在1998年共有1335篇,56.3%有錯。2008年共有1578篇,其中67.9%出現統計錯誤。
5.
自2004年至2008年的醫學期刊,在Korean
Journal of Pain 出版的139篇論文,有20.9%發現統計錯誤。
在BJP期刊出版的1993篇期刊論文,使用的統計方法如下:
Statistical Methods |
Frequency |
Chi-square test |
72 |
Student
t-test |
46 |
Correlation |
38 |
Mann-Whitney test |
30 |
ANOVA |
29 |
Confidence
intervals |
12 |
Fisher’s exact
test |
8 |
(Source: McGuigan
SM. The
use of
statistics in the
British Journal
of
Psychiatry.
British
Journal
of Psychiatry.
1995;167:683-688)
在BJP期刊,1977/1978與1993年出版期刊,其統計錯誤:
Type
of Errors |
1977/78
Error
rate
(1) |
1993
Error
rate (2) |
Randomization |
12/49= 24% |
25/58=43% |
Measures of
location |
34/139=24% |
44/164=27% |
Measures of
dispersion |
16/139=12% |
44/164=27% |
Student’s
t-test |
13/35=37% |
37/46=80% |
Chi-squared test |
12/48=25% |
11/72=15% |
Description of
methods |
18/139=13% |
27/164=16% |
Statement
of results |
10/139=13% |
28/164=17% |
Incorrect analysis |
20/139=14% |
45/164=27% |
1.
White
SJ.
Statistical
errors
in
papers
in
the British
Journal
of
Psychiatry.
British
Journal
of
Psychiatry.
1979;135:336-342.
2.
McGuigan SM.
The
use
of
statistics
in
the British
Journal
of
Psychiatry.
British Journal
of Psychiatry.
1995;167:683-688.
在1998與2008年,中國最佳10個醫學期刊,在其方法的統計錯誤為:
Types
of
Statistical Methods |
Errors in
1998 |
Errors in
2008 |
Chi-sq |
P
value |
T-test |
305 (62.0%) |
253 (44.4%) |
32.8 |
<0.001 |
1. Multiple comparison |
153 (31.1%) |
129 (22.6%) |
9.7 |
0.002 |
2.
Non-parametric |
89 (18.1%) |
60 (10.5%) |
12.5 |
<0.001 |
3. Paired
t-test |
73 (14.8%) |
60 (10.5%) |
4.5 |
0.034 |
Contingency tables |
154 (58.3%) |
169 (32.3%) |
21.4 |
<0.001 |
1. Small
size cell |
82 (25.7%) |
74 (14.2%) |
17.5 |
<0.001 |
2. Fisher
exact
test |
52 (16.3%) |
53 (10.1%) |
6.9 |
0.009 |
ANOVA |
128 (63.4%) |
263 (59.0%) |
1.1 |
0.289 |
1. Multiple comparison |
51 (25.3%) |
132 (29.6%) |
1.3 |
0.255 |
2. Repeated-measures
data |
45 (22.3%) |
63 (14.1%) |
6.65 |
0.010 |
Non-parametric
test |
29 (43.3%) |
33 (17.7%) |
17.57 |
<0.001 |
(Source:
Wu
S, et
al.
Misuse
of statistical
methods in
10
leading
Chinese
medical
journals
in 1998
and
2008.
Scientific
World Journal.
2011;11:2106-2114)
在1998與2008年,中國最佳10個醫學期刊,其研究設計的統計錯誤為:
|
1998 |
2008 |
Types
of
research
design |
Papers used
statistics |
Statistical
errors |
Papers used
statistics |
Statistical
errors |
RCT |
64
(98.0%) |
36
(56.3%) |
56
(93.3%) |
38
(67.9%) |
Clinical
trial |
82
(91.1%) |
47
(57.3%) |
58
(95.1%) |
34
(67.9%) |
Cohort
study |
47
(79.7%) |
28
(59.6%) |
80
(92.0%) |
17
(21.3%) |
Case-control |
254 (92.4%) |
148
(58.3%) |
276
(97.2%) |
129
(46.7%) |
Cross-sectional |
56
(74.7%) |
32 (57.1%) |
52
(88.1%) |
23
(44.2%) |
Case
study |
122 (31.9%) |
59
(48.4%) |
233
(48.9%) |
110
(47.2%) |
Basic
science |
240
(74.1%) |
175
(72.9%) |
409
(87.4%) |
268
(65.5%) |
Total |
912
(68.3%) |
545
(59.8%) |
1233
(78.1%) |
644 (52.2%) |
(Source:
Wu
S, et
al.
Misuse
of statistical
methods in 10
leading
Chinese
medical journals
in 1998
and 2008.
Scientific World
Journal.
2011;11:2106-2114)
在Korean
Journal of Pain的119篇論文,其歸納統計方法比例:
Statistical
Methods |
2004 |
2005 |
2006 |
2007 |
2008 |
Total |
Student
t-test |
9 |
13 |
14 |
9 |
8 |
53 (21.0) |
Chi-square
test |
9 |
8 |
9 |
8 |
6 |
40 (15.9) |
One
way
ANOVA |
5 |
7 |
4 |
5 |
4 |
25 ( 9.9) |
Mann-Whiney
test |
3 |
4 |
7 |
3 |
6 |
23 ( 9.1) |
Paired
t-test |
7 |
4 |
3 |
3 |
5 |
22 ( 8.7) |
Repeated
measures
ANOVA |
6 |
6 |
2 |
0 |
4 |
18 ( 7.1) |
Fisher’s exact test |
2 |
2 |
6 |
3 |
1 |
14 ( 5.6) |
Wilcoxon
signed
rank
test |
3 |
0 |
5 |
0 |
2 |
10 ( 4.0) |
Kruskal-Wallis
test |
2 |
1 |
1 |
2 |
3 |
9 ( 3.6) |
Total |
53 |
47 |
64 |
46 |
42 |
252 (100) |
(Source:
Yim
KH,
et
al.,
Analysis
of
statistical
methods
and
errors in
the
articles
published
in the
Korean
Journal
of Pain.
2010;23(1):35-41)
在Korean
Journal of Pain的119篇論文,其統計錯誤為:
Types
of
Errors |
2004 |
2005 |
2006 |
2007 |
2008 |
Total |
Nonparametric
test |
12 |
21 |
9 |
6 |
8 |
56
(33.9) |
Inadequate
dispersion |
8 |
7 |
5 |
5 |
10 |
35
(21.2) |
Chi-square
test |
5 |
5 |
3 |
6 |
5 |
24
(14.5) |
Multiple
comparison |
2 |
4 |
1 |
6 |
2 |
15
( 9.1) |
Ignoring
data
characteristics |
4 |
2 |
2 |
1 |
3 |
12 ( 7.3) |
Paired
t-test |
0 |
4 |
3 |
2 |
0 |
9 ( 5.5) |
Illogical
conclusion |
4 |
2 |
0 |
0 |
1 |
7 ( 4.2) |
Total |
36 |
47 |
25 |
27 |
30 |
165
(100) |
(Source:
Yim
KH, et
al., Analysis
of
statistical methods
and
errors in
the
articles
published
in the
Korean
Journal
of Pain.
2010;23(1):35-41)
以評論分類方法,醫學期刊的統計錯誤其型式如下:
Category |
Statistical
Errors |
Design |
Failure
to use
randomization in
controlled
trial
Use
of
an
inappropriate
control
group
Inadequate
sample
size |
Analysis |
Unpaired
method
for
paired
data
Wrong
unit
of analysis
Wrong
assumptions
Categorization
of continuous variable
Use of
parametric
methods for
non-normal
data |
Presentation |
Giving
SE
instead
of SD to describe
data
Results
given
only as
p-values |
Interpretation |
Concluding
causation
from an
observed
association
Interpreting
a
poor
study
(e.g.
small
sample,
case
study) |
(Source:
Altman
DG.
Statistical
reviewing
for
medical
journals.
Statistics
in
Medicine,
1998;17:2661-2674)
The
British Medical Journal
在1991~1993年間,100篇校稿稿件在審稿過程中檢查出的統計問題如下:
Check-list |
Yes |
Unclear |
No |
Objective
clear? |
83 |
6 |
11 |
Appropriate
study
design |
72 |
25 |
3 |
Source
of
subjects? |
83 |
6 |
10 |
Sample
sized
calculation? |
0 |
0 |
63 |
Satisfactory
response
rate? |
49 |
23 |
2 |
Methods
described adequately? |
47 |
- |
53 |
Statistical
analyses
appropriate? |
41 |
37 |
22 |
Statistical
presentation
satisfactory? |
14 |
- |
86 |
Confidence
intervals
given? |
51 |
- |
41 |
Conclusion
justified? |
40 |
49 |
11 |
Paper
statistically
acceptable? |
4 |
- |
96 |
(Source:
Altman
DG.
Statistical reviewing
for
medical
journals.
Statistics
in
Medicine,
1998;17:2661-2674)
由上述的分析,Dr.
Young 提出如下的建議:
一、對於研究人員
(一)
研究計劃
1.
決定個人將要研究的問題。
2.
在收集數據之前與統計討論研究設計。因為不同的研究數據影響了未來使用的分析方法。
(二)
數據分析
1.
考慮所使用的模式其假設條件是否符合數據。
2.
列出數據以檢查是否符合模式假設。
3.
確認分析作業中使用的變數(Variables),以敘述統計(descriptive
statistics)進行歸納。
4.
與統計在討論分析方法,以找出數據最佳的分析方法。
(三)
撰寫研究結果
1.
目的在於透明化與重現化。
2.
在討論中,包括使用的統計方法。
二、對於編輯與審稿人員
1.
有更多的統計專家參與審稿。
2.
研究用協議(protocols)對評審者更有參考價值。
3.
期刊編輯群之名單應該有統計專家。
4.
應該發佈作者依循之統計方法指引(guide),鼓勵作者加以參考。
三、對於統計結果之教導
(一)
敘述統計
1.
常態分佈之數據應以平均值與標準偏差(非使用標準差)加以表示。
2.
非常態分佈之數據應該以中數(medians),四分範圍(interquartile
range)加以表示。
(二)
假設檢定
1.
需報導使用之檢定方法,一尾或二尾(one-tailed
or two-tailed),或成對。
2.
不能只使用〞NS〞,或是〞inequalities〞要列出p-value。
(三)
迴歸分析
1.
確認是否合乎回歸之假設條件。
2.
檢查並教導變數使否有重合性與交互效應。
3.
描述變數之選用過程,例如前進法(forward),逐步法(stepwise)。
(四)
變方分析(ANOVA)或共變異數分析(ANCOVA)
1.
確認是否合乎假設條件。
2.
進行ANCOVA,選擇適當的共變異數。
此篇講稿所引用文獻(Reference):
1.
Altman DG. The
scandal of poor medical research. British Medical Journal,
1994;308:283-284.
2.
Altman DG.
Statistical reviewing for medical journals. Statistics in Medicine,
1998;17:2661-2674.
3.
McGuigan SM. The use
of statistics in the British Journal of Psychiatry. British Journal of
Psychiatry. 1995;167:683-688.
4.
Welch GE, Gabbe SG.
Review of statistics usage in the American Journal of Obstetrics and
Gynecology. American Journal of Obstetrics and Gynecology,
1996;175(5):1138–1141.
5.
White SJ.
Statistical errors in papers in the British Journal of Psychiatry.
British Journal of Psychiatry. 1979;135:336-342.
6.
Wu S, et al. Misuse
of statistical methods in 10 leading Chinese medical journals in 1998
and 2008. Scientific World Journal. 2011;11:2106-2114.
7.
Yim KH, et al.,
Analysis of statistical methods and errors in the articles published in
the Korean Journal of Pain. 2010;23(1):35-41
|