Academic Studies on Possible Impacts of
Australia's Gun Changes Occurring About '96 and '97

1 July 09

Before looking at the academic studies that have been done related to the possible impact of the '97 Australian gun law changes and buyback since enough time has passed for sufficient data to exist, consider what is necessary to show that some change at a given time might have caused some other change.  To do this for a death rate, for example, it is necessary to determine the trend of the rate just before the time of interest (the "pre" trend) and to then examine how the rate for at least a few years after the time of interest compares to the projection of that "pre" trend into the following period.  It's a comparison of immediately before with immediately after.

Any qualified statistical analyst would repeat the analysis looking for the change to occur the year after the initial change occurred, and maybe even the following year, in case the impact of the initial change was delayed.  In looking at the "after" rates, one must look at both the rates and the amount by which the rates change from year to year.  Death rates being higher would mean "bad," and so would rates increasing faster from year to year or not dropping as fast from year to year.

Trends are caused by things, and things change over time.  Some changes are over short time periods, others are over long periods.  To determine whether or not something impacts a rate (such as the gun suicide rate) at or near a certain time, one must check the trend immediately after the time against the trend immediately before it.  A long-term general trend, as from the beginning of the 20th century, is irrelevant.

The studies addressed herein are:

1.  one by S. Chapman and crew of the School of Public Health, U. of Sydney, published 2006 in Injury Prevention

2.  one by Jeanine Baker of the Sporting Shooters Assn of Australia and Samara McPhedran of the School of Psychology, U. of Sydney, published 2006 in the British Journal of Criminology

3.  one by Christine Neill, of the Wilfrid Laurier U. Dept of Econ, and Andrew Leigh of the Research School of Social Sciences, Australian Natl U., reported June 2007

4.  one by Wang-Sheng Lee, U. of Melbourne, and Sandy Suardi, La Trobe U., reported Aug 2008

Note that none of the studies was truly about the Australian gun buyback or gun law changes.  All the studies looked for a significant change in some kinds of death rate that occurred at about '96 or '97.  If a change is identified, it could be caused by anything major that occurred at or before the same time—including the gun changes.  Such a finding would indicate that the gun changes might have had an impact.  If no such change is observed, it would prove that the gun changes did not have a (significant) impact on whatever rate is being examined.


Chapman had access to a pre-publication copy of the Baker-McPhedran (B-M) report, and criticized the B-M method.  He said he didn't like an analysis based on time series data because such analyses ignore the variability from data point to data point.  This assertion is incorrect.  Any decent statistical analysis accounts for such variation by determining the extent to which averages, trends, etc. are uncertain because of chance variation.  He also said the "ARIMA" statistical method/model (used by B-M) "is unable to explicitly address the effect of an intervention such as the introduction of gun laws."  ARIMA is widely used for prediction and forecasting, and comes complete with characterization of the expected accuracy of such predictions/forecasts.  This is what is needed to determine if something causes departures of actual observations from expected/predicted ranges of values.

Chapman reported that the pre-existing annual rate of decline in gun suicide rates accelerated after introduction of the gun laws.  He reported about the same for total gun death rates, which is natural since most of the gun deaths are suicides.  He didn't mention that his "post" trends began well below where his "pre" trends ended, which would logically say that something (like the gun changes) had yielded an immediate reduction in gun suicides.

Regarding gun homicide, Chapman did one thing worthwhile.  Besides analyzing trends for gun homicides he analyzed trends for gun homicides that did not involve more than four victims per incident.  This is good because mass killings create great randomness in the homicide data and therefore obscure the ordinary homicide trend.

Because the mass killings were generally in the last half of the "pre" period, Chapman's trend line for total gun homicide was shifted up at the end of the period.  This resulted in the post-'96 rates being a bit lower than the last of the "pre" rates (that is, there was a significant discontinuity between the "pre" and "post" periods).  He found the difference between the annual drop rates to be not statistically significant (rates were falling before '97 and fell at about the same rate from '97 on). The "pre" and "post" trends virtually merged together when mass killings were excluded.

Chapman reported that nongun homicides had been increasing at about 1.1 percent before '97, then fell at about 2.4 percent per year, and that total homicides fell after '96 after having been somewhat steady in the preceding years.  Somehow, because of these findings, he concluded that the data did not support a hypothesis of method substitution for homicide.  He didn't mention that his "trends" showed a considerable discontinuity between the "pre" and "post" periods.  And he didn't mention how it could be that the gun changes had reversed a bad trend in nongun homicides.  When an analysis has some nonsensical results, the analyst should question the entire analysis.

He reported that nongun suicide rates rose at 2.3 percent (ave) per year before '97, then dropped at 4.1 percent (ave) per year.  He didn't mention the fact that the rate jumped upward abruptly in '97 before falling at the faster rate.  This rise, along with the abrupt drop between the trend lines for gun suicide, would mean that method substitution was immediate, but short-lived.  And he reported that total suicide rates had been stable pre '97, then started dropping.  Somehow, again, he concluded that the data did not support a hypothesis of method substitution for suicide.  Maybe he had not noticed that he had already said that total suicide had been stable in the "pre" period while the gun suicide rate had been dropping and the nongun suicide rate had been rising throughout the period.

Chapman reported that accidental gun death rates increased after '96.  They had been falling from '79 to '96.  He lauded the fact that there had been no mass shootings after '96 and that a cultural change had been made.

Chapman's analyses had at least three large problems.  One was that they cut off the "pre" period in '96 although the laws did not go into effect, and the buyback was not completed, until '97.  Because the number of data points in the "after" period was small, putting the '97 data points there rather than in the "pre" data had the effect of significantly increasing the rates at which his "after" trends for gun homicide and gun suicide dropped.  This was also part of the cause of abrupt jumps between the "pre" and "post" periods, including his curious result of nongun suicide rate jumping upward at the start of his "after" period.

Chapman either did not repeat his analysis considering '97/'98 as the before-after point, or he chose to report only the analysis that appeared to support the gun control agenda (in spite of the fact that the analysis produced some bizarre results).  (Oh, yes, Chapman was a member of the Coalition for Gun Control, Australia, from '93 to '96 and one of his co-authors was ('06) editor of the anti-gun Gun Policy News.)

NOTE: For unknown reason, Chapman's starting data were a little different than the data for all the other studies.  Our own analysis of Chapman's analysis uses his numbers.

Another problem with Chapman's methods is that they apparently did not respond to short-term trends.  The critical problem with Chapman's analyses of gun suicide, nongun homicide and total gun death rates is that the "trends" he used for the "pre" period were not the true trends that existed immediately prior to '96 or '97.

It is obvious from the gun suicide, gun homicide (w/o "mass" killings), nongun homicide, and total gun death figures in Chapman's paper that the actual rates were generally above the line representing his "trend" in the middle of the "pre" data range, and generally below it at both ends of the range.  This can be because a trend is actually curved (high at the center).  To visualize the situation, draw an arc like a profile of a low hill, then draw a straight line through it so that it intersects the arc in two places.  The arc is like the actual data and the straight line is like the Chapman "trend."

In fact, the true trends of the suicide data and nongun homicide data obviously merge right into the post-'96 trends.  And the total gun death rate trend nearly does.  Examining how "residuals" (the differences between actual values and the statistical trend) are dispersed as a function of the "x" variable is a matter of basic statistics.

In the case of Chapman's gun suicide rate data, the data fit very nicely two separate straight-line trends—one up through '87 and the other for '87 through '97.  The trend is even more consistent for data from '91 through '97.

If one uses the data from '91 through '97 for the "pre" period, the correlation coefficient is quite high (-.986) and the '98 ("post") gun suicide rate is immediately well below the "pre" trend 95% confidence interval.  But the rates rise rapidly (in 2 years) to far above the "pre" trend 95% confidence interval (from 2000 to 2003).  In other words, the first-year drop of gun suicide was totally overcome by the average annual drop being reduced in comparison to what the drop rate was before '98.  This difference from Chapman's results (he said, accellerated decline) comes from using the true "pre" trend and comparing pre-'98 rather than pre-'97.  The true "pre" trend increases the rate of decline "before" and taking the '97 out of the "after" dramatically reduces the "after" rate of decline.

The same problem exists with Chapman's pre-'97 "trend" for nongun homicide.  Specifically, midrange actuals are above his "trend" line while actuals are generally below the line at both ends.  In other words, his "trend" was not the trend up through '96.  Although the data for '88 through '96 has large random variations, a reasonably good fit is achieved with a straight line trend.  If this is done, including '97 in the "pre" trend, the trend virtually merges into the "post" trend, indicating that no change occurred in '96 or '97.  Specifically, the offset in the two trends is only 0.1 and annual decline rates are 2.6 percent and 2.7 percent per year (virtually identical).  [Chapman had reported 1.1 percent and 2.4 percent.]  The difference from Chapman's results is basically from using the true "pre" trend.

Chapman demonstrated that one can find a statistical method to obscure any truth.  Our introductory statitistics text says that a person needs to be quite knowledgable about the area in which an analysis is done, because one has to be able to recognize the nonsense that can come out of a statistical analysis.  Of course, such knowledge can also help the dishonest analyst to guide the analysis in the direction he wants it to go.

What about the "method substitution"?  Chapman said, "If substitution occurred, we would expect an increasing downward trend in firearm deaths after introduction of gun control laws but a compensatory lesser downward or even upward trend in non-firearm-related deaths over the same period."  This is incorrect.  It assumes a constant overall rate of the particular kind of death (suicide or homicide).  Also, method substitution does not have to be complete, so the change in nongun death rates would not have to completely compensate for the reduction in the gun death rate.

It is amateurish to look at two things and try to devine whether or not there was substitution.  To analyze substitution, measure it!  All that's necessary is to calculate the ratio, for each year, of the nongun deaths to the gun deaths (for suicide and homicide) and examine how the ratio changes over time.  The same could be done using the gun deaths and the all-method deaths, but this dilutes the measure a bit.  The ratios of the death rates could also be used, but with some loss of accuracy if the rates used are rounded off too much.  Following is the plot for suicide rates.

Ratio of Nongun Suicides to Gun Suicides (Chapman data)

Note that, according to Chapman's source data, the choice between guns and other methods for suicide was quite constant from '79 through '85.  Australians then steadily shifted to other methods until '89, then stopped shifting until about '93, at which time they started rapidly shifting again away from use of guns in suicide.  The ratio for '96 appears to be a bit of an anomaly.  The ratio continued to accelerate upward until '98, then jumped up and down in a range from about 8 to 10.

According to Chapman's data, something that happened around '97 caused the upward acceleration of method substitution to stop, and also caused an increase in the variability from year to year.  If the gun changes caused method substitution to increase, something else that happened at the same time must have overcome both this and the pre-existing trend!  Very unlikely.  But, stopping the substitution is a surprizing result.  What did it?  It will be seen that the Baker-McPhedran data does not show this same highly variable leveling off of the ratio.

Now, look at the same kind of plot for gun homicide.  In this plot, a 2-year moving average is also plotted as an aid to identifying trends because the gun homicide data is highly erratic.

Non-gun to Gun Ratio for Homicides (Chapman data)

It is barely possible to discern from the figure that the ratio trend from Chapman's data was fairly stable from '79 to about '85, then rose unsteadily until about '90, then remained approximately level until about '96.  After dropping a bit in '96 and '97, the substitution increased until '02, then dropped almost to the '90-'96 levels in '03.  According to Chapman's data something that happened about '97 caused method substitution to rise for four years.  If the data were good, it would be interesting to see if the trend continued after '03.  (The '96 drop was caused by the dramatic drop in homicide for the half year following the Port Arthur massacre.)


Baker and McPhedran (B-M) used a statistical process/model/software called "ARIMA" to do their analysis.  To do this, software is usually used to run tests on the data in order to determine candidate parameters for use in establishing the probable limits of the data trend both during the "pre" period and into the "post" period.  Then, the analysis is run, the fit of the "pre" trends is checked.  The end of the "pre" trend might be left out initially in order to see how well an analysis of the preceding part predicts the last part.  Then the analysis is rerun with other high-probability parameters, and the fits for those parameter sets are checked—all to determine the best set of parameters to use in the final analysis.

B-M reported that their model fit the gun suicide rates well, and their model indicated that the "post" rates quickly fell below their 95 percent confidence interval (CI).  In other words, they found that gun suicide rates fell faster after '96 than before '97.

Their analysis of nongun suicide rates indicated that the average of the predicted "post" '96 rates was not significantly different than the average of the actual values.  This was because the rates for the first two years in the period jumped above the B-M CI (i.e., expected) range, the rates for following years tracked across the range, and the rates for the last two years dropped below the CI range.  That is, the average of actual was about the same as the average of predicted only because the actuals transited the prediction range.

B-M thought the gun and nongun suicide rate results together suggested initial "method substitution, followed by a decrease in suicide (non-firearm), which mirrored, but was larger than, falls in observed suicide (firearm)."

B-M found that their ARIMA model for gun homicide rates fit the data only moderately well, which meant that their 95 percent CI was a bit broad.  But, the analysis indicated that the rates in the "post" period were well within the CI (expected range).  And they found that the "post" nongun homicide rates fell well within the range predicted based on the "pre" data, although they thought the ARIMA model predicted poorly (which meant that the "post" CI (expected values) got noticebly wider as the time in the "post" period progressed.  Because both gun and nongun homicide rates fell within predicted ranges, they concluded that the results did not support a hypothesis of method substitution for homicide.

Finally, they found (as Chapman had) that rates of accidental gun deaths had started increasing after '96, while they had been steadily declining until then.

B-M were lucky that any of their results were valid.  Their correlation coefficients may have told them that some of their CIs fit the data well, but this really wasn't true.  The models they used in their reported analyses were not agile in responding to changes of trend, and generally did not start responding to a change until the year after the change actually began.  We haven't used ARIMA, so we don't know whether this might be a result of the specific parameters B-M used or if it is simply a limitation of ARIMA.

In the case of the gun suicide data, the true trend was practically level from '79 through '87, then broke downward in almost a straight line until '97.  In response to the break in the trend, their analysis caused their CI (approximately the dashed lines in the figure) for the "pre" period to immediately ('80) start dropping slowly until a year after the true trend started down.  Then, their CI was "faked" downward a bit too much by the short-term '88/'89 drop.  Then, the significant problem began.

Gun Suicide Rate History (Baker-McPhedran data)

From '91 through '96 the actual data trended down more than B-M's CI did, so that the actual data tracked generally from toward the top of their CI to toward the bottom of it.  The data trended on in pretty much the same direction (with a short detour in '98) and B-M's predicted (expected) trend CI continued along the wrong path it was on—resulting in the actual "post" data dropping entirely below their CI.  The poor fit of the '91-'96 trend was the reason B-M erroneously found that gun suicide post-'96 was lower than predicted based on the "pre" data.

The fit was similarly poor for nongun suicide, but this didn't impact the results much because the actual nongun suicide rates peaked and abruptly started downward after '98.  That is, the "post" data wouldn't have matched a predicted trend even if the trend were perfect.  But B-M's trend didn't follow the level rates from '79 to '84, the quick transition to a new level, or the new level rates from '87 to '96.  Their trend lagged the actual transition in levels and then, instead of leveling off, headed slowly upward.

Total & Nongun Suicide Rate History (Baker-McPhedran data)
Their model for the gun homicide rates acted similar to the gun suicide model, starting off heading downward immediately although the actual data trend was pretty level until about '87.  But their trend CI was pretty much like the actual data trend by the end of the "pre" period, so their observation ended up being correct.

Their trend for nongun homicide was similarly unresponsive to the true trend, immediately starting upward although the actual data was level, levelling off briefly while the actual trend was increasing, then being unduly bumped too high by a combination of an extra-low rate in '84 and an extra-high rate in '88—resulting in the CI being delayed by two years and therefore being noticably higher than the actual trend from '89 all the way through '93 before getting nudged back in line with the actual trend by '96 (so the poor fit didn't impact the results).

Total and Nongun Homicide Rate Histories (Baker-McPhedran data)

Except for the obvious gun accident results, the only departure from "pre" trends that B-M found (re. gun suicide) was a mistake caused entirely by the fact that their "pre" trend for gun suicide wasn't really the trend just before '97.

The statistical trend lines for the data from '91 through '96 and for the data from '91 through '97 are both shown in the gun suicide figure (repeated below).  Both these trends fit the data much better than B-M's model does (R2 is .93 for the '91-'96 data and .96 for the '91-'97 data, while B-M got only .85).  Note that the "post" '97 rates start off being a bit lower than would be predicted by the "pre" trends, then quickly follow a trend a bit higher than the "pre" trends (although not significantly different).  Note, too, that the biggest part of the drop started in '98, supporting the idea that '97 should really be considered the end of the "pre" period.

Gun Suicide Rate History (Baker-McPhedran data)

The proper conclusion was that gun suicide rates dropped for only two to three years below what the previous trend would predict, then continued on at least as high as that trend would predict.

And what about substitution?  See the plots of the nongun to gun ratios for B-M's data.

History of Nongun to Gun Ratio for Suicide (Baker-McPhedran data)

Note that the suicide ratios after '97 (from the B-M data) doesn't look like a levelling off as Chapman's data yielded.  This is because B-M's '04 data point continued upward, making the '98 data point look like an extraordinarily high value.  (The ratios are very sensitive to the small gun suicide values in the denominator, so it doesn't take much of a difference in gun suicide to make a big difference in the ratio.)  The ratios from the B-M data indicate that substitution continued its climb with only a short perterbation (short, sudden jump) in '98.

Note that B-M, by only "eyeballing" gun and nongun suicide saw possible method substitution only for the '98 outlier rather than recognizing that substitution continued rising much as it had before.

History of Nongun to Gun Ratio for Homicide (Baker-McPhedran data)

The plot of the method substitution ratio for homicide has a peculiar peak for '93-'95, then maybe the start of another for '98-'00, before taking off to new heights.  Because of that first peak, it is not clear that increases immediately following '96 were abnormal compared to the period before '96.  Method substitution did definitely increase though in comparison to pre-'93.  The substitution continued rising, but B-M saw neither the previous substitution trend nor the "after" substitution trend because they didn't analyze substitution explicitly.  Note that the curve is considerably different from the one created from Chapman's data.



The Neill-Leigh (N-L) analysis was not published in a journal.  It was basically a repeat of the B-M analysis, but with additional ARIMA analyses using logarithms of the values and using data for all the way back to 1915.  N-L said the purpose of their paper was to "highlight important flaws in BM."

N-L reported that using a decade more of data, and using the data from 1915 on, yielded progressively higher estimates of "post" gun homicide and gun suicide—so they found that gun homicide as well as gun suicide ran below prediction (B-M had found this only for gun suicide).  And they reported that using logarithms rather than the actual values gave even higher predicted rates, meaning the actual "post" values were even more below prediction.

N-L were very concerned that ARIMA projections out into the future using the actual death rates could go below zero, which is physically impossible.  This was their justification for using logarithms rather than the actual rates.

N-L also complained and analyzed considerably about a supposed claim by B-M that they had controlled for or "allowed both for method substitution and for underlying trends in suicide or homicide rates."  N-L's concern was the concept that something could be used for both purposes.  They were correct in this regard, but B-M did not really use the nongun rates as controls in the usual statistical sense.  B-M did say that their including analysis of nongun suicide and homicide provided a control, but they really only controlled by "eyeballing" and trying to devine whether or not there was substitution or that gun rate changes resulted from overall changes.

The truth is that nongun or overall suicide and homicide rate histories must be taken into account when examining whether gun death rate changes might be the result of a gun policy change.  If nongun and overall suicide rates start climbing at a certain date and gun suicide rates start climbing at the same percentage at the same time, or start falling by that much less than before, the change in gun rates is caused by the fact that more people are committing suicide, not by the fact that there are guns or gun laws/policies.  The nongun or overall rates should, however, be accounted for explicitly by including them in the analyses of the gun rates, rather than just by observation.

N-L are proof that powerful computers and software are not complete substitutes for the tool between the ears.  Any method or software tool that identifies a different trend at the end of a period by using more than 18 years of data points is worthless or misused.  As we have already seen, the true trends leading up to '96/'97 started no more than ten years earlier.  N-L indicated that the main reason that using a longer period of data shifted the trends for gun suicide and homicide rates was that the rates were at historically high levels in the late '70s and early '80s.  This tells us that their ARIMA was responding to long-term trend rather than the trend just before '97.  It appears that they assume that nothing has been driving gun death rates over the years until maybe the '96/'97 gun changes.

And the theoretical possibility that a projection into some time in the future could go to zero or negative is irrelevant if the some time in the future is beyond the time of interest in the study.  What happens 20 years after the gun changes cannot logically be pinned on those changes.  The effects would have to be seen from immediately to maybe up to three years after a change in order to be potentially linked to the change even if all other potential causes of impacts were accounted for (which is impossible because of the lack of relevant data).

The only legitimate justification for logarithmic transformation of the death rates would be if the rates appeared to change exponentially—either growing or declining exponentially with time.  Logarithmic transformation predicts lower than true trends, especially when the random departures from trend are large (as with the homicide data), because the conversion weights the "ups" less than the "downs."  To match a 50 percent down departure, an up departure must be 200 percent.


Lee and Suardi (L-S) looked at the preceding studies and saw a lot of disagreement, and they saw continuing controversy (especially about gun homicide results).  So they did a much more complex analysis which, instead of looking for something to change about '96/'97, did an analysis for every possible break point using data for all the years (1915 through 2004).  Their approach was looking for years at which something big happened to the death rates.  Their analysis found that something very probably happened to gun homicide in '51 and '87, to nongun homicide in '50, and to gun suicide in '87.  That is, new trends were almost certainly established for homicide and gun suicide at about those years.  (Note that we identified '87 as the start of the gun homicide and gun suicide trends using simple plots, eyes and working brains.  Note, too, that the '50/'51 homicide break points are obvious from a simple data plot.)

Their analyses found no significant trend change around '96/'97.  Because the gun changes of that time appear to have had little impact, the less dramatic gun changes of Victoria and the Northern Territory in '88 could not be claimed to be the cause of the changes L-S found in '87.  It is possible that the '87 changes may have resulted, not from government policy changes, but from the psyche of Australians being changed by the multiple mass killings of '87.

There is something very important about the L-S findings that they did not note.  That is that their findings that the trends changed significantly at about '87 means that the trends of significance in determining whether or not something impacted the trends about '96/'97 should not include data points pre '87 in identifying the trends leading up to '96/'97 unless the analysis method is agile enough to quickly respond to trend changes.  Their findings support the use of slightly fewer "pre" data points (years) than even Chapman and B-M used.

We believe that the L-S type of analysis would be better in identifying times when changes occurred if the analysis used "pre" and "post" trend identifications (regressions) of no more than 10 years of data (each) so that the trends compared would be the actual trends at the different times (rather than long-term trends).  But L-S did do one check using data only from '88 onward, and found no abrupt change.