Tilburg University
Bayesian discovery sampling
van Batenburg, P.C.; Kriens, J.
Publication date:
1987
Document Version
Publisher's PDF, also known as Version of record
Link to publication in Tilburg University Research Portal
Citation for published version (APA):
van Batenburg, P. C., & Kriens, J. (1987). Bayesian discovery sampling: A simple model of Bayesian inference
in auditing . (pp. 1-9). (Ter Discussie FEW). Faculteit der Economische Wetenschappen.
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
ti`~~ J~G
~io ~~,~
O~~Joo~~~
Bayesian Discovery Sampling: a simple
model of Bayesian Inference in
Auditing
Paul C. van Batenburg and J. Kriens
xo. 87.04
T
~:~~~4~4.~~-~~~~
-~-
Bayesian Discovery Sampling: a simple model of Bayesian Inference in Auditing
by
Paul C. van Batenburg~~
and J. Kriens~~
~~ Touche Ross Netherlands, Center for Quantítative Methods and Statistícs, World Trade Center, PoB 72302, 1007 AV Amsterdam, The Netherlands.
ABSTRACT
Once auditors have been convinced of the advantages of Bayesian inference their difficulties in practical applications are not the ones statisticians have. The mathematical fornulations of prior and posterior probabilities only need to fulfill the auditor's
subjective ideas about the presence of errors in a population to be audited; exact derivations are left to specialists.
The auditor, however, has other problems to solve:
1. How can he objectively specify his prior knowlegde about the
population? ,
2. How can he obiectively use posterior probabilities to decide on how to audit this population?
In this paper, the above mentioned questions are answered by showing that the methodology of discovery sampling gives all the information needed to specify the prior and to interpret the posterior densities used in a Bayesian version of a methodology that has already been used by auditors for a number of years.
2
1. INTRODUCTION
In the Dutch branch of Touche Ross International, a worldwide company of certified public accountants, statistical methods in auditing have been applied for a period of 25 years. A whole methodology has been designed, resulting in a number of publications both in Dutch and in international literature on hypothesis testing, error evaluation methods, regression
estimators and on outgoing quality limit methods. (Kriens (1979), Kriens and Dekkers (1979), Kriens and Veenstra (1985), Van
Batenburg, Kriens and Veenstra (1987)).
At the moment, progress is being made in the implementation of Bayesian inference in interval estimation, based on the Cox-Snell model (Cox and Snell (1979)), of which the theoretical properties have been studied rather thoroughly, cf. Moors and Janssans (1987). As an example of how fruitful Bayesian inference can be, the Center for Quantítative methods and Statistics of Touche Ross Netherlands has built a simple model in which the Bayesian notion of prior and posterior probabilities is combined with the
classical method of discovery sampling.
In section 2 of this paper, our version of discovery sampling in the classical manner is presented. In section 3, Bayesian
methodology is applied to the parameters of this classical method. Section 4 describes the complete model of Bayesian discovery sampling, and in section 5 some numerical examples are presented to show the efficiency in sample sizes by using
Bayesian inference over classical methodology. 2. DISCOVERY SAIiPLING
In this section, a brief outline of discovery sampling as used by Touche Ross Netherlands is presented; tha aim of this paper is not to discuss this theory, but to show the advantages of
Bayesian i nference.
Let p be the error percentage in a population. The null hypothesis
H~: p-0
is tested against
H1: p ~ 0.
The probability of a type I error, a ( wrongly rejecting a perfect population), must be zero, and the critical region of this test can only be
3
x representing the number of errors in a random sample of size n taken from the population to be audited. By taking this very null hypothesis, standard testing theory is reasonably simplified: attention can now be focused completely on the probability of a type II error. The symbol (3 is - as usual - given to the
probability to accept a population that is not perfect. The random variable x follows a hypergeometric probability function which, for reasonable large population sizes and sample sizes not exceeding 107. of population size, can be approximated by a binomial probability function:
Q-P (x~0 ~ n.P)-(1 -p)n .
The parameters (3o and pl are chosen by the auditor, stating: 'when the true error percentage exceeds p, the probability of not noticing this from the sample may notlexceed (io'. Sample
sizes can now be deducted:
a 5(i when n? loR ao
o log (1-pl).
Some interesting minimal sample sizes used for testing in this manner are presented in table 1, to which can be added that in practical applications (i is usually chosen to be 17. or 59., whereas pl almost never éxceeds 57..
Table 1. Sample sizes for discovery samp~~ng-classical procedure based on binomial probabilities
pl ao 17. 22 37 4~ 5R 17. 459 390 349 321 299 27. 288 194 174 160 149 37. 152 129 116 106 99 47. 113 96 86 79 74 57. 90 77 69 63 59 67. 75 64 57 53 49 77. 64 54 49 45 42 87. 56 47 43 39 36 97. 49 42 38 35 32 107. 44 38 34 31 29 ~)
Poisson approximations to this formula, often frequented, are mathematically a little símpler but will always give
4
3. A BAYBSIAN VIEW ON THE PARAMETSRS IN DISCOVSRY SAHPLING
The critical percentage errors p, chosen by the audítor in order to decide on the sample size to ~e used, together with the
maxímal probability Q of a type II error to be allowed, will also be the outcome og the calculation of the upper limit of the one-sided 100(1-(i )7. confidence interval for p given a random sample of síze n on which no errors occurred. This can be verified by specifying the formula by which this upper limit is calculated:
Min { p ~ P(x - 0 I n,P) S(io }.
From a Bayesian point of view the one-sided confidence interval for p can be interpreted as a probability on the random variable
P:
P(p ~ pl) - (io.
In this way, the results of an audit sampling in year t can be used to identify a prior distribution on p: the classical confidence interval that resulted from year t's audit is a probability statement from a prior probability density function. In this example, the prior (Pr(.)) distribution of p ís assumed to be a beta density with parameters r- 1 and s to-be identified by the probability mentioned above:
Pr (p) - s(1-p)s-1 0~ p~ 1
- 0 elsewhere.
The fact that r is chosen to be 1 i mplies the mode of the prior density to lie in p- 0, which i s consistent with the fact that the classical test i s on a null hypothesis p~ 0.
Similarly, the posterior distribution for p which follows from the Bayesian model presented in the next section results in a probabílity on the random variable p:
P (p ~ p2) - Pz.
with p en a to be chosen. By stating the classical parameters (32 and2p2 inZthe usual way 'when error percentage exceeds pz, the probability of not noticing this from the sample may not exceed p2', the auditor has declared which posterior probability he wants to achieve using the Bayesian model. This complete model is
5
4. THS I10DSI.
4.1 Prior probability for pt
Pr (p) - s(1-p)s-1 0 ~ p~ 1
- 0 elsewhere.
4.2 Probability of zero errors in a random sample of size n from a population with error proportion p:
L (x-0 ~ n,P) a (1-p)n .
4.3 The posterior probabilíty function for p results from the following calculations: Po (P ~ x-0,n) - L(x-0 ~ n.P) Pr(P)
f
0 L (x-0 ~ n.P) Pr(P)dp s(1-p)nts-1 1 .fs(1-p)nts-1dP -(nts) (1-p)nts-1 0 5 p 5 1. 4.4 Prior identification: from year t's audit a 100 (1-at)Rconfidence interval is calculated to have one-sided upper limit pt' P(pt ~ pt) - at 1 - rs(1-p)s-1 dp ~ (1-pt)s p tJ which gives: log at s - log (1-pt) '
4.5 Requirements for posterior i dentification: auditing in year
6 P(pttl ~ Pttl) - attl ~nts) ( 1-p)nts-1 dp - ( 1-Pttl)nts ~ pttl which gives log atfl ntsz . log (1'Pttl) 4.6 Required sample size in year (ttl):
log attl n
-log (1-Pttl)
log at log (1-pt)
4.7 Untill now, the Bayesian random variables p and p have
been treated as being completely i dentícal.tIn fac~t~his means that the auditor has stated that populations to be audited in years t and ( ttl) are completely equivalent. It is the
auditor's, not the statistician's, responsibility to specify a factor of f on the interval ( 0,1) that i ndicates how certain he
is about this equivalency. Being no more than an example, no harm has yet been done by varying this factor from 0 to 1 by
steps of 107., but further research may not neglect the auditor's responsibility to find a sensible solution:
log Pttl
log at
n z - f
7
5. [dUNQ3RRICAL BXAl~LFS
Let us assume that in year t, an auditor has drawn a random sample of 59, i n which no errors were found. When deciding on the audit sampling plan for year ( ttl), the auditor again has to decide on the crítical value of the error percentage and the confidence level required. If, for example, the auditor once more decides to take Pttl-57, and pttl-57., a new sample of 59 is
required.
The auditor, however, by using his prior knowledge, can judge if there i s a justified reason to choose a value of f. Logically, taking f-1007. results in a zero sample size, because this assumption implies that last year's audit sample i s completely
sufficient for this year's audit. Quite a daring assumption! The bottom row of table 2 shows sample sizes required for this year's audit with attl-57, and ptt1-5R, depending on the chosen value of f.
Let us furthermore assume that the suditor will take his responsibility for setting f at 707.. He can now decide on two strategies, or even a combination of these:
- by taking a new sample of 59, he can perform discovery sampling
with p } -39. and p -57., ( table 2) which would have required
99 samplé-items wi~hóut Bayesian inference;
- by taking a new sample of 50, he can perform discovery sampling with pt 1-57. and Q } 1-19., ( table 3) which would have required
90 samp~e items wi~hout Bayesian i nference.
'able 2. Sample si~es i n L-!avesian discove~v sarnplinp
prior: ~-rppn!' lirnit; -.. ~~ : canf .i dence 1 evel 9~,~
conf i dence 1 evel of po~~teri ai" upC~er -- mi t i s 9~'L,
F?osteriorTwithout T comparibi:i-v factor r
lirnit IPaves I 1!iï!i: 9~iï. 3ii;; 7~:~': b!i7. ~Ui: 4!í;; ?ir;: ~r:r;; 1~!;. - 1,---`9q :41 ~47 ~~- -~`~ ~64 :7ii ~76 ~?8.~ ~SL! ~94 ',':: 149 91 9? 1u' 1C~9 114 1~~r 1'~b 1?~ 1?~7 144 :7. 99 41 47 ~~' ~Q 64 7ii 16 8~ 8~a 94 16 ~8 '~: ~9 4J J1 ~7 b, b9 tE~ ?4 -~ -5:: ~9 ~! 7 1.- í9 ~4- ?ii ~6 4~ 4S ~~4
~able .-. 5ample sizes in 6avesian discoverv samplinq prior: ~rpper limit ~ ~.
: conf i dence 1 evel 95'I.
confidence leve.l of posterior upoer iimit is 99I
~aosterierIwitho~!t I cornparibilitv factor f
limit IBaves T liiii~ 9Oi; 8r?ï. 7!i': b~ii: ~~!ï. 4~!;'. ~ii~ ',?iiy. 1~!~:
1': 459 401 4ii7 41' 419 4~4 4~i~ 4.'6 44~ 448 4~~4
,"y, ~~8 1?~! 176 1E~ï' 195 19' 199 ~U~ ~11 ~1?
~~--:ï, 15~ 94 lOl! 1~!b 11L 117 1~- 1:9 1.'~ 141 14?
4Y. 11~ ~5 61 67 7- ?3 F34 9i! 96 1~!~ 1!iE
9
References
Van Batenburg, P.C., Kriens, J. and R.H.Veenstra ( 1987), Average Outgoing Quality Limit - a revised and
improved version. To be published by the Universíty of Amsterdam in 1987.
Cox, D.R. and E.J.Snell ( 1979), On Sampling and estimation of
rare errors, Biometrika 66, pp 125-132.
Kriens, J. (1974), Statistical sampling for auditing and
accounting purposes. University of Novi Sad
(Yugoslavia). .
Kriens, J. (1979), Statistical Sampling ín Auditing.
Proceedings of the 42nd Session of the International Statistical Institute, vol. XLVIII, book 3, pp 423 -437, Manila.
Kriens, J. and A.C.Dekkers (1979), Statistical Sampling in Auditing (Steekproeven in de accountantscontrole), Stenfert Kroese, Leiden (In Dutch .
Kriens, J. and R.H.Veenstra (1985), Statistical Sampling in internal control by using the AOQL-System. The Statistician, 34, pp 383 - 390.
i
IN 1986 REEDS VERSCHENEN
O1 F. van der Ploeg
Monopoly Unions, Investment and Employment: Benefits of Contingent Wage Contracts
02 J. van Mier
Gewone differentievergelijkingen met niet-constante coëfficiLnten en partiële differentievergelijkingen (vervolg R.T.D. no. 84.32)
03 J.J.A. Moors
Het Bayesiaanse Cox-Snell-model by accountantscontroles
04 G.J. van den Berg
Nonstationarity in job search theory 05 G.J. van den Berg
Small-sample properties of estimators of the autocorrelation coeffi-cient
06 P. Kooreman
Huishoudproduktie en de analyse van tijdsbesteding 0~ R.J. Casimir
DSS, Information systems and Management Games 08 A.J. van Reeken
De ontwikkeling van de informatiesysteemontwikkeling
09 E. Berns
Filosofie, economie en macht 10 Anna Harat~czyk
The Comparative Analysis of the Social Development of Cracow, Bratis-lava, and Leipzig, in the period 1960-1985
11 A.J. van Reeken
Over de relatie tussen de begrippen: offer, resultaat, efficiëntie, effectiviteit, produktiviteit, rendement en kwaliteit
12 A.J. van Reeken
Groeiende Index van Informatiesysteemontwikkelmethoden
13 A.J. van Reeken
A note on Types of Information Systems 14 A.J. van Reeken
Het probleem van de Componentenanalyse in ISAC
15 A. Kapteyn, P. Kooreman, R.J.M. Willemse
Some methodological issues in the implementatíon of subjective pover-ty definitions
16 I. Woittiez
ii
1~ A.J. van Reeken
A new concept for allocation of joint costs: Stepwise reduction of costs proportional to joint savings
18 A.J. van Reeken
Naar een andere eanpak in de systemering
19 J.G. de Boer, N.J.W. Greveling
Informatieplanning met behulp van referentie-informatiemodellen 1. Totstandkoming bedrijfsinformatiemodellen
20 J.G. de Boer, N.J.W. Greveling
Informatieplanning met behulp van referentie-informatiemodellen 2.
Een methode voor informatieplanning
21 W. Reijnders
Direct Marketing: "Van tactiek naar strategie" 22 H. Gremmen
1V
IN 198~ REEDS VERSCHENEN O1 J.J.A. Moors
Analytical Properties of Bayesian Cox-Snell Bounds in Auditing 02 H.P.A. Mulders, A.J. van Reeken
DATAAL - een hulpmiddel voor onderhoud van gegevensverzamelingen 03 Drs. A.J. van Reeken