Bayesian discovery sampling: A simple model of Bayesian inference in auditing

(1)

Tilburg University

Bayesian discovery sampling

van Batenburg, P.C.; Kriens, J.

Publication date:

1987

Document Version

Publisher's PDF, also known as Version of record

Link to publication in Tilburg University Research Portal

Citation for published version (APA):

van Batenburg, P. C., & Kriens, J. (1987). Bayesian discovery sampling: A simple model of Bayesian inference

in auditing . (pp. 1-9). (Ter Discussie FEW). Faculteit der Economische Wetenschappen.

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal Take down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

(2)

ti`~~ J~G

~io ~~,~

OJoo~

(3)

(4)

Bayesian Discovery Sampling: a simple

model of Bayesian Inference in

Auditing

Paul C. van Batenburg and J. Kriens

xo. 87.04

T

~:~~~4~4.-~~

-~-

(5)

Bayesian Discovery Sampling: a simple model of Bayesian Inference in Auditing

by

Paul C. van Batenburg~~

and J. Kriens~~

~~ Touche Ross Netherlands, Center for Quantítative Methods and Statistícs, World Trade Center, PoB 72302, 1007 AV Amsterdam, The Netherlands.

(6)

ABSTRACT

Once auditors have been convinced of the advantages of Bayesian inference their difficulties in practical applications are not the ones statisticians have. The mathematical fornulations of prior and posterior probabilities only need to fulfill the auditor's

subjective ideas about the presence of errors in a population to be audited; exact derivations are left to specialists.

The auditor, however, has other problems to solve:

1. How can he objectively specify his prior knowlegde about the

population? ,

2. How can he obiectively use posterior probabilities to decide on how to audit this population?

In this paper, the above mentioned questions are answered by showing that the methodology of discovery sampling gives all the information needed to specify the prior and to interpret the posterior densities used in a Bayesian version of a methodology that has already been used by auditors for a number of years.

(7)

2

1. INTRODUCTION

In the Dutch branch of Touche Ross International, a worldwide company of certified public accountants, statistical methods in auditing have been applied for a period of 25 years. A whole methodology has been designed, resulting in a number of publications both in Dutch and in international literature on hypothesis testing, error evaluation methods, regression

estimators and on outgoing quality limit methods. (Kriens (1979), Kriens and Dekkers (1979), Kriens and Veenstra (1985), Van

Batenburg, Kriens and Veenstra (1987)).

At the moment, progress is being made in the implementation of Bayesian inference in interval estimation, based on the Cox-Snell model (Cox and Snell (1979)), of which the theoretical properties have been studied rather thoroughly, cf. Moors and Janssans (1987). As an example of how fruitful Bayesian inference can be, the Center for Quantítative methods and Statistics of Touche Ross Netherlands has built a simple model in which the Bayesian notion of prior and posterior probabilities is combined with the

classical method of discovery sampling.

In section 2 of this paper, our version of discovery sampling in the classical manner is presented. In section 3, Bayesian

methodology is applied to the parameters of this classical method. Section 4 describes the complete model of Bayesian discovery sampling, and in section 5 some numerical examples are presented to show the efficiency in sample sizes by using

Bayesian inference over classical methodology. 2. DISCOVERY SAIiPLING

In this section, a brief outline of discovery sampling as used by Touche Ross Netherlands is presented; tha aim of this paper is not to discuss this theory, but to show the advantages of

Bayesian i nference.

Let p be the error percentage in a population. The null hypothesis

H~: p-0

is tested against

H1: p ~ 0.

The probability of a type I error, a ( wrongly rejecting a perfect population), must be zero, and the critical region of this test can only be

(8)

3

x representing the number of errors in a random sample of size n taken from the population to be audited. By taking this very null hypothesis, standard testing theory is reasonably simplified: attention can now be focused completely on the probability of a type II error. The symbol (3 is - as usual - given to the

probability to accept a population that is not perfect. The random variable x follows a hypergeometric probability function which, for reasonable large population sizes and sample sizes not exceeding 107. of population size, can be approximated by a binomial probability function:

Q-P (x~0 ~ n.P)-(1 -p)n .

The parameters (3o and pl are chosen by the auditor, stating: 'when the true error percentage exceeds p, the probability of not noticing this from the sample may notlexceed (io'. Sample

sizes can now be deducted:

a 5(i when n? loR ao

o log (1-pl).

Some interesting minimal sample sizes used for testing in this manner are presented in table 1, to which can be added that in practical applications (i is usually chosen to be 17. or 59., whereas pl almost never éxceeds 57..

Table 1. Sample sizes for discovery samp~~ng-classical procedure based on binomial probabilities

pl ao 17. 22 37 4~ 5R 17. 459 390 349 321 299 27. 288 194 174 160 149 37. 152 129 116 106 99 47. 113 96 86 79 74 57. 90 77 69 63 59 67. 75 64 57 53 49 77. 64 54 49 45 42 87. 56 47 43 39 36 97. 49 42 38 35 32 107. 44 38 34 31 29 ~)

Poisson approximations to this formula, often frequented, are mathematically a little símpler but will always give

(9)

4

3. A BAYBSIAN VIEW ON THE PARAMETSRS IN DISCOVSRY SAHPLING

The critical percentage errors p, chosen by the audítor in order to decide on the sample size to ~e used, together with the

maxímal probability Q of a type II error to be allowed, will also be the outcome og the calculation of the upper limit of the one-sided 100(1-(i )7. confidence interval for p given a random sample of síze n on which no errors occurred. This can be verified by specifying the formula by which this upper limit is calculated:

Min { p ~ P(x - 0 I n,P) S(io }.

From a Bayesian point of view the one-sided confidence interval for p can be interpreted as a probability on the random variable

P:

P(p ~ pl) - (io.

In this way, the results of an audit sampling in year t can be used to identify a prior distribution on p: the classical confidence interval that resulted from year t's audit is a probability statement from a prior probability density function. In this example, the prior (Pr(.)) distribution of p ís assumed to be a beta density with parameters r- 1 and s to-be identified by the probability mentioned above:

Pr (p) - s(1-p)s-1 0~ p~ 1

- 0 elsewhere.

The fact that r is chosen to be 1 i mplies the mode of the prior density to lie in p- 0, which i s consistent with the fact that the classical test i s on a null hypothesis p~ 0.

Similarly, the posterior distribution for p which follows from the Bayesian model presented in the next section results in a probabílity on the random variable p:

P (p ~ p2) - Pz.

with p en a to be chosen. By stating the classical parameters (32 and2p2 inZthe usual way 'when error percentage exceeds pz, the probability of not noticing this from the sample may not exceed p2', the auditor has declared which posterior probability he wants to achieve using the Bayesian model. This complete model is

(10)

5

4. THS I10DSI.

4.1 Prior probability for pt

Pr (p) - s(1-p)s-1 0 ~ p~ 1

- 0 elsewhere.

4.2 Probability of zero errors in a random sample of size n from a population with error proportion p:

L (x-0 ~ n,P) a (1-p)n .

4.3 The posterior probabilíty function for p results from the following calculations: Po (P ~ x-0,n) - L(x-0 ~ n.P) Pr(P)

f

0 L (x-0 ~ n.P) Pr(P)dp s(1-p)nts-1 1 .fs(1-p)nts-1dP -(nts) (1-p)nts-1 0 5 p 5 1. 4.4 Prior identification: from year t's audit a 100 (1-at)R

confidence interval is calculated to have one-sided upper limit pt' P(pt ~ pt) - at 1 - rs(1-p)s-1 dp ~ (1-pt)s p tJ which gives: log at s - log (1-pt) '

4.5 Requirements for posterior i dentification: auditing in year

(11)

6 P(pttl ~ Pttl) - attl ~nts) ( 1-p)nts-1 dp - ( 1-Pttl)nts ~ pttl which gives log atfl ntsz . log (1'Pttl) 4.6 Required sample size in year (ttl):

log attl n

-log _(1-Pttl)

log at log (1-pt)

4.7 Untill now, the Bayesian random variables p and p have

been treated as being completely i dentícal.tIn fac~t~his means that the auditor has stated that populations to be audited in years t and ( ttl) are completely equivalent. It is the

auditor's, not the statistician's, responsibility to specify a factor of f on the interval ( 0,1) that i ndicates how certain he

is about this equivalency. Being no more than an example, no harm has yet been done by varying this factor from 0 to 1 by

steps of 107., but further research may not neglect the auditor's responsibility to find a sensible solution:

log Pttl

log at

n z - f

(12)

7

5. [dUNQ3RRICAL BXAl~LFS

Let us assume that in year t, an auditor has drawn a random sample of 59, i n which no errors were found. When deciding on the audit sampling plan for year ( ttl), the auditor again has to decide on the crítical value of the error percentage and the confidence level required. If, for example, the auditor once more decides to take Pttl-57, and pttl-57., a new sample of 59 is

required.

The auditor, however, by using his prior knowledge, can judge if there i s a justified reason to choose a value of f. Logically, taking f-1007. results in a zero sample size, because this assumption implies that last year's audit sample i s completely

sufficient for this year's audit. Quite a daring assumption! The bottom row of table 2 shows sample sizes required for this year's audit with attl-57, and ptt1-5R, depending on the chosen value of f.

Let us furthermore assume that the suditor will take his responsibility for setting f at 707.. He can now decide on two strategies, or even a combination of these:

- by taking a new sample of 59, he can perform discovery sampling

with p } -39. and p -57., ( table 2) which would have required

99 samplé-items wi~hóut Bayesian inference;

- by taking a new sample of 50, he can perform discovery sampling with pt 1-57. and Q } 1-19., ( table 3) which would have required

90 samp~e items wi~hout Bayesian i nference.

(13)

'able 2. Sample si~es i n L-!avesian discove~v sarnplinp

prior: ~-rppn!' lirnit; -.. ~~ : canf .i dence 1 evel 9~,~

conf i dence 1 evel of po~~teri ai" upC~er -- mi t i s 9~'L,

F?osteriorTwithout T comparibi:i-v factor r

lirnit IPaves I 1!iï!i: 9~iï. 3ii;; 7~:~': b!i7. ~Ui: 4!í;; ?ir;: ~r:r;; 1~!;. - 1,---`9q :41 ~47 ~~- -~`~ ~64 :7ii ~76 ~?8.~ ~SL! ~94 ',':: 149 91 9? 1u' 1C~9 114 1~~r 1'~b 1?~ 1?~7 144 :7. 99 41 47 ~~' ~Q 64 7ii 16 8~ 8~a 94 16 ~8 '~: ~9 4J J1 ~7 b, b9 tE~ ?4 -~ -5:: ~9 ~! 7 1.- í9 ~4- ?ii ~6 4~ 4S ~~4

~able .-. 5ample sizes in 6avesian discoverv samplinq prior: ~rpper limit ~ ~.

: conf i dence 1 evel 95'I.

confidence leve.l of posterior upoer iimit is 99I

~aosterierIwitho~!t I cornparibilitv factor f

limit IBaves T liiii~ 9Oi; 8r?ï. 7!i': b~ii: ~~!ï. 4~!;'. ~ii~ ',?iiy. 1~!~:

1': 459 401 4ii7 41' 419 4~4 4~i~ 4.'6 44~ 448 4~~4

,"y, ~~8 1?~! 176 1E~ï' 195 19' 199 ~U~ ~11 ~1?

~~--:ï, 15~ 94 lOl! 1~!b 11L 117 1~- 1:9 1.'~ 141 14?

4Y. 11~ ~5 61 67 7- ?3 F34 9i! 96 1~!~ 1!iE

(14)

9

References

Van Batenburg, P.C., Kriens, J. and R.H.Veenstra ( 1987), Average Outgoing Quality Limit - a revised and

improved version. To be published by the Universíty of Amsterdam in 1987.

Cox, D.R. and E.J.Snell ( 1979), On Sampling and estimation of

rare errors, Biometrika 66, pp 125-132.

Kriens, J. (1974), Statistical sampling for auditing and

accounting purposes. University of Novi Sad

(Yugoslavia). .

Kriens, J. (1979), Statistical Sampling ín Auditing.

Proceedings of the 42nd Session of the International Statistical Institute, vol. XLVIII, book 3, pp 423 -437, Manila.

Kriens, J. and A.C.Dekkers (1979), Statistical Sampling in Auditing (Steekproeven in de accountantscontrole), Stenfert Kroese, Leiden (In Dutch .

Kriens, J. and R.H.Veenstra (1985), Statistical Sampling in internal control by using the AOQL-System. The Statistician, 34, pp 383 - 390.

(15)

i

IN 1986 REEDS VERSCHENEN

O1 F. van der Ploeg

Monopoly Unions, Investment and Employment: Benefits of Contingent Wage Contracts

02 J. van Mier

Gewone differentievergelijkingen met niet-constante coëfficiLnten en partiële differentievergelijkingen (vervolg R.T.D. no. 84.32)

03 J.J.A. Moors

Het Bayesiaanse Cox-Snell-model by accountantscontroles

04 G.J. van den Berg

Nonstationarity in job search theory 05 G.J. van den Berg

Small-sample properties of estimators of the autocorrelation coeffi-cient

06 P. Kooreman

Huishoudproduktie en de analyse van tijdsbesteding 0~ R.J. Casimir

DSS, Information systems and Management Games 08 A.J. van Reeken

De ontwikkeling van de informatiesysteemontwikkeling

09 E. Berns

Filosofie, economie en macht 10 Anna Harat~czyk

The Comparative Analysis of the Social Development of Cracow, Bratis-lava, and Leipzig, in the period 1960-1985

11 A.J. van Reeken

Over de relatie tussen de begrippen: offer, resultaat, efficiëntie, effectiviteit, produktiviteit, rendement en kwaliteit

12 A.J. van Reeken

Groeiende Index van Informatiesysteemontwikkelmethoden

A note on Types of Information Systems 14 A.J. van Reeken

Het probleem van de Componentenanalyse in ISAC

15 A. Kapteyn, P. Kooreman, R.J.M. Willemse

Some methodological issues in the implementatíon of subjective pover-ty definitions

16 I. Woittiez

(16)

ii

1~ A.J. van Reeken

A new concept for allocation of joint costs: Stepwise reduction of costs proportional to joint savings

Naar een andere eanpak in de systemering

19 J.G. de Boer, N.J.W. Greveling

Informatieplanning met behulp van referentie-informatiemodellen 1. Totstandkoming bedrijfsinformatiemodellen

20 J.G. de Boer, N.J.W. Greveling

Informatieplanning met behulp van referentie-informatiemodellen 2.

Een methode voor informatieplanning

21 W. Reijnders

Direct Marketing: "Van tactiek naar strategie" 22 H. Gremmen

(17)

1V

IN 198~ REEDS VERSCHENEN O1 J.J.A. Moors

Analytical Properties of Bayesian Cox-Snell Bounds in Auditing 02 H.P.A. Mulders, A.J. van Reeken

DATAAL - een hulpmiddel voor onderhoud van gegevensverzamelingen 03 Drs. A.J. van Reeken

(18)