Bivariate Dirichlet Distribution: Analysis of Statistical Properties and Parameter Estimation Methods with a Simulation Study
التوزيع الثنائي المتغير ديريشليت: تحليل الخصائص الإحصائية وطرق تقدير المعلمات مع دراسة محاكاة
Sara.A. El_Warrad1, Mohamed Amraja Mohamed2, and Mohammed A. Asselhab3
1 Statistical Department, Faculty of Arts and Sciences, Benghazi University, Libya.
sara.alwarrad@uob.edu.ly
2 Statistical Department-faculty of science- Sebha University, Libya.
3 Mathematical Department-faculty of Arts and sciences- Sebha University, Libya.
DOI: https://doi.org/10.53796/hnsj66/12
Arabic Scientific Research Identifier: https://arsri.org/10000/66/12
Volume (6) Issue (6). Pages: 162 - 174
Received at: 2025-05-07 | Accepted at: 2025-05-15 | Published at: 2025-06-01
Abstract: In this paper, we study the bivariate Dirichlet distribution and discusses some important statistical properties such as product moments, covariance, and the correlation coefficient. It also introduces a simple method for generating random pairs (X, Y) based on the marginal distribution of X and the conditional distribution of Y|X=x. Furthermore, Estimation of the parameters for the bivariate Dirichlet distribution are derived using method of moments (MME). Maximum likelihood estimator (MLE) are also presented. Finally, it includes a simulation to evaluate the efficiency of the estimators based on bias and mean squared error.
Keywords: Bivariate Dirichlet distribution, Maximum likelihood estimator, moments estimators.
المستخلص: يتناول هذا البحث التوزيع الثنائي المتغير ديريشليت (Bivariate Dirichlet)، حيث يتم استعراض بعض الخصائص الإحصائية المهمة مثل لحظات الجداء، والتباين المشترك، ومعامل الارتباط. كما يقدم البحث طريقة مبسطة لتوليد أزواج عشوائية (X, Y) بالاعتماد على التوزيع الهامشي لـ X والتوزيع الشرطي لـ Y عند X = x. بالإضافة إلى ذلك، تم اشتقاق تقديرات معلمات التوزيع باستخدام طريقة العزوم (MME)، كما تم تقديم طريقة التقدير بالاحتمالية العظمى (MLE). ويختتم البحث بمحاكاة لقياس كفاءة التقديرات من حيث الانحياز ومتوسط مربع الخطأ (MSE).
الكلمات المفتاحية: التوزيع الثنائي المتغير ديريشليت، مقدّر الاحتمالية العظمى، مقدّرات العزوم.
1. Introduction
The bivariate beta distribution is one of the basic distributions in statistics, as it attracts useful applications in several areas; for example, in the modeling of the proportions of substances in a mixture, brand shares, i.e the proportions of brands of some consumer product that are bought by customers (Chatfiled [1975]), proportions of the electorate voting for the candidate in a two candidate election (Hoyer and Mayer [1976]) and the dependence between two soil strength parameters (A_Grivas and Asaoka
[1982]).
Bivariate beta distributions have also been used extensively as a prior in Bayesian statistics (see, for example, Apostolakis and Moieni [1987]). The Dirichlet distribution is multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution . Several applications of the Dirichlet distribution are discussed by Wilks [1962], Goodhardt et al [1984], Lange [1995], Bouguila et al [2004], Null [2009] and Wang et al [2011]. Estimation of parameters of bivariate Dirichlet distribution by maximum likelihood is discussed by Nadarajah and Kotz [2007] and estimation parameters of the Dirichlet distribution based on entropy by Sahin et al [2023].
This paper commences in Section 2 with an exposition of the marginal distributions associated with the bivariate Dirichlet distribution. This section further provides a thorough examination of their statistical properties, specifically addressing moments, product moments, covariance, and the correlation coefficient. Following this, Section 3 introduces the conditional distributions and explores their moment characteristics. A practical and accessible approach for generating random varieties from the distribution is outlined in Section 4. Section 5 is dedicated to the derivation of estimators for the distribution’s parameters utilizing the method of moments. In addition, the maximum likelihood estimators (MLE) are presented and discussed. To conclude, Section 6 presents a simulation study conducted to assess the efficiency of the proposed estimators. As an alternative, this final section may present a numerical illustration to validate the findings presented in Sections 4 and 5.
The present work focuses on the bivariate Dirichlet distribution, parameterized by positive values a,b,c, and d, which is defined by the subsequent probability density function (pdf):
(1)
and
(2)
The distribution in (1) is the bivariate form of the Connor and Mosimann’s generalized Dirichlet distribution (see Connor and Mosimann [1969]). It has several applications in many areas, including Bayesian statistics, contingency tables, correspondence analysis, environmental sciences, forensic sciences, geochemistry, image analysis and statistical decision theory (see Gupta and Nadarajah [2004] for illustrations of some of these application areas).
We will define some properties estimator. these properties will help us in deciding whether one estimator is better than anther.
Definition 1: The point estimator is an unbiased estimator for the parameter if
If the estimator is not unbiased, then the difference
is called the bias of estimator .
Definition 2: The mean squared error (MSE) of an estimator of the parameter is defined as
The mean squared error (MSE) can be rewritten as follows :
The calculations throughout this paper involve some special function, including the beta type I
(3)
(4)
The rth moment of X is
(5)
The Gauss hypergeometric function
(6)
where which is given in a series form by
(7)
where and denotes the ascending factorial.
The hypergeometric function type I
(8)
where (see Gupta and Nagar [ 2000], p298).
The rth moment of X is
(9)
where (for proof see Nagar.D and Alvarez,A.J [2005])
The properties of the above special function can be found in (Gradshteyn and Ryzhik [1980] , p 284,984,286,1040 and 298).
2. Marginal Distributions
This section focuses on the derivation and analysis of the marginal distributions of the components X and Y of the bivariate Dirichlet distribution given by its probability density function (pdf) in Equation (1). We further explore key statistical properties of these marginal distributions, specifically their moments, product moments, covariance, and correlation coefficient.
2.1. The Marginal Distribution of X
Integrating Equation (1), with respect to y, we obtain the marginal pdf of X given by
Note that the marginal pdf of X is see(3).
From Equation (5) , we find that
(10)
and
(11)
2.2. The Marginal Distribution of Y
Integrating Equation (1), with respect to x, we obtain the marginal pdf of Y given by
(12)
where andis defined in (6)
Note that the marginal pdf of Y is see (8).
From Equation (9) , we have
(13)
(14)
2.3. Product Moments
Theorem1: If X and Y are jointly distributed random variables with the joint pdf in Equation (1), then
(15)
for
Proof: Knowing that
(16)
and substituting with Equation (1) into Equation (16), we get
Using the transformation we get
Using Equation (4) in the above equation , we obtain Equation (15).
This completes the proof of the Theorem.
Theorem2:If X and Y are jointly distributed random variables with the joint pdf in Equation (1), then the correlation coefficient of X and Y is given by
(17)
Proof: First to compute the covariance of and given by
From Equations (15), (10) and (13), we get
(18)
Now, we determine the correlation coefficient of and
Using Equations (18), (11) and (14), we obtain Equation (17).
3. The Conditional Density Functions:
In this section, we study the conditional distributions of X and Y are jointly distributed with the pdf (1).we also derive the conditional moments.
Theorem 3: If X and Y are the jointly distributed random variables with the joint pdf (1), then the conditional pdf of Y given X=x is given by
(19)
Equivalently,
(20)
Proof: The conditional density function is
=
Using (1) and (3), we have
Which can be rewritten as
leading to the result given in (19).
Obviously, the result in (20) follows, directly.
Theorem 4: If X and Y are the jointly distributed random variables with the joint pdf (1), then the conditional pdf of X given Y =y is given by
(21)
Equivalently,
(22)
where
Proof: The conditional density function of is obtained by diving in Equation (1) by in Equation (12). Thus
which leads to the result in Equation (21).
The conditional distribution of X/(1-Y) given Y=y in Equation (22) then follows.
Note that the result (20) belongs to the standard beta family with parameters and , and the result (22) belongs to( Libby and Novick’s [1982] ) generalized beta family with parameters and .
Theorem 5: If X and Y are the jointly distributed random variables with the joint pdf (1), then
(23)
and
(24)
Proof: Since the distribution of given is , then from Equation (5) ,we have
(25)
and
(26)
Equations (25) and (26) can be simplified to obtain Equations (23) and (24).
From Equation (23), we see that with non-homoscedastic variance.
Theorem 6: If X and Y are the jointly distributed random variables with the joint pdf (1), then
(27)
and
(28)
Proof: The first moment of is
=
where
Solving the above integral by using (6),we get
(29)
Next, we determine the second moment of
Using Equation (6),we get
Then the conditional variance is
(30)
Equations (29) and (30) can be simplified to obtain Equations (27) and (28).
4. Algorithm for Generation of (X,Y) Observations
This section details a straightforward algorithm for simulating observations (X,Y) from the bivariate Dirichlet distribution defined by its probability density function (pdf) in Equation (1). As demonstrated by the marginal distribution of X being Beta(a,c) (Equation 3) and the conditional distribution of being Beta(b,d) (Equation 20), we can exploit this hierarchical structure for data generation. The following algorithm provides a simple method to obtain bivariate observations from the target distribution:
Algorithm 1
Step 1: Generate a value for X from a Beta distribution with parameters a and c.
Step 2: Generate a value for T independently from a Beta distribution with parameters b and d.
Step 3: Calculate the corresponding value for Y using the transformation Y=T(1−X).
Step 4: The resulting pair (X,Y) constitutes a generated observation from the bivariate Dirichlet distribution.
5. Parameter Estimation
This section focuses on the estimation of the parameters of the probability density function (pdf) provided in Equation (1). Specifically, we derive the estimators using the method of moments and subsequently present the maximum likelihood estimators (MLEs).
5.1 Method of moments estimation
Suppose is a random sample from the distribution (1). Using the first two moments of the marginal distribution of X and the conditional distribution of Y/(1-x) given X=x, we derive the estimators of a,c and b,d respectively, as follows:
If
For obtaining estimators of, we set
(31)
and
(32)
For obtains estimators of b and d, we set
(33)
(34)
Solving simultaneously Equations ((31)- (34)) for a,b,c and d we get
Respectively the corresponding method of moments estimators (MMEs)
(35)
(36)
(37)
and
(38)
where ,and .
Clearly and are moments estimators, while and are approximate moments estimators.
5.2. The Maximum Likelihood (MLE)
Let be a random sample from the distribution (1). Nadarajah and Kotz [2007] studied the maximum likelihood (MLE) estimators of the distribution (1).
Then, the maximum likelihood (MLE) estimators of are obtained by solving the following equations simultaneously for a,b,c and d.
(39)
(40)
(41)
and
(42)
where denotes the digamma function.
There is no exact solution for Equations (39)-(42), but can be solved numerically.
6. Numerical illustration
A numerical comparison of the method of moments and maximum likelihood estimation (MLE) was conducted to evaluate their performance in terms of mean squared error (MSE) and bias. This involved generating 200 independent random datasets from the bivariate Dirichlet distribution specified in Equation (1), with each dataset having a sample size of n∈{5,10,15,20,30}. The simulated observations (X,Y) were generated according to Algorithm 1. For each of the 200 simulated datasets, the parameter estimates were obtained for different combinations of a,b,c, and d by applying the formulas derived for the method of moments in Equations (35)-(38) and for the maximum likelihood estimation (MLE) in Equations (39)-(42). Two specific scenarios for the parameter values were examined: Case 1, where (a=2,c=1,b=4,d=3), and Case 2, where (a=4,c=3,b=2,d=1). It should be emphasized that these parameter values were chosen arbitrarily to illustrate the behavior of the estimators. In Case 1, we considered a parameter setting where a,c < b,d while in Case 2, the opposite relationship a,c > b,d was investigated. The resulting bias and MSE for the parameter estimators in both cases are presented in Tables 1 through 4. All numerical computations were performed using a MATLAB program.
Table(1 ) The Bias of the estimators of when (a=2,c=1,b=4 and d=3).
Parameters Estimators |
a |
b |
c |
d |
|
n=5 |
MME |
3.0309 |
5.5273 |
1.3775 |
4.2905 |
MLE |
2.9610 |
5.7576 |
1.4210 |
4.4740 |
|
n=10 |
MME |
0.8400 |
1.5475 |
0.3969 |
1.1253 |
MLE |
0.9850 |
1.6651 |
0.4724 |
1.2143 |
|
n=15 |
MME |
0.5951 |
0.8081 |
0.2322 |
0.5845 |
MLE |
0.5618 |
0.8901 |
0.2104 |
0.6429 |
|
n=20 |
MME |
0.3646 |
0.7927 |
0.1963 |
0.5669 |
MLE |
0.3517 |
0.8500 |
0.1792 |
0.6104 |
|
n=30 |
MME |
0.2519 |
0.2395 |
0.0752 |
0.2246 |
MLE |
0.2678 |
0.2933 |
0.0773 |
0.2689 |
Table(2 ) MES of the estimators of when (a=2,c=1,b=4 and d=3).
Parameters Estimators |
a |
b |
c |
d |
|
n=5 |
MME |
58.7551 |
235.3628 |
17.8314 |
118.3329 |
MLE |
51.0677 |
239.6029 |
18.2878 |
121.2371 |
|
n=10 |
MME |
5.5171 |
9.6759 |
0.8314 |
5.2540 |
MLE |
5.5924 |
10.1622 |
1.0102 |
5.5500 |
|
n=15 |
MME |
2.3824 |
5.1362 |
0.3012 |
2.7081 |
MLE |
1.9822 |
5.3426 |
0.2420 |
2.8157 |
|
n=20 |
MME |
0.7026 |
3.7932 |
0.2040 |
2.1676 |
MLE |
0.6564 |
3.8197 |
0.1848 |
2.1977 |
|
n=30 |
MME |
0.5246 |
1.7351 |
0.0979 |
0.9372 |
MLE |
0.4982 |
1.7256 |
0.0900 |
0.9362 |
Table(3 ) The Bias of the estimators of when (a=4,c=3,b=2 and d=1).
Parameters Estimators |
a |
b |
c |
d |
|
n=5 |
MME |
4.4189 |
2.9008 |
3.0896 |
1.2041 |
MLE |
4.5585 |
2.8990 |
3.2119 |
1.2610 |
|
n=10 |
MME |
1.3683 |
0.7995 |
1.1035 |
0.3186 |
MLE |
1.3931 |
0.8275 |
1.1264 |
0.3269 |
|
n=15 |
MME |
0.6717 |
0.4600 |
0.5452 |
0.1756 |
MLE |
0.7373 |
0.4763 |
0.5979 |
0.1806 |
|
n=20 |
MME |
0.5211 |
0.3506 |
0.4234 |
0.1510 |
MLE |
0.5643 |
0.3623 |
0.4583 |
0.1531 |
|
n=30 |
MME |
0.3390 |
0.2164 |
0.1855 |
0.1028 |
MLE |
0.3491 |
0.2441 |
0.1870 |
0.1135 |
Table(4 ) MES of the estimators of when (a=4,c=3,b=2 and d=1).
Parameters Estimators |
a |
b |
c |
d |
|
n=5 |
MME |
95.9743 |
91.3910 |
45.3577 |
13.6211 |
MLE |
94.0243 |
84.1383 |
45.0749 |
13.4095 |
|
n=10 |
MME |
10.2379 |
4.5377 |
6.3146 |
0.8359 |
MLE |
9.7983 |
4.3752 |
6.3432 |
0.8300 |
|
n=15 |
MME |
4.0496 |
1.3517 |
2.9493 |
0.2681 |
MLE |
4.0984 |
1.0974 |
3.0680 |
0.2465 |
|
n=20 |
MME |
3.1955 |
0.9875 |
1.7836 |
0.2326 |
MLE |
2.8719 |
0.8531 |
1.6848 |
0.1965 |
|
n=30 |
MME |
1.2952 |
0. 6279 |
0.6207 |
0.1338 |
MLE |
1.2465 |
0.5607 |
0.6959 |
0.1140 |
From Table (1), we see that the bias of MMEs for all parameters is less than of the MLEs for all values of n except for the parameter a for n=5,15 and 20, and parameter c for n=15 and 20.
However the differences in the bias are not significant.
From Table (2), we see that the MSE of MMEs of all parameters is less than that of the MLEs for all values of n except for the parameter a for n=5,15,20 and 30, parameter b for n =30, parameter c for n =15,20 and 30 and parameter d for n=30.
Again the differences in the MES between the two estimator are very small.
From Table (3), we see that the bias of MMEs is smallest that of MLEs for all parameters and all values of n except for the parameter b when n=5.
From Table (4), we see that the differences in the MSE of the two estimator are very small, but however the MMEs have smaller MSE in most cases.
We may conclude that both estimators perform approximately the same regarding biasness and MSE.
References
[1] A-Grivas, D. and Asaoka,A. (1982). Slope safety prediction under static and seismic loads, Journal of the Geotechnical Engineer Division Proceedings of the American Society of Civil Engineers, 108, 713-729
[2] Apostolakis, F.J and Moieni, P. (1987). The foundations of models of dependence in probability safety assessment, Reliability Engineering ,18, 177-195.
[3] Bouguila, N et al. (2004). Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application . IEEE Transactins on Image Processing, 13(11):1533-1543.
[4] Chatfiled, C. (1975). A marking application of a characterization theorem. In: A Modern course on Distributions in Scientific Work, Vol. 2: “Model Building and Model Selection ” (G. P. Patil, S.Kotz and J. K. Ord, eds.),pp. 175-185. Reidel, Dordrecht, The Netherland.
[5] Connor, R. and Mosimann, J. (1969). Concepts of independence for proportions with a generalization of the dirichlet distribution. Journal of the American Statistical Association, 64, 194–206.
[6] Godhart, et al. (1984). The Dirichlet: A comprehensive model of bying behavior. Journal of the Royal Statistical Society. Series A(General), pages 621-655.
[7] Gradshteyn, I. and Ryzhik, I.(1980). Table of Integrals, Series, Products, Academic Press.
[8] Gupta, A. K. and Nagar,D.K. ( 2000). Matrix Variate Distributions. New York: Chapman and Hall/ CR.
[9] Gupta, A. K. and Nadarajah, S. ( 2004). ”Handbook of Beta Distribution and Its Applications”. Marcel Dekker, New York.
[10] Hoyer,R. W. and Mayer, L. (1976). The equivalence of various objective functions in a stochastic model of electoral competition. Technical Report, No.114,Series2, Department of Statistic, Princeton University.
[11] Lange, K..(1995). Applications of Dirichlet distributions to forensic match probabilities. Genetica, 96(1-2) ,107-117.
[12] Libby, D.L. and Novick, M.R.(1982). Multivariate generalized beta-distributions with applications to utility assessment . Advances in Statistical Analysis 91 ,93-106.
[13] Nadarajah, S. and Kotz, S. (2007). Proportions, sums and ratios. Advances in Statistical Analysis, 91, 93–106.
[14] Nagar, D. and Alvarez, A.J. (2005). Properties of the hypergeomatric function type I distribution. Advances and Applications in Statistical , 5, No.3, 341–351.
[15] Null, B. (2009). Modeling baseball player ability with a nested Dirichlet distribution. Journal of Quantitative Analysis in Sport , 5(2).
[16] Sahin, B et al . (2023). Parameter estimation of the Dirichlet distribution based on entropy . Aximos, 12, 947.
[17] Trick, S et al . (2023). Parameter estimation for a bivariate beta distribution with arbitrarybeta marginal and positive correlation .Metron, 81,163-180
[18] Wang, K et al . (2011). Dirichlet and related distributions .Theory, Methods and Applications, Vol888, chapter 1, John wiley and sons.
[19] Wilks, S . (1962). Mathematical statistics. New York: John wiley and sons.