Kode R dan Selang Kepercayaan Korelasi Berdasarkan Empirical Likelihood serta Implementasinya pada Korelasi PDRB dengan Jumlah Kasus Covid-19 di Indonesia

Suliadi Suliadi

doi:10.29313/statistika.v22i1.357

Authors

Suliadi Suliadi +6285846252822

DOI:

https://doi.org/10.29313/statistika.v22i1.357

Keywords:

Empirical Likelihood, Koefisien Korelasi Pearson, Kode R, PDRB, Covid-19

Abstract

ABSTRAK

Korelasi merupakan suatu ukuran untuk melihat kekuatan hubungan linier di antara dua variabel. Ada beberapa metode yang biasa digunakan untuk mengukur korelasi, diantaranya adalah korelasi pearson, peringkat spearman dan kendall tau. Metode yang biasa digunakan untuk mengukur korelasi untuk variabel bertipe numerik adalah korelasi pearson. Metode ini mensyaratkan bahwa kedua variabel tersebut berdistribusi normal bivariat. Oleh karena itu inferensia korelasi pearson hasilnya akan valid jika asumsi tersebut terpenuhi. Dalam praktek, sering kali kenormalan data tersebut tidak dapat dipenuhi. Satu pendekatan telah diajukan dalam pembuatan selang kepercayaan berdasarkan empirical likelihood. Metode ini adalah metode bebas distribusi yang artinya tidak ada asumsi bahwa data harus berdistribusi tertentu. Dalam artikel ini kami membahas penyusunan selang kepercayaan korelasi pearson berdasarkan metode empirical likelihood dan juga menyediakan kode perintah R Language untuk pembuatan selang kepercayaan tersebut. Kami menerapkan metode tersebut pada kasus hubungan antara PDRB dan jumlah kasus Covid-19 berdasarkan data provinsi di Indonesia Tahun 2020. Kami mendapatkan adanya hubungan yang sangat kuat antara PDRB dengan jumlah kasus Covid-19 di Indonesia dengan korelasi sebesar 0.939 dan dengan metode tersebut diperoleh batas bawah selang kepercayaan 99% adalah 0.872 dan batas atasnya adalah 0.962.

ABSTRACT

Correlation is a measure to see the strength of the linear relationship between two variables. There are several methods commonly used to measure correlation, including Pearson's correlation, Spearman's rank and Kendall tau. The method commonly used to measure correlations for numeric type variables is Pearson correlation. This method requires that the two variables have a bivariate normal distribution. Therefore, the Pearson correlation inferential results will be valid if these assumptions are met. In practice, often the normality of the data cannot be met. One approach has been proposed in constructing confidence intervals based on empirical likelihood. This method is a
distribution-free method, which means that there is no assumption that the data must have a certain distribution. In this article, we discuss the construction of the Pearson correlation confidence interval based on the empirical likelihood method and also provide the R Language command code for constructing the confidence interval. We apply this method to the case of the relationship between GRDP and the number of Covid-19 cases based on provincial data in Indonesia in 2020. We found a very strong relationship between GRDP and the number of Covid-19 cases in Indonesia with a correlation of 0.939 and with this method the lower limit of the 99% confidence interval was 0.872 and the upper limit was 0.962.

References

BPS. (2021). Produk Domestik Regional Bruto Provinsi-Provinsi di Indonesia Menurut Pengeluaran 2016-2020. Badan Pusat Statistik, Indoesia.
Chen, S.X., I. Van Keilegom. (2009). A review on empirical likelihood methods for Regression. TEST Vol. 18, p415–447.
Fisher, R.A. (1915). Frequency Distribution of the Values of the Correlation Coefficient in Samples from an Indefinitely Large Population. Biometrika, Vl. 10, 507–521.
Gonzalez-Estrada, E. and J.A. Villasenor-Alva. (2013). mvShapiroTest: Generalized Shapiro-Wilk test for multivariate normality. R package version 1.0.
https://CRAN.R-project.org/package=mvShapiroTest.
Hasselman, B. (2018). nleqslv: Solve Systems of Nonlinear Equations. R package version 3.3.2.
https://CRAN.R-project.org/package=nleqslv.
Hotelling, H. (1953). New Light on the Correlation Coefficient and its Transform. Journal of the Royal Statistical Society, Series B, Vol. 15,p193–232.
Hu, H., A. Jung & G. Qin. (2018). Interval Estimation for the Correlation Coefficient. The American Statistician, DOI: 10.1080/00031305.2018.1437077.
Nie, L., Y. Chen, and H. Chu. (2011). Asymptotic Variance of Maximum Likelihood Estimator for the Correlation Coefficient from a BVN Distribution withOne Variable Subject to Censoring. Journal of Statistical Planning and Inference, Vol. 141, p392–401.
Owen, A.B. (1988). Empirical Likelihood Ratio Confidence Intervals for a Single Functional. Biometrika, Vol. 75, p237-249.
Owen, A.B. (1991). Empirical Likelihood for Linear Models. The Annals of Statistics. Vol. 19, p1725-1747.
Owen, A. B. (2001). Empirical likelihood. Chapman & Hall/CRC. New York, USA.
Sun, Y. and A.C.M. Wong. (2007). Interval Estimation for the Normal Correlation Coefficient. Statistic & Probability Letters, Vol. 77, p1652–1661.
Tian, L. and G.E. Wilding. (2008). Confidence Interval Estimation of a Common Correlation Coefficient. Computational Statistics and Data Analysis, Vol. 52, p4872–4877.
Weerakkody, G.J., and S. Givaruangsawat. (1995). Estimating the Correlation Coefficient in the Presence of Correlated Observations from a Bivariate Normal Population. Communication in Statistics—Theory and Methods, Vol. 24, p1705–1719.