Econometrics
See recent articles
Showing new listings for Thursday, 14 November 2024
- [1] arXiv:2411.07031 (cross-list from cs.AI) [pdf, other]
-
Title: Evaluating the Accuracy of Chatbots in Financial LiteratureSubjects: Artificial Intelligence (cs.AI); Econometrics (econ.EM)
We evaluate the reliability of two chatbots, ChatGPT (4o and o1-preview versions), and Gemini Advanced, in providing references on financial literature and employing novel methodologies. Alongside the conventional binary approach commonly used in the literature, we developed a nonbinary approach and a recency measure to assess how hallucination rates vary with how recent a topic is. After analyzing 150 citations, ChatGPT-4o had a hallucination rate of 20.0% (95% CI, 13.6%-26.4%), while the o1-preview had a hallucination rate of 21.3% (95% CI, 14.8%-27.9%). In contrast, Gemini Advanced exhibited higher hallucination rates: 76.7% (95% CI, 69.9%-83.4%). While hallucination rates increased for more recent topics, this trend was not statistically significant for Gemini Advanced. These findings emphasize the importance of verifying chatbot-provided references, particularly in rapidly evolving fields.
- [2] arXiv:2411.08188 (cross-list from stat.ME) [pdf, html, other]
-
Title: MSTest: An R-Package for Testing Markov Switching ModelsSubjects: Methodology (stat.ME); Econometrics (econ.EM); Computation (stat.CO)
We present the R package MSTest, which implements hypothesis testing procedures to identify the number of regimes in Markov switching models. These models have wide-ranging applications in economics, finance, and numerous other fields. The MSTest package includes the Monte Carlo likelihood ratio test procedures proposed by Rodriguez-Rondon and Dufour (2024), the moment-based tests of Dufour and Luger (2017), the parameter stability tests of Carrasco, Hu, and Ploberger (2014), and the likelihood ratio test of Hansen (1992). Additionally, the package enables users to simulate and estimate univariate and multivariate Markov switching and hidden Markov processes, using the expectation-maximization (EM) algorithm or maximum likelihood estimation (MLE). We demonstrate the functionality of the MSTest package through both simulation experiments and an application to U.S. GNP growth data.
- [3] arXiv:2411.08491 (cross-list from stat.ME) [pdf, other]
-
Title: Covariate Adjustment in Randomized Experiments Motivated by Higher-Order Influence FunctionsComments: 62 pages, 8 figuresSubjects: Methodology (stat.ME); Econometrics (econ.EM); Statistics Theory (math.ST)
Higher-Order Influence Functions (HOIF), developed in a series of papers over the past twenty years, is a fundamental theoretical device for constructing rate-optimal causal-effect estimators from observational studies. However, the value of HOIF for analyzing well-conducted randomized controlled trials (RCT) has not been explicitly explored. In the recent US Food \& Drug Administration (FDA) and European Medical Agency (EMA) guidelines on the practice of covariate adjustment in analyzing RCT, in addition to the simple, unadjusted difference-in-mean estimator, it was also recommended to report the estimator adjusting for baseline covariates via a simple parametric working model, such as a linear model. In this paper, we show that an HOIF-motivated estimator for the treatment-specific mean has significantly improved statistical properties compared to popular adjusted estimators in practice when the number of baseline covariates $p$ is relatively large compared to the sample size $n$. We also characterize the conditions under which the HOIF-motivated estimator improves upon the unadjusted estimator. Furthermore, we demonstrate that a novel debiased adjusted estimator proposed recently by Lu et al. is, in fact, another HOIF-motivated estimator under disguise. Finally, simulation studies are conducted to corroborate our theoretical findings.
Cross submissions (showing 3 of 3 entries)
- [4] arXiv:2206.04902 (replaced) [pdf, html, other]
-
Title: Forecasting macroeconomic data with Bayesian VARs: Sparse or dense? It depends!Subjects: Econometrics (econ.EM); Applications (stat.AP); Methodology (stat.ME)
Vector autogressions (VARs) are widely applied when it comes to modeling and forecasting macroeconomic variables. In high dimensions, however, they are prone to overfitting. Bayesian methods, more concretely shrinkage priors, have shown to be successful in improving prediction performance. In the present paper, we introduce the semi-global framework, in which we replace the traditional global shrinkage parameter with group-specific shrinkage parameters. We show how this framework can be applied to various shrinkage priors, such as global-local priors and stochastic search variable selection priors. We demonstrate the virtues of the proposed framework in an extensive simulation study and in an empirical application forecasting data of the US economy. Further, we shed more light on the ongoing ``Illusion of Sparsity'' debate, finding that forecasting performances under sparse/dense priors vary across evaluated economic variables and across time frames. Dynamic model averaging, however, can combine the merits of both worlds.
- [5] arXiv:2401.01804 (replaced) [pdf, html, other]
-
Title: Efficient Computation of Confidence Sets Using Classification on Equidistributed GridsSubjects: Econometrics (econ.EM); Machine Learning (stat.ML)
Economic models produce moment inequalities, which can be used to form tests of the true parameters. Confidence sets (CS) of the true parameters are derived by inverting these tests. However, they often lack analytical expressions, necessitating a grid search to obtain the CS numerically by retaining the grid points that pass the test. When the statistic is not asymptotically pivotal, constructing the critical value for each grid point in the parameter space adds to the computational burden. In this paper, we convert the computational issue into a classification problem by using a support vector machine (SVM) classifier. Its decision function provides a faster and more systematic way of dividing the parameter space into two regions: inside vs. outside of the confidence set. We label those points in the CS as 1 and those outside as -1. Researchers can train the SVM classifier on a grid of manageable size and use it to determine whether points on denser grids are in the CS or not. We establish certain conditions for the grid so that there is a tuning that allows us to asymptotically reproduce the test in the CS. This means that in the limit, a point is classified as belonging to the confidence set if and only if it is labeled as 1 by the SVM.
- [6] arXiv:2104.04716 (replaced) [pdf, other]
-
Title: Selecting Penalty Parameters of High-Dimensional M-Estimators using Bootstrapping after Cross-ValidationComments: 164 pages, 14 figuresSubjects: Statistics Theory (math.ST); Econometrics (econ.EM)
We develop a new method for selecting the penalty parameter for $\ell_{1}$-penalized M-estimators in high dimensions, which we refer to as bootstrapping after cross-validation. We derive rates of convergence for the corresponding $\ell_1$-penalized M-estimator and also for the post-$\ell_1$-penalized M-estimator, which refits the non-zero entries of the former estimator without penalty in the criterion function. We demonstrate via simulations that our methods are not dominated by cross-validation in terms of estimation errors and can outperform cross-validation in terms of inference. As an empirical illustration, we revisit Fryer Jr (2019), who investigated racial differences in police use of force, and confirm his findings.
- [7] arXiv:2406.04191 (replaced) [pdf, html, other]
-
Title: Strong Approximations for Empirical Processes Indexed by Lipschitz FunctionsSubjects: Statistics Theory (math.ST); Econometrics (econ.EM); Probability (math.PR); Methodology (stat.ME)
This paper presents new uniform Gaussian strong approximations for empirical processes indexed by classes of functions based on $d$-variate random vectors ($d\geq1$). First, a uniform Gaussian strong approximation is established for general empirical processes indexed by possibly Lipschitz functions, improving on previous results in the literature. In the setting considered by Rio (1994), and if the function class is Lipschitzian, our result improves the approximation rate $n^{-1/(2d)}$ to $n^{-1/\max\{d,2\}}$, up to a $\operatorname{polylog}(n)$ term, where $n$ denotes the sample size. Remarkably, we establish a valid uniform Gaussian strong approximation at the rate $n^{-1/2}\log n$ for $d=2$, which was previously known to be valid only for univariate ($d=1$) empirical processes via the celebrated Hungarian construction (Komlós et al., 1975). Second, a uniform Gaussian strong approximation is established for multiplicative separable empirical processes indexed by possibly Lipschitz functions, which addresses some outstanding problems in the literature (Chernozhukov et al., 2014, Section 3). Finally, two other uniform Gaussian strong approximation results are presented when the function class is a sequence of Haar basis based on quasi-uniform partitions. Applications to nonparametric density and regression estimation are discussed.
- [8] arXiv:2409.01911 (replaced) [pdf, html, other]
-
Title: Variable selection in convex nonparametric least squares via structured Lasso: An application to the Swedish electricity distribution networksSubjects: Methodology (stat.ME); Econometrics (econ.EM)
We study the problem of variable selection in convex nonparametric least squares (CNLS). Whereas the least absolute shrinkage and selection operator (Lasso) is a popular technique for least squares, its variable selection performance is unknown in CNLS problems. In this work, we investigate the performance of the Lasso estimator and find out it is usually unable to select variables efficiently. Exploiting the unique structure of the subgradients in CNLS, we develop a structured Lasso method by combining $\ell_1$-norm and $\ell_{\infty}$-norm. The relaxed version of the structured Lasso is proposed for achieving model sparsity and predictive performance simultaneously, where we can control the two effects--variable selection and model shrinkage--using separate tuning parameters. A Monte Carlo study is implemented to verify the finite sample performance of the proposed approaches. We also use real data from Swedish electricity distribution networks to illustrate the effects of the proposed variable selection techniques. The results from the simulation and application confirm that the proposed structured Lasso performs favorably, generally leading to sparser and more accurate predictive models, relative to the conventional Lasso methods in the literature.