Rasmus Frigaard Lemvig
  • Home
  • Education
  • Work experience
  • Volunteer work
  • Manuscripts
  • Projects
  • About me

Manuscripts

In this tab is a collection of all available manuscripts.

Consistency of Honest Decision Trees and Random Forests

Project with Martin Bladt available here.

Abstract: We study various types of consistency of honest decision trees and random forests in the regression setting. In contrast to related literature, our proofs are elementary and follow the classical arguments used for smoothing methods. Under mild regularity conditions on the regression function and data distribution, we establish weak and almost sure convergence of honest trees and honest forest averages to the true regression function, and moreover we obtain uniform convergence over compact covariate domains. The framework naturally accommodates ensemble variants based on subsampling and also a two-stage bootstrap sampling scheme. Our treatment synthesizes and simplifies existing analyses, in particular recovering several results as special cases. The elementary nature of the arguments clarifies the close relationship between data-adaptive partitioning and kernel-type methods, providing an accessible approach to understanding the asymptotic behavior of tree-based methods.

Local estimation of transition rates of jump processes through discretization

Project with Martin Bladt available here

Abstract: We investigate the Poisson regression method for Markov and semi-Markov jump processes from a nonparametric angle, allowing the lengths of the time and duration intervals in the partition to vary with the number of observations. Imposing no structural assumptions on the true intensities, we obtain asymptotic normality of the occurence/exposure rates under appropriate shrinking conditions on the partition lengths. We derive asymptotic normality results for both Markov and semi-Markov models using only classical central limit theorems and elementary results for counting processes. All results are illustrated on both simulated and real data.