[2311.18145] Sparsifying generalized linear models
source link: https://arxiv.org/abs/2311.18145
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Computer Science > Data Structures and Algorithms
Sparsifying generalized linear models
We consider the sparsification of sums F : \mathbb{R}^n \to \mathbb{R} where F(x) = f_1(\langle a_1,x\rangle) + \cdots + f_m(\langle a_m,x\rangle) for vectors a_1,\ldots,a_m \in \mathbb{R}^n and functions f_1,\ldots,f_m : \mathbb{R} \to \mathbb{R}_+. We show that (1+\varepsilon)-approximate sparsifiers of F with support size \frac{n}{\varepsilon^2} (\log \frac{n}{\varepsilon})^{O(1)} exist whenever the functions f_1,\ldots,f_m are symmetric, monotone, and satisfy natural growth bounds. Additionally, we give efficient algorithms to compute such a sparsifier assuming each f_i can be evaluated efficiently.
Our results generalize the classic case of \ell_p sparsification, where f_i(z) = |z|^p, for p \in (0, 2], and give the first near-linear size sparsifiers in the well-studied setting of the Huber loss function and its generalizations, e.g., f_i(z) = \min\{|z|^p, |z|^2\} for 0 < p \leq 2. Our sparsification algorithm can be applied to give near-optimal reductions for optimizing a variety of generalized linear models including \ell_p regression for p \in (1, 2] to high accuracy, via solving (\log n)^{O(1)} sparse regression instances with m \le n(\log n)^{O(1)}, plus runtime proportional to the number of nonzero entries in the vectors a_1, \dots, a_m.
Subjects: | Data Structures and Algorithms (cs.DS); Functional Analysis (math.FA) |
Cite as: | arXiv:2311.18145 [cs.DS] |
(or arXiv:2311.18145v1 [cs.DS] for this version) | |
https://doi.org/10.48550/arXiv.2311.18145 |
Recommend
-
59
The PALM tree algorithm for partially additive (generalized) linear model trees is introduced along with the R package palmtree. One potential application is modeling of treatment-subgroup interactions while adjusting for...
-
37
Linear Regression or Generalized Linear Model? Choosing the right algorithm for machine learning applications
-
19
AbstractR package flexmix provides flexible modelling of finite mixtures of regression models using the EM algorithm. Several new features of the software such as fixed and nested varying effects for mixtur...
-
6
Damian Edelberg December 12, 2023 1 minute read...
-
5
Computer Science > Logic in Computer Science [Submitted on 24 Nov 2023] Refinement Proofs in Rust Using Ghost Locks...
-
5
Pensieve: 2311 2023-11-26 13:51 这个月有点懒, 不太想读书. 读完了两本, 一本是记录老北京历史的府门儿·宅门儿, 本来期望能读到更多普通老百...
-
4
Computer Science > Computation and Language [Submitted on 28 Nov 2023 (v1), last revised...
-
15
Sap B1 Patch 2311 sluggish and slowing down system ...
-
2
Computer Science > Computational Complexity [Submitted on 1 Nov 2023 (v1), last revised 1...
-
3
Computer Science > Computation and Language [Submitted on 24 Nov 2023 (v1), last revised...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK