Author: Trevor Hastie, Robert Tibshirani, Martin Wainwright

Publisher: CRC Press

ISBN: 1498712177

Pages: 367

Year: 2015-05-07

View: 1178

Read: 1067

Discover New Methods for Dealing with High-Dimensional Data A sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underlying signal in a set of data. Top experts in this rapidly evolving field, the authors describe the lasso for linear regression and a simple coordinate descent algorithm for its computation. They discuss the application of l1 penalties to generalized linear models and support vector machines, cover generalized penalties such as the elastic net and group lasso, and review numerical methods for optimization. They also present statistical inference methods for fitted (lasso) models, including the bootstrap, Bayesian methods, and recently developed approaches. In addition, the book examines matrix decomposition, sparse multivariate analysis, graphical models, and compressed sensing. It concludes with a survey of theoretical results for the lasso. In this age of big data, the number of features measured on a person or object can be large and might be larger than the number of observations. This book shows how the sparsity assumption allows us to tackle these problems and extract useful and reproducible patterns from big datasets. Data analysts, computer scientists, and theorists will appreciate this thorough and up-to-date treatment of sparse statistical modeling.

Author: Christophe Giraud

Publisher: CRC Press

ISBN: 1482237954

Pages: 270

Year: 2014-12-17

View: 929

Read: 1051

Ever-greater computing technologies have given rise to an exponentially growing volume of data. Today massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity, including networks, finance, and genetics. However, analyzing such data has presented a challenge for statisticians and data analysts and has required the development of new statistical methods capable of separating the signal from the noise. Introduction to High-Dimensional Statistics is a concise guide to state-of-the-art models, techniques, and approaches for handling high-dimensional data. The book is intended to expose the reader to the key concepts and ideas in the most simple settings possible while avoiding unnecessary technicalities. Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this highly accessible text: Describes the challenges related to the analysis of high-dimensional data Covers cutting-edge statistical methods including model selection, sparsity and the lasso, aggregation, and learning theory Provides detailed exercises at the end of every chapter with collaborative solutions on a wikisite Illustrates concepts with simple but clear practical examples Introduction to High-Dimensional Statistics is suitable for graduate students and researchers interested in discovering modern statistics for massive data. It can be used as a graduate text or for self-study.

Author: T.J. Hastie

Publisher: Routledge

ISBN: 1351445960

Pages: 352

Year: 2017-10-19

View: 360

Read: 592

This book describes an array of power tools for data analysis that are based on nonparametric regression and smoothing techniques. These methods relax the linear assumption of many standard models and allow analysts to uncover structure in the data that might otherwise have been missed. While McCullagh and Nelder's Generalized Linear Models shows how to extend the usual linear methodology to cover analysis of a range of data types, Generalized Additive Models enhances this methodology even further by incorporating the flexibility of nonparametric regression. Clear prose, exercises in each chapter, and case studies enhance this popular text.

Author: Havard Rue, Leonhard Held

Publisher: CRC Press

ISBN: 0203492021

Pages: 280

Year: 2005-02-18

View: 157

Read: 412

Gaussian Markov Random Field (GMRF) models are most widely used in spatial statistics - a very active area of research in which few up-to-date reference works are available. This is the first book on the subject that provides a unified framework of GMRFs with particular emphasis on the computational aspects. This book includes extensive case-studies and, online, a c-library for fast and exact simulation. With chapters contributed by leading researchers in the field, this volume is essential reading for statisticians working in spatial theory and its applications, as well as quantitative researchers in a wide range of science fields where spatial data analysis is important.

Author: Granville Tunnicliffe Wilson, Marco Reale, John Haywood

Publisher: CRC Press

ISBN: 1420011502

Pages: 340

Year: 2015-07-29

View: 793

Read: 460

Models for Dependent Time Series addresses the issues that arise and the methodology that can be applied when the dependence between time series is described and modeled. Whether you work in the economic, physical, or life sciences, the book shows you how to draw meaningful, applicable, and statistically valid conclusions from multivariate (or vector) time series data. The first four chapters discuss the two main pillars of the subject that have been developed over the last 60 years: vector autoregressive modeling and multivariate spectral analysis. These chapters provide the foundational material for the remaining chapters, which cover the construction of structural models and the extension of vector autoregressive modeling to high frequency, continuously recorded, and irregularly sampled series. The final chapter combines these approaches with spectral methods for identifying causal dependence between time series. Web Resource A supplementary website provides the data sets used in the examples as well as documented MATLAB® functions and other code for analyzing the examples and producing the illustrations. The site also offers technical details on the estimation theory and methods and the implementation of the models.

Author: Bradley Efron, R.J. Tibshirani

Publisher: CRC Press

ISBN: 0412042312

Pages: 456

Year: 1994-05-15

View: 239

Read: 1091

Statistics is a subject of many uses and surprisingly few effective practitioners. The traditional road to statistical knowledge is blocked, for most, by a formidable wall of mathematics. The approach in An Introduction to the Bootstrap avoids that wall. It arms scientists and engineers, as well as statisticians, with the computational techniques they need to analyze and understand complicated data sets.

Author: Ryan Martin, Chuanhai Liu

Publisher: CRC Press

ISBN: 1439886512

Pages: 256

Year: 2015-09-25

View: 1173

Read: 227

A New Approach to Sound Statistical Reasoning Inferential Models: Reasoning with Uncertainty introduces the authors’ recently developed approach to inference: the inferential model (IM) framework. This logical framework for exact probabilistic inference does not require the user to input prior information. The authors show how an IM produces meaningful prior-free probabilistic inference at a high level. The book covers the foundational motivations for this new IM approach, the basic theory behind its calibration properties, a number of important applications, and new directions for research. It discusses alternative, meaningful probabilistic interpretations of some common inferential summaries, such as p-values. It also constructs posterior probabilistic inferential summaries without a prior and Bayes’ formula and offers insight on the interesting and challenging problems of conditional and marginal inference. This book delves into statistical inference at a foundational level, addressing what the goals of statistical inference should be. It explores a new way of thinking compared to existing schools of thought on statistical inference and encourages you to think carefully about the correct approach to scientific inference.

Author: Peter Bühlmann, Sara van de Geer

Publisher: Springer Science & Business Media

ISBN: 364220192X

Pages: 558

Year: 2011-06-08

View: 926

Read: 337

Modern statistics deals with large and complex data sets, and consequently with models containing a large number of parameters. This book presents a detailed account of recently developed approaches, including the Lasso and versions of it for various models, boosting methods, undirected graphical modeling, and procedures controlling false positive selections. A special characteristic of the book is that it contains comprehensive mathematical theory on high-dimensional statistics combined with methodology, algorithms and illustrations with real data examples. This in-depth approach highlights the methods’ great potential and practical applicability in a variety of settings. As such, it is a valuable resource for researchers, graduate students and experts in statistics, applied mathematics and computer science.

Author: Alan Miller

Publisher: CRC Press

ISBN: 1420035932

Pages: 256

Year: 2002-04-15

View: 246

Read: 1056

Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition: A separate chapter on Bayesian methods Complete revision of the chapter on estimation A major example from the field of near infrared spectroscopy More emphasis on cross-validation Greater focus on bootstrapping Stochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presented More examples Subset Selection in Regression, Second Edition remains dedicated to the techniques for fitting and choosing models that are linear in their parameters and to understanding and correcting the bias introduced by selecting a model that fits only slightly better than others. The presentation is clear, concise, and belongs on the shelf of anyone researching, using, or teaching subset selecting techniques.

Author: Bradley Efron, Trevor Hastie

Publisher: Cambridge University Press

ISBN: 1108107958

Pages:

Year: 2016-07-20

View: 1229

Read: 385

The twenty-first century has seen a breathtaking expansion of statistical methodology, both in scope and in influence. 'Big data', 'data science', and 'machine learning' have become familiar terms in the news, as statistical methods are brought to bear upon the enormous data sets of modern science and commerce. How did we get here? And where are we going? This book takes us on an exhilarating journey through the revolution in data analysis following the introduction of electronic computation in the 1950s. Beginning with classical inferential theories - Bayesian, frequentist, Fisherian - individual chapters take up a series of influential topics: survival analysis, logistic regression, empirical Bayes, the jackknife and bootstrap, random forests, neural networks, Markov chain Monte Carlo, inference after model selection, and dozens more. The distinctly modern approach integrates methodology and algorithms with statistical inference. The book ends with speculation on the future direction of statistics and data science.

Author: Peter Bühlmann, Petros Drineas, Michael Kane, Mark van der Laan

Publisher: CRC Press

ISBN: 1482249081

Pages: 464

Year: 2016-02-22

View: 742

Read: 177

Handbook of Big Data provides a state-of-the-art overview of the analysis of large-scale datasets. Featuring contributions from well-known experts in statistics and computer science, this handbook presents a carefully curated collection of techniques from both industry and academia. Thus, the text instills a working understanding of key statistical and computing ideas that can be readily applied in research and practice. Offering balanced coverage of methodology, theory, and applications, this handbook: Describes modern, scalable approaches for analyzing increasingly large datasets Defines the underlying concepts of the available analytical tools and techniques Details intercommunity advances in computational statistics and machine learning Handbook of Big Data also identifies areas in need of further development, encouraging greater communication and collaboration between researchers in big data sub-specialties such as genomics, computational biology, and finance.

Author: P. McCullagh, John A. Nelder

Publisher: CRC Press

ISBN: 0412317605

Pages: 532

Year: 1989-08-01

View: 1119

Read: 245

The success of the first edition of Generalized Linear Models led to the updated Second Edition, which continues to provide a definitive unified, treatment of methods for the analysis of diverse types of data. Today, it remains popular for its clarity, richness of content and direct relevance to agricultural, biological, health, engineering, and other applications. The authors focus on examining the way a response variable depends on a combination of explanatory variables, treatment, and classification variables. They give particular emphasis to the important case where the dependence occurs through some unknown, linear combination of the explanatory variables. The Second Edition includes topics added to the core of the first edition, including conditional and marginal likelihood methods, estimating equations, and models for dispersion effects and components of dispersion. The discussion of other topics-log-linear and related models, log odds-ratio regression models, multinomial response models, inverse linear and related models, quasi-likelihood functions, and model checking-was expanded and incorporates significant revisions. Comprehension of the material requires simply a knowledge of matrix theory and the basic ideas of probability theory, but for the most part, the book is self-contained. Therefore, with its worked examples, plentiful exercises, and topics of direct use to researchers in many disciplines, Generalized Linear Models serves as ideal text, self-study guide, and reference.

Author: Naiyang Deng, Yingjie Tian, Chunhua Zhang

Publisher: CRC Press

ISBN: 1439857938

Pages: 363

Year: 2012-12-17

View: 217

Read: 639

Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions presents an accessible treatment of the two main components of support vector machines (SVMs)—classification problems and regression problems. The book emphasizes the close connection between optimization theory and SVMs since optimization is one of the pillars on which SVMs are built. The authors share insight on many of their research achievements. They give a precise interpretation of statistical leaning theory for C-support vector classification. They also discuss regularized twin SVMs for binary classification problems, SVMs for solving multi-classification problems based on ordinal regression, SVMs for semi-supervised problems, and SVMs for problems with perturbations. To improve readability, concepts, methods, and results are introduced graphically and with clear explanations. For important concepts and algorithms, such as the Crammer-Singer SVM for multi-class classification problems, the text provides geometric interpretations that are not depicted in current literature. Enabling a sound understanding of SVMs, this book gives beginners as well as more experienced researchers and engineers the tools to solve real-world problems using SVMs.

Author: Edoardo M. Airoldi, David Blei, Elena A. Erosheva, Stephen E. Fienberg

Publisher: CRC Press

ISBN: 1466504099

Pages: 618

Year: 2014-11-06

View: 966

Read: 247

In response to scientific needs for more diverse and structured explanations of statistical data, researchers have discovered how to model individual data points as belonging to multiple groups. Handbook of Mixed Membership Models and Their Applications shows you how to use these flexible modeling tools to uncover hidden patterns in modern high-dimensional multivariate data. It explores the use of the models in various application settings, including survey data, population genetics, text analysis, image processing and annotation, and molecular biology. Through examples using real data sets, you’ll discover how to characterize complex multivariate data in: Studies involving genetic databases Patterns in the progression of diseases and disabilities Combinations of topics covered by text documents Political ideology or electorate voting patterns Heterogeneous relationships in networks, and much more The handbook spans more than 20 years of the editors’ and contributors’ statistical work in the field. Top researchers compare partial and mixed membership models, explain how to interpret mixed membership, delve into factor analysis, and describe nonparametric mixed membership models. They also present extensions of the mixed membership model for text analysis, sequence and rank data, and network data as well as semi-supervised mixed membership models.

Author: Inge Koch

Publisher: Cambridge University Press

ISBN: 0521887933

Pages: 526

Year: 2013-12-02

View: 1297

Read: 775

This modern approach integrates classical and contemporary methods, fusing theory and practice and bridging the gap to statistical learning.