digplanet beta 1: Athena
Share digplanet:

Agriculture

Applied sciences

Arts

Belief

Chronology

Culture

Education

Environment

Geography

Health

History

Humanities

Language

Law

Life

Mathematics

Nature

People

Politics

Science

Society

Technology

A statistical model is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more other variables. The model is statistical as the variables are not deterministically but stochastically related. In mathematical terms, a statistical model is frequently thought of as a pair $(Y, P)$ where $Y$ is the set of possible observations and $P$ the set of possible probability distributions on $Y$. It is assumed that there is a distinct element of $P$ which generates the observed data. Statistical inference enables us to make statements about which element(s) of this set are likely to be the true one.

Most statistical tests can be described in the form of a statistical model. For example, the Student's t-test for comparing the means of two groups can be formulated as seeing if an estimated parameter in the model is different from 0. Another similarity between tests and models is that there are assumptions involved. Error is assumed to be normally distributed in most models.[1]

## Formal definition

A statistical model is a collection of probability distribution functions or probability density functions (collectively referred to as distributions for brevity). A parametric model is a collection of distributions, each of which is indexed by a unique finite-dimensional parameter: $\mathcal{P}=\{\mathbb{P}_{\theta} : \theta \in \Theta\}$, where $\theta$ is a parameter and $\Theta \subseteq \mathbb{R}^d$ is the feasible region of parameters, which is a subset of d-dimensional Euclidean space. A statistical model may be used to describe the set of distributions from which one assumes that a particular data set is sampled. For example, if one assumes that data arise from a univariate Gaussian distribution, then one has assumed a Gaussian model: $\mathcal{P}=\{\mathbb{P}(x; \mu, \sigma) = \frac{1}{\sqrt{2 \pi} \sigma} \exp\left\{ -\frac{1}{2\sigma^2}(x-\mu)^2\right\} : \mu \in \mathbb{R}, \sigma > 0\}$.

A non-parametric model is a set of probability distributions with infinite dimensional parameters, and might be written as $\mathcal{P}=\{\text{all distributions}\}$. A semi-parametric model also has infinite dimensional parameters, but is not dense in the space of distributions. For example, a mixture of Gaussians with one Gaussian at each data point is dense in the space of distributions. Formally, if d is the dimension of the parameter, and n is the number of samples, if $d \rightarrow \infty$ as $n \rightarrow \infty$ and $d/n \rightarrow 0$ as $n \rightarrow \infty$, then the model is semi-parametric.

## Model comparison

Models can be compared to each other. This can either be done when you have done an exploratory data analysis or a confirmatory data analysis. In an exploratory analysis, you formulate all models you can think of, and see which describes your data best. In a confirmatory analysis you test which of your models you have described before the data was collected fits the data best, or test if your only model fits the data. In linear regression analysis you can compare the amount of variance explained by the independent variables, R2, across the different models. In general, you can compare models that are nested by using a Likelihood-ratio test. Nested models are models that can be obtained by restricting a parameter in a more complex model to be zero.

## An example

Height and age are probabilistically distributed over humans. They are stochastically related; when you know that a person is of age 7, this influences the chance of this person being 6 feet tall. You could formalize this relationship in a linear regression model of the following form: heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject. This means that height starts at some value, there is a minimum height when someone is born, and it is predicted by age to some amount. This prediction is not perfect as error is included in the model. This error contains variance that stems from sex and other variables. When sex is included in the model, the error term will become smaller, as you will have a better idea of the chance that a particular 16-year-old is 6 feet tall when you know this 16-year-old is a girl. The model would become heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous. This model would presumably have a higher R2. The first model is nested in the second model: the first model is obtained from the second when b2 is restricted to zero.

## Classification

According to the number of the endogenous variables and the number of equations, models can be classified as complete models (the number of equations equal to the number of endogenous variables) and incomplete models. Some other statistical models are the general linear model (restricted to continuous dependent variables), the generalized linear model (for example, logistic regression), the multilevel model, and the structural equation model.[2]

## References

1. ^ Field, A. (2005). Discovering statistics using SPSS. Sage, London.
2. ^ Adèr, H.J. (2008). Chapter 12: Modelling. In H.J. Adèr & G.J. Mellenbergh (Eds.) (with contributions by D.J. Hand), Advising on Research Methods: A consultant's companion (pp. 271-304). Huizen, The Netherlands: Johannes van Kessel Publishing.

Original courtesy of Wikipedia: http://en.wikipedia.org/wiki/Statistical_model — Please support Wikipedia.
A portion of the proceeds from advertising on Digplanet goes to supporting Wikipedia.
 84959 videos foundNext >
 Demystifying Statistics: Fitting ModelsLecture given at the University of Sussex September 2012. Describes why statistics is an important topic to study, describes why statistics is a useful life ... The Role of Assumptions in Statistical ModelingUploaded on January 13, 2013, by Dr. Justin Esarey (Rice University). Linear Statistical ModelsIf you understand this, congratulations. Because we dont! DOWNLOAD THE SONG!: Linear Statistical Models (George's Models): http://tinyurl.com/yj3amuj. Statistical Modeling of Monetary Policy and It's EffectsChristopher Sims, PhD 2011 Nobel Laureate Harold H. Helm '20 Professor of Economics and Banking Princeton University Halle Distinguished Fellow April 3, 2012... Tracking sales using a statistical modelTHIS VIDEO INTERVIEW COURTESY OF SOS Video Communications, www.sostv.com) Professor Greg Allenby of the Fisher College of Business at The Ohio State Univers... America's Next Top Statistical ModelInternational Night at Harvard School of Public Health. Using Statistical Modeling to Evaluate Tobacco Control EffortsIn this video, Dr. Eric "Rocky" Feuer discusses a paper in which he and colleagues supported by the National Cancer Institute used statistical modeling to as... RI Seminar: Deva Ramanan : Recognizing objects using model-based statisticsDeva Ramanan Associate Professor, Department of Computer Science, University of California at Irvine May 03, 2013 Recognizing objects using model-based stati... Capture and Statistical Modeling of Arm-Muscle Deformations (Eurographics 2013)We present a comprehensive data-driven statistical model for skin and muscle deformation of the human shoulder-arm complex. Skin deformations arise from comp... Methods of Statistical Model Estimation, co-author, Joseph M. Hilbe (CRC Press)http://www.crcpress.com/product/isbn/9781439858028 Watch Joseph M. Hilbe, co-author of Methods of Statistical Model Estimation, speak about this new book at ...
 84959 videos foundNext >
 222 news items
 Neuroscientists use statistical model to draft fantasy teams of neurons Science Daily (press release) Mon, 29 Apr 2013 13:43:57 -0700 Neuroscientists Use Statistical Model to Draft Fantasy Teams of Neurons. Apr. 29, 2013 — This past weekend teams from the National Football League used statistics like height, weight and speed to draft the best college players, and in a few weeks ... PR Web (press release) Draper, MIT Model Could Predict Landslides PR Web (press release) Fri, 26 Apr 2013 07:28:09 -0700 Engineers at Draper Laboratory and MIT are working under contract with NASA to develop a statistical model that can identify areas where landslides are most likely to occur so that preparations can be made to better respond to a crisis. Share on ... Sea change in BoM's seasonal outlooks PS News Thu, 23 May 2013 01:10:14 -0700 "The increase in accuracy of the new model over the previous statistical model is relatively modest, but lays the foundation for further increases in forecast accuracy over the coming decades as the science, computing capacity, and remote sensing ... Phys.Org Cellphone technology helps horses recover from surgery Phys.Org Fri, 24 May 2013 04:00:05 -0700 The data are later uploaded to a computer and compared to a statistical model that rates each horse's recovery on a 0-100 scale, based upon the amount of difficulty the horse had getting up. "If a horse stands up nice and slow and controlled, it ... Medicare payments impact private rates Pittsburgh Business Times Thu, 23 May 2013 11:27:46 -0700 The study found that a 10 percent reduction in Medicare payment rates would lead to an estimated reduction of private payments of 3 percent or 8 percent, depending on the statistical model. A Pittsburgh Business Times analysis of Medicare data found ... FSU Professor Storm Chasing in Oklahoma When Tornado Hit WCTV Thu, 23 May 2013 15:10:40 -0700 Elsner, an expert on climate and weather, is developing a statistical model for predicting tornado activity in the central Plains. He teaches a “storm chasers” course that focuses on the historical record of tornadoes and how chasers influence these ... Trap-effectiveness and response to tiletamine-zolazepam and medetomidine ... 7thSpace Interactive (press release) Thu, 23 May 2013 12:05:53 -0700 According to the best statistical model obtained, the main factor driving anaesthetic efficacy and stress indicators is trap type. Conclusions: Both cage and corral traps are efficient methods to capture wild boar. Cage traps are safer, as demonstrated ... The Guardian (blog) Statistical models show referees are homers – by popular acclamation The Guardian (blog) Sun, 28 Apr 2013 15:34:47 -0700 One of the authors, Dr Babatunde Buraimo – a senior lecturer in sports economics at the University of Central Lancashire – talks me through the "sophisticated statistical model" involving "minute-by-minute bivariate probit analysis". It is impressive ...
 Limit to books that you can completely read online Include partial books (book previews) .gsc-branding { display:block; }

Oops, we seem to be having trouble contacting Twitter