digplanet beta 1: Athena
Share digplanet:

Agriculture

Applied sciences

Arts

Belief

Business

Chronology

Culture

Education

Environment

Geography

Health

History

Humanities

Language

Law

Life

Mathematics

Nature

People

Politics

Science

Society

Technology

In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule and its result (the estimate) are distinguished.

There are point and interval estimators. The point estimators yield single-valued results, although this includes the possibility of single vector-valued results and results that can be expressed as a single function. This is in contrast to an interval estimator, where the result would be a range of plausible values (or vectors or functions).

Statistical theory is concerned with the properties of estimators; that is, with defining properties that can be used to compare different estimators (different rules for creating estimates) for the same quantity, based on the same data. Such properties can be used to determine the best rules to use under given circumstances. However, in robust statistics, statistical theory goes on to consider the balance between having good properties, if tightly defined assumptions hold, and having less good properties that hold under wider conditions.

Contents

Background [edit]

An "estimator" or "point estimate" is a statistic (that is, a function of the data) that is used to infer the value of an unknown parameter in a statistical model. The parameter being estimated is sometimes called the estimand.[citation needed] It can be either finite-dimensional (in parametric and semi-parametric models), or infinite-dimensional (semi-nonparametric and non-parametric models).[citation needed] If the parameter is denoted θ then the estimator is typically written by adding a circumflex over the symbol: \scriptstyle\hat\theta. Being a function of the data, the estimator is itself a random variable; a particular realization of this random variable is called the "estimate". Sometimes the words "estimator" and "estimate" are used interchangeably.

The definition places virtually no restrictions on which functions of the data can be called the "estimators". The attractiveness of different estimators can be judged by looking at their properties, such as unbiasedness, mean square error, consistency, asymptotic distribution, etc.. The construction and comparison of estimators are the subjects of the estimation theory. In the context of decision theory, an estimator is a type of decision rule, and its performance may be evaluated through the use of loss functions.

When the word "estimator" is used without a qualifier, it usually refers to point estimation. The estimate in this case is a single point in the parameter space. Other types of estimators also exist: interval estimators, where the estimates are subsets of the parameter space.

The problem of density estimation arises in two applications. Firstly, in estimating the probability density functions of random variables and secondly in estimating the spectral density function of a time series. In these problems the estimates are functions that can be thought of as point estimates in an infinite dimensional space, and there are corresponding interval estimation problems.

Definition [edit]

Suppose there is a fixed parameter  \theta \ that needs to be estimated. Then an "estimator" is a function that maps the sample space to a set of sample estimates. An estimator of  \theta \ is usually denoted by the symbol \widehat{\theta}. It is often convenient to express the theory using the algebra of random variables: thus if X is used to denote a random variable corresponding to the observed data, the estimator (itself treated as a random variable) is symbolised as a function of that random variable, \widehat{\theta}(X). The estimate for a particular observed dataset (i.e. for X=x) is then \widehat{\theta}(x), which is a fixed value. Often an abbreviated notation is used in which \widehat{\theta} is interpreted directly as a random variable, but this can cause confusion.

Quantified properties [edit]

The following definitions and attributes apply:

Error

For a given sample  x \ , the "error" of the estimator \widehat{\theta} is defined as

e(x)=\widehat{\theta}(x) - \theta,

where \theta \ is the parameter being estimated. Note that the error, e, depends not only on the estimator (the estimation formula or procedure), but on the sample.

Mean squared error

The mean squared error of \widehat{\theta} is defined as the expected value (probability-weighted average, over all samples) of the squared errors; that is,

\operatorname{MSE}(\widehat{\theta}) = \operatorname{E}[(\widehat{\theta}(X) - \theta)^2].

It is used to indicate how far, on average, the collection of estimates are from the single parameter being estimated. Consider the following analogy. Suppose the parameter is the bull's-eye of a target, the estimator is the process of shooting arrows at the target, and the individual arrows are estimates (samples). Then high MSE means the average distance of the arrows from the bull's-eye is high, and low MSE means the average distance from the bull's-eye is low. The arrows may or may not be clustered. For example, even if all arrows hit the same point, yet grossly miss the target, the MSE is still relatively large. Note, however, that if the MSE is relatively low, then the arrows are likely more highly clustered (than highly dispersed).

Sampling deviation

For a given sample  x \ , the sampling deviation of the estimator \widehat{\theta} is defined as

d(x) =\widehat{\theta}(x) - \operatorname{E}( \widehat{\theta}(X) ) =\widehat{\theta}(x) - \operatorname{E}( \widehat{\theta} ),

where  \operatorname{E}( \widehat{\theta}(X) ) is the expected value of the estimator. Note that the sampling deviation, d, depends not only on the estimator, but on the sample.

Variance

The variance of \widehat{\theta} is simply the expected value of the squared sampling deviations; that is, \operatorname{var}(\widehat{\theta}) = \operatorname{E}[(\widehat{\theta} - \operatorname{E}(\widehat{\theta}) )^2]. It is used to indicate how far, on average, the collection of estimates are from the expected value of the estimates. Note the difference between MSE and variance. If the parameter is the bull's-eye of a target, and the arrows are estimates, then a relatively high variance means the arrows are dispersed, and a relatively low variance means the arrows are clustered. Some things to note: even if the variance is low, the cluster of arrows may still be far off-target, and even if the variance is high, the diffuse collection of arrows may still be unbiased. Finally, note that even if all arrows grossly miss the target, if they nevertheless all hit the same point, the variance is zero.

Bias

The bias of \widehat{\theta} is defined as B(\widehat{\theta}) = \operatorname{E}(\widehat{\theta}) - \theta. It is the distance between the average of the collection of estimates, and the single parameter being estimated. It also is the expected value of the error, since  \operatorname{E}(\widehat{\theta}) - \theta = \operatorname{E}(\widehat{\theta} - \theta ) . If the parameter is the bull's-eye of a target, and the arrows are estimates, then a relatively high absolute value for the bias means the average position of the arrows is off-target, and a relatively low absolute bias means the average position of the arrows is on target. They may be dispersed, or may be clustered. The relationship between bias and variance is analogous to the relationship between accuracy and precision.

Unbiased

The estimator \widehat{\theta} is an unbiased estimator of  \theta \ if and only if B(\widehat{\theta}) = 0. Note that bias is a property of the estimator, not of the estimate. Often, people refer to a "biased estimate" or an "unbiased estimate," but they really are talking about an "estimate from a biased estimator," or an "estimate from an unbiased estimator." Also, people often confuse the "error" of a single estimate with the "bias" of an estimator. Just because the error for one estimate is large, does not mean the estimator is biased. In fact, even if all estimates have astronomical absolute values for their errors, if the expected value of the error is zero, the estimator is unbiased. Also, just because an estimator is biased, does not preclude the error of an estimate from being zero (we may have gotten lucky). The ideal situation, of course, is to have an unbiased estimator with low variance, and also try to limit the number of samples where the error is extreme (that is, have few outliers). Yet unbiasedness is not essential. Often, if just a little bias is permitted, then an estimator can be found with lower MSE and/or fewer outlier sample estimates.

An alternative to the version of "unbiased" above, is "median-unbiased", where the median of the distribution of estimates agrees with the true value; thus, in the long run half the estimates will be too low and half too high. While this applies immediately only to scalar-valued estimators, it can be extended to any measure of central tendency of a distribution: see median-unbiased estimators.

Relationships
  • The MSE, variance, and bias, are related: \operatorname{MSE}(\widehat{\theta}) = \operatorname{var}(\widehat\theta) + (B(\widehat{\theta}))^2, i.e. mean squared error = variance + square of bias. In particular, for an unbiased estimator, the variance equals the MSE.
  • The standard deviation of an estimator of θ (the square root of the variance), or an estimate of the standard deviation of an estimator of θ, is called the standard error of θ.

Behavioural properties [edit]

Consistency

A consistent sequence of estimators is a sequence of estimators that converge in probability to the quantity being estimated as the index (usually the sample size) grows without bound. In other words, increasing the sample size increases the probability of the estimator being close to the population parameter.

Mathematically, a sequence of estimators {tn; n ≥ 0} is a consistent estimator for parameter θ if and only if, for all ϵ > 0, no matter how small, we have


\lim_{n\to\infty}\Pr\left\{
\left|
t_n-\theta\right|<\epsilon
\right\}=1.

The consistency defined above may be called weak consistency. The sequence is strongly consistent, if it converges almost surely to the true value.

An estimator that converges to a multiple of a parameter can be made into a consistent estimator by multiplying the estimator by a scale factor, namely the true value divided by the asymptotic value of the estimator. This occurs frequently in estimation of scale parameters by measures of statistical dispersion.

Asymptotic normality

An asymptotically normal estimator is a consistent estimator whose distribution around the true parameter θ approaches a normal distribution with standard deviation shrinking in proportion to 1/\sqrt{n} as the sample size n grows. Using \xrightarrow{D} to denote convergence in distribution, tn is asymptotically normal if

\sqrt{n}(t_n - \theta) \xrightarrow{D} N(0,V),

for some V, which is called the asymptotic variance of the estimator.

The central limit theorem implies asymptotic normality of the sample mean \bar x as an estimator of the true mean. More generally, maximum likelihood estimators are asymptotically normal under fairly weak regularity conditions — see the asymptotics section of the maximum likelihood article. However, not all estimators are asymptotically normal, the simplest examples being case where the true value of a parameter lies in the boundary of the allowable parameter region.

Efficiency

Two naturally desirable properties of estimators are for them to be unbiased and have minimal mean squared error (MSE). These cannot in general both be satisfied simultaneously: a biased estimator may have lower mean squared error (MSE) than any unbiased estimator; see estimator bias.

Among unbiased estimators, there often exists one with the lowest variance, called the minimum variance unbiased estimator (MVUE). In some cases an unbiased efficient estimator exists, which, in addition to having the lowest variance among unbiased estimators, satisfies the Cramér–Rao bound, which is an absolute lower bound on variance for statistics of a variable.

Concerning such "best unbiased estimators", see also Cramér–Rao bound, Gauss–Markov theorem, Lehmann–Scheffé theorem, Rao–Blackwell theorem.

Robustness

See: Robust estimator, Robust statistics

See also [edit]

References [edit]

External links [edit]


Original courtesy of Wikipedia: http://en.wikipedia.org/wiki/Estimator — Please support Wikipedia.
A portion of the proceeds from advertising on Digplanet goes to supporting Wikipedia.
10889 videos foundNext > 

(ML 11.1) Estimators

Definition of an estimator. Examples of estimators. Definition of an unbiased estimator.

Estimator

Here is short video about a search I am doing for an Estimator. Contact Info: Randy Chapman Associate Partner/ Construction Team Leader Management Recruiters...

The Maximum Likelihood Estimator for Variance

A derivation of the maximum likelihood estimator for variance, using the help of Maple 12.

Introductory Statistics - Chapter 7: Estimation

A video summary of chapter 7 in Perdisco's Introductory Statistics 360Textbook. To find out more, visit www.perdisco.com/introstats.

Google Traffic Estimator Adwords Keyword Tool

Google traffic estimator keyword tool is a great time saver, research a range of keywords for Adwords, Adsense or for search engine optimization. See http://...

Mistakes students make in defining bias of an estimator

Small but important point in defining bias, if not defined properly the terms upwards and downwards bias will be wrong.

The Maximum Likelihood Estimator for Variance is Biased: Proof

A proof that the maximum likelihood estimator for variance is biased.

The Unbiased Variance Estimator: Example

Demonstration of the unbiased sample variance, as well as the bias in the sample standard deviation.

Construction Cost Estimator App for the Mac, iPad, and iPhone

Construction Cost Estimator helps contractors prepare on-site estimates for construction projects. The app saves time and money by letting contractors quickl...

The Maximum Likelihood Estimator for Variance is Biased: Example

An example of bias in the maximum likelihood estimator of variance.

10889 videos foundNext > 

12492 news items

 
WebProNews
Mon, 20 May 2013 09:58:27 -0700

Google has combined the Keyword Tool and the Traffic Estimator into one tool called the Keyword Planner. “Behind every successful AdWords campaign are well planned out keywords and ad groups,” says AdWords product manager Deepti Bhatnagar.
 
WebWire (press release)
Thu, 16 May 2013 08:16:25 -0700

MINNETONKA, Minn. - -UnitedHealthcare has made available to nearly all of its employer-sponsored plan participants nationwide myHealthcare Cost Estimator, an integrated online and mobile service that brings a retail shopping experience to health care ...

Autochannel (press release)

Autochannel (press release)
Tue, 14 May 2013 09:15:49 -0700

DEARBORN, MI--May 14, 2013: FordDirect, a joint venture between Ford Motor Company and its franchise dealers, today announced the addition of two new tools for dealers provided by Black Book® Online. A trade-in appraisal tool and credit estimator tool ...
 
PR Web (press release)
Thu, 09 May 2013 07:05:56 -0700

The Athena Sustainable Materials Institute today announced the availability of the Athena Impact Estimator for Highways, its newest infrastructure assessment software. The life cycle based environmental assessment software is available as a free ...
 
Wall Street Journal (press release)
Mon, 06 May 2013 06:46:37 -0700

At the 61st Annual Clinical Meeting of the American Congress of Obstetricians and Gynecologists (ACOG) in New Orleans, Clearblue unveiled the newest addition to their family of products that will change women's home pregnancy test experiences forever.
 
printweek.com
Sat, 18 May 2013 16:11:47 -0700

Highly impressive major player in the large format digital print, exhibition graphics and outdoor media arena seek to appoint a truly exceptional individual to work within their expanding Client Services team. Working in close harmony with the Client ...
 
printweek.com
Fri, 17 May 2013 05:18:27 -0700

Rare opportunity to join a genuine success story in the Print sector. The business is a growing, ambitious yet well-established business with a very inclusive culture that has enjoyed recent sustained success in turbulent times for the industry, and is ...
 
PR Web (press release)
Thu, 25 Apr 2013 04:44:46 -0700

This update to the Impact Estimator for Buildings software includes changes in insulation and window frame options, and the Athena Institute is offering free assistance to a limited number of design teams with the LEED pilot credit for using LCA in design.
Loading

Oops, we seem to be having trouble contacting Twitter

Talk About Estimator

You can talk about Estimator with people all over the world in our discussions.

Support Wikipedia

A portion of the proceeds from advertising on Digplanet goes to supporting Wikipedia. Please add your support for Wikipedia!