Solutions to Assignments
MBA and MBA (Banking & Finance)
MMPC-005: Quantitative Analysis for Managerial Applications
MMPC-005/TMA/JULY/2022
Question No. 5. Write short notes on any two of the following:-
(a) Mathematical Properties of Arithmetic Mean
(b) Stratified Sampling
Stratified sampling is more complex than simple random sampling, but where
applied properly, stratification can significantly increase the statistical
efficiency of sampling.
The concept:
Suppose we are interested in estimating the demand of non aerated beverages in a residential colony. We know that the consumption of
these beverages has some relationship with the family income and that the
families residing in this colony can be classified into three categories-viz.,
high income, middle income and low income families. If we are doing a
sampling study we would like to make sure that our sample does have some
members from each of the three categories-perhaps in the same proportion as
the total number of families belonging to that category-in which case we
would have used proportional stratified sampling. On the other hand, if we
know that the variation in the consumption of these beverages from one
family to another is relatively large for the low income category whereas
there is not much variation in the high income category, we would perhaps
pick up a smaller than proportional sample from the high income category
and a larger than proportional sample from-the low income category. This is
what is done in disproportional stratified sampling.
The basis for using stratified sampling is the existence of strata such that each
stratum is more homogeneous within and markedly different from another
stratum. The higher the homogeneity within each stratum, the higher the gain
in statistical efficiency due to stratification.
What are strata?:
The strata are so defined that they constitute a partition of
the population-i.e., they are mutually exclusive and collectively exhaustive.
Every element of the population belongs to one stratum and not more than
one stratum, by definition. This is shown in Figure II in the form of a Venn
diagram, where three strata have been shown.
A stratum can therefore he conceived of as a sub-population which is more
homogeneous than the complete population-the members of a stratum, are
similar to each other and are different from the members of another stratum
in the characteristics that we are measuring.
Proportional stratified sampling:
Sampling Methods After defining the strata, a simple random
sample is picked up from each of the strata. If we want to have a total sample
of size 100, this number is allocated to the different strata-either in proportion
to the size of the stratum in the population or otherwise.
If the different strata have similar variances of the characteristic being
measured, then the statistical efficiency will be the highest if the sample sizes
for different strata are in the same proportion as the size of the respective
stratum in the population. Such a design is called proportional stratified
sampling and is shown in Table 4 below.
Stratification of the population is quite
common in managerial applications because it also allows to draw separate
conclusions for each stratum. For example, if we are estimating the demand
for a non-aerated beverage in a residential colony and have stratified the
population based on the family income, then we would have data pertaining
to each stratum which might be useful in making many marketing decisions.
Stratification requires us to identify the strata such that the intra-stratum
differences are as small as possible and inter-strata differences as large as
possible. However, whether a stratum is homogeneous or not-in the
characteristic that we are measuring e.g. consumption of non-aerated
beverage in the family in the previous example-can be known only at the end
of the study whereas stratification is to be done at the beginning of the study
and that is why some other variable like family income is to be used for
stratification. This is based on the implicit assumption that family income and
consumption of non-aerated beverages are very closely associated with each
other. If this assumption is true, stratification would increase the statistical
efficiency of sampling. In many studies, it is not easy to find such associated
variables which can be used as the basis for stratification and then
stratification may not help in increasing the statistical efficiency, although the
cost of the study goes up due to the additional costs of stratification.
If we assume that the occurrence of an event corresponds to customers arriving for servicing, then the time between the occurrence would correspond to the inter-arrival time (IAT), and m would correspond to the arrival rate. Exponential has been used widely to characterise the IAT distribution. The Exponential p.d.f. is also used for characterising service time distributions. The parameter 'm' in that case, corresponds to the service rate. We take up an example to show the probability calculations using the Exponential p.d.f. In the final section of this unit, we will be illustrating through an example, the use of the Exponential distribution in decision making.
(c) Exponential Distribution
Time between breakdown of machines, duration of telephone calls, life of an
electric bulb are examples of situations where the Exponential distribution
has been found useful. In the previous unit, while discussing the discrete
probability distributions, we have examined the Poisson process and the
resulting Poisson distribution. In the Poisson process, we were interested in
the random variable of number of occurrences of an event within a specific
time or space. Thus, using the knowledge of Poisson process, we have
calculated the probability that 0, 1, 2 …. accidents will occur in any month.
Quite often, another type of random variable assumes importance in the
context of a Poisson process. We may be interested in the random variable of
the lapse of time before the first occurrence of the event. Thus, for a machine,
we note that the first failure or breakdown of the machine may occur after 1
month or 1.5 months etc. The random variable of the number of failures
within a specific time, as we have already seen, is discrete and follows the
Poisson distribution. The variable, time of first failure, is continuous and the
Exponential p.d.f. characterises the uncertainty.
If any situation is found to satisfy the conditions of a Poisson process, and if
the average occurrence of the event of interest is m per unit time, then the
number of occurrences in a given length of time t has a Poisson distribution
with parameter mt, and the time between any two consecutive occurrences
will be Exponential with parameter m. This can be used to derive the p.d.f. of
the Exponential distribution.
If we assume that the occurrence of an event corresponds to customers arriving for servicing, then the time between the occurrence would correspond to the inter-arrival time (IAT), and m would correspond to the arrival rate. Exponential has been used widely to characterise the IAT distribution. The Exponential p.d.f. is also used for characterising service time distributions. The parameter 'm' in that case, corresponds to the service rate. We take up an example to show the probability calculations using the Exponential p.d.f. In the final section of this unit, we will be illustrating through an example, the use of the Exponential distribution in decision making.
Example:
The distribution of the total time a light bulb will burn from the moment it is
first put into service is known to be exponential with mean time between
failure of the bulbs equal to 1000 hrs. What is the probability that a bulb will
burn more than 1000 hrs.
Solution:
(d) Time Series Analysis
Time series analysis is one of the most powerful methods in use, especially
for short term forecasting purposes. From the historical data one attempts to
obtain the underlying pattern so that a suitable model of the process can be
developed, which is then used for purposes of forecasting or studying the
internal structure of the process as a whole. We have already seen in earlier
unit that a variety of methods such as subjective methods, moving averages
and exponential smoothing, regression methods, causal models and time series analysis are available for forecasting. Time series analysis looks for the
dependence between values in a time series (a set of values recorded at equal
time intervals) with a view to accurately identify the underlying pattern of the
data.
In the case of quantitative methods of forecasting, each technique makes
explicit assumptions about the underlying pattern. For instance, in using
regression models we had first to make a guess on whether a linear or
parabolic model should be chosen and only then could we proceed with the estimation of parameters and model-development. We could rely on mere
visual inspection of the data or its graphical plot to make the best choice of
the underlying model. However, such guess work, through not uncommon, is
unlikely to yield very accurate or reliable results. In time series analysis, a
systematic attempt is made to identify and isolate different kinds of patterns
in the data. The four kinds of patterns that are most frequently encountered
are horizontal, non-stationary (trend or growth), seasonal and cyclical.
Generally, a random or noise component is also superimposed.
We shall first examine the method of decomposition wherein a model of the
time-series in terms of these patterns can be developed. This can then be used
for forecasting purposes as illustrated through an example.
A more accurate and statistically sound procedure to identify the patterns in a
time-series is through the use of auto-correlations. Auto-correlation refers to
the correlation between the same variable at different time lags and was
discussed in Unit 18. Auto-correlations can be used to identify the patterns in
a time series and suggest appropriate stochastic models for the underlying
process. A brief outline of common processes and the Box-Jenkins
methodology is then given.