Showing posts with label Research Methodology and Statistical Analysis. Show all posts

Tuesday, 12 April 2022

Question No. 5 - MCO-03 - Research Methodology and Statistical Analysis

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Mcom - 2nd Year

Question No. 5

Distinguish between the following:

(a) Primary and Secondary Data

In statistical analysis, collection of data plays a significant part. The method of collecting information is divided into two different sections, namely primary data and secondary data. In this process, the primary data is assembling data or information for the first time, whereas the secondary data is the data that has already been gathered or collected by others.

The most important characteristics of the primary data is that it is original and first-hand, whereas the secondary data is the interpretation and analysis of the primary data.

Primary Data Definition

Primary data is the data that is collected for the first time through personal experiences or evidence, particularly for research. It is also described as raw data or first-hand information. The mode of assembling the information is costly, as the analysis is done by an agency or an external organisation, and needs human resources and investment. The investigator supervises and controls the data collection process directly.

The data is mostly collected through observations, physical testing, mailed questionnaires, surveys, personal interviews, telephonic interviews, case studies, and focus groups, etc.

Secondary Data Definition

Secondary data is a second-hand data that is already collected and recorded by some researchers for their purpose, and not for the current research problem. It is accessible in the form of data collected from different sources such as government publications, censuses, internal records of the organisation, books, journal articles, websites and reports, etc.

This method of gathering data is affordable, readily available, and saves cost and time. However, the one disadvantage is that the information assembled is for some other purpose and may not meet the present research purpose or may not be accurate.

The differences between the primary and secondary data are represented in a comparison format as follows:

Primary Data	Secondary Data
Definition
Primary data are those that are collected for the first time.	Secondary data refer to those data that have already been collected by some other person.
Originality
These are original because these are collected by the investigator for the first time.	These are not original because someone else has collected these for his own purpose.
Nature of Data
These are in the form of raw materials.	These are in the finished form.
Reliability and Suitability
These are more reliable and suitable for the enquiry because these are collected for a particular purpose.	These are less reliable and less suitable as someone else has collected the data which may not perfectly match our purpose.
Time and Money
Collecting primary data is quite expensive both in the terms of time and money.	Secondary data requires less time and money; hence it is economical.
Precaution and Editing
No particular precaution or editing is required while using the primary data as these were collected with a definite purpose.	Both precaution and editing are essential as secondary data were collected by someone else for his own purpose.

(b) Estimation and testing of hypothesis

(c) Sampling and Non-Sampling Errors

Sampling error is one which occurs due to unrepresentativeness of the sample selected for observation. Conversely, non-sampling error is an error arise from human error, such as error in problem identification, method or procedure used, etc.

An ideal research design seeks to control various types of error, but there are some potential sources which may affect it. In sampling theory, total error can be defined as the variation between the mean value of population parameter and the observed mean value obtained in the research. The total error can be classified into two categories, i.e. sampling error and non-sampling error.

BASIS FOR COMPARISON	SAMPLING ERROR	NON-SAMPLING ERROR
Meaning	Sampling error is a type of error, occurs due to the sample selected does not perfectly represents the population of interest.	An error occurs due to sources other than sampling, while conducting survey activities is known as non sampling error.
Cause	Deviation between sample mean and population mean	Deficiency and analysis of data
Type	Random	Random or Non-random
Occurs	Only when sample is selected.	Both in sample and census.
Sample size	Possibility of error reduced with the increase in sample size.	It has nothing to do with the sample size.

The significant differences between sampling and non-sampling error are mentioned in the following points:

a. Sampling error is a statistical error happens due to the sample selected does not perfectly represents the population of interest. Non-sampling error occurs due to sources other than sampling while conducting survey activities is known as non-sampling error.

b. Sampling error arises because of the variation between the true mean value for the sample and the population. On the other hand, the non-sampling error arises because of deficiency and inappropriate analysis of data.

c. Non-sampling error can be random or non-random whereas sampling error occurs in the random sample only.

d. Sample error arises only when the sample is taken as a representative of a population.As opposed to non-sampling error which arises both in sampling and complete enumeration.

e. Sampling error is mainly associated with the sample size, i.e. as the sample size increases the possibility of error decreases. On the contrary, the non-sampling error is not related to the sample size, so, with the increase in sample size, it won’t be reduced.

(d) Bibliography and footnote

The Footnote

Content footnotes give additional information about the content, and bibliographic notes provide additional sources related to the content. The footnote is found at the bottom, or foot, of the page. It is marked by a superscript number within the body of the text. The superscript number also appears at the bottom of the page, along with the additional explanatory or bibliographic information.

If specific sources are used to write content footnotes, this information should be cited through parenthetical citations within the footnote and then with full citation information within the Works Cited, or Bibliography, page. Bibliographic footnotes point your readers to specific, related outside texts without providing much commentary on them. Full citation information for these sources should also be included on the Works Cited page.

The Bibliography

The Bibliography, or Works Cited, page is the last section of a paper. It compiles the full citation information for any source cited in or consulted for the paper into one location and allows your readers to get an overview of the works informing your thinking.

The full citation information found in this section tells your readers when and where a source was published, whereas a footnote might only include the title of the work. Additionally, no information besides the citation information is included within the bibliography.

Thursday, 7 April 2022

Question No. 4 - MCO-03 - Research Methodology and Statistical Analysis

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Question No. 4

Write short notes on the following:

(a) Comparative Scales

Scaling emerged from the social sciences in an attempt to measure or order attributes with respect to quantitative attributes or traits. Scaling provides a mechanism for measuring abstract concepts.

A comparative scale is an ordinal or rank order scale that can also be referred to as a non-metric scale. Respondents evaluate two or more objects at one time and objects are directly compared with one another as part of the measuring process.

For example you could ask someone if they prefer listening to MP3s through a Zune or an iPod. You could take it a step further and add some other MP3 player brands to the comparison. MP3 players would be scaled relative to each other and the scale position of any one player would depend on the the scale position of the remaining players. Because they are being compared differences such as who has the click wheel are effectively forced. Where this is limiting is evident when you find no standard of comparison outside the objects being compared. No generalizations are made outside of these objects. Often used when physical characteristics of objects are being compared.

1. Guttman Scaling

This can also be referred to as a cumulative scoring or scalogram analysis. The intent of this survey is that the respondent will agree to a point and their score is measured to the point where they stop agreeing. For this reason questions are often formatted in dichotomous yes or no responses.

The survey may start out with a question that is easy to agree with and then get increasingly sensitive to the point where the respondent starts to disagree. You may start out with a question that asks if you like music at which point you mark yes. Four questions later it may ask if you like music without a soul and which is produced by shady record labels only out to make money at which point you may say no. If you agreed with the first 5 questions and then started disagreeing you would be rated a 5. The total of questions you agreed to would be added up and your final score would say something about your attitude toward music.

2. Rasch Scaling

This probabilistic model provides a theoretical basis for obtaining interval level measurements based on counts from observations such as total scores on assessments. This analyzes individual differences in response tendencies as well as an item’s discrimination and difficulty. It measures how respondents interact with items and then infers differences between items from responses to obtain scale values. This model is typically used analyze data from assessments and to measure abilities, attitudes, and personality traits.

3. Rank-Order Scaling

This gives the respondent a set of items and then asks the respondent to put those items in some kind of order. The “order” could be something like preference, liking, importance, effectiveness, etc. This can be a simple ordinal structure such as A is higher than B or be done by relative position (give each letter a numerical value as in A is 10 and B is 7). You could present five items and ask the respondent to order each one A-E in order of preference. In Rank-Order scaling only (n-1) decisions need to be made.

4. Constant Sum Scaling

With this ordinal level technique respondents are given a constant sum of units such as points, money, or credits and then asked to allocate them to various items. For example, you could ask a respondent to reflect on the importance of features of a product and then give them 100 points to allocate to each feature of the product based on that. If a feature is not important then the respondent can assign it zero. If one feature is twice as important as another then they can assign it twice as much. When they are done all the points should add up to 100.

5. Paired Comparison Scale

This is an ordinal level technique where a respondent is presented with two items at a time and asked to choose one. This is the most widely used comparison scale technique. If you take n brands then [n (n-1)/2] paired comparisons are required. A classic example of when paired comparison is used is during taste tests. For example you could have a taste test in which you have someone try both Coke and Pepsi and then ask them which one they prefer.

6. Bogardus Social Distance Scale

This is a cumulative score that is a variant of the Guttman scale, agreement with any item implies agreement with the preceding items. This scale is used to measure how close or distant people feel toward other people. Social distance is a concern when it comes to issues related to racial integration or other forms of equality. This is applicable to team formation in the work place for example. Some people accept other people easily and use trustworthiness as the basis of their relationship with other people. Other people do not accept people who are not like them and tend to keep those that are not like them at arms length.

7. Q-Sort Scaling

This is a rank order procedure where respondents are asked to sort a given number of items or statements and classify them into a predetermined number of sets (usually 11) according to some criterion such as preference, attitude, or behavioral intent. Using cards that note an item to be ranked is the most popular and simplest method to use in the sorting process. In order to increase statistical reliability at least 60 cards should be used and no more than 140. This is good for discriminating among a large group of items in a relatively short amount of time.

(b) Purpose of a Report

No research is complete unless the report is written and communicated. It is necessary for you as the researcher to maintain proper notes on progress made, e.g., problem statement objectives, justifications for the study, review of literature, development of instruments for data collection, hypothesis, sample description and sampling technique, pilot study, problems faced in data collection, the data and the analysis. These notes help in preparing the research report.

The main purpose of research report is to let others interested in the subject know the findings of the research. The researcher himself/ herself may have definite purpose of writing the research report. Examples of purposes are listed below:

1) Research is conducted for the partial fulfillment of the degree like M.Sc., Ph.D. Therefore writing report is a part sf the academic programme.

2) Research is conducted to find an answer to the problems faced by the practitioner, teacher or administrator. Here the report is written to communicate the findings to others in the profession for critiquing, application of result or future investigation in the area of research.

3) When the research is funded by the government or a research foundation. They stipulate the requirements of the report.

The reports are usually written as a thesis, monographs or article for publication in journal or magazine. The content outline can be broadly divided as introduction, review of literature, methodology, data analysis and interpretation, summary and conclusions. The introduction section include: background of the study, need and justification for the study, problem statement, variables, objectives and hypothesis, scope and limitations, and assumptions. Methodology includes: justification and explanation of research approach, sampling technique and size of sample, setting of the study, construction or selection of instrument for data collection, procedure for data collection and the plan of data analysis. Style of writing references and bibliography is recommended by the publishers ' and the research departments of institutions. The writer has to follow them strictly. Reference are works that are referred in the text whereas bibliography includes all the relevant literature reviewed irrespective of "referred" or "not referred" status.

(c) Binomial Distribution

A binomial distribution can be thought of as simply the probability of a SUCCESS or FAILURE outcome in an experiment or survey that is repeated multiple times. The binomial is a type of distribution that has two possible outcomes (the prefix “bi” means two, or twice). For example, a coin toss has only two possible outcomes: heads or tails and taking a test could have two possible outcomes: pass or fail.

Binomial distributions must also meet the following three criteria:

a. The number of observations or trials is fixed. In other words, you can only figure out the probability of something happening if you do it a certain number of times. This is common sense—if you toss a coin once, your probability of getting a tails is 50%. If you toss a coin a 20 times, your probability of getting a tails is very, very close to 100%.

b. Each observation or trial is independent. In other words, none of your trials have an effect on the probability of the next trial.

c. The probability of success (tails, heads, fail or pass) is exactly the same from one trial to another.

The binomial distribution is closely related to the Bernoulli distribution. According to Washington State University, “If each Bernoulli trial is independent, then the number of successes in Bernoulli trails has a binomial Distribution. On the other hand, the Bernoulli distribution is the Binomial distribution with n=1.”

A Bernoulli distribution is a set of Bernoulli trials. Each Bernoulli trial has one possible outcome, chosen from S, success, or F, failure. In each trial, the probability of success, P(S) = p, is the same. The probability of failure is just 1 minus the probability of success: P(F) = 1 – p. (Remember that “1” is the total probability of an event occurring…probability is always between zero and 1). Finally, all Bernoulli trials are independent from each other and the probability of success doesn’t change from trial to trial, even if you have information about the other trials’ outcomes.

Many instances of binomial distributions can be found in real life. For example, if a new drug is introduced to cure a disease, it either cures the disease (it’s successful) or it doesn’t cure the disease (it’s a failure). If you purchase a lottery ticket, you’re either going to win money, or you aren’t. Basically, anything you can think of that can only be a success or a failure can be represented by a binomial distribution.

The binomial distribution formula is:

b(x; n, P) = nCx * Px * (1 – P)n – x

Where:

b = binomial probability

x = total number of “successes” (pass or fail, heads or tails etc.)

P = probability of a success on an individual trial

n = number of trials

The binomial distribution formula can calculate the probability of success for binomial distributions. Often you’ll be told to “plug in” the numbers to the formula and calculate. This is easy to say, but not so easy to do—unless you are very careful with order of operations, you won’t get the right answer. If you have a Ti-83 or Ti-89, the calculator can do much of the work for you. If not, here’s how to break down the problem into simple steps so you get the answer right—every time.

(d) Skewness

Skewness is a measure of asymmetry or distortion of symmetric distribution. It measures the deviation of the given distribution of a random variable from a symmetric distribution, such as normal distribution. A normal distribution is without any skewness, as it is symmetrical on both sides. Hence, a curve is regarded as skewed if it is shifted towards the right or the left.

Types of Skewness

1. Positive Skewness

If the given distribution is shifted to the left and with its tail on the right side, it is a positively skewed distribution. It is also called the right-skewed distribution. A tail is referred to as the tapering of the curve differently from the data points on the other side.

As the name suggests, a positively skewed distribution assumes a skewness value of more than zero. Since the skewness of the given distribution is on the right, the mean value is greater than the median and moves towards the right, and the mode occurs at the highest frequency of the distribution.

2. Negative Skewness

If the given distribution is shifted to the right and with its tail on the left side, it is a negatively skewed distribution. It is also called a left-skewed distribution. The skewness value of any distribution showing a negative skew is always less than zero. The skewness of the given distribution is on the left; hence, the mean value is less than the median and moves towards the left, and the mode occurs at the highest frequency of the distribution.

Measuring Skewness

Skewness can be measured using several methods; however, Pearson mode skewness and Pearson median skewness are the two frequently used methods. The Pearson mode skewness is used when a strong mode is exhibited by the sample data. If the data includes multiple modes or a weak mode, Pearson’s median skewness is used.

The formula for Pearson mode skewness:

Where:

X = Mean value

Mo = Mode value

s = Standard deviation of the sample data

The formula for Person median skewness:

Where:

Md = Median value

How to Interpret

- Skewness also includes the extremes of the dataset instead of focusing only on the average. Hence, investors take note of skewness while estimating the distribution of returns on investments. The average of the data set works out if an investor holds a position for the long term. Therefore, extremes need to be looked at when investors seek short-term and medium-term security positions.

- Usually, a standard deviation is used by investors in forecasting returns, and it presumes a normal distribution with zero skewness. However, because of skewness risk, it is better to obtain the performance estimations based on skewness. Moreover, the occurrence of return distributions coming close to normal is low.

- Skewness risk occurs when a symmetric distribution is applied to the skewed data. The financial models seeking to estimate an asset’s future performance consider a normal distribution. However, skewed data will increase the accuracy of the financial model.

- If a return distribution shows a positive skew, investors can expect recurrent small losses and few large returns from investment. Conversely, a negatively skewed distribution implies many small wins and a few large losses on the investment.

- Hence, a positively skewed investment return distribution should be preferred over a negatively skewed return distribution since the huge gains may cover the frequent – but small – losses. However, investors may prefer investments with a negatively skewed return distribution. It may be because they prefer frequent small wins and a few huge losses over frequent small losses and a few large gains.

Wednesday, 6 April 2022

Question No. 3 - MCO-03 - Research Methodology and Statistical Analysis

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Question No. 3

Briefly comment on the following:

(a) The recognition or existence of a problem motivates research.

Without a problem, research cannot proceed, because there is nothing to proceed from and proceed towards. Therefore, the first step in research is to perceive a problem - either practical or theoretical. The recognition or existence of a problem motivates research. It may be noted that research is the process of repeated search for truth/facts. Unless there is a problem to search for, investigation cannot proceed. Thus, a problem sets the goal or direction of research. A problem in simple words is “some difficulty experienced by the researcher in a theoretical or practical situation. Solving this difficulty is the task of research”. A problem exists when we do not have enough information to answer a question (problem). The answer to the question or problem is what is sought in the research. By problem we mean “any condition or circumstance in which one does not know how to act and what to accept as true”. In our common usage when we are unable to assess a thing correctly, we often say ‘it is problematic’.

Thus the researcher who selects a problem formulates a hypothesis or postulates a theoretical assumption that this or that is true, this or that thing to do. He/she collects proof (facts/data) of his/her hypothesis. Based on the analysis of the data collected he/she asserts the truth or answers the question/solves the problem.

A topic of study may be selected by some institution or by some researcher or researchers having intellectual interests. In the former case there could be a wide variety of problems in which institutions are interested. The institution could be a local body, or government or corporate enterprises or a political party. For example, the government may be interested in assessing the probable consequences of various courses of action for solving a problem say rural unemployment. A firm may be interested in assessing the demand for something and predicting the future course of events so as to plan appropriate action relating to marketing, production, consumer behaviour and so on.

The topic of study may be selected by some individual researcher having intellectual or scientific interests. The researcher may be interested in exploring some general subject matter about which relatively little is known. And its purpose is just for scientific curiosity. Person may also be interested in a phenomenon which has already been studied in the past, but now it appears that conditions are different and, therefore, it requires further examination. Person may also be interested in a field in which there is a highly developed theoretical system but there is need for retesting the old theory on the basis of new facts, so as to test its validity in the changed circumstances.

The topic of research may be of a general nature or specifically needed by some institution, organization or government. It may be of intellectual interest or of practical concern, “A wide variety of practical concerns may present topics for research”. For example, one may want to study the impact of television on children’s education, performance of regulated agricultural markets, profitability of a firm, impact of imports on Indian economy, a comparative study of accounting practices in public and private undertakings, etc.

If the researcher / research organization has a ready problem on hand, he/she can proceed further in the research process or else you have to search for a problem. Where can you search for research problems? Your own mind, where else? You have to feel the problem and think about it. However, the following sources may help you in identifying the problem / problem areas.

1) Business Problems: A research problem is a felt need, the need may be an answer, or a solution or an improvement in facilities / technology eg. Cars Business experiences, various types of problems. They may be business policy problems, operational problems, general management problems, or functional area problems. The functional areas are Financial Management, Marketing Management, Production Management and Human Resources Management. Every business research problem is expected to solve a management problem by facilitating rational decision-making.

2) Day to Day Problems: A research problem can be from the day to day experience of the researcher. Every day problems constantly present some thing new and worthy of investigation and it depends on the keenness of observation and sharpness of the intellect of the researcher to knit his daily experience into a research problem. For example, a person who travels in city buses every day finds it a problem to get in or get out of the bus. But a Q system (that is the answer to the problem) facilitates boarding and alighting comfortably.

3) Technological Changes: Technological changes in a fast changing world are constantly bringing forth new problems and thus new opportunities for research. For example, what is the impact or implications of a new technique or new process or new machine?

4) Unexplored Areas: Research problems can be both abstract and of applied interest. The researcher may identify the areas in which much work has been done and the areas in which little work has been done or areas in which no work has been done. He may select those areas which have not been explored so far/explored very little.

5) Theory of One’s Own Interest: A researcher may also select a problem for investigation from a given theory in which he has considerable interest. In such situations the researcher must have a thorough knowledge of that theory and should be able to explore some unexplained aspects or assumptions of that theory. His effort should revalidate, or modify or reject the theory.

6) Books, Theses, Dissertation Abstracts, Articles: Special assignments in textbooks, research theses, investigative reports, research articles in research journals etc., are rich sources for problem seekers. These sources may suggest some additional areas of needed research. Many of the research theses and articles suggest problems for further investigation which may prove fruitful.

7) Policy Problems: Government policy measures give rise to both positive and negative impact. The researcher may identify these aspects for his research. For example, what is the impact of the Government’s new industrial policy on industrial development? What is the impact of Export - Import policy on balance of payments? What is the impact of Securities Exchange Board of India Regulations on stock markets?

8) Discussions with Supervisor and Other Knowledgeable Persons: The researcher may find it fruitful to have discussions with his/her proposed supervisor or other knowledgeable persons in the area of the topic.

The selection of a topic for research is only half-a-step forward. This general topic does not help a researcher to see what data are relevant to his/her purpose. What are the methods would he/she employ in securing them? And how to organize these? Before he/she can consider all these aspects, he/she has to formulate a specific problem by making the various components of it (as explained above) explicit.

A research problem is nothing but a basic question for which an answer or a solution is sought through research. The basic question may be further broken down into specifying questions. These “simple, pointed, limited, empirically verifiable questions are the final result of the phased process, we designate as the formulation of a research problem”. Specification or definition of the problem is therefore a process that involves a progressive narrowing of the scope and sharpening of focus of questions till the specific challenging questions are finally posed. If you can answer the following questions, you have clearly specified/defined the problem.

(b) Quantitative data has to be condensed in a meaningful manner, so that it can be easily understood and interpreted.

Quantitative data has to be condensed in a meaningful manner, so that it can be easily understood and interpreted. One of the common methods for condensing the quantitative data is to compute statistical derivatives, such as Percentages, Ratios, Rates, etc. These are simple derivatives. Further, it is necessary to summarise and analyse the data. The first step in that direction is the computation of Central Tendency or Average, which gives a bird's-eye view of the entire data. In this Unit, we will discuss computation of statistical derivatives based on simple calculations. Further, numerical methods for summarizing and describing data ñ measures of Central Tendency ñ are discussed. The purpose is to identify one value, which can be obtained from the data, to represent the entire data set.

Statistical derivatives are the quantities obtained by simple computation from the given data. Though very easy to compute, they often give meaningful insight to the data. Here we discuss three often-used measures: percentage, ratio and rate. These measures point out an existing relationship among factors and thereby help in better interpretation.

1. Percentage

As we have noted earlier, the frequency distribution may be regarded as simple counting and checking as to how many cases are in each group or class. The relative frequency distribution gives the proportion of cases in individual classes. On multiplication by 100, the percentage frequencies are obtained. Converting to percentages has some advantages - it is now more easily understood and comparison becomes simpler because it standardizes data. Percentages are quite useful in other tables also, and are particularly important in case of bivariate tables.

2. Ratio

Another descriptive measure that is commonly used with frequency distribution (it may be used elsewhere also) is the ratio. It expresses the relative value of frequencies in the same way as proportion or percentages but it does so by comparing any one group to either total number of cases or any other group. For instance, in table 6.3, Unit 6, the ratio of all labourers to their daily wages between Rs 30ñ35 is 70:14 or 5:1. Where ever possible, it is convenient to reduce the ratios in the form of n1: n2, the most preferred value of n2 being 1. Thus, representation in the form of ratio also reduces the size of the number which facilitates easy comparison and quick grasp. As the number of categories increases, the ratio is a better derivative for presentation as it will be easy and less confusing.

There are several types of ratios used in statistical work. Let us discuss them.

a. The Distribution Ratio: It is defined as the ratio of a part to a total which includes that part also. For example, in an University there are 600 girls out of 2,000 students. Than the distribution ratio of girls to the total number of students is 3:10. We can say 30% of the total students are girls in that University.

b. Interpret ratio: It is a ratio of a part in a total to another part in the same total. For example, sex ratio is usually expressed as number of females per 1,000 males (not against population).

c. Time ratio: This ratio is a measure which expresses the changes in a series of values arranged in a time sequence and is typically shown as percentage. Mainly, there are two types of time ratios :

i) Those employing a fixed base period: Under this method, for instance, if you are interested in studying the sales of a product in the current year, you would select a particular past year, say 1990 as the base year and compare the current yearís production with the production of 1990.

ii) Those employing a moving base: For example, for computation of the current year's sales, last year's sales would be assumed as the base (for 1991, 1990 is the base. For 1992, 1991 is the base and so on.

Ratios are more often used in financial economics to indicate the financial status of an organization.

3. Rate

The concept of ratio may be extended to the rate. The rate is also a comparison of two figures, but not of the same variable, and it is usually expressed in percentage. It is a measure of the number of times a value occurs in relation to the number of times the value could occur, i.e. number of actual occurrences divided by number of possible occurrences. Unemployment rate in a country is given by total number of unemployed person divided by total number of employable persons. It is clear now that a rate is different from a ratio. For example, we may say that in a town the ratio of the number of unemployed persons to that of all persons is 0.05: 1. The same message would be conveyed if we say that unemployment rate in the town is 0.05, or more commonly, 5 per cent. Sometimes rate is defined as number of units of a variable corresponding to a single unit of another variable; the two variables could be in different units. For example, seed rate refers to amount of seed required per unit area of land. The following table gives some examples of rates.

(c) Decomposition and analysis of a time series is one and the same thing.

Decomposition and analysis of a time series are one and the same thing. The original data or observed data ‘O’ is the result of the effects generated by the long-term and short-term causes, namely, (1) Trend = T, (2) cyclical = C, (3) seasonal = S, and (4) Irregular = I. Finding out the values for each of the components is called decomposition of a time series. Decomposition is done either by the Additive model or the Multiplicative model of analysis. Which of these two models is to be used in analysis of time series depends on the assumption that we might make about the nature and relationship among the four components.

Additive Model: It is based on the assumption that the four components are independent of one another. Under this assumption, the pattern of occurrence and the magnitude of movements in any particular component are not affected by the other components. In this model the values of the four components are expressed in the original units of measurement. Thus, the original data or observed data, ‘Y’ is the total of the four component values,

that is,

Y = T + S + C + I

where, T, S, C and I represent the trend variations, seasonal variations cyclical variations, and erratic variations, respectively.

Multiplicative Model: It is based on the assumption that the causes giving rise to the four components are interdependent. Thus, the original data or observed data ‘Y’ is the product of four component values,

that is :

Y = T × S × C × I

In this model the values of all the components, except trend values, are expressed as percentages. In business research, normally, the multiplicative model is more suited and used more frequently for the purpose of analysis of time series. Because, the data related to business and economic time series is the result of interaction of a number of factors which individually cannot be held responsible for generating any specific type of variations.

(d) Research reports are the product of slow, painstaking and accurate work.

Reporting simply means communicating or informing through reports. The researcher has collected some facts and figures, analyzed the same and arrived at certain conclusions. He has to inform or report the same to the parties interested. Therefore “reporting is communicating the facts, data and information through reports to the persons for whom such facts and data are collected and compiled”.

A report is not a complete description of what has been done during the period of survey/research. It is only a statement of the most significant facts that are necessary for understanding the conclusions drawn by the investigator. Thus, “ a report by definition, is simply an account”. The report thus is an account describing the procedure adopted, the findings arrived at and the conclusions drawn by the investigator of a problem.

Research report is a channel of communicating the research findings to the readers of the report. A good report is one which does this task efficiently and effectively. As such it should have the following characteristics/qualities.

i) It must be clear in informing the what, why, who, whom, when, where and how of the research study.

ii) It should be neither too short nor too long. One should keep in mind the fact that it should be long enough to cover the subject matter but short enough to sustain the reader’s interest.

iii) It should be written in an objective style and simple language, correctness, precision and clarity should be the watchwords of the scholar. Wordiness, indirection and pompous language are barriers to communication.

iv) A good report must combine clear thinking, logical organization and sound interpretation.

v) It should not be dull. It should be such as to sustain the reader’s interest.

vi) It must be accurate. Accuracy is one of the requirements of a report. It should be factual with objective presentation. Exaggerations and superlatives should be avoided.

vii) Clarity is another requirement of presentation. It is achieved by using familiar words and unambiguous statements, explicitly defining new concepts and unusual terms.

viii) Coherence is an essential part of clarity. There should be logical flow of ideas (i.e. continuity of thought), sequence of sentences. Each sentence must be so linked with other sentences so as to move the thoughts smoothly.

ix) Readability is an important requirement of good communication. Even a technical report should be easily understandable. Technicalities should be translated into language understandable by the readers.

x) A research report should be prepared according to the best composition practices. Ensure readability through proper paragraphing, short sentences, illustrations, examples, section headings, use of charts, graphs and diagrams.

xi) Draw sound inferences/conclusions from the statistical tables. But don’t repeat the tables in text (verbal) form.

xii) Footnote references should be in proper form. The bibliography should be reasonably complete and in proper form.

xiii) The report must be attractive in appearance, neat and clean whether typed or printed.

xiv) The report should be free from mistakes of all types viz. language mistakes, factual mistakes, spelling mistakes, calculation mistakes etc.,

The researcher should try to achieve these qualities in his report as far as possible

Monday, 4 April 2022

Question No. 2 - MCO-03 - Research Methodology and Statistical Analysis

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Question No. 2

(a) What do you understand by the term Correlation? Distinguish between different kinds of correlation with the help of scatter diagrams.

Correlation refers to the statistical relationship between two entities. In other words, it's how two variables move in relation to one another. Correlation can be used for various data sets, as well. In some cases, you might have predicted how things will correlate, while in others, the relationship will be a surprise to you. It's important to understand that correlation does not mean the relationship is causal.

To understand how correlation works, it's important to understand the following terms:

- Positive correlation: A positive correlation would be 1. This means the two variables moved either up or down in the same direction together.

- Negative correlation: A negative correlation is -1. This means the two variables moved in opposite directions.

- Zero or no correlation: A correlation of zero means there is no relationship between the two variables. In other words, as one variable moves one way, the other moved in another unrelated direction.

A scatter diagram is used to examine the relationship between both the axes (X and Y) with one variable. In the graph, if the variables are correlated, then the point drops along a curve or line. A scatter diagram or scatter plot gives an idea of the nature of relationship.

In a scatter correlation diagram, if all the points stretch in one line, then the correlation is perfect and is in unity. However, if the scatter points are widely scattered throughout the line, then the correlation is said to be low. If the scatter points rest near a line or on a line, then the correlation is said to be linear.

Types of Scatter Diagram

You can classify scatter diagrams in many ways; I will discuss the two most popular based on correlation and slope of the trend. These are the most common in project management.

According to the correlation, you can divide scatter diagrams into the following categories:

- Scatter Diagram with No Correlation

- Scatter Diagram with Moderate Correlation

- Scatter Diagram with Strong Correlation

- Scatter Diagram with No Correlation

This diagram is also known as “Scatter Diagram with Zero Degree of Correlation.”

Here, the data point spread is so random that you cannot draw a line through them.

Therefore, you can say that these variables have no correlation.

Scatter Diagram with Moderate Correlation

This diagram is also known as “Scatter Diagram with a Low Degree of Correlation”.

scatter-diagram-with-moderate-correlation

Here, the data points are a little closer and you can see that some kind of relationship exists between these variables.

Scatter Diagram with Strong Correlation

This diagram is also known as “Scatter Diagram with a High Degree of Correlation”.

In this diagram, data points are close to each other and you can draw a line by following their pattern.

scatter-diagram-with-strong-correlation

In this case, you say that these variables are closely related.

As discussed earlier, you can categorize the scatter diagram according to the slope, or trend, of the data points:

- Scatter Diagram with Strong Positive Correlation

- Scatter Diagram with Weak Positive Correlation

- Scatter Diagram with Strong Negative Correlation

- Scatter Diagram with Weak Negative Correlation

- Scatter Diagram with Weakest (or no) Correlation

A strong positive correlation means a visible upward trend from left to right; a strong negative correlation means a visible downward trend from left to right. A weak correlation means the trend is less clear. A flat line, from left to right, is the weakest correlation, as it is neither positive nor negative. A scatter diagram with no correlation shows that the independent variable does not affect the dependent variable.

Scatter Diagram with Strong Positive Correlation

scatter-diagram-with-strong-positive-correlation

This diagram is also known as a Scatter Diagram with Positive Slant.

In a positive slant, the correlation is positive, i.e. as the value of X increases, the value of Y will increase. You can say that the slope of a straight line drawn along the data points will go up. The pattern resembles a straight line.

For example, if the weather gets hotter, cold drink sales will go up.

Scatter Diagram with Weak Positive Correlation

scatter-diagram-with-weak-positive-correlation

As the value of X increases, the value of Y also increases, but the pattern does not resemble a straight line.

Scatter Diagram with Strong Negative Correlation

scatter-diagram-with-strong-negative-correlation

This diagram is also known as a Scatter Diagram with a Negative Slant.

In the negative slant, the correlation is negative, i.e. as the value of X increases, the value of Y will decrease. The slope of a straight line drawn along the data points will go down.

For example, if the temperature goes up, sales of winter coats go down.

Scatter Diagram with Weak Negative Correlation

scatter-diagram-with-weak-negative-correlation

As the value of X increases, the value of Y will decrease, but the pattern is not clear.

Scatter Diagram with No Correlation

There isn’t any relationship between the two variables to be seen. It might just be a series of points with no visible trend, or it might be a straight, flat row of points. In either case, the independent variable has no effect on the second variable; it is not dependent.

(b) What do you understand by interpretation of data? Illustrate the types of mistakes which frequently occur in interpretation.

Data interpretation refers to the process of using diverse analytical methods to review data and arrive at relevant conclusions. The interpretation of data helps researchers to categorize, manipulate, and summarize the information in order to answer critical questions.

The importance of data interpretation is evident and this is why it needs to be done properly. Data is very likely to arrive from multiple sources and has a tendency to enter the analysis process with haphazard ordering. Data analysis tends to be extremely subjective. That is to say, the nature and goal of interpretation will vary from business to business, likely correlating to the type of data being analyzed. While there are several different types of processes that are implemented based on individual data nature, the two broadest and most common categories are “quantitative analysis” and “qualitative analysis”.

Yet, before any serious data interpretation inquiry can begin, it should be understood that visual presentations of data findings are irrelevant unless a sound decision is made regarding scales of measurement. Before any serious data analysis can begin, the scale of measurement must be decided for the data as this will have a long-term impact on data interpretation ROI. The varying scales include:

- Nominal Scale: non-numeric categories that cannot be ranked or compared quantitatively. Variables are exclusive and exhaustive.

- Ordinal Scale: exclusive categories that are exclusive and exhaustive but with a logical order. Quality ratings and agreement ratings are examples of ordinal scales (i.e., good, very good, fair, etc., OR agree, strongly agree, disagree, etc.).

- Interval: a measurement scale where data is grouped into categories with orderly and equal distances between the categories. There is always an arbitrary zero point.

- Ratio: contains features of all three.

When performing data analysis, it can be easy to slide into a few traps and end up making mistakes. Diligence is essential, and it’s wise to keep an eye out for the following 7 potential mistakes you can make. These include:

Sampling bias occurs when a non-representative sample is used. For example, a political campaign might sample 1,300 voters only to find out that one political party’s members are dramatically overrepresented in the pool. Sampling bias should be avoided because it can weigh the analysis too far in one particular direction.

Cherry-picking happens when data is stacked to support a particular hypothesis. It’s one of the more intentional problems that appear on this list because there’s always a temptation to give the analysis a nudge in the “right” direction. Not only is cherry-picking unethical, but it may have more serious consequences in fields like public policy, engineering, and health.

Disclosing metrics is a problem because a metric becomes useless once subjects know its value. This ends up creating problems like the habit in the education field of teaching to what’s on standardized tests. A similar problem occurred in the early days of internet search when websites started flooding their content with keywords to game the way pages were ranked.

Overfitting tends to happen during the analysis process. Someone might have a model, for example, and the curve produced by the model seems to be predictive. Unfortunately, the curve is only a curve because the data fits the model. The failure of the model may only become apparent, however, when the model is compared to future observations that aren’t so well-fitted.

Focusing only on the numbers is worrisome because it can have adverse real-world consequences. For example, existing social biases can be fed into models. A company handling lending might produce a model that induces geographic bias by using data derived from biased sources. The numbers may look clean and neat, but the underlying biases can be socially and economically turbulent.

Solution bias can be thought of as the gentler cousin of cherry-picking. With solution bias, a solution might be so cool, interesting or elegant that it’s hard not to fall in love with. Unfortunately, the solution might be wrong, and appropriate levels of scientific and mathematical rigor might not be applied because refuting the solution would just seem disheartening.

Communicating poorly is more problematic than you might expect. Producing analysis is one thing, but conveying findings in an accessible manner to people who didn’t participate in the project is critical. Data scientists need to be comfortable with producing elegant and engaging dashboards, charts and other work products to ensure their findings are well-communicated.

How to Avoid These Problems

Process and diligence are your primary weapons in combating mistakes in data analysis. First, you must have a process in place that emphasizes the importance of getting things right. When you’re creating a data science experiment, there need to be checks in place that will force you to stop and consider things like:

# Where is the data coming from?

# Are there known biases in the data?

# Can you screen the data for problems?

# Who is checking everybody’s work?

# When will results be re-analyzed to verify integrity?

# Are there ethical, social, economic or moral implications that need to be examined more closely before starting?

Diligence is also essential. You should be looking at concerns about whether:

# You have a large and representative enough sample to work with

# There are more rigorous ways to conduct the analysis

# How you’ll make sure analysts are following properly outlined procedures

Tackling a data science project requires sufficient and ample planning. You also have to consider ways to refine your work and to keep improving your processes over time. It takes commitment, but a group with the right culture can do a better job of steering clear of avoidable mistakes.

MCO-03 - Research Methodology and Statistical Analysis - Mcom 2nd Year

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Mcom - 2nd Year

Question No. 1

What is meant by business research process? What are the various stages / aspects involved in the research process? (20) CLICK HERE

Question No. 2

(a) What do you understand by the term Correlation? Distinguish between different kinds of correlation with the help of scatter diagrams.

(b) What do you understand by interpretation of data? Illustrate the types of mistakes which frequently occur in interpretation. (10+10) CLICK HERE

Question No. 3

Briefly comment on the following:

(a) The recognition or existence of a problem motivates research.

(b) Quantitative data has to be condensed in a meaningful manner, so that it can be easily understood and interpreted.

(d) Research reports are the product of slow, painstaking and accurate work. (4X5)

CLICK HERE

Question No. 4

Write short notes on the following:

(a) Comparative Scales

(b) Purpose of a Report

(d) Skewness (4X5) CLICK HERE

Question No. 5

Distinguish between the following:

(a) Primary and Secondary Data

(b) Estimation and testing of hypothesis

(d) Bibliography and footnote CLICK HERE

Tuesday, 12 April 2022

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Mcom - 2nd Year

Question No. 5

Distinguish between the following:

(a) Primary and Secondary Data

(b) Estimation and testing of hypothesis

(c) Sampling and Non-Sampling Errors

(d) Bibliography and footnote

Thursday, 7 April 2022

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Question No. 4

Write short notes on the following:

(a) Comparative Scales

(b) Purpose of a Report

(c) Binomial Distribution

(d) Skewness

Wednesday, 6 April 2022

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Question No. 3

Briefly comment on the following:

(a) The recognition or existence of a problem motivates research.

(b) Quantitative data has to be condensed in a meaningful manner, so that it can be easily understood and interpreted.

(c) Decomposition and analysis of a time series is one and the same thing.

(d) Research reports are the product of slow, painstaking and accurate work.

Monday, 4 April 2022

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Question No. 2

(a) What do you understand by the term Correlation? Distinguish between different kinds of correlation with the help of scatter diagrams.

(b) What do you understand by interpretation of data? Illustrate the types of mistakes which frequently occur in interpretation.

Solutions to Assignments

MCO-03 -

Research Methodology and Statistical Analysis

Mcom - 2nd Year

Contact Form

Search This Blog

About Me

Pages