Unit-4 Marketing Research

MBA MK-01: MARKETING RESEARCH

Unit – 4

Q1. What is the role of ‘Data Editing and Data Analysis’ in preparing research report?

Ans. When the researcher collects the data it is in raw form and it needs to be edited, organized and analyzed. The raw data needs to be transformed into a comprehensible form of data. The first steps in this process are to edit the data. The edited data is then coded and inferences are drawn. The editing of the data is not a complex task but it requires an experienced, talented and knowledgeable person to do so. The role of data editing and data analysis is as follows:

Role of data editing:

Clarify responses

With editing the data the researcher makes sure that all responses are now very clear to understand. Bringing clarity is important otherwise the researcher can draw wrong inferences from the data. Sometimes the respondents make some spelling and grammatical mistakes the editor needs to correct them. The respondents might not be able to express their opinion in proper wording. The editor can rephrase the response, but he needs to be very careful in doing so. Any bias can be introduced by taking the wrong meanings of the respondent’s point of view.

Make omissions

The editor may also need to make some omissions in the responses. By chance or by some mistake some responses are left incomplete, the editor has to see what has been an oversight by the respondent.

It depends on the target population how well you get the questionnaires filled. An educated respondent will fill the questionnaire in a better manner than a person who is not very educated. It also depends on how much interested the respondent is in filling the questionnaire. Sometimes the respondents are very reluctant to fill it out. In case, you think that your respondents are not very much interested, you should take an interview rather than submitting a questionnaire. In the questionnaire, the respondents will leave blank spaces and you might get “no response”. On the other hand, in an interview you can better assess what they want to tell and what they are trying to hide.

Avoid biased editing

The editor has a great responsibility to edit the surveyed data or other form of responses. The editor needs to be very objective and should not try to hide or remove any information. He should not add anything in the responses without any sound reason. He should have to be confident in making any changes or corrections in the data. In short, he should make least changes and only logical changes. He should not add anything that shows his opinion on the issue.

Make judgements

Sometimes the respondents leave something incomplete; to complete the sentence or a phrase the editor has to make a judgement. He should have to have good judgement to do so. He should do it so well that his personal bias do not involve in the responses.

Check handwriting

Handwriting issues needs also be resolved by the editor. Some people write very fast and in this way they write so that comprehension of the text becomes difficult. In electronically sent questionnaires this problem never arises.

Logical adjustments

Logical adjustments must be made or otherwise the data will become faulty. There might be need for some logical corrections, for example, a respondent gives these three answers to the three questions that have been asked from him;

Q1: What is your age?

Ans: 16 years

Q2: What is your academic qualification?

Ans: Bachelors

Q3: What academic qualifications you want to achieve in the future?

Ans: Bachelors in fine arts

Looking at the answers he has provided, he could not be 16 years of age and done with bachelors degree. By looking at other answers he has provided you can guess his age. If he is 16 years of age then he could not be done with bachelors and you can guess in which class he will be. In case, it is possible to contact with the respondent you can ask him about these answers. You can make logical changes in these answers because it is clearly evident that 16-year boy or girl could not be in bachelors. He might got confused between the two questions and give wrong response. Such corrections are pretty easy to make but there can be some other responses that are tricky and clearly wrong. The editor must have knowledge how to correct the answers and what to do in such situation.

Re-contact the respondent

If some information is least comprehendible and no logical meaning can be taken, interviewees can be re-contacted to know what they meant by that. In case, the data in the questionnaire is not correct and the editor cannot take any meaning from it. The editor should ask the respondents, re-contact with them and get their help.

Electronic editing

In recent years, most of the researchers prefer to submit electronic questionnaires wherever it is possible. Electronically sent questionnaires are easy to edit, because in the electronic questionnaire you can set some parameters. The computer can edit the questionnaire itself and the job of the editor becomes easy. You can avoid inconsistencies in the electronic questionnaire. The logical errors can be completely avoided. No response answers are few in electronic questionnaires.

The qualities of the data editor

The data editor should have three qualities; he should have to be Intelligent, objective and experienced in editing the data. He should know that how important is the handling of data to the researcher. He should try to avoid the slightest chances of bias, which means that he should also be honest with his work. His data editing will play a major role on the final inferences that the researcher will draw from the data.

Role of data analysis:

Before beginning the dissertation writing, one has to collect data for the research. The data to be used can be either collected using data gathering techniques or someone else’s existing data, if it serves the purpose of the research. Collecting the data correctly takes a great deal of work. Before data analysis can begin, the accuracy of the data collected needs to be verified. Following data collection, the data needs to be critically analysed. For any research, data analysis is very important as it provides an explanation of various concepts, theories, frameworks and methods used. It eventually helps in arriving at conclusions and proving the hypothesis.

Data analysis is a process used to inspect, clean, transform and remodel data with a view to reach to a certain conclusion for a given situation. Data analysis is typically of two kinds: qualitative or quantitative. The type of data dictates the method of analysis. In qualitative research, any non-numerical data like text or individual words are analysed. Quantitative analysis, on the other hand, focuses on measurement of the data and can use statistics to help reveal results and conclusions. The results are numerical. In some cases, both forms of analysis are used hand in hand. For example, quantitative analysis can help prove qualitative conclusions.

Among the many benefits of data analysis, the more important ones are:

· Data analysis helps in structuring the findings from different sources of data.

· Data analysis is very helpful in breaking a macro problem into micro parts.

· Data analysis acts like a filter when it comes to acquiring meaningful insights out of huge data set.

· Data analysis helps in keeping human bias away from the research conclusion with the help of proper statistical treatment.

· When discussing data analysis it is important to mention that a methodology to analyse data needs to be picked. if a specific methodology is not selected data can neither be collected nor analyzed.

· The methodology should be present in the dissertation as it enables the reader to understand which methods have been used during the research and what type of data has been collected and analyzed throughout the process.

· The dissertation also presents a critical analysis of various methods and techniques that were considered but ultimately not used for the data analysis. An effective research methodology leads to better data collection and analysis and leads the researcher to arrive at valid and logical conclusions in the research. Without a specific methodology, observations and findings in a research cannot be made which means methodology is an essential part of a research or dissertation.

Q2. Define Data Processing. What are the steps involved in Data Processing?

Ans. Data processing is, generally, "the collection and manipulation of items of data to produce meaningful information." In this sense it can be considered a subset of information processing, "the change (processing) of information in any manner detectable by an observer." The term Data processing (DP) has also been used previously to refer to a department within an organization responsible for the operation of data processing applications.

General: Operations performed on a given set of data to extract the required information in an appropriate form such as diagrams, reports, or tables. See also electronic data processing.

Computing: Manipulation of input data with an application program to obtain desired output as an audio/video, graphic, numeric, or text data file.

Data processing is concerned with editing, coding, classifying, tabulating and charting and diagramming research data. The essence of data processing in research is data reduction. Data reduction involves winnowing out the irrelevant from the relevant data and establishing order from chaos and giving shape to a mass of data. Data processing in research consists of five important steps. They are discussed below:

1. Editing of Data

Editing is the first step in data processing. Editing is the process of examining the data collected in questionnaires/schedules to detect errors and omissions and to see that they are corrected and the schedules are ready for tabulation. When the whole data collection is over a final and a thorough check up is made. Mildred B. Parten in his book points out that the editor is responsible for seeing that the data are;

· Accurate as possible,

· Consistent with other facts secured,

· Uniformly entered,

· As complete as possible,

· Acceptable for tabulation and arranged to facilitate coding tabulation.

There are different types of editing. They are:

Field Editing is done by the enumerator. The schedule filled up by the enumerator or the respondent might have some abbreviated writings, illegible writings and the like. These are rectified by the enumerator. This should be done soon after the enumeration or interview before the loss of memory. The field editing should not extend to giving some guess data to fill up omissions.

Central Editing is done by the researcher after getting all schedules or questionnaires or forms from the enumerators or respondents. Obvious errors can be corrected. For missed data or information, the editor may substitute data or information by reviewing information provided by likely placed other respondents. A definite inappropriate answer is removed and “no answer” is entered when reasonable attempts to get the appropriate answer fail to produce results.

Editors must keep in view the following points while performing their work:

1. They should be familiar with instructions given to the interviewers and coders as well as with the editing instructions supplied to them for the purpose,

2. While crossing out an original entry for one reason or another, they should just draw a single line on it so that the same may remain legible,

3. They must make entries (if any) on the form in some distinctive color and that too in a standardized form,

4. They should initial all answers which they change or supply,

5. Editor’s initials and the data of editing should be placed on each completed form or schedule.

2. Coding of Data

Coding is necessary for efficient analysis and through it the several replies may be reduced to a small number of classes which contain the critical information required for analysis. Coding decisions should usually be taken at the designing stage of the questionnaire. This makes it possible to pre-code the questionnaire choices and which in turn is helpful for computer tabulation as one can straight forward key punch from the original questionnaires. But in case of hand coding some standard method may be used. One such standard method is to code in the margin with a colored pencil. The other method can be to transcribe the data from the questionnaire to a coding sheet. Whatever method is adopted, one should see that coding errors are altogether eliminated or reduced to the minimum level.

Coding is the process/operation by which data/responses are organized into classes/categories and numerals or other symbols are given to each item according to the class in which it falls. In other words, coding involves two important operations; (a) deciding the categories to be used and (b) allocating individual answers to them. These categories should be appropriate to the research problem, exhaustive of the data, mutually exclusive and unidirectional Since the coding eliminates much of information in the raw data, it is important that researchers design category sets carefully in order to utilize the available data more fully.

The study of the responses is the first step in coding. In the case of pressing – coded questions, coding begins at the preparation of interview schedules. Secondly, coding frame is developed by listing the possible answers to each question and assigning code numbers or symbols to each of them which are the indicators used for coding. The coding frame is an outline of what is coded and how it is to be coded. That is, a coding frame is an outline of what is coded and how it is to be coded. That is, coding frame is a set of explicit rules and conventions that are used to base classification of observations variable into values which are which are transformed into numbers. Thirdly, after preparing the sample frame the gradual process of fitting the answers to the questions must be begun. Lastly, transcription is undertaken i.e., transferring of the information from the schedules to a separate sheet called transcription sheet. Transcription sheet is a large summary sheet which contains the answer/codes of all the respondents. Transcription may not be necessary when only simple tables are required and the number of respondents is few.

3. Classification of Data

Classification or categorization is the process of grouping the statistical data under various understandable homogeneous groups for the purpose of convenient interpretation. A uniformity of attributes is the basic criterion for classification; and the grouping of data is made according to similarity. Classification becomes necessary when there is a diversity in the data collected for meaningless for meaningful presentation and analysis. However, it is meaningless in respect of homogeneous data. A good classification should have the characteristics of clarity, homogeneity, equality of scale, purposefulness and accuracy.

Objectives of Classification are below:

The complex scattered and haphazard data is organized into concise, logical and intelligible form.
It is possible to make the characteristics of similarities and dissimilarities clear.
Comparative studies are possible.
Understanding of the significance is made easier and thereby good deal of human energy is saved.
Underlying unity amongst different items is made clear and expressed.
Data is so arranged that analysis and generalization becomes possible.

Classification is of two types, viz., quantitative classification, which is on the basis of variables or quantity and qualitative classification, in which classification according to attributes. The former is the way of, grouping the variables, say, quantifying the variables in cohesive groups, while the latter groups the data on the basis of attributes or qualities. Again, it may be multiple classification or dichotomous classification. The former is the way of making many (more than two) groups on the basis of some quality or attributes while the latter is the classification into two groups on the basis of presence or absence of a certain quality. Grouping the workers of a factory under various income (class intervals) groups come under the multiple classification; and making two groups into skilled workers and unskilled workers is the dichotomous classification. The tabular form of such classification is known as statistical series, which may be inclusive or exclusive.

4. Tabulation of Data

Tabulation is the process of summarizing raw data and displaying it in compact form for further analysis. Therefore, preparing tables is a very important step. Tabulation may be by hand, mechanical, or electronic. The choice is made largely on the basis of the size and type of study, alternative costs, time pressures, and the availability of computers, and computer programmes. If the number of questionnaire is small, and their length short, hand tabulation is quite satisfactory.

Table may be divided into: (i) Frequency tables, (ii) Response tables, (iii) Contingency tables, (iv) Uni-variate tables, (v) Bi-variate tables, (vi) Statistical table and (vii) Time series tables.

Generally a research table has the following parts: (a) table number, (b) title of the table, (c) caption (d) stub (row heading), (e) body, (f) head note, (g) foot note.

As a general rule the following steps are necessary in the preparation of table:

Title of table: The table should be first given a brief, simple and clear title which may express the basis of classification.
Columns and rows: Each table should be prepared in just adequate number of columns and rows.
Captions and stubs: The columns and rows should be given simple and clear captions and stubs.
Ruling: Columns and rows should be divided by means of thin or thick rulings.
Arrangement of items; Comparable figures should be arranged side by side.
Deviations: These should be arranged in the column near the original data so that their presence may easily be noted.
Special emphasis: This can be done by writing important data in bold or special letters.
Unit of measurement: The unit should be noted below the lines.
Approximation: This should also be noted below the title.
Footnotes: These may be given below the table.
Source : Source of data must be given. For primary data, write primary data.

It is always necessary to present facts in tabular form if they can be presented more simply in the body of the text. Tabular presentation enables the reader to follow quickly than textual presentation. A table should not merely repeat information covered in the text. The same information should not, of course be presented in tabular form and graphical form. Smaller and simpler tables may be presented in the text while the large and complex table may be placed at the end of the chapter or report.

5. Data Diagrams

Diagrams are charts and graphs used to present data. These facilitate getting the attention of the reader more. These help presenting data more effectively. Creative presentation of data is possible. The data diagrams classified into:

Charts: A chart is a diagrammatic form of data presentation. Bar charts, rectangles, squares and circles can be used to present data. Bar charts are uni-dimensional, while rectangular, squares and circles are two-dimensional.

Graphs: The method of presenting numerical data in visual form is called graph, A graph gives relationship between two variables by means of either a curve or a straight line. Graphs may be divided into two categories. (1) Graphs of Time Series and (2) Graphs of Frequency Distribution. In graphs of time series one of the factors is time and other or others is / are the study factors. Graphs on frequency show the distribution of by income, age, etc. of executives and so on.

Q3. Describe the principles that should be kept in mind while classifying data into categories?

Ans. The following main principles would be kept in mind while classifying and tabulating the statistical data, which helps to improving statistical research.

Rules to classification of data: The following general rules are to be followed while classifying the data according to class intervals.

a) Number of class intervals: The number of classes is usually between 5 to 15. It should neither be very large nor be very small because if the number of classes are very few, necessary information may be lost and if the number of classes are very large, further analyzing of data would be difficult.

b) Size of class interval: The approximate size of the class intervals can be estimated as following relation.

c) Starting of class: The classes should start with 0 or 5 or multiple of 5.

d) Method of class: Exclusive method of classification should be preferred as it is more useful.

e) Tally Bars: Class frequencies should be obtained by using tally marks or tally bars.

f) It should be clear: Precise and Concrete form and it should meet the purpose of study. It should be flexible.

Q4. What is Hypothesis? Discuss the characteristics of a good hypothesis. Explain the term Null and Alternative Hypothesis?

Ans. Ordinarily, when one talks about hypothesis, one simply means a mere assumption or some supposition

to be proved or disproved. But for a researcher hypothesis is a formal question that he intends to resolve. Thus a hypothesis may be defined as a proposition or a set of proposition set forth as an explanation for the occurrence of some specified group of phenomena either asserted merely as a provisional conjecture to guide some investigation or accepted as highly probable in the light of established facts. Quite often a research hypothesis is a predictive statement, capable of being tested by scientific methods, that relates an independent variable to some dependent variable. For example, consider statements like the following ones:

“Students who receive counselling will show a greater increase in creativity than students not receiving counselling” Or

“the automobile A is performing as well as automobile B.”

These are hypotheses capable of being objectively verified and tested. Thus, we may conclude that a hypothesis states what we are looking for and it is a proposition which can be put to a test to determine its validity.

The Characteristics of a hypothesis can be discussed as follows:

Hypothesis should be clear and precise. If the hypothesis is not clear and precise, the inferences drawn on its basis cannot be taken as reliable.
Hypothesis should be capable of being tested. In a swamp of untestable hypotheses, many a time the research programmes have bogged down. Some prior study may be done by researcher in order to make hypothesis a testable one. A hypothesis “is testable if other deductions can be made from it which, in turn, can be confirmed or disproved by observation.”
Hypothesis should state relationship between variables, if it happens to be a relational hypothesis.
Hypothesis should be limited in scope and must be specific. A researcher must remember that narrower hypotheses are generally more testable and he should develop such hypotheses.
Hypothesis should be stated as far as possible in most simple terms so that the same is easily understandable by all concerned. But one must remember that simplicity of hypothesis has nothing to do with its significance.
Hypothesis should be consistent with most known facts i.e., it must be consistent with a substantial body of established facts. In other words, it should be one which judges accept as being the most likely.
Hypothesis should be amenable to testing within a reasonable time. One should not use even an excellent hypothesis, if the same cannot be tested in reasonable time for one cannot spend a life-time collecting data to test it.
Hypothesis must explain the facts that gave rise to the need for explanation. This means that by using the hypothesis plus other known and accepted generalizations, one should be able to deduce the original problem condition. Thus hypothesis must actually explain what it claims to explain; it should have empirical reference.

Statistics: An assumption about certain characteristics of a population. If it specifies values for every parameter of a population, it is called a simple hypothesis; if not, a composite hypothesis. If it attempts to nullify the difference between two sample means (by suggesting that the difference is of no statistical significance), it is called a null hypothesis.

Null hypothesis (in a statistical test) the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.

A null hypothesis is a type of hypothesis used in statistics that proposes that no statistical significance exists in a set of given observations. The null hypothesis attempts to show that no variation exists between variables or that a single variable is no different than its mean. It is presumed to be true until statistical evidence nullifies it for an alternative hypothesis.

The null hypothesis, also known as the conjecture, assumes that any kind of difference or significance you see in a set of data is due to chance. The opposite of the null hypothesis is known as the alternative hypothesis. Null hypothesis. The null hypothesis, denoted by H0, is usually the hypothesis that sample observations result purely from chance.

Alternative hypothesis. The alternative hypothesis, denoted by H1 or Ha, is the hypothesis that sample observations are influenced by some non-random cause.

(in the statistical testing of a hypothesis) the hypothesis to be accepted if the null hypothesis is rejected.

Difference Between Null and Alternative

The null hypothesis is the initial statistical claim that the population mean is equivalent to the claimed. For example, assume the average time to cook a specific brand of pasta is 12 minutes. Therefore, the null hypothesis would be stated as, "The population mean is equal to 12 minutes." Conversely, the alternative hypothesis is the hypothesis that is accepted if the null hypothesis is rejected.

For example, assume the hypothesis test is set up so that the alternative hypothesis states that the population parameter is not equal to the claimed value. Therefore, the cook time for the population mean is not equal to 12 minutes; rather it could be less than or greater than the stated value. If the null hypothesis is accepted or the statistical test indicates that the population mean is 12 minutes, then the alternative hypothesis is rejected. The opposite is true.

Q5. Discuss the different steps that are involved in testing of a hypothesis?

Ans. To test a hypothesis means to tell (on the basis of the data the researcher has collected) whether or not the hypothesis seems to be valid. In hypothesis testing the main question is: whether to accept the null hypothesis or not to accept the null hypothesis? Procedure for hypothesis testing refers to all those steps that we undertake for making a choice between the two actions i.e., rejection and acceptance of a null hypothesis. The various steps involved in hypothesis testing are stated below:

1.Making a formal statement: The step consists in making a formal statement of the null hypothesis (H0) and also of the alternative hypothesis (Ha). This means that hypotheses should be clearly stated, considering the nature of the research problem. For instance, Mr. Mohan of the Civil Engineering Department wants to test the load bearing capacity of an old bridge which must be more than 10 tons, in that case he can state his hypotheses as under:

Null hypothesis H0 : m = 10 tons

Alternative Hypothesis Ha: m > 10 tons

Take another example. The average score in an aptitude test administered at the national level is 80. To evaluate a state’s education system, the average score of 100 of the state’s students selected on random basis was 75. The state wants to know if there is a significant difference between the local scores and the national scores. In such a situation the hypotheses may be stated as under:

Null hypothesis H0: m = 80

Alternative Hypothesis Ha: m ¹ 80

The formulation of hypotheses is an important step which must be accomplished with due care in accordance with the object and nature of the problem under consideration. It also indicates whether we should use a one-tailed test or a two-tailed test. If Ha is of the type greater than (or of the type lesser than), we use a one-tailed test, but when Ha is of the type “whether greater or smaller” then we use a two-tailed test.

2.Selecting a significance level: The hypotheses are tested on a pre-determined level of significance and as such the same should be specified. Generally, in practice, either 5% level or 1% level is adopted for the purpose. The factors that affect the level of significance are: (a) the magnitude of the difference between sample means; (b) the size of the samples; (c) the variability of measurements within samples; and (d) whether the hypothesis is directional or non-directional (A directional hypothesis is one which predicts the direction of the difference between, say, means). In brief, the level of significance must be adequate in the context of the purpose and nature of enquiry.

3.Deciding the distribution to use: After deciding the level of significance, the next step in hypothesis testing is to determine the appropriate sampling distribution. The choice generally remains between normal distribution and the t-distribution. The rules for selecting the correct distribution are similar to those which we have stated earlier in the context of estimation.

4.Selecting a random sample and computing an appropriate value: Another step is to select a random sample(s) and compute an appropriate value from the sample data concerning the test statistic utilizing the relevant distribution. In other words, draw a sample to furnish empirical data.

5.Calculation of the probability: One has then to calculate the probability that the sample result would diverge as widely as it has from expectations, if the null hypothesis were in fact true.

6.Comparing the probability: Yet another step consists in comparing the probability thus calculated with the specified value for a , the significance level. If the calculated probability is equal to or smaller than the a value in case of one-tailed test (and a /2 in case of two-tailed test), then reject the null hypothesis (i.e., accept the alternative hypothesis), but if the calculated probability is greater, then accept the null hypothesis. In case we reject H0, we run a risk of (at most the level of significance) committing an error of Type I, but if we accept H0, then we run some risk (the size of which cannot be specified as long as the H0 happens to be vague rather than specific) of committing an error of Type II.

Q6. What do you understand by Non-Parametric Test? Discuss major advantages and limitations of Non-Parametric test?

Ans. A non-parametric test is a hypothesis test that does not require the population's distribution to be characterized by certain parameters. For example, many hypothesis tests rely on the assumption that the population follows a normal distribution with parameters μ and σ. Non-parametric tests do not have this assumption, so they are useful when your data are strongly abnormal and resistant to transformation.

However, nonparametric tests are not completely free of assumptions about your data. For instance, nonparametric tests require the data to be an independent random sample. For example, salary data are heavily skewed to the right, with many people earning modest salaries and fewer people earning larger salaries. You can use nonparametric tests on this data to answer questions such as the following:

Is the median salary at your company equal to a certain value? Use the 1-sample sign test.

Is the median salary at a bank's urban branch greater than the median salary of the bank's rural branch? Use the Mann-Whitney test or the Kruskal-Wallis test.

Are median salaries different in rural, urban, and suburban bank branches? Use Mood's median test.

How does education level affect salaries at the rural and urban branch? Use Friedman test.

Limitations of nonparametric tests

Nonparametric tests have the following limitations:
Nonparametric tests are usually less powerful than corresponding tests designed for use on data that come from a specific distribution. Thus, you are less likely to reject the null hypothesis when it is false.
Nonparametric tests often require you to modify the hypotheses. For example, most nonparametric tests about the population centre are tests about the median instead of the mean. The test does not answer the same question as the corresponding parametric procedure.

Advantages of nonparametric test

(1) Nonparametric test make less stringent demands of the data. For standard parametric procedures to be valid, certain underlying conditions or assumptions must be met, particularly for smaller sample sizes. The one-sample t test, for example, requires that the observations be drawn from a normally distributed population. For two independent samples, the t test has the additional requirement that the population standard deviations be equal. If these assumptions/conditions are violated, the resulting P-values and confidence intervals may not be trustworthy. However, normality is not required for the Wilcoxon signed rank or rank sum tests to produce valid inferences about whether the median of a symmetric population is 0 or whether two samples are drawn from the same population.

(2) Nonparametric procedures can sometimes be used to get a quick answer with little calculation.

(3) Two of the simplest nonparametric procedures are the sign test and median test. The sign test can be used with paired data to test the hypothesis that differences are equally likely to be positive or negative, (or, equivalently, that the median difference is 0).

(4) Nonparametric methods provide an air of objectivity when there is no reliable (universally recognized) underlying scale for the original data and there is some concern that the results of standard parametric techniques would be criticized for their dependence on an artificial metric.

(5) A historical appeal of rank tests is that it was easy to construct tables of exact critical values, provided there were no ties in the data. The same critical value could be used for all data sets with the same number of observations because every data set is reduced to the ranks 1,...,n. However, this advantage has been eliminated by the ready availability of personal computers

(6) Sometimes the data do not constitute a random sample from a larger population. The data in hand are all there are. Standard parametric techniques based on sampling from larger populations are no longer appropriate. Because there are no larger populations, there are no population parameters to estimate.

Q7. What do you mean by parametric test? State the conditions necessary for the use of following test:

a. Z test

b. T test

c. Chi-Square test

Ans. Parametric statistics/test is a branch of statistics which assumes that sample data comes from a population that follows a probability distribution based on a fixed set of parameters. Most well-known elementary statistical methods are parametric

A parametric model as it relies on a fixed parameter set assumes more about a given population than non-parametric methods. When the assumptions are correct, parametric methods will produce more accurate and precise estimates than non-parametric methods, i.e. have more statistical power. As more is assumed when the assumptions are not correct they have a greater chance of failing, and for this reason are not a robust statistical method. On the other hand, parametric formulae are often simpler to write down and faster to compute. For this reason their simplicity can make up for their lack of robustness, especially if care is taken to examine diagnostic statistics

Reasons to Use Parametric Tests

Reason 1: Parametric tests can perform well with skewed and abnormal distributions

This may be a surprise but parametric tests can perform well with continuous data that are abnormal if you satisfy these sample size guidelines.

Parametric analyses

Sample size guidelines for abnormal data

1-sample t test

Greater than 20

2-sample t test

Each group should be greater than 15

One-Way ANOVA

If you have 2-9 groups, each group should be greater than 15.

If you have 10-12 groups, each group should be greater than 20.

Reason 2: Parametric tests can perform well when the spread of each group is different

While nonparametric tests don’t assume that your data follow a normal distribution, they do have other assumptions that can be hard to meet. For nonparametric tests that compare groups, a common assumption is that the data for all groups must have the same spread (dispersion). If your groups have a different spread, the nonparametric tests might not provide valid results.

On the other hand, if you use the 2-sample t test or One-Way ANOVA, you can simply go to the Options sub dialog and uncheck Assume equal variances. Voilà, you’re good to go even when the groups have different spreads!

Reason 3: Statistical power

Parametric tests usually have more statistical power than nonparametric tests. Thus, you are more likely to detect a significant effect when one truly exists.

Conditions for following Tests :

For both Z-tests and T-tests, the conditions are the same. However, you may recall that for Z-tests, the population standard deviation has to be known, and for T-tests, the population standard deviation is unknown.

T-test conditions

The data were collected in a random way, each observation must be independent of the others, and the sampling distribution must be normal or approximately normal.

Z-test conditions

The data were collected in a random way, each observation must be independent of the others, the sampling distribution must be normal or approximately normal, and the population standard deviation must be known. When performing a hypothesis test for a population mean, there are three conditions.

One has to deal with how the data were collected. Were they collected in some random way? A simple random sample is the gold standard.

Second, is each observation independent of the others? You're going to verify that mathematically.

And third, is the sampling distribution approximately normal? Again, you're going to verify that a number of ways.

1. First, are the data collected in some random way? The purpose is to make sure there's not any bias in the sample. Ideally, you want a simple random sample from the population or to be able to treat our data as being a simple random sample. Cluster samples are typically okay, as are stratified random samples. The randomness is what matters most.

2. Second, the independence condition. You want to make sure that each observation doesn't affect any other observation. There are a couple ways to do that:

One, which isn't very common, is sampling with replacement. This means when you take a person out, or an item out of the population, that you put them back and can sample them again. That's not typically how you do sampling. Normally, when you're sampling somebody, you don't put them back, and you can't sample them again. For instance, if you're taking a political poll you wouldn't want someone's opinion counted twice. So you need a population that is large.

Sampling without replacement, where you have to check that the sample is less than 10% of the population. If we multiply your sample size by 10, the population has to be at least that big in order to say that the observations are pretty much independent of each other.

3. Finally, Is the sampling distribution approximately normal? The distribution of sample means the sampling distribution will be nearly normal in two cases:

3.1 One is if the sample size is 30 or above. The central limit theorem says that the sampling distribution of sample means will be approximately normal when the sample size is large. For most distributions that's 30 or larger for a sample size.

3.2 The other way is, if the parent distribution (the distribution of values from which we got our data) is normal, then the sampling distribution of sample means will also be normal, regardless of the sample size. There's two ways to verify that:

3.2 (a) if we're lucky, it might be stated within the context of the problem. If you're actually doing this, though, in real life, it would be hard to verify that for sure.

3.2 (b) if it doesn't, then you actually have to look at your data. Graph the data in a histogram or a dot plot and look for approximate symmetry, a mound shape, and a lack of outliers.

Chi-Square Goodness of Fit Test

This lesson explains how to conduct a chi-square goodness of fit test. The test is applied when you have one categorical variable from a single population. It is used to determine whether sample data are consistent with a hypothesized distribution.

For example, suppose a company printed baseball cards. It claimed that 30% of its cards were rookies; 60%, veterans; and 10%, All-Stars. We could gather a random sample of baseball cards and use a chi-square goodness of fit test to see whether our sample distribution differed significantly from the distribution claimed by the company. The sample problem at the end of the lesson considers this example.

When to Use the Chi-Square Goodness of Fit Test

The chi-square goodness of fit test is appropriate when the following conditions are met:

The sampling method is simple random sampling.

The variable under study is categorical.

The expected value of the number of sample observations in each level of the variable is at least 5.

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

Comments

M Ahmed12 August 2018 at 23:49
Nice information, thanks for sharing in blog post.
Business Consulting Firm in UAE
Techtoolsinnovation29 October 2018 at 23:11
Thanks for your post which is truly informative for us and we will surely keep visiting this website.
wbe services
tail spend management services
sas certified advanced analytics professional
sas academic data science training

Unknown2 April 2019 at 17:47
This is a very nice and detailed article on market research. I would also like to add that to run a business properly we should also need to use proper Business Management tools. I hope you can also write an article regarding them as well.

School of Business Management

Search This Blog

Unit-4 Marketing Research

Comments

Post a Comment

Popular posts from this blog

Model Question Bank Marketing Management (MBA – 202)

Compendium of MCQs of Strategic Management

BCOM (H)-304 --Question - Answer Bank