Data Loading...
Data Presentation Flipbook PDF
Data Presentation
188 Views
115 Downloads
FLIP PDF 784.15KB
DPB30063 STATISTIC TOPIC 2: DATA PRESENTATION
Zaihasra Binti Rukiman JPPMK
DPB30063 STATISTICS
TOPIC 2 DATA PRESENTATION
At the end of this chapter, students should be able to :
Construct frequency distribution tables
Organize quantitative data
2.0
Introduction
Presenting data involves the use of a variety of different graphical techniques to visually show the reader the relationship between different data sets, to emphasize the nature of a particular aspect of the data or to geographically ‘place’ data appropriately on a map.
Data presentation is an essential step before further statistical analysis is carried out.
The
data
are
summarized
and
displayed
enabling
researchers,
managers and decision-makers to observe important features of the data and provide insight into the type of model and analysis that should be used.
Some common data presentations include frequency table, bar chart, pie chart, histogram, frequency curve, line graph, stem-and –leaf display and ogive.
Example: Data that have been collected or recorded but have not been arranged or processed yet Below is the height of 10 children in cm: 58 25 30 64 22 76 53 47
31 66
These data is also called ungrouped data.
2.1 Construct Frequency Distribution Tables
1
DPB30063 STATISTICS
Frequency tells you how often something happened.
The frequency of an observation tells you the number of times the observation occurs in the data.
Tables can show either categorical variables (sometimes called qualitative variables) or quantitative variables (sometimes called numeric variables).
Frequency distribution tables can show either the actual number of observations falling in each range or the percentage of observations.
The statistical data that we collect can be presented in the form of a frequency distribution.
A frequency distribution refers to summarizing a large data set into a small number of intervals.
2.1.1 Calibrate the elements of frequency distribution tables
A frequency table summarizes the data collected by forming intervals of values and indicating the number of data that fall into each interval.
This frequency table with class intervals is known as the frequency distribution of grouped data.
2.1.1.1
Number of class
Deciding how many classes to use for grouping the data is a compromise between the extremes of too much detail (each observation in its own category)
and
not
enough
detail
(only
one
category).
The number of classes, k, can be obtained using the following formula:
k = 1 + 3.3log10 (n) n = the number of data Data Range
2
DPB30063 STATISTICS
Data range is the difference between the largest and smallest value of the observation.
R = Highest Value – Lowest Value
Class Width
Find the class width by dividing the range by the number of classes.
Class width =
Data Range Number of classes
2.1.1.2
Class interval
Class intervals are generally equal in width and are mutually exclusive.
Consists of lower class limit and upper class limit.
All of the data should fall between the lower class limit for the first class interval and upper class limit for the last class interval.
To determine the lower class limit for the first class interval, start with the lowest/smallest value or observation.
Once the starting point is determined, add the class width to get the lower class limit for the second class interval.
The upper class limit of the first class interval is a value that is smaller than the lower class limit for the second class interval
E.g. 40-49 as class interval
2.1.1.3
Frequency
40 = lower class limit 49 = upper class limit
3
DPB30063 STATISTICS
A frequency is the number of times a data value occurs.
The following steps are used to determine the frequency of each class: i-
Tally and frequency – the tallying process is normally used to count data that falls in each class; the data count will become the frequency of each class. Tallies are usually marked in bunches of five.
ii-
The used of the tallying process eases the calculation of the frequency of each class from the given data set.
iii-
Starting with first data, search for the class in which the number will fall, then draw a tally mark (/) for that particular class. Once we have for tally marks, the fifth mark will be drawn across the last four marks. It is done as such to make the counting process easier.
iv-
E.g. //// ///
= 5 (the frequency of observation) =3
//// /// = 8
2.1.1.4
Cumulative frequency
4
DPB30063 STATISTICS
The total of a frequency and all frequencies so far in a frequency distribution.
It is the 'running total' of frequencies.
Example: Score
Frequency
Cumulative Frequency
1
2
2
2
5
7
3
4
11
4
2
13
5
1
14
Cumulative for score 4 is 2+5+4+2 = 13
2.1.1.5
Class Boundaries
The amount to be added or subtracted is ½ the difference between the upper limit of one class and the lower limit of the following class.
Class boundaries is the midpoint between the upper class limit of a class and the lower class limit of the next class.
Lower boundary of a class = Upper limit of previous class + Lower limit of class 2 Upper boundary of a class = Upper limit of class + Lower limit of next class 2
E.g. for class interval 40-49 Lower class boundaries 40-0.5=39.5 Upper class boundaries 49+0.5=49.5
2.1.1.6
Mid-point (Class mark)
5
DPB30063 STATISTICS
The number in the middle of the class.
Can be calculated by using equation:
the following
Mid-point = Lower limit of class + Upper limit of class 2 Relative frequency
2.1.1.7
The relative frequency of an event is defined as the number of times that the event occurs during experimental trials, divided by the total number of trials conducted.
Computed by using:
Relative frequency = Frequency of each class Total of frequency
2.2
Organize quantitative data 2.2.1 Construct histogram
A histogram is a graphical display of data using bars of different heights.
The histogram is constructed by using class boundaries and the frequencies of the classes.
The frequency is represented by the area of the bar. The area is equivalent to the height of the bar for equal class intervals.
Steps to construct a histogram i-
Identify class boundaries and frequency for each class.
ii-
Mark the class boundaries at the horizontal axis (x).
iii-
Insert the frequency at the vertical axis (y).
iv-
Draw a vertical bar to show the frequency for each class. ( The height for each bar is based on the total of frequency for each class).
Example:
6
DPB30063 STATISTICS
Class boundaries
Frequency
20-30
2
30-40
4
40-50
4
50-60
5
60-70
3
70-80
1
80-90
0
90-100
1
2.2.2 Construct frequency polygon
A polygon is drawn by connecting the midpoints of every class in one line.
Steps for drawing a polygon: i-
Complete a histogram
7
DPB30063 STATISTICS
ii-
Add two additional classes with zero frequency at both ends of the histogram.
iii-
Mark the midpoints of histograms bars.
iv-
Connect the midpoints using straight lines.
v-
Make sure that the polygon starts and end the xaxis.
2.2.3 Construct ogive
The Ogive is a graph of a cumulative distribution, which explains data values on the horizontal plane axis and either
the
cumulative
cumulative frequencies
relative or
frequencies,
cumulative
the
percent
frequencies on the vertical axis. a- Draw ‘less than’ ogive -
The frequencies of all preceding classes are added to the frequency of a class.
8
DPB30063 STATISTICS
-
This series is called the less than cumulative series.
-
It
is
constructed
by
adding
the
first-class
frequency to the second-class frequency and then to the third class frequency and so on. -
The downward cumulating results in the less than cumulative series.
-
How to Draw Less Than Ogive Curve? i-
Draw and mark the horizontal and vertical axes.
ii-
Take the cumulative frequencies along the y-axis (vertical axis) and the upper-class limits on the x-axis (horizontal axis).
iii-
Against each upper-class limit, plot the cumulative frequencies.
iv-
Connect the points with a continuous curve.
b- Draw ‘more than’ ogive -
The frequencies of the succeeding classes are added to the frequency of a class.
-
This series is called the more than or greater than cumulative series.
-
It is constructed by subtracting the first class second class frequency from the total, third class frequency from that and so on.
-
The upward cumulating result is greater than or more than the cumulative series.
-
How to Draw Greater than or More than Ogive Curve? i-
Draw and mark the horizontal and vertical axes.
ii-
Take the cumulative frequencies along the y-axis (vertical axis) and the lower-class limits on the x-axis (horizontal axis).
9
DPB30063 STATISTICS
iii-
Against each lower-class limit, plot the cumulative frequencies
iv-
Connect the points with a continuous curve.
10
DPB30063 STATISTICS
Example: In the previous examination, the mark for statistics examination for 50 person’s students as bellows:
19
21
30
22
37
23
69
65
26
76
47
36
31
71
35
17
29
70
33
16
24
27
37
25
27
21
33
25
73
75
27
20
40
20
24
25
65
41
24
63
23
42
17
65
25
18
23
18
46
25
Solution: i- Determine the range. R = Highest Value – Lowest Value R = 76 – 16 = 60 ii- Determine the tentative number of classes (K). K = 1 + 3. 33 log N = 1 + 3.33 log 50 = 1 + 3.33 (1.69897) = 6.65 *Round – off the result to the next integer if the decimal part exceeds 0. K=7
class width iii-
c
Range R c number of classes k
60 8.57 9 7
11
DPB30063 STATISTICS
Classes
Tally
Frequency
16 – 24
//// //// //// //
17
25 – 33
//// //// ////
14
34 – 42
//// //
7
43 – 51
//
2
52 – 60
0
61 – 69
////
5
70 - 78
////
5
12