Just as there are rules of grammar in composition,there are rules of graphing that help to visualize data for youraudience. A well-designed graph should not need much explanation becausethe graph itself should make the trends in the data visually apparent. A well-designed graph also doesn't need any unnecessary decoration that doesn't convey useful information, such as depth on bars in a 2-D plot. Each of the following terms carries animportant meaning.
Dependent and Independent Variables
Imagine that we want to make a graph of the amountof rainfall that occurs at different times of year. Rainfall dependson time of year, but time of year does not depend on rainfall.Therefore, rainfall is the dependent variable and time of yearis the independent variable. In some graphs, you may have morethan one dependent variable, but never more than one independentvariable. For example, you may overlap plots of rainfall in thedesert and rainfall in the tropics against time of year, or you couldgraph inches of rainfall in 2005 and 2006 against time ofyear.
The Axes
The independent variable belongs on the x-axis(horizontal line) of the graph and the dependent variable belongs onthe y-axis (vertical line). The x and y axes cross at a pointreferred to as the origin, where the coordinates are (0,0). Ingraphs with only positive values for x and y, the origin is in thelower left corner.
The Scale
Each axis needs a scale to show the range of thedata on that axis. The low end of the scale may be zero or a roundnumber value slightly smaller than the smallest data point. The highend of the scale is usually a round number value slightly larger thanthe largest data point. The scale is measured off in major and minortick marks. Typically the scale runs from low to high ineasily counted multiples like 10s, 50s, 100s, etc. When graphs are compared side-by-side, consider scaling them to the same data range to make comparisons easier.
The Axis Labels
Each axis needs a descriptive axis labelindicating which variable is represented. For example, the y-axislabel might read "Total Rainfall" and the x-axis label might read"Month"
The Units
If you are measuring time, you must include theunits as well as the numeric values so people will know if you aretalking about seconds, minutes, hours, days, years, etc. If you aremeasuring rainfall, people won't know if you mean inches,millimeters, gallons, etc. unless you include the units. Units shouldbe reported following the axis label, as in "Total Rainfall(inches)."
The Data Points
Typically, each independent measurement represents a point on the graph. If there are multiple data sets being plotted on the same graph, each set should be represented by a unique symbol.
The Lines
Should you connect the data points with a line? Typically the answer is yes if the data points are part of a series of measurements of the same thing over a period of time, for example. The implication is that the values do not drop back to zero between measurements. However, if you are graphing a scatterplot, you might do a trendline or regression line through the points, but would not connect each one. In some cases, the relationship may not be linear, but exponential or logarithmic, or some other mathematical function, so a curve might be more appropriate than a line. However, there should be a reason why a particular curve is chosen.
The Legend
The legend becomes important when you are graphingmore than one dependent variable. For example, your legend mightindicate that green lines or bars represent rainfall in the tropicswhile brown lines or bars represent rainfall in the desertregion.
Use of Color
Colors or patterns should be used to help conveyinformation, but should not be used simply for decoration. In theprevious example, why were green and brown chosen? If the colors werereversed, would this be better or worse? Why?
Which Visual Representation?
The type of data you are presenting may be bettersuited for one kind of graph than another. For example, if yourmeasurements are periodic samples of an ongoing event, like rainfalleach day, then a line with points helps to convey that message. If onthe other hand, you are first averaging across distinct units of timelike months, then bars might work better. If you are trying tovisually display the pieces of a whole, a piechart might be a goodchoice.
Error Bars
Each point on the graph might represent a single data point, or the average of a collection of measurements at that point. In that case, it would be appropriate to express the range of variation around that point. Typically the error around the mean is expressed as the standard deviation, but with small sample sizes, the standard error is sometimes used.
The Title
The title should be a brief statement describingthe subject of the graph, but should not describe or interpret theresults.
What not to include
The graph should only include elements that enhance the interpretation, and there should be a minimum of visual adornment. For example, bars should not be 3-D unless the third dimension adds information.
An Example of a Properly Formatted Graph
How can a graph be graded? Here's a sample rubric:
| | |
Dependent Variable |
| |
Independent Variable |
| |
Axes |
| |
Scale |
| |
Axis Labels |
| |
Units |
| |
Legend (if needed) |
| |
Use of Color |
| |
Visual Representation |
| |
Title |
|
Resources:
- How to make a graph in Microsoft Excel
- Download this example: rainfall.xls
- Online Create-A-Graph tool
- Make a graph with Plotly
- Information is Beautiful
References:
- Tufte, Edward. The Visual Display of Quantitative Information