This is part of a series of tutorials I’m putting together for my students.
So, you have collected some interesting data from your experiments. Since no one but you will be reading your lab notebook (but hopefully people could if they wanted to), you need to present that data in figures so the rest of the world can know what you did and decipher your results.
Deciding what data gets to be a manuscript figure
The purpose of your manuscript is to show evidence for a new conclusion and the data presented in your figures should tell this story. Remember, the order of your figures for your manuscript may not necessarily (and probably won’t be) the order in which you collected the data. So, once you have all of your figures assembled, print them out one per page and work on defining the best order of presentation to make the case for your new conclusion. Now is a good time to evaluate whether there are any potential weaknesses regarding support for your conclusions, either in the data you already have or data you may still need to collect. You want to present the strongest possible case before your manuscript is submitted for review, but everything is a cost-benefit analysis and you’re always against the clock.
At this point you may also notice that some of the data presented in figures may be tangential, not quite fit with the rest or break up the flow of the story you are trying to tell. Authors can decide to cull certain figures (Sorry, data you have to remain in the lab notebook) or move them to ‘Supplementary Figures.’ Many journals allow for the inclusion of Supplementary Material- extra figures, longer versions of methods, large tables of data or files that would never be appropriate for print format. These Supplementary Materials exist as electronic files only linked to your final accepted manuscript as it appears as a journal article. Each journal has a different policy for what is acceptable for Supplementary Material. Some are more inclusive- the more data the merrier, drag everything out of all authors’ lab notebooks. Others are very limiting- essential data and files in appropriate for print only and anything else must be incorporated into the main body of the manuscript or cut out completely.
Preparing figures starts with high-quality data.
Images should be of sufficient resolution. Any adjustments of brightness and contrast must be made to the entire image; adjusting selective portions is unethical data manipulation and scientific fraud. Cropping is OK, but again beware of excessive image manipulations. They are usually an indicator that you need to repeat the experiment to obtain the necessary data.
Experimental data should be free of technical errors or other artifacts. The results should come from experiments as described in the methods section. Consistency in following experimental protocols (and including all of those details in your notebook) should be standard lab practices. Controls must be performed for each experiment so that the results can be properly interpreted. As you evaluate the figures you have made from your data, check again to see if all necessary controls have been included. When in doubt, don’t skimp on this- repeat the experiment with the proper controls. Your co-authors and reviewers will likely eventually tell you the same thing.
Your data should also be repeated enough times to be statistically relevant. Note that this does not mean you repeat an experiment enough times until you get the data you want. This ‘cherry-picking’ is another unethical manipulation of data. Unfortunately, this type of fraud is the most difficult to catch by the peer-review system. Reviewers have no way of knowing that you have a hundred other experimental trials with contradictory data in your lab notebook. Scientists must have the integrity to accurately present their results and have legitimate justifications for excluding some data (altered variables, confounding variables, improper controls etc). It is not always possible to show all repetitions of an experiment and in some cases (like gels) it is not even feasible to average the results. Showing ‘representative data’ (a single instance of the most common result) is perfectly acceptable, but it should be just that- representative of your average results.
Don’t pursue perfect data at the expense of integrity. The rising standards of scientific work and competition for rewards based on that work create an enormous amount of pressure to compromise your integrity for the sake of publication. RESIST! Research fraud undermines our entire enterprise. Biological systems are inherently complex and imperfect- we should not expect results to be simple and pristine. Control for what you can when you can, but do not otherwise force data to yield a certain result.
Putting together figure files
Usually your data will consist of images or graphs. These electronic files must be edited to include the raw data as well as appropriate labels. The simplest way of doing this is to drop the images into a PowerPoint slide to assemble all the necessary parts. Text boxes can be used to add labels. Lines and arrows can be added to draw attention to certain features. All labels and features of your figures should be properly aligned using the automatic tools for doing so. More complex figures consist of multiple parts that are designated by letters (Ex: Figure 1A and Figure 1B), and these letters can be added as text boxes. Journals tend to have preferences for the exact labeling details (fonts, sizes etc) and the instructions to authors will have this information. Make sure you read this information carefully and apply it consistently across all figures. Don’t use Arial capital letters to label the parts of Figure 1, Calibri Roman numerals for Figure 3 and lower case Times New Roman on Figure 5. You’re not in cloud cuckoo land. Editors, reviewers and other scientists appreciate consistency.
Remember that in the final manuscript format, the sizes of all labels and images will be considerably reduced. Make sure that your figure as submitted in manuscript form is sufficiently large so that it is still interpretable at a much smaller size. Any lines on graphs should be of sufficient thickness so as not to disappear or lose their pattern upon reduction. Note that it is generally easier to number samples like gel lanes, mulitpart images, etc than to write out the full sample description in the figure. Save the full sample names and descriptions for the figure legend.
When available, move up in the food chain to a program like Adobe Photoshop or Illustrator or Corel Draw to put together figure images. These programs have a steeper learning curve, but offer more sophisticated options for putting the figure file together and saving it as a high resolution image. For many journals, your figures must be submitted as image files (usually .tiff) or as PDF pages. Most journals use the manuscript submission phase as their quality control phase, meaning the files you submit for review must be of sufficient quality for the manuscript proof. Speaking of higher quality software, programs like OriginLab and Kaleidagraph are much better at generating image quality graphs than Microsoft Excel.
Color vs. Black/White or Grayscale Figures
Journals will typically charge you more to print color figures over black and white or grayscale images. (Oh, so yeah, if you didn’t get the memo, the authors typically pay publication charges to cover the printing and/or access for the published work. But then again, if you’ve gotten this far, you’ve realized you’re not in science for the money.) When possible use black and white or grayscale figures. If graphs become too complicated in monotone, try breaking up the number of samples shown on the same axis. Of course, you shouldn’t completely eschew color. Use it when it is most appropriate to distinguish samples. For example, it’s not that big of a deal to show a Coomassie-stained gel in black and white, but pictures of Arabidopsis showing wild-type and mutant plants with varying degrees of pigmentation should definitely be in color. Finally, as part of the ‘use color judiciously rule’, stick with the basic colors (8 crayola box, not the 196) so that there are no incompatibilities or unexpected shifts in tone when transferring files or changing file-types. Also keep in mind that there is ~10% prevalence of red-green colorblindness, so avoid using these colors together to differentiate between key samples. (Hey, after #TheDress this week, maybe you should just avoid color altogether.)
The figure heading, title and legend
Each figure should have a heading as defined by the journal (Ex: ‘Figure 1.’ ‘Fig. 1’ or ‘Figure 1:’). Each figure should also have a title, the formatting of which may be explicitly defined by the journal. It may be required to be in the form of a complete sentence or just a concise phrase; it may be required to be in bold or italics to distinguish it from the legend. The legend should tell the reader what they are looking at. It is not necessary to include lengthy procedural details, but it is useful to mention the name of the experiment and any details about treatment or sample preparation useful for interpreting the data in the figure. It should define all parts shown. Every sample or label on the figure must be defined in the legend.
Other random and lesser commandments
- Thou shalt be consistent across all figures.
- Thou shalt not use yellow for graph lines.
- Thou shalt include error bars.
- Thou shalt have elements sized appropriately relative to one another.
Include your figure and figure legend tips and lesser commandments below.