{"id":320,"date":"2019-01-17T16:17:40","date_gmt":"2019-01-17T16:17:06","guid":{"rendered":"http:\/\/www.lib.uiowa.edu\/data\/?page_id=320"},"modified":"2026-02-16T23:33:54","modified_gmt":"2026-02-16T23:33:54","slug":"data-structure","status":"publish","type":"page","link":"http:\/\/www.lib.uiowa.edu\/data\/manage\/data-structure\/","title":{"rendered":"Spreadsheet Data Structure"},"content":{"rendered":"<p><strong><a href=\"#best\">Best Practices<\/a> | <a href=\"#tools\">Helpful Tools\u00a0<\/a><\/strong><\/p>\n<p id=\"best\">Well-organized spreadsheets, sometimes referred to as \u201ctidy data,\u201d make data analysis and visualization significantly easier and save the researcher time. Use these best practices when setting up your spreadsheet and inputting your data and you will spend less time cleaning and re-organizing your data later.<\/p>\n<p>In addition to the recommendations on this page, creating a data dictionary can help as well. See our <a href=\"https:\/\/www.lib.uiowa.edu\/data\/manage\/documenting\/#dictionaries\" target=\"_blank\" rel=\"noopener\">information about data dictionaries<\/a> for specifics.<\/p>\n<p>Each recommendation below is numbered. See corresponding examples of \u201cmessy data\u201d in the images that follow, as well as the <a href=\"#tidy\">\u201ctidy data\u201d example at the bottom of the page.<\/a><\/p>\n<h3><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2214\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/1.jpg\" alt=\"\" width=\"29\" height=\"29\" \/>Make your Spreadsheet a Rectangle<\/h3>\n<p>Use only simple rows and columns for data so the spreadsheet is a single rectangle. Don\u2019t have multiple tables on one spreadsheet and don&#8217;t add graphs, images, etc. on the same sheet. Instead use one table on each spreadsheet, with columns as variables and rows as observations.<\/p>\n<h3><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2217\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/2.jpg\" alt=\"\" width=\"29\" height=\"29\" \/>Put One Piece of Information in a Cell<\/h3>\n<p>Don\u2019t put more than one thing in a cell; it makes sorting and analysis difficult. Put each data point (even\u00a0 measurement units) in separate cells or provide the units in the column heading and offer a definition in your <a href=\"https:\/\/www.lib.uiowa.edu\/data\/manage\/documenting\/#dictionaries\" target=\"_blank\" rel=\"noopener\">data dictionary<\/a>. If you need to make notes about data in a specific cell, create a \u201cnotes\u201d column in your spreadsheet.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-2209\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData1-300x154.png\" alt=\"\" width=\"988\" height=\"507\" srcset=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData1-300x154.png 300w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData1-1024x527.png 1024w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData1-768x395.png 768w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData1-1536x790.png 1536w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData1-2048x1053.png 2048w\" sizes=\"auto, (max-width: 988px) 100vw, 988px\" \/><\/p>\n<h3><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2219\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/3.jpg\" alt=\"\" width=\"29\" height=\"29\" \/>Be Consistent with Variables<\/h3>\n<p>When entering data, be consistent with how you enter variables into the spreadsheet. For instance, always use \u201cmale\u201d for <em>male<\/em>, rather than sometimes using <em>M<\/em>, <em>m<\/em>, or <em>Male<\/em>.<\/p>\n<h3><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2220\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/4.jpg\" alt=\"\" width=\"29\" height=\"29\" \/>Watch the Date Format<\/h3>\n<p>Use the date format YYYY-MM-DD (also known as <a href=\"https:\/\/www.iso.org\/iso-8601-date-and-time-format.html\" target=\"_blank\" rel=\"noopener\">ISO 8601 <i class=\"fas fa-external-link-alt\"> <\/i><\/a>) or use separate columns for Year, Month, and Day. This reduces confusion for an international audience and avoids problems with how Microsoft Excel treats dates. (More information about formatting dates in Excel can be found <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/format-a-date-the-way-you-want-8e10019e-d5d8-47a1-ba95-db95123d273e\" target=\"_blank\" rel=\"noopener\">in the support documentation on their website. <i class=\"fas fa-external-link-alt\"> <\/i><\/a>)<\/p>\n<h3><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2221\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/5.jpg\" alt=\"\" width=\"29\" height=\"29\" \/>Fill in all Cells<\/h3>\n<p>There are different schools of thought on how to treat missing data (see <a href=\"#references\">Broman &amp; Woo and Hook, et al articles linked at the bottom of the page for more info<\/a>). In general, best practice is to use a common code for missing data, such as a hyphen (&#8211;) or NA, to make clear the data is missing rather than unintentionally left blank. Don&#8217;t just leave the cell blank unless you define this in the data dictionary. Make a separate column for notes if you need to explain why data is missing.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-2210\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData2-300x163.png\" alt=\"\" width=\"981\" height=\"533\" srcset=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData2-300x163.png 300w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData2-1024x557.png 1024w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData2-768x418.png 768w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData2-1536x835.png 1536w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData2-2048x1113.png 2048w\" sizes=\"auto, (max-width: 981px) 100vw, 981px\" \/><\/p>\n<h3><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2222\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/6.jpg\" alt=\"\" width=\"29\" height=\"29\" \/>Avoid Special Characters and Extra Spaces<\/h3>\n<p>Special characters like @, *, and ^ mean specific things in programming languages, so using them in spreadsheets makes the data hard to work with later on. Similarly, spaces in column headings often require extra work when referencing them in programming scripts or code, so it&#8217;s best to use underscores to separate words (like_this) or use camel case (LikeThis).<\/p>\n<h3><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2223\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/7.jpg\" alt=\"\" width=\"29\" height=\"29\" \/>Don\u2019t Use Colors, Fonts, Italics, or Visuals as Data<\/h3>\n<p>Colors, fonts, italics, and visuals are either ignored by the computer or just confuse it. It doesn\u2019t know a bolded row is suspicious data, or a yellow row should be ignored. If you need to emphasize data to make it easier for a human reader, add another column with an indicator variable (e.g., \u201ctrusted\u201d with values of TRUE or FALSE.)<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-2211\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData3-300x161.png\" alt=\"\" width=\"995\" height=\"534\" srcset=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData3-300x161.png 300w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData3-768x411.png 768w\" sizes=\"auto, (max-width: 995px) 100vw, 995px\" \/><\/p>\n<h3 id=\"tidy\">Make Your Data Tidy<\/h3>\n<p>The image below shows best practices for spreadsheet data: it&#8217;s a single rectangle, there&#8217;s one piece of information in each cell, variables are entered consistently, date format is YYYY-MM-DD, missing data uses a common code, no special characters or extra spaces are used, and it doesn&#8217;t use color, font, or italics to communicate information.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-2212\" src=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData4-300x169.png\" alt=\"\" width=\"985\" height=\"555\" srcset=\"https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData4-300x169.png 300w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData4-1024x576.png 1024w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData4-768x432.png 768w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData4-1536x864.png 1536w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData4-2048x1151.png 2048w, https:\/\/www.lib.uiowa.edu\/data\/files\/2023\/04\/MessyDataTidyData4-195x110.png 195w\" sizes=\"auto, (max-width: 985px) 100vw, 985px\" \/><\/p>\n<p>&nbsp;<\/p>\n<h3 id=\"tools\"><b><span data-contrast=\"none\">Helpful Tools<\/span><\/b><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;201341983&quot;:0,&quot;335559738&quot;:40,&quot;335559739&quot;:0,&quot;335559740&quot;:259}\">\u00a0<\/span><\/h3>\n<h4><b><span data-contrast=\"none\">OpenRefine<\/span><\/b><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/h4>\n<p><a href=\"https:\/\/openrefine.org\/\" target=\"_blank\" rel=\"noopener\"><span data-contrast=\"none\">OpenRefine<\/span> <i class=\"fas fa-external-link-alt\"> <\/i><\/a><span data-contrast=\"none\"> is a free, open-source tool for Windows, Mac, and Linux you can use to review and clean up (tidy) spreadsheet data, split or combine data columns, and insert linked data from other sources. <\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><a href=\"https:\/\/openrefine.org\/download\" target=\"_blank\" rel=\"noopener\"><span data-contrast=\"none\">See the download and installation instructions here.<\/span> <i class=\"fas fa-external-link-alt\"> <\/i><\/a> <span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"none\">The Libraries provide individual, team, and class-based training on OpenRefine (request via\u00a0<\/span><a href=\"mailto:lib-data@uiowa.edu\" target=\"_blank\" rel=\"noopener\"><span data-contrast=\"none\">lib-data@uiowa.edu<\/span><\/a><span data-contrast=\"none\">)<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<h4><b><span data-contrast=\"none\">REDCap<\/span><\/b><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/h4>\n<p><span data-contrast=\"none\">REDCap is a useful tool for creating and defining a data structure by basing it on a codebook or survey tool. Use REDCap for data entry for tabular data and surveys. It can be set up for HIPAA data and is frequently used for biomedical research.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"none\">REDCap is supported by the Institute for Clinical and Translational Sciences (ICTS). Request an account <\/span><a href=\"mailto:redcap@icts.uiowa.edu?subject=REDCap%20Help%20Request&amp;body=Please%20provide%20the%20following%20information:%0d%0dProject%20Name-%0dProject%20ID-%0dDescribe%20request-\" target=\"_blank\" rel=\"noopener\"><span data-contrast=\"none\">by emailing ICTS<\/span><\/a><span data-contrast=\"none\">.<\/span><span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"none\">REDCap was created by Vanderbilt University. For more info, see their <\/span><a href=\"https:\/\/projectredcap.org\/resources\/videos\/\" target=\"_blank\" rel=\"noopener\"><span data-contrast=\"none\">introductory videos.<\/span> <i class=\"fas fa-external-link-alt\"> <\/i><\/a> <span data-ccp-props=\"{&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}\">\u00a0<\/span><\/p>\n<h3 id=\"references\">References<\/h3>\n<p>If you&#8217;d like more information on best practices for organizing spreadsheet data, the following resources may help:<\/p>\n<ul>\n<li><span class=\"authors\">Broman, K.W., and Woo, K.H.<\/span>\u00a0<span class=\"date\">(2018)<\/span> <span class=\"art_title\">Data Organization in Spreadsheets, <\/span><span class=\"serial_title\"><em>The American Statistician<\/em>,<\/span>\u00a0<span class=\"volume_issue\">72:1,<\/span>\u00a0<span class=\"page_range\">2-10,<\/span>\u00a0<span class=\"doi_link\">DOI:\u00a0<a href=\"https:\/\/doi-org.proxy.lib.uiowa.edu\/10.1080\/00031305.2017.1375989\" target=\"_blank\" rel=\"noopener\">10.1080\/00031305.2017.1375989 <i class=\"fas fa-external-link-alt\"> <\/i><\/a><\/span><\/li>\n<li><span class=\"NormalTextRun SCXW191441781 BCX8\">Hook L.A., <\/span><span class=\"NormalTextRun SCXW191441781 BCX8\">Vannan<\/span><span class=\"NormalTextRun SCXW191441781 BCX8\"> S.K.S., Beaty T.W., Cook R.B., Wilson B.E. (2010) <a href=\"https:\/\/citeseerx.ist.psu.edu\/document?repid=rep1&amp;type=pdf&amp;doi=605fde97ef080d09fd9ee4d008ddf9a480a6389c\" target=\"_blank\" rel=\"noopener\">Best Practices for Preparing Environmental Data Sets to Share and Archive <i class=\"fas fa-external-link-alt\"> <\/i><\/a>. \u00a0<\/span><\/li>\n<li>Wickham, Hadley. (2014) Tidy data, <em>The Journal of Statistical Software<\/em>, 59:10, 1-23, DOI: <a href=\"https:\/\/doi.org\/10.18637\/jss.v059.i10\" target=\"_blank\" rel=\"noopener\">0.18637\/jss.v059.i10 <i class=\"fas fa-external-link-alt\"> <\/i><\/a>\n<ul>\n<li>(See <a href=\"https:\/\/tidyr.tidyverse.org\/articles\/tidy-data.html\" target=\"_blank\" rel=\"noopener\">this website <i class=\"fas fa-external-link-alt\"> <\/i><\/a> for a shorter, less formal version of the paper above).<\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/datacarpentry.org\/semester-biology\/materials\/tidy-data\/\" target=\"_blank\" rel=\"noopener\">Data Carpentry for Biologists <i class=\"fas fa-external-link-alt\"> <\/i><\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Best Practices | Helpful Tools\u00a0 Well-organized spreadsheets, sometimes referred to as \u201ctidy data,\u201d make data analysis and visualization significantly easier and save the researcher time. Use these best practices when [&hellip;]<\/p>\n","protected":false},"author":124,"featured_media":0,"parent":471,"menu_order":7,"comment_status":"closed","ping_status":"closed","template":"pagetpl-data.php","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[52],"tags":[107,108,103,64,124,125,22],"class_list":["post-320","page","type-page","status-publish","hentry","category-managing","tag-data-organization","tag-organization","tag-organizing-data","tag-research-data","tag-spreadsheet","tag-structure","tag-tools","managing","","data","wp-json","wp","v2","pages","320"],"_links":{"self":[{"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/pages\/320","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/users\/124"}],"replies":[{"embeddable":true,"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/comments?post=320"}],"version-history":[{"count":27,"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/pages\/320\/revisions"}],"predecessor-version":[{"id":3179,"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/pages\/320\/revisions\/3179"}],"up":[{"embeddable":true,"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/pages\/471"}],"wp:attachment":[{"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/media?parent=320"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/categories?post=320"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.lib.uiowa.edu\/data\/wp-json\/wp\/v2\/tags?post=320"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}