The Smart Chef

Extract cell colors from Excel

There are several reasons that it may be helpful to extract color information from an excel sheet using R. In my case, I was looking to exclude rows shaded red from analysis. Here are some other reasons it may be helpful:

  • Data analysis: If you're working with a large dataset in Excel that uses color to highlight important information, you might want to extract the color data and use it in your R analysis. For example, you might want to create a scatter plot where the color of each point represents a certain category or data point.
  • Reporting: If you're generating reports or visualizations in R, you may want to match the colors used in your Excel spreadsheets. Extracting the color data allows you to create a consistent color scheme across different platforms.
  • Automation: If you have a large number of Excel files with color data that you need to analyze, extracting the color data using R can save you time and effort compared to manually copying and pasting the color data.
  • Comparison: If you have two Excel files with similar data and different color schemes, you might want to extract the color data and compare the two color schemes side by side.
  • In this example, I am looking for cells shaded Red (FFFF0000) in column F.

      
    
    library(tidyxl)
    formats <- xlsx_formats(path)
    my_colors <- my_data$local$fill$patternFill$fgColor$rgb
    
    data_file <- readxl::read_excel(path)
    x <- xlsx_cells(path)
    
    ##FFFF0000 - Red
    x %>%
      filter(local_format_id %in%
               which(formats$local$fill$patternFill$fgColor$rgb == "FFFF0000")) %>%
      select(address, data_type) ->
      location
    
    red_cells <- data.table(location)
    
    red_cells$col <- substr(red_cells$address,1,1)
    red_cells$row <- as.numeric(gsub("[^0-9]+", "", red_cells$address))
    red_cells <- subset(red_cells, col == 'F')
    ## Add one to row to account for column header
    red_cells$row <- red_cells$row - 1