R: How to transform info like 61 years 02 months to total number of months?

in #hive-138200last year

I have a task where I need to transform info like 61 years 02 months into total number of months in a dataframe column.

To do this, I use the following libraries:

library(dplyr)
library(stringr)

To see a sample data frame, you can run this code.

df <- data.frame(date_str = c("61 years 04 months", "10 years", "7 years 09 months"))

Screenshot 2023-10-14 at 9.31.49 AM.png

The following code will create a new column with the total number of months.

df <- df %>%
  mutate(years = as.numeric(str_extract(date_str, "\\d+(?= years)")),
         months = as.numeric(str_extract(date_str, "\\d+(?= months)")),
         years = ifelse(is.na(years), 0, years), # Set NAs to 0 for years
         months = ifelse(is.na(months), 0, months), # Set NAs to 0 for months
         total_months = years*12 + months) %>%
  select(-years, -months) # Remove intermediate columns if you don't need them

print(df)

The magic is with the str_extract(date_str, "\d+(?= years)" where the year value is extracted.

To cater to cases where there is 10 years and no months, the following is used: ifelse(is.na(months), 0, months).

This derives the total months: 'total_months = years*12 + months'.

Screenshot 2023-10-14 at 9.31.02 AM.png

snippets.png