I have a task where I need to transform info like 61 years 02 months into total number of months in a dataframe column.
To do this, I use the following libraries:
library(dplyr)
library(stringr)
To see a sample data frame, you can run this code.
df <- data.frame(date_str = c("61 years 04 months", "10 years", "7 years 09 months"))
The following code will create a new column with the total number of months.
df <- df %>%
mutate(years = as.numeric(str_extract(date_str, "\\d+(?= years)")),
months = as.numeric(str_extract(date_str, "\\d+(?= months)")),
years = ifelse(is.na(years), 0, years), # Set NAs to 0 for years
months = ifelse(is.na(months), 0, months), # Set NAs to 0 for months
total_months = years*12 + months) %>%
select(-years, -months) # Remove intermediate columns if you don't need them
print(df)
The magic is with the str_extract(date_str, "\d+(?= years)" where the year value is extracted.
To cater to cases where there is 10 years and no months, the following is used: ifelse(is.na(months), 0, months).
This derives the total months: 'total_months = years*12 + months'.