Automating tasks with python is fun and saves a lot of time. One of the commonly used document types is excel files. Microsoft Excel is a powerful software that can help with finances, accounting, data sharing and analysis. Things that can be done with Excel range from simple personal finance records to complicated data documents. While it is a very useful software, doing repetitive tasks over and over gets boring and time consuming. That's why using automating tools like OpenPyXL can be more fun, productive, and efficient.
Since Excel and Python both have been around for a while, there are many libraries that make automating excel with python possible. Lately, I have been using OpenPyXL to extract data from excel files and convert them into reports in new excel files. In fact, I spent all day on a project today that I have been planning for weeks now. Now that the project is complete, it will save hours every week and I happy to share here how awesome OpenPyXL is.
OpenPyXL is a python library to create, read, and edit Excel files. It is easy to use, has easy documentation, and many resources online to find solutions when we face problems with our code. To get started with OpenPyXL we pip install the library and start writing code.
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
Workbook is the main object that creates our excel file in memory. Then we can go to the active worksheet and start reading or editing it. As you may already know worksheets consist of columns, rows, and cells.
If we want to add a value to a cell, we just do the following.
ws['A1'] = 'Hive'
It write Hive into cell B5, which is located in column B and 5th row.
If we want to add entire row of values, we can do the following.
ws.append(['HBD','pays', '20%', 'APR'])
This will append four value shown in the list above after the previous value we entered. I knot these are very simple things to do. Experimenting with simple tasks, we continue making the tasks more complicated with using functions like merging cells, assigning fonts, size, colors, fill colors, and much more.
Let's say we want to define width for the columns we use, we can do so as following:
ws.column_dimensions['A'].width = 25
ws.column_dimensions['B'].width = 16
ws.column_dimensions['C'].width = 16
ws.column_dimensions['D'].width = 16
ws.column_dimensions['E'].width = 25
The best thing about automating with python is once we know what we need to do, and have an understanding of applying various formats and styles to one cell, we can iterate through thousands of them and repeat the same set of tasks.
To define fonts, fill colors, text alignment, borders to the our spreadsheet cells we need import other useful modules of the opepyxl.
from openpyxl.styles import PatternFill, Border, Side, Alignment, Font
Let's say we would like apply some of these to cell 'C3'.
ws['C3'].font = Font(size=24, bold=True)
ws['C3'].fill = PatternFill(start_color='FF76B5C5', end_color='FF76B5C5', fill_type='solid')
ws['C3'].alignment = Alignment(horizontal='center', vertical='center')
ws['C3'].border = Border(left=Side(style='thin'),
right=Side(style='thin'),
top=Side(style='thin'),
bottom=Side(style='thin')
Again, once we can do anything with one cell, we can do the same with thousands of them by iterating through the range of cells. Moreover, it become super simple once we create helper functions to do so based on the tasks we are trying to achieve.
These are only some of the examples what can be done with openpyxl. There are many more super useful functions it has. Any that can be done on openpyxl, can be done on excel. We use this library to automate the tasks.
When working with excel files we may want to create new worksheets, and we can do that as folllowing:
ws = wb.create_sheet('Balance')
If we wanted to access this or other sheets in the active workbook we do so like this:
ws = wb['Balance']
Once we are done with are project we can save the file.
wb.save(filepath)
The real fun with OpenPyXL starts we actually try to solve a problem, and automate real life tasks. It is very good at it. If you use excel files on daily basis and know a little bit of python, I would highly recommend to consider automating repetitive tasks and save some time.
Some may consider using dataframes and csv files. These are really good solutions. But if you need something with properly styled documents to share with clients, teams, or colleagues you would have to spend more time applying them manually. OpenPyXL can help with doing this for you.
One of the ideas for a Hive based project can be, creating a tool that gets wallet transfers for an account, creates an excel file with all the data and formatting and have it ready to be downloaded. If this is something you may be interested in, I can go through that process in the future or maybe even include this to my Hive Librarian App.
Have you used OpenPyXL or other libraries that help with automating excel worksheets? What are your thoughts on automating repetitive tasks? Let me know in the comments.
Posted Using LeoFinance Beta