There is a way to iterate over Excel tables with OpenPyXL. The functionality has changed in newer...
How to generate an XML file from Excel with Python
In this post, we will take a look into how we can generate Extensible Markup Language (XML) files from an Excel file with Python. We will be using the Yattag package to generate our XML file and the OpenPyXL package for reading our Excel data.
Yattag is described in its documentation as following:
Yattag is a Python library for generating HTML or XML in a pythonic way.
That pretty much sums Yatttag up, I find it as a simple, easy to use library that just works. I had been searching for this kind of library in order to more easily generate different XML files.
To install Yattag with pip:
pip install yattag
Adding a tag with Yattag is as easy as using the
Tags are automatically closed. To use Yattag we need to import Doc from Yattag and create our Doc, tag and text with Doc().tagtext().
OpenPyXL is a library for interacting with Excel 2010 files. OpenPyXL can read and write to .xlsx and .xlsm files.
To install OpenPyXL with pip:
pip install openpyxl
To load an existing workbook in OpenPyXl we need to use the
load_workbook method. We also need to select the sheet we are reading the data from. In our example, we are using popular baby names in New York City. You can access the dataset from the link at the bottom of this post.
I have created a workbook named NY_baby_names.xlsx with one sheet of data, Sheet1. The worksheet has the following headers: Year of Birth, Gender, Child's First Name, Count, Rank. You can download the Excel file from my website here.
To access the data with OpenPyXL, do:
['Year of Birth', 'Gender', "Child's First Name", 'Count']
[2011, 'FEMALE', 'GERALDINE', 13]
First, we load the workbook with
load_workbook, and then select the first worksheet. We then iterate through the first two rows with the
Generating the XML from Excel
After the imports, we load the workbook and the worksheet. We then create our Yattag document. We fill the headers with Yattags
asis() method. The asis method enables us to input any string as the next line.
We then create our main tag, Babies. We loop through our sheet with the
iter_rows method. The
iter_rows method returns a generator with all the cells. We use a list comprehension to get all the values from the cells.
Next, we are adding the babies. Notice the use of the With tag and text. When we are finished we indent our result with Yattags indent method.
Finally, we save our file. The output should look like below. Notice that I only included two babies in the output.
Popular Baby Names dataset: https://catalog.data.gov/dataset/most-popular-baby-names-by-sex-and-mothers-ethnic-group-new-york-city-8c742