In this article, we are going to present a custom XLSX parser based on Java annotations. The application will be using the Apache POI library to read the structure of the Excel file and then map all entries into the POJO object.
2. Setup project
The project will use several dependencies from apache-poi. We also used Lombok to generate common methods for POJO classes like setters, getters, and constructors.
The pom.xml file will contain the following items:
The latest versions of used libraries could be found in our Maven Repository:
The project is organized in the following structure:
4. XLSX Parser classes
We can distinguish three classes responsible for processing XLSX files:
XLSXField - custom annotation used to mark field in POJO class and connect it with the column from XLSX document,
XLSXHeader - is a wrapper class used to hold information about fields, columns, and columns positions in the Excel file,
XLSXParser - the main class responsible for parsing XLSX files and set values into POJO objects.
Starting with the XLSXField - which is an annotation used to connect column with a field in Java class:
The XLSXHeader is a helper class that wraps data like class field name, column name from XLSX file, and column index:
And finally the base utility class responsible for parsing Excel files - XLSXParser:
Note that for the sake of simplicity we assume that the first row in an Excel file will be our table header, and also the first sheet will contain all the data:
The parse(...) method is responsible for reading XLSX documents using the Apache POI library and parsing values from it into a specific Java model instance.
In the first step, we create a list of helper classes that contains all the necessary information like field name, related column, and column index:
In modelObjectToXLSXHeader() method we iterate over all declared in specified class fields and filter all marked with XLSXField annotation. Then we are searching for a column in an Excel file with the same name as provided in the annotation column property. The method returns the list of wrapped objects XLSXHeader.
The createRowObject() method creates an instance of a specified POJO object and filled the fields with values according to the data from the XLSX file.
As a result parse() method returns a list of POJO objects.
5. Run and test a sample program
In order to test the implementation of the XLSX parser, we created a simple object with three fields marked with @XLSXField annotation:
Fields will be related with the following column from the Excel file:
Field
Column
title
Title
content
Content
url
URL
Our sample XLSX file with have the following structure:
The sample Java program that will parse the posts.xlsx file located in the project resource folder looks as follows:
Here:
first, we created an instance of XLSXParser dedicated for the Post class,
in the next step we retrieved data from the posts.xlsx file,
then file stream is used to parse the file using the XLSXParser.parse(...) method.
As a result, we have a list of three POJO instances that correspond to the rows in the Excel document:
6. Conclusion
In this article, we presented how to create an XLSX parser based on annotations in Java using the Apache POI library. Note that we showed a simple example of an Excel file, also we convert all values into strings, but it a good point to start some more complex functionality.
As always the code used in this article is available in our GitHub repository.
{{ 'Comments (%count%)' | trans {count:count} }}
{{ 'Comments are closed.' | trans }}