tabula-py is a simple Python wrapper of tabula-java
Installation
pip install tabula-py
Usage
importtabula# Read pdf into list of DataFramedf=tabula.read_pdf("test.pdf",pages='all')# Read remote pdf into list of DataFramedf2=tabula.read_pdf("https://github.com/tabulapdf/tabula-java/raw/master/src/test/resources/technology/tabula/arabic.pdf")# convert PDF into CSV filetabula.convert_into("test.pdf","output.csv",output_format="csv",pages='all')# convert all PDFs in a directorytabula.convert_into_by_batch("input_directory",output_format='csv',pages='all)