java - Is there a uniform ExcelExtractor class and a factory for both xls and xlsx files? -
is there common class , implementation of excelextractor interface handles, uniformly, extraction of text xls , xlsx sources? maybe, in ss package.
i looking allow me like, getting right implementation factory, based on file type.
right now, having explicitly use org.apache.poi.hssf.extractor.excelextractor xls files , org.apache.poi.xssf.extractor.xssfexcelextractor xlsx.
for example, explicit approach xls:
inputstream inp = new fileinputstream(path); hssfworkbook wb = new hssfworkbook(new poifsfilesystem(inp)); excelextractor extractor = new excelextractor(wb); extractor.setformulasnotresults(true); extractor.setincludesheetnames(false); string text = extractor.gettext(); i can implement own factory, before thought ask see if there common approach handles both formats (that ss package for).
two options
first, if really want stick old apache poi text extractors, use extractorfactory class. identify type, , create extractor you
however, better option - apache tika. tika builds on top of poi (and lots of others), , gives plain text extraction (+detection +xhtml +more!) wide range of file formats. you'd call tika, ask text, , no matter type. see tika examples one started
Comments
Post a Comment