Package com.avoka.fc.core.util
Class CsvReader
- java.lang.Object
-
- com.avoka.fc.core.util.CsvReader
-
public class CsvReader extends Object
Provides a stream based parser for parsing delimited text data from a file or a stream.
-
-
Field Summary
Fields Modifier and Type Field Description static int
ESCAPE_MODE_BACKSLASH
Use a backslash character before the text qualifier to represent an occurance of the text qualifier.static int
ESCAPE_MODE_DOUBLED
Double up the text qualifier to represent an occurrence of the text qualifier.
-
Constructor Summary
Constructors Constructor Description CsvReader(InputStream inputStream, char delimiter, Charset charset)
Create aCsvReader
object using anInputStream
object as the data source.CsvReader(InputStream inputStream, Charset charset)
Create aCsvReader
object using anInputStream
object as the data source. Uses a comma as the column delimiter.CsvReader(Reader inputStream)
Create aCsvReader
object using aReader
object as the data source. Uses a comma as the column delimiter.CsvReader(Reader inputStream, char delimiter)
Create aCsvReader
object using aReader
object as the data source.CsvReader(String fileName)
Create aCsvReader
object using a file as the data source. Uses a comma as the column delimiter and ISO-8859-1 as theCharset
.CsvReader(String fileName, char delimiter)
Create aCsvReader
object using a file as the data source. Uses ISO-8859-1 as theCharset
.CsvReader(String fileName, char delimiter, Charset charset)
Create aCsvReader
object using a file as the data source.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Close and releases all related resources.protected void
finalize()
String
get(int columnIndex)
Return the current column value for a given column index.String
get(String headerName)
Returns the current column value for a given column header name.boolean
getCaptureRawRecord()
Return the "Capture Raw Record" settingint
getColumnCount()
Return the number of columns found in this record.char
getComment()
Return the character being used as a comment signal.long
getCurrentRecord()
Return the index of the current record.char
getDelimiter()
Return the character being used as the column delimiter.int
getEscapeMode()
Return the current way to escape an occurrence of the text qualifier inside qualified data.String
getHeader(int columnIndex)
Return the column header value for a given column index.int
getHeaderCount()
Return the number of headers read in by a previous call toreadHeaders()
.String[]
getHeaders()
Return the header values as a string array.int
getIndex(String headerName)
Return the corresponding column index for a given column header name.String
getRawRecord()
Return the raw record containing the current line read from the streamchar
getRecordDelimiter()
Return the character to use as the record delimiter.boolean
getSafetySwitch()
Return the value of a safety switch to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file.boolean
getSkipEmptyRecords()
Return a flag to indicate whether empty records shall be skipped by the parser.char
getTextQualifier()
Return the character to use as a text qualifier in the data.boolean
getTrimWhitespace()
Return whether leading and trailing whitespace characters are being trimmed from non-textqualified column data.boolean
getUseComments()
Return whether comments (lines starting with the comment character) will be skipped while parsing or not.boolean
getUseTextQualifier()
Return whether text qualifiers will be used while parsing or not.String[]
getValues()
Return the list of column values.boolean
isQualified(int columnIndex)
Return whether the entry in the given column was qualified, i.e.static CsvReader
parse(String data)
Creates aCsvReader
object using a string of data as the source. Uses ISO-8859-1 as theCharset
.boolean
readHeaders()
Read the first record of data as column headers.boolean
readRecord()
Read the next record.void
setCaptureRawRecord(boolean captureRawRecord)
Set the "Capture Raw Record" settingvoid
setComment(char comment)
Set the character being used as a comment signal.void
setDelimiter(char delimiter)
Set the character to use as the column delimiter.void
setEscapeMode(int escapeMode)
Set the current way to escape an occurance of the text qualifier inside qualified data.void
setHeaders(String[] headers)
Set the header values.void
setRecordDelimiter(char recordDelimiter)
Set the character to use as the record delimiter.void
setSafetySwitch(boolean safetySwitch)
Set the value of a safety switch to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file.void
setSkipEmptyRecords(boolean skipEmptyRecords)
Set a flag to indicate whether empty records shall be skipped by the parser.void
setTextQualifier(char textQualifier)
Set the character to use as a text qualifier in the data.void
setTrimWhitespace(boolean trimWhitespace)
Set whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not.void
setUseComments(boolean useComments)
Set whether comments (lines starting with the comment character) will be skipped while parsing or not.void
setUseTextQualifier(boolean useTextQualifier)
Set whether text qualifiers will be used while parsing or not.boolean
skipLine()
Skip the next line of data using the standard end of line characters and will not do any column delimited parsing.boolean
skipRecord()
Skip the next record of data by parsing each column. Will not incrementgetCurrentRecord()
.
-
-
-
Field Detail
-
ESCAPE_MODE_DOUBLED
public static final int ESCAPE_MODE_DOUBLED
Double up the text qualifier to represent an occurrence of the text qualifier.- See Also:
- Constant Field Values
-
ESCAPE_MODE_BACKSLASH
public static final int ESCAPE_MODE_BACKSLASH
Use a backslash character before the text qualifier to represent an occurance of the text qualifier.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
CsvReader
public CsvReader(String fileName, char delimiter, Charset charset) throws FileNotFoundException
Create aCsvReader
object using a file as the data source.- Parameters:
fileName
- the path to the file to use as the data sourcedelimiter
- the character to use as the column delimitercharset
- theCharset
to use while parsing the data- Throws:
FileNotFoundException
- if the file does not exist
-
CsvReader
public CsvReader(String fileName, char delimiter) throws FileNotFoundException
Create aCsvReader
object using a file as the data source. Uses ISO-8859-1 as theCharset
.- Parameters:
fileName
- the path to the file to use as the data sourcedelimiter
- the character to use as the column delimiter- Throws:
FileNotFoundException
- if the file does not exist
-
CsvReader
public CsvReader(String fileName) throws FileNotFoundException
Create aCsvReader
object using a file as the data source. Uses a comma as the column delimiter and ISO-8859-1 as theCharset
.- Parameters:
fileName
- the path to the file to use as the data source- Throws:
FileNotFoundException
- if the file does not exist
-
CsvReader
public CsvReader(Reader inputStream, char delimiter)
Create aCsvReader
object using aReader
object as the data source.- Parameters:
inputStream
- the stream to use as the data sourcedelimiter
- the character to use as the column delimiter
-
CsvReader
public CsvReader(Reader inputStream)
Create aCsvReader
object using aReader
object as the data source. Uses a comma as the column delimiter.- Parameters:
inputStream
- the stream to use as the data source
-
CsvReader
public CsvReader(InputStream inputStream, char delimiter, Charset charset)
Create aCsvReader
object using anInputStream
object as the data source.- Parameters:
inputStream
- the stream to use as the data sourcedelimiter
- the character to use as the column delimitercharset
- theCharset
to use while parsing the data
-
CsvReader
public CsvReader(InputStream inputStream, Charset charset)
Create aCsvReader
object using anInputStream
object as the data source. Uses a comma as the column delimiter.- Parameters:
inputStream
- the stream to use as the data sourcecharset
- theCharset
to use while parsing the data
-
-
Method Detail
-
getCaptureRawRecord
public boolean getCaptureRawRecord()
Return the "Capture Raw Record" setting- Returns:
- the current value of the "Capture Raw Record" setting
-
setCaptureRawRecord
public void setCaptureRawRecord(boolean captureRawRecord)
Set the "Capture Raw Record" setting- Parameters:
captureRawRecord
- the new value for the "Capture Raw Record" setting
-
getRawRecord
public String getRawRecord()
Return the raw record containing the current line read from the stream- Returns:
- the raw record
-
getTrimWhitespace
public boolean getTrimWhitespace()
Return whether leading and trailing whitespace characters are being trimmed from non-textqualified column data. Default is true.- Returns:
- whether leading and trailing whitespace characters are being trimmed from non-textqualified column data.
-
setTrimWhitespace
public void setTrimWhitespace(boolean trimWhitespace)
Set whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not. Default is true.- Parameters:
trimWhitespace
- whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not.
-
getDelimiter
public char getDelimiter()
Return the character being used as the column delimiter. Default is comma, ','.- Returns:
- the character being used as the column delimiter.
-
setDelimiter
public void setDelimiter(char delimiter)
Set the character to use as the column delimiter. Default is comma, ','.- Parameters:
delimiter
- the character to use as the column delimiter.
-
getRecordDelimiter
public char getRecordDelimiter()
Return the character to use as the record delimiter.- Returns:
- the character to use as the record delimiter. The default is a combination of standard end of line characters for Windows, Unix, and Mac.
-
setRecordDelimiter
public void setRecordDelimiter(char recordDelimiter)
Set the character to use as the record delimiter.- Parameters:
recordDelimiter
- the character to use as the record delimiter. The default is a combination of standard end of line characters for Windows, Unix, and Mac.
-
getTextQualifier
public char getTextQualifier()
Return the character to use as a text qualifier in the data.- Returns:
- the character to use as a text qualifier in the data.
-
setTextQualifier
public void setTextQualifier(char textQualifier)
Set the character to use as a text qualifier in the data.- Parameters:
textQualifier
- the character to use as a text qualifier in the data.
-
getUseTextQualifier
public boolean getUseTextQualifier()
Return whether text qualifiers will be used while parsing or not.- Returns:
- whether text qualifiers will be used while parsing
-
setUseTextQualifier
public void setUseTextQualifier(boolean useTextQualifier)
Set whether text qualifiers will be used while parsing or not.- Parameters:
useTextQualifier
- whether to use a text qualifier while parsing or not
-
getComment
public char getComment()
Return the character being used as a comment signal. The default comment character is the pound character ('#'). Lines starting with this character will be ignored if useComments is set.- Returns:
- the character being used as a comment signal.
-
setComment
public void setComment(char comment)
Set the character being used as a comment signal. The default comment character is the pound character ('#'). Lines starting with this character will be ignored if useComments is set.- Parameters:
comment
- the character to use as a comment signal
-
getUseComments
public boolean getUseComments()
Return whether comments (lines starting with the comment character) will be skipped while parsing or not.- Returns:
- whether comments are being looked for while parsing
-
setUseComments
public void setUseComments(boolean useComments)
Set whether comments (lines starting with the comment character) will be skipped while parsing or not.- Parameters:
useComments
- whether comments are being looked for while parsing
-
getEscapeMode
public int getEscapeMode()
Return the current way to escape an occurrence of the text qualifier inside qualified data.- Returns:
- the current way to escape an occurrence of the text qualifier inside qualified data.
-
setEscapeMode
public void setEscapeMode(int escapeMode) throws IllegalArgumentException
Set the current way to escape an occurance of the text qualifier inside qualified data.- Parameters:
escapeMode
- the way to escape an occurance of the text qualifier inside qualified data- Throws:
IllegalArgumentException
- When an illegal value is specified for escapeMode
-
getSkipEmptyRecords
public boolean getSkipEmptyRecords()
Return a flag to indicate whether empty records shall be skipped by the parser.- Returns:
- whether empty records will be skipped
-
setSkipEmptyRecords
public void setSkipEmptyRecords(boolean skipEmptyRecords)
Set a flag to indicate whether empty records shall be skipped by the parser.- Parameters:
skipEmptyRecords
- whether empty records will be skipped
-
getSafetySwitch
public boolean getSafetySwitch()
Return the value of a safety switch to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file. This switch can be turned off if the file format is known and tested. With the switch off, the max column lengths and max column count per record supported by the parser will greatly increase. Default is true.- Returns:
- the current setting of the safety switch.
-
setSafetySwitch
public void setSafetySwitch(boolean safetySwitch)
Set the value of a safety switch to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file. This switch can be turned off if the file format is known and tested. With the switch off, the max column lengths and max column count per record supported by the parser will greatly increase. Default is true.- Parameters:
safetySwitch
- the new setting of the safety switch
-
getColumnCount
public int getColumnCount()
Return the number of columns found in this record.- Returns:
- The column count
-
getCurrentRecord
public long getCurrentRecord()
Return the index of the current record.- Returns:
- The index of the current record
-
getHeaderCount
public int getHeaderCount()
Return the number of headers read in by a previous call toreadHeaders()
.- Returns:
- the number of headers read in by a previous call to
readHeaders()
.
-
getHeaders
public String[] getHeaders() throws IOException
Return the header values as a string array.- Returns:
- the header values as a String array
- Throws:
IOException
- if this object has already been closed.
-
setHeaders
public void setHeaders(String[] headers)
Set the header values.- Parameters:
headers
- the new header values
-
getValues
public String[] getValues() throws IOException
Return the list of column values.- Returns:
- the list of column values
- Throws:
IOException
- if this object has already been closed
-
get
public String get(int columnIndex) throws IOException
Return the current column value for a given column index.- Parameters:
columnIndex
- the index of the column- Returns:
- the current column value
- Throws:
IOException
- if this object has already been closed
-
get
public String get(String headerName) throws IOException
Returns the current column value for a given column header name.- Parameters:
headerName
- the header name of the column- Returns:
- the current column value
- Throws:
IOException
- if this object has already been closed
-
parse
public static CsvReader parse(String data)
Creates aCsvReader
object using a string of data as the source. Uses ISO-8859-1 as theCharset
.- Parameters:
data
- the non-null data String object to use as the source- Returns:
- a
CsvReader
object using the String of data as the source
-
readRecord
public boolean readRecord() throws IOException
Read the next record.- Returns:
- whether another record was successfully read
- Throws:
IOException
- if an error occurred while reading data from the source stream
-
readHeaders
public boolean readHeaders() throws IOException
Read the first record of data as column headers.- Returns:
- whether the header record was successfully read
- Throws:
IOException
- if an error occurred while reading data from the source stream
-
getHeader
public String getHeader(int columnIndex) throws IOException
Return the column header value for a given column index.- Parameters:
columnIndex
- the index of the header column being requested- Returns:
- the value of the column header at the given column index
- Throws:
IOException
- if this object has already been closed
-
isQualified
public boolean isQualified(int columnIndex) throws IOException
Return whether the entry in the given column was qualified, i.e. started with a qualifier character.- Parameters:
columnIndex
- the index of the column whose entry should be investigated- Returns:
- whether the value is qualified
- Throws:
IOException
- if this object has already been closed
-
getIndex
public int getIndex(String headerName) throws IOException
Return the corresponding column index for a given column header name.- Parameters:
headerName
- the header name of the column.- Returns:
- The column index for the given column header name. Returns -1 if not found.
- Throws:
IOException
- if this object has already been closed.
-
skipRecord
public boolean skipRecord() throws IOException
Skip the next record of data by parsing each column. Will not incrementgetCurrentRecord()
.- Returns:
- whether another record was successfully skipped
- Throws:
IOException
- if an error occurred while reading data from the source stream.
-
skipLine
public boolean skipLine() throws IOException
Skip the next line of data using the standard end of line characters and will not do any column delimited parsing.- Returns:
- whether a line was successfully skipped
- Throws:
IOException
- if an error occurred while reading data from the source stream
-
close
public void close()
Close and releases all related resources.
-
-