Class CsvReader

java.lang.Object
com.avoka.fc.core.util.CsvReader

public class CsvReader extends Object
Provides a stream based parser for parsing delimited text data from a file or a stream.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Use a backslash character before the text qualifier to represent an occurance of the text qualifier.
    static final int
    Double up the text qualifier to represent an occurrence of the text qualifier.
  • Constructor Summary

    Constructors
    Constructor
    Description
    CsvReader(InputStream inputStream, char delimiter, Charset charset)
    Create a CsvReader object using an InputStream object as the data source.
    CsvReader(InputStream inputStream, Charset charset)
    Create a CsvReader object using an InputStream object as the data source. Uses a comma as the column delimiter.
    CsvReader(Reader inputStream)
    Create a CsvReader object using a Reader object as the data source. Uses a comma as the column delimiter.
    CsvReader(Reader inputStream, char delimiter)
    Create a CsvReader object using a Reader object as the data source.
    CsvReader(String fileName)
    Create a CsvReader object using a file as the data source. Uses a comma as the column delimiter and ISO-8859-1 as the Charset.
    CsvReader(String fileName, char delimiter)
    Create a CsvReader object using a file as the data source. Uses ISO-8859-1 as the Charset.
    CsvReader(String fileName, char delimiter, Charset charset)
    Create a CsvReader object using a file as the data source.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Close and releases all related resources.
    protected void
     
    get(int columnIndex)
    Return the current column value for a given column index.
    get(String headerName)
    Returns the current column value for a given column header name.
    boolean
    Return the "Capture Raw Record" setting
    int
    Return the number of columns found in this record.
    char
    Return the character being used as a comment signal.
    long
    Return the index of the current record.
    char
    Return the character being used as the column delimiter.
    int
    Return the current way to escape an occurrence of the text qualifier inside qualified data.
    getHeader(int columnIndex)
    Return the column header value for a given column index.
    int
    Return the number of headers read in by a previous call to readHeaders().
    Return the header values as a string array.
    int
    getIndex(String headerName)
    Return the corresponding column index for a given column header name.
    Return the raw record containing the current line read from the stream
    char
    Return the character to use as the record delimiter.
    boolean
    Return the value of a safety switch to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file.
    boolean
    Return a flag to indicate whether empty records shall be skipped by the parser.
    char
    Return the character to use as a text qualifier in the data.
    boolean
    Return whether leading and trailing whitespace characters are being trimmed from non-textqualified column data.
    boolean
    Return whether comments (lines starting with the comment character) will be skipped while parsing or not.
    boolean
    Return whether text qualifiers will be used while parsing or not.
    Return the list of column values.
    boolean
    isQualified(int columnIndex)
    Return whether the entry in the given column was qualified, i.e.
    static CsvReader
    parse(String data)
    Creates a CsvReader object using a string of data as the source. Uses ISO-8859-1 as the Charset.
    boolean
    Read the first record of data as column headers.
    boolean
    Read the next record.
    void
    setCaptureRawRecord(boolean captureRawRecord)
    Set the "Capture Raw Record" setting
    void
    setComment(char comment)
    Set the character being used as a comment signal.
    void
    setDelimiter(char delimiter)
    Set the character to use as the column delimiter.
    void
    setEscapeMode(int escapeMode)
    Set the current way to escape an occurance of the text qualifier inside qualified data.
    void
    setHeaders(String[] headers)
    Set the header values.
    void
    setRecordDelimiter(char recordDelimiter)
    Set the character to use as the record delimiter.
    void
    setSafetySwitch(boolean safetySwitch)
    Set the value of a safety switch to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file.
    void
    setSkipEmptyRecords(boolean skipEmptyRecords)
    Set a flag to indicate whether empty records shall be skipped by the parser.
    void
    setTextQualifier(char textQualifier)
    Set the character to use as a text qualifier in the data.
    void
    setTrimWhitespace(boolean trimWhitespace)
    Set whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not.
    void
    setUseComments(boolean useComments)
    Set whether comments (lines starting with the comment character) will be skipped while parsing or not.
    void
    setUseTextQualifier(boolean useTextQualifier)
    Set whether text qualifiers will be used while parsing or not.
    boolean
    Skip the next line of data using the standard end of line characters and will not do any column delimited parsing.
    boolean
    Skip the next record of data by parsing each column. Will not increment getCurrentRecord().

    Methods inherited from class java.lang.Object

    clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • ESCAPE_MODE_DOUBLED

      public static final int ESCAPE_MODE_DOUBLED
      Double up the text qualifier to represent an occurrence of the text qualifier.
      See Also:
    • ESCAPE_MODE_BACKSLASH

      public static final int ESCAPE_MODE_BACKSLASH
      Use a backslash character before the text qualifier to represent an occurance of the text qualifier.
      See Also:
  • Constructor Details

    • CsvReader

      public CsvReader(String fileName, char delimiter, Charset charset) throws FileNotFoundException
      Create a CsvReader object using a file as the data source.
      Parameters:
      fileName - the path to the file to use as the data source
      delimiter - the character to use as the column delimiter
      charset - the Charset to use while parsing the data
      Throws:
      FileNotFoundException - if the file does not exist
    • CsvReader

      public CsvReader(String fileName, char delimiter) throws FileNotFoundException
      Create a CsvReader object using a file as the data source. Uses ISO-8859-1 as the Charset.
      Parameters:
      fileName - the path to the file to use as the data source
      delimiter - the character to use as the column delimiter
      Throws:
      FileNotFoundException - if the file does not exist
    • CsvReader

      public CsvReader(String fileName) throws FileNotFoundException
      Create a CsvReader object using a file as the data source. Uses a comma as the column delimiter and ISO-8859-1 as the Charset.
      Parameters:
      fileName - the path to the file to use as the data source
      Throws:
      FileNotFoundException - if the file does not exist
    • CsvReader

      public CsvReader(Reader inputStream, char delimiter)
      Create a CsvReader object using a Reader object as the data source.
      Parameters:
      inputStream - the stream to use as the data source
      delimiter - the character to use as the column delimiter
    • CsvReader

      public CsvReader(Reader inputStream)
      Create a CsvReader object using a Reader object as the data source. Uses a comma as the column delimiter.
      Parameters:
      inputStream - the stream to use as the data source
    • CsvReader

      public CsvReader(InputStream inputStream, char delimiter, Charset charset)
      Create a CsvReader object using an InputStream object as the data source.
      Parameters:
      inputStream - the stream to use as the data source
      delimiter - the character to use as the column delimiter
      charset - the Charset to use while parsing the data
    • CsvReader

      public CsvReader(InputStream inputStream, Charset charset)
      Create a CsvReader object using an InputStream object as the data source. Uses a comma as the column delimiter.
      Parameters:
      inputStream - the stream to use as the data source
      charset - the Charset to use while parsing the data
  • Method Details

    • getCaptureRawRecord

      public boolean getCaptureRawRecord()
      Return the "Capture Raw Record" setting
      Returns:
      the current value of the "Capture Raw Record" setting
    • setCaptureRawRecord

      public void setCaptureRawRecord(boolean captureRawRecord)
      Set the "Capture Raw Record" setting
      Parameters:
      captureRawRecord - the new value for the "Capture Raw Record" setting
    • getRawRecord

      public String getRawRecord()
      Return the raw record containing the current line read from the stream
      Returns:
      the raw record
    • getTrimWhitespace

      public boolean getTrimWhitespace()
      Return whether leading and trailing whitespace characters are being trimmed from non-textqualified column data. Default is true.
      Returns:
      whether leading and trailing whitespace characters are being trimmed from non-textqualified column data.
    • setTrimWhitespace

      public void setTrimWhitespace(boolean trimWhitespace)
      Set whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not. Default is true.
      Parameters:
      trimWhitespace - whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not.
    • getDelimiter

      public char getDelimiter()
      Return the character being used as the column delimiter. Default is comma, ','.
      Returns:
      the character being used as the column delimiter.
    • setDelimiter

      public void setDelimiter(char delimiter)
      Set the character to use as the column delimiter. Default is comma, ','.
      Parameters:
      delimiter - the character to use as the column delimiter.
    • getRecordDelimiter

      public char getRecordDelimiter()
      Return the character to use as the record delimiter.
      Returns:
      the character to use as the record delimiter. The default is a combination of standard end of line characters for Windows, Unix, and Mac.
    • setRecordDelimiter

      public void setRecordDelimiter(char recordDelimiter)
      Set the character to use as the record delimiter.
      Parameters:
      recordDelimiter - the character to use as the record delimiter. The default is a combination of standard end of line characters for Windows, Unix, and Mac.
    • getTextQualifier

      public char getTextQualifier()
      Return the character to use as a text qualifier in the data.
      Returns:
      the character to use as a text qualifier in the data.
    • setTextQualifier

      public void setTextQualifier(char textQualifier)
      Set the character to use as a text qualifier in the data.
      Parameters:
      textQualifier - the character to use as a text qualifier in the data.
    • getUseTextQualifier

      public boolean getUseTextQualifier()
      Return whether text qualifiers will be used while parsing or not.
      Returns:
      whether text qualifiers will be used while parsing
    • setUseTextQualifier

      public void setUseTextQualifier(boolean useTextQualifier)
      Set whether text qualifiers will be used while parsing or not.
      Parameters:
      useTextQualifier - whether to use a text qualifier while parsing or not
    • getComment

      public char getComment()
      Return the character being used as a comment signal. The default comment character is the pound character ('#'). Lines starting with this character will be ignored if useComments is set.
      Returns:
      the character being used as a comment signal.
    • setComment

      public void setComment(char comment)
      Set the character being used as a comment signal. The default comment character is the pound character ('#'). Lines starting with this character will be ignored if useComments is set.
      Parameters:
      comment - the character to use as a comment signal
    • getUseComments

      public boolean getUseComments()
      Return whether comments (lines starting with the comment character) will be skipped while parsing or not.
      Returns:
      whether comments are being looked for while parsing
    • setUseComments

      public void setUseComments(boolean useComments)
      Set whether comments (lines starting with the comment character) will be skipped while parsing or not.
      Parameters:
      useComments - whether comments are being looked for while parsing
    • getEscapeMode

      public int getEscapeMode()
      Return the current way to escape an occurrence of the text qualifier inside qualified data.
      Returns:
      the current way to escape an occurrence of the text qualifier inside qualified data.
    • setEscapeMode

      public void setEscapeMode(int escapeMode) throws IllegalArgumentException
      Set the current way to escape an occurance of the text qualifier inside qualified data.
      Parameters:
      escapeMode - the way to escape an occurance of the text qualifier inside qualified data
      Throws:
      IllegalArgumentException - When an illegal value is specified for escapeMode
    • getSkipEmptyRecords

      public boolean getSkipEmptyRecords()
      Return a flag to indicate whether empty records shall be skipped by the parser.
      Returns:
      whether empty records will be skipped
    • setSkipEmptyRecords

      public void setSkipEmptyRecords(boolean skipEmptyRecords)
      Set a flag to indicate whether empty records shall be skipped by the parser.
      Parameters:
      skipEmptyRecords - whether empty records will be skipped
    • getSafetySwitch

      public boolean getSafetySwitch()
      Return the value of a safety switch to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file. This switch can be turned off if the file format is known and tested. With the switch off, the max column lengths and max column count per record supported by the parser will greatly increase. Default is true.
      Returns:
      the current setting of the safety switch.
    • setSafetySwitch

      public void setSafetySwitch(boolean safetySwitch)
      Set the value of a safety switch to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file. This switch can be turned off if the file format is known and tested. With the switch off, the max column lengths and max column count per record supported by the parser will greatly increase. Default is true.
      Parameters:
      safetySwitch - the new setting of the safety switch
    • getColumnCount

      public int getColumnCount()
      Return the number of columns found in this record.
      Returns:
      The column count
    • getCurrentRecord

      public long getCurrentRecord()
      Return the index of the current record.
      Returns:
      The index of the current record
    • getHeaderCount

      public int getHeaderCount()
      Return the number of headers read in by a previous call to readHeaders().
      Returns:
      the number of headers read in by a previous call to readHeaders().
    • getHeaders

      public String[] getHeaders() throws IOException
      Return the header values as a string array.
      Returns:
      the header values as a String array
      Throws:
      IOException - if this object has already been closed.
    • setHeaders

      public void setHeaders(String[] headers)
      Set the header values.
      Parameters:
      headers - the new header values
    • getValues

      public String[] getValues() throws IOException
      Return the list of column values.
      Returns:
      the list of column values
      Throws:
      IOException - if this object has already been closed
    • get

      public String get(int columnIndex) throws IOException
      Return the current column value for a given column index.
      Parameters:
      columnIndex - the index of the column
      Returns:
      the current column value
      Throws:
      IOException - if this object has already been closed
    • get

      public String get(String headerName) throws IOException
      Returns the current column value for a given column header name.
      Parameters:
      headerName - the header name of the column
      Returns:
      the current column value
      Throws:
      IOException - if this object has already been closed
    • parse

      public static CsvReader parse(String data)
      Creates a CsvReader object using a string of data as the source. Uses ISO-8859-1 as the Charset.
      Parameters:
      data - the non-null data String object to use as the source
      Returns:
      a CsvReader object using the String of data as the source
    • readRecord

      public boolean readRecord() throws IOException
      Read the next record.
      Returns:
      whether another record was successfully read
      Throws:
      IOException - if an error occurred while reading data from the source stream
    • readHeaders

      public boolean readHeaders() throws IOException
      Read the first record of data as column headers.
      Returns:
      whether the header record was successfully read
      Throws:
      IOException - if an error occurred while reading data from the source stream
    • getHeader

      public String getHeader(int columnIndex) throws IOException
      Return the column header value for a given column index.
      Parameters:
      columnIndex - the index of the header column being requested
      Returns:
      the value of the column header at the given column index
      Throws:
      IOException - if this object has already been closed
    • isQualified

      public boolean isQualified(int columnIndex) throws IOException
      Return whether the entry in the given column was qualified, i.e. started with a qualifier character.
      Parameters:
      columnIndex - the index of the column whose entry should be investigated
      Returns:
      whether the value is qualified
      Throws:
      IOException - if this object has already been closed
    • getIndex

      public int getIndex(String headerName) throws IOException
      Return the corresponding column index for a given column header name.
      Parameters:
      headerName - the header name of the column.
      Returns:
      The column index for the given column header name. Returns -1 if not found.
      Throws:
      IOException - if this object has already been closed.
    • skipRecord

      public boolean skipRecord() throws IOException
      Skip the next record of data by parsing each column. Will not increment getCurrentRecord().
      Returns:
      whether another record was successfully skipped
      Throws:
      IOException - if an error occurred while reading data from the source stream.
    • skipLine

      public boolean skipLine() throws IOException
      Skip the next line of data using the standard end of line characters and will not do any column delimited parsing.
      Returns:
      whether a line was successfully skipped
      Throws:
      IOException - if an error occurred while reading data from the source stream
    • close

      public void close()
      Close and releases all related resources.
    • finalize

      protected void finalize()
      Overrides:
      finalize in class Object