Trait/Object

net.tixxit.delimited

DelimitedParser

Related Docs: object DelimitedParser | package delimited

Permalink

trait DelimitedParser extends AnyRef

An immutable parser for delimited files. This operates on chunks of input, using the parseChunk method. After parsing a chunk, the parseChunk method returns a new DelimitedParser as well as all of the complete rows parsed in that chunk. Any partially complete rows will be returned in a future call to parseChunk in either the returned DelimitedParser or a future one in a chain of calls to parseChunk.

There are also convenience methods for parsing Files, Strings, InputStreams, Readers, etc.

To get an instance of a DelimitedParser that can be used to parse a CSV, TSV, etc file, you can use something like:

val parser = DelimitedParser(DelimitedFormat.CSV)
val rows: Vector[Either[DelimitedError, Row]] =
  parser.parseFile(new java.io.File("some.csv"))

If you don't know the format of your delimited file ahead of time, not much changes:

val parser = DelimitedParser(DelimitedFormat.Guess)
val rows: Vector[Either[DelimitedError, Row]] =
  parser.parseFile(new java.io.File("some.csv"))
Linear Supertypes
AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DelimitedParser
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def format: Option[DelimitedFormat]

    Permalink

    The DelimitedFormat being used to parse this delimited file, or None if a format has not yet been inferred (in which case, no rows have yet been returned by parseChunk).

  2. abstract def parseChunk(chunk: Option[String]): (DelimitedParser, Vector[Either[DelimitedError, Row]])

    Permalink

    Parse a chunk of the input if there is any left.

    Parse a chunk of the input if there is any left. If chunk is None, then that indicates to the parser that there will be no further input. In this case (chunk is None), all remaining input will be consumed and returned as rows (or errors).

    This returns a new DelimitedParser to use to parse the next chunk, as well as a Vector of all complete rows parsed from chunk.

    chunk

    the next chunk of data as a String, or None if eof

  3. abstract def reset: (String, DelimitedParser)

    Permalink

    Returns all unparsed data and a DelimitedParser whose state is completely reset.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  14. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. def parseAll(chunks: Iterator[String]): Iterator[Either[DelimitedError, Row]]

    Permalink

    Parse all chunks in the given iterator, consecutively, treating the last chunk in chunks as the final input.

    Parse all chunks in the given iterator, consecutively, treating the last chunk in chunks as the final input. This will return all rows from the input.

  16. def parseFile(file: File, charset: Charset = StandardCharsets.UTF_8): Vector[Either[DelimitedError, Row]]

    Permalink

    Completely parses file and returns all the rows in a Vector.

    Completely parses file and returns all the rows in a Vector.

    file

    the TSV file on disk

    charset

    the character set the TSV was encoded in

  17. def parseInputStream(is: InputStream, charset: Charset = StandardCharsets.UTF_8): Iterator[Either[DelimitedError, Row]]

    Permalink

    Returns an iterator that parses rows from in as elements are consumed.

    Returns an iterator that parses rows from in as elements are consumed.

    charset

    the character set to decode the bytes as

  18. def parseReader(reader: Reader): Iterator[Either[DelimitedError, Row]]

    Permalink

    Returns an iterator that parses rows from reader as elements are consumed.

  19. def parseString(input: String): Vector[Either[DelimitedError, Row]]

    Permalink

    Parses an entire delimited file as a string.

  20. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  21. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  22. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped