Package

net.tixxit

delimited

Permalink

package delimited

Visibility
  1. Public
  2. All

Type Members

  1. case class DelimitedError(message: String, rowStart: Long, pos: Long, context: String, row: Long, col: Long) extends Exception with Product with Serializable

    Permalink

    An error that happened while parsing a row from a delimted file.

    An error that happened while parsing a row from a delimted file.

    message

    a message describing the error

    rowStart

    the offset (# of chars), into the file, where the row starts

    pos

    the position (chars) in the file where the error occured

    context

    the text of the row, up to at least where the error occured

    row

    the row (1-based) where the error occured

    col

    the column (1-based) where the error occured

  2. case class DelimitedFormat(separator: String, quote: String = "\"", quoteEscape: String = "\"", rowDelim: RowDelim = RowDelim.Both, allowRowDelimInQuotes: Boolean = true) extends DelimitedFormatStrategy with Product with Serializable

    Permalink

    A DelimitedFormatStrategy where all parameters have been completely fixed.

    A DelimitedFormatStrategy where all parameters have been completely fixed.

    separator

    the delimiter that separates fields within a row

    quote

    the character/string that indicates the beginning/end of a quoted value

    quoteEscape

    the string that is used to escape a quote character, within a quoted value

    rowDelim

    the delimiter used to separate rows

    allowRowDelimInQuotes

    if true, allow row delimiters within quotes, otherwise they are treated as an error

  3. sealed trait DelimitedFormatStrategy extends AnyRef

    Permalink

    There are 2 types of DelimitedFormatStrategys: GuessDelimitedFormat and DelimitedFormat.

    There are 2 types of DelimitedFormatStrategys: GuessDelimitedFormat and DelimitedFormat. A DelimitedFormat is delimited format that is completely specified. It can actually be used to render and parse delimited files without further work. On the other hand, GuessDelimitedFormat may have any number of parameters left unspecified, which means they need to be inferred before the format can be used to parse/render a delimited file.

    All the method provided in DelimitedFormatStrategy are ways of fixing (or updating) the various parameters used. In the case of a DelimitedFormat, this just changes that parameter and keeps all others the same. In the case of GuessDelimitedFormat, it will fix that parameter, so it no longer needs to be inferred.

  4. trait DelimitedParser extends AnyRef

    Permalink

    An immutable parser for delimited files.

    An immutable parser for delimited files. This operates on chunks of input, using the parseChunk method. After parsing a chunk, the parseChunk method returns a new DelimitedParser as well as all of the complete rows parsed in that chunk. Any partially complete rows will be returned in a future call to parseChunk in either the returned DelimitedParser or a future one in a chain of calls to parseChunk.

    There are also convenience methods for parsing Files, Strings, InputStreams, Readers, etc.

    To get an instance of a DelimitedParser that can be used to parse a CSV, TSV, etc file, you can use something like:

    val parser = DelimitedParser(DelimitedFormat.CSV)
    val rows: Vector[Either[DelimitedError, Row]] =
      parser.parseFile(new java.io.File("some.csv"))

    If you don't know the format of your delimited file ahead of time, not much changes:

    val parser = DelimitedParser(DelimitedFormat.Guess)
    val rows: Vector[Either[DelimitedError, Row]] =
      parser.parseFile(new java.io.File("some.csv"))
  5. trait GuessDelimitedFormat extends DelimitedFormatStrategy

    Permalink

    A DelimitedFormatStrategy that can infer some or all parameters of a DelimitedFormat given an adequate sample of a delimited file.

  6. final class Row extends Seq[String] with IndexedSeq[String] with IndexedSeqLike[String, Row]

    Permalink

    A row in a delimited file, as a sequence of *unescaped* strings.

    A row in a delimited file, as a sequence of *unescaped* strings. The values must be rendered first to be used in a delimited file. This provides fast, random access to the underlying cells in the row and convenience methods for rendering the row given a DelimitedFormat.

  7. case class RowDelim(value: String, alternate: Option[String] = None) extends Product with Serializable

    Permalink

    The row delimiter used to separate rows in a delimited file.

    The row delimiter used to separate rows in a delimited file. The primary value specified is always used as the row delimtier when rendering. The alternate value is an optional 2nd delimiter that may be used only when parsing a delimited file.

    In general, you'll want to use RowDelim.Both as your row delimiter. This will use \n to delimit rows, but will also accept \r\n as a row delimiter since that happens often.

    value

    the row delimiter to use when parsing/rendering

    alternate

    an optional alternative that may be accepted during parsing

Value Members

  1. object DelimitedFormat extends Serializable

    Permalink
  2. object DelimitedParser

    Permalink
  3. object Row

    Permalink
  4. object RowDelim extends Serializable

    Permalink
  5. package benchmark

    Permalink
  6. package iteratee

    Permalink
  7. package parser

    Permalink

Ungrouped