An error that happened while parsing a row from a delimted file.
A DelimitedFormatStrategy where all parameters have been completely fixed.
A DelimitedFormatStrategy where all parameters have been completely fixed.
the delimiter that separates fields within a row
the character/string that indicates the beginning/end of a quoted value
the string that is used to escape a quote character, within a quoted value
the delimiter used to separate rows
if true, allow row delimiters within quotes, otherwise they are treated as an error
There are 2 types of DelimitedFormatStrategy
s: GuessDelimitedFormat
and DelimitedFormat.
There are 2 types of DelimitedFormatStrategy
s: GuessDelimitedFormat
and DelimitedFormat. A DelimitedFormat is delimited format that is
completely specified. It can actually be used to render and parse delimited
files without further work. On the other hand, GuessDelimitedFormat may
have any number of parameters left unspecified, which means they need to be
inferred before the format can be used to parse/render a delimited file.
All the method provided in DelimitedFormatStrategy
are ways of fixing (or
updating) the various parameters used. In the case of a DelimitedFormat,
this just changes that parameter and keeps all others the same. In the case
of GuessDelimitedFormat, it will fix that parameter, so it no longer
needs to be inferred.
An immutable parser for delimited files.
An immutable parser for delimited files. This operates on chunks of input,
using the parseChunk
method. After parsing a chunk, the parseChunk
method
returns a new DelimitedParser
as well as all of the complete rows parsed
in that chunk. Any partially complete rows will be returned in a future call
to parseChunk
in either the returned DelimitedParser
or a future one in
a chain of calls to parseChunk
.
There are also convenience methods for parsing File
s, String
s,
InputStream
s, Reader
s, etc.
To get an instance of a DelimitedParser
that can be used to parse a CSV,
TSV, etc file, you can use something like:
val parser = DelimitedParser(DelimitedFormat.CSV) val rows: Vector[Either[DelimitedError, Row]] = parser.parseFile(new java.io.File("some.csv"))
If you don't know the format of your delimited file ahead of time, not much changes:
val parser = DelimitedParser(DelimitedFormat.Guess) val rows: Vector[Either[DelimitedError, Row]] = parser.parseFile(new java.io.File("some.csv"))
A DelimitedFormatStrategy that can infer some or all parameters of a DelimitedFormat given an adequate sample of a delimited file.
A row in a delimited file, as a sequence of *unescaped* strings.
A row in a delimited file, as a sequence of *unescaped* strings. The values must be rendered first to be used in a delimited file. This provides fast, random access to the underlying cells in the row and convenience methods for rendering the row given a DelimitedFormat.
The row delimiter used to separate rows in a delimited file.
The row delimiter used to separate rows in a delimited file. The primary
value
specified is always used as the row delimtier when rendering. The
alternate
value is an optional 2nd delimiter that may be used only when
parsing a delimited file.
In general, you'll want to use RowDelim.Both
as your row delimiter. This
will use \n to delimit rows, but will also accept \r\n as a row delimiter
since that happens often.
the row delimiter to use when parsing/rendering
an optional alternative that may be accepted during parsing
An error that happened while parsing a row from a delimted file.
a message describing the error
the offset (# of chars), into the file, where the row starts
the position (chars) in the file where the error occured
the text of the row, up to at least where the error occured
the row (1-based) where the error occured
the column (1-based) where the error occured