Struct csv_core::Reader[−][src]

pub struct Reader { /* fields omitted */ }

A pull based CSV reader.

This reader parses CSV data using a finite state machine. Callers can extract parsed data incrementally using one of the read methods.

Note that this CSV reader is somewhat encoding agnostic. The source data needs to be at least ASCII compatible. There is no support for specifying the full gamut of Unicode delimiters/terminators/quotes/escapes. Instead, any byte can be used, although callers probably want to stick to the ASCII subset (<= 0x7F).

Usage

A reader has two different ways to read CSV data, each with their own trade offs.

read_field - Copies a single CSV field into an output buffer while unescaping quotes. This is simple to use and doesn’t require storing an entire record contiguously in memory, but it is slower.
read_record - Copies an entire CSV record into an output buffer while unescaping quotes. The ending positions of each field are copied into an additional buffer. This is harder to use and requires larger output buffers, but it is faster than read_field since it amortizes more costs.

RFC 4180

RFC 4180 is the closest thing to a specification for CSV data. Unfortunately, CSV data that is seen in the wild can vary significantly. Often, the CSV data is outright invalid. Instead of fixing the producers of bad CSV data, we have seen fit to make consumers much more flexible in what they accept. This reader continues that tradition, and therefore, isn’t technically compliant with RFC 4180. In particular, this reader will never return an error and will always find a parse.

Here are some detailed differences from RFC 4180:

CRLF, LF and CR are each treated as a single record terminator by default.
Records are permitted to be of varying length.
Empty lines (that do not include other whitespace) are ignored.

Implementations

`impl Reader`[src]

`pub fn new() -> Reader`[src]

Create a new CSV reader with a default parser configuration.

`pub fn reset(&mut self)`[src]

Reset the parser such that it behaves as if it had never been used.

This may be useful when reading CSV data in a random access pattern.

`pub fn line(&self) -> u64`[src]

Return the current line number as measured by the number of occurrences of \n.

Line numbers starts at 1 and are reset when reset is called.

`pub fn set_line(&mut self, line: u64)`[src]

Set the line number.

This is useful after a call to reset where the caller knows the line number from some additional context.

`pub fn read_field( &mut self, input: &[u8], output: &mut [u8] ) -> (ReadFieldResult, usize, usize)`[src]

Parse a single CSV field in input and copy field data to output.

This routine requires a caller provided buffer of CSV data as the input and a caller provided buffer, output, in which to store field data extracted from input. The field data copied to output will have its quotes unescaped.

Calling this routine parses at most a single field and returns three values indicating the state of the parser. The first value, a ReadFieldResult, tells the caller what to do next. For example, if the entire input was read or if the output buffer was filled before a full field had been read, then ReadFieldResult::InputEmpty or ReadFieldResult::OutputFull is returned, respectively. See the documentation for ReadFieldResult for more details.

The other two values returned correspond to the number of bytes read from input and written to output, respectively.

Termination

This reader interprets an empty input buffer as an indication that there is no CSV data left to read. Namely, when the caller has exhausted all CSV data, the caller should continue to call read with an empty input buffer until ReadFieldResult::End is returned.

Errors

This CSV reader can never return an error. Instead, it prefers a parse over no parse.

`pub fn read_record( &mut self, input: &[u8], output: &mut [u8], ends: &mut [usize] ) -> (ReadRecordResult, usize, usize, usize)`[src]

Parse a single CSV record in input and copy each field contiguously to output, with the end position of each field written to ends.

NOTE: This method is more cumbersome to use than read_field, but it can be faster since it amortizes more work.

This routine requires a caller provided buffer of CSV data as the input and two caller provided buffers to store the unescaped field data (output) and the end position of each field in the record (fields).

Calling this routine parses at most a single record and returns four values indicating the state of the parser. The first value, a ReadRecordResult, tells the caller what to do next. For example, if the entire input was read or if the output buffer was filled before a full field had been read, then ReadRecordResult::InputEmpty or ReadRecordResult::OutputFull is returned, respectively. Similarly, if the ends buffer is full, then ReadRecordResult::OutputEndsFull is returned. See the documentation for ReadRecordResult for more details.

The other three values correspond to the number of bytes read from input, the number of bytes written to output and the number of end positions written to ends, respectively.

The end positions written to ends are constructed as if there was a single contiguous buffer in memory containing the entire row, even if ReadRecordResult::OutputFull was returned in the middle of reading a row.

Termination

This reader interprets an empty input buffer as an indication that there is no CSV data left to read. Namely, when the caller has exhausted all CSV data, the caller should continue to call read with an empty input buffer until ReadRecordResult::End is returned.

Errors

This CSV reader can never return an error. Instead, it prefers a parse over no parse.

Trait Implementations

`impl Clone for Reader`[src]

`fn clone(&self) -> Reader`[src]

`pub fn clone_from(&mut self, source: &Self)`1.0.0[src]

`impl Debug for Reader`[src]

`fn fmt(&self, f: &mut Formatter<'_>) -> Result`[src]

`impl Default for Reader`[src]

`fn default() -> Reader`[src]

Auto Trait Implementations

`impl RefUnwindSafe for Reader`

`impl Send for Reader`

`impl Sync for Reader`

`impl Unpin for Reader`

`impl UnwindSafe for Reader`

Blanket Implementations

`impl<T> Any for T where T: 'static + ?Sized,` [src]

`pub fn type_id(&self) -> TypeId`[src]

`impl<T> Borrow<T> for T where T: ?Sized,` [src]

`pub fn borrow(&self) -> &T`[src]

`impl<T> BorrowMut<T> for T where T: ?Sized,` [src]

`pub fn borrow_mut(&mut self) -> &mut T`[src]

`impl<T> From<T> for T`[src]

`pub fn from(t: T) -> T`[src]

`impl<T, U> Into for T where U: From<T>,` [src]

`pub fn into(self) -> U`[src]

`impl<T> ToOwned for T where T: Clone,` [src]

`type Owned = T`

The resulting type after obtaining ownership.

`pub fn to_owned(&self) -> T`[src]

`pub fn clone_into(&self, target: &mut T)`[src]

`impl<T, U> TryFrom for T where U: Into<T>,` [src]

`type Error = Infallible`

The type returned in the event of a conversion error.

`pub fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>`[src]

`impl<T, U> TryInto for T where U: TryFrom<T>,` [src]

`type Error = >::Error`

The type returned in the event of a conversion error.

Struct csv_core::Reader[−][src]

Usage

RFC 4180

Implementations

impl Reader[src]

pub fn new() -> Reader[src]

pub fn reset(&mut self)[src]

pub fn line(&self) -> u64[src]

pub fn set_line(&mut self, line: u64)[src]

pub fn read_field( &mut self, input: &[u8], output: &mut [u8]) -> (ReadFieldResult, usize, usize)[src]

Termination

Errors

pub fn read_record( &mut self, input: &[u8], output: &mut [u8], ends: &mut [usize]) -> (ReadRecordResult, usize, usize, usize)[src]

Termination

Errors

Trait Implementations

impl Clone for Reader[src]

fn clone(&self) -> Reader[src]

pub fn clone_from(&mut self, source: &Self)1.0.0[src]

impl Debug for Reader[src]

fn fmt(&self, f: &mut Formatter<'_>) -> Result[src]

impl Default for Reader[src]

fn default() -> Reader[src]

Auto Trait Implementations

impl RefUnwindSafe for Reader

impl Send for Reader

impl Sync for Reader

impl Unpin for Reader

impl UnwindSafe for Reader

Blanket Implementations

impl<T> Any for T where T: 'static + ?Sized, [src]

pub fn type_id(&self) -> TypeId[src]

impl<T> Borrow<T> for T where T: ?Sized, [src]

pub fn borrow(&self) -> &T[src]

impl<T> BorrowMut<T> for T where T: ?Sized, [src]

pub fn borrow_mut(&mut self) -> &mut T[src]

impl<T> From<T> for T[src]

pub fn from(t: T) -> T[src]

impl<T, U> Into<U> for T where U: From<T>, [src]

pub fn into(self) -> U[src]

impl<T> ToOwned for T where T: Clone, [src]

type Owned = T

pub fn to_owned(&self) -> T[src]

pub fn clone_into(&self, target: &mut T)[src]

impl<T, U> TryFrom<U> for T where U: Into<T>, [src]

type Error = Infallible

pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>[src]

impl<T, U> TryInto<U> for T where U: TryFrom<T>, [src]

type Error = <U as TryFrom<T>>::Error

pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>[src]

`impl Reader`[src]

`pub fn new() -> Reader`[src]

`pub fn reset(&mut self)`[src]

`pub fn line(&self) -> u64`[src]

`pub fn set_line(&mut self, line: u64)`[src]

`pub fn read_field( &mut self, input: &[u8], output: &mut [u8] ) -> (ReadFieldResult, usize, usize)`[src]

`pub fn read_record( &mut self, input: &[u8], output: &mut [u8], ends: &mut [usize] ) -> (ReadRecordResult, usize, usize, usize)`[src]

`impl Clone for Reader`[src]

`fn clone(&self) -> Reader`[src]

`pub fn clone_from(&mut self, source: &Self)`1.0.0[src]

`impl Debug for Reader`[src]

`fn fmt(&self, f: &mut Formatter<'_>) -> Result`[src]

`impl Default for Reader`[src]

`fn default() -> Reader`[src]

`impl RefUnwindSafe for Reader`

`impl Send for Reader`

`impl Sync for Reader`

`impl Unpin for Reader`

`impl UnwindSafe for Reader`

`impl<T> Any for T where T: 'static + ?Sized,` [src]

`pub fn type_id(&self) -> TypeId`[src]

`impl<T> Borrow<T> for T where T: ?Sized,` [src]

`pub fn borrow(&self) -> &T`[src]

`impl<T> BorrowMut<T> for T where T: ?Sized,` [src]

`pub fn borrow_mut(&mut self) -> &mut T`[src]

`impl<T> From<T> for T`[src]

`pub fn from(t: T) -> T`[src]

`impl<T, U> Into<U> for T where U: From<T>,` [src]

`pub fn into(self) -> U`[src]

`impl<T> ToOwned for T where T: Clone,` [src]

`type Owned = T`

`pub fn to_owned(&self) -> T`[src]

`pub fn clone_into(&self, target: &mut T)`[src]

`impl<T, U> TryFrom<U> for T where U: Into<T>,` [src]

`type Error = Infallible`

`pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>`[src]

`impl<T, U> TryInto<U> for T where U: TryFrom<T>,` [src]

`type Error = <U as TryFrom<T>>::Error`

`pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>`[src]