pre_tokenize

Function pre_tokenize 

Source
pub fn pre_tokenize(text: &str, do_lower_case: bool) -> Vec<String>
Expand description

Pre-tokenize a text string into word-level tokens.

Applies lowercasing, accent stripping, CJK splitting, whitespace splitting, and punctuation splitting.