pub struct FittedTfIdfVectorizer { /* private fields */ }
Expand description
Counts the occurrences of each vocabulary entry, learned during fitting, in a sequence of texts and scales them by the inverse document document frequency defined by the method. Each vocabulary entry is mapped to an integer value that is used to index the count in the result.
Implementations§
Source§impl FittedTfIdfVectorizer
impl FittedTfIdfVectorizer
pub fn force_tokenizer_redefinition(&mut self, tokenizer: fn(&str) -> Vec<&str>)
Sourcepub fn vocabulary(&self) -> &Vec<String>
pub fn vocabulary(&self) -> &Vec<String>
Constains all vocabulary entries, in the same order used by the transform
method.
Sourcepub fn method(&self) -> &TfIdfMethod
pub fn method(&self) -> &TfIdfMethod
Returns the inverse document frequency method used in the tansform method
Sourcepub fn transform<T: ToString, D: Data<Elem = T>>(
&self,
x: &ArrayBase<D, Ix1>,
) -> Result<CsMat<f64>>
pub fn transform<T: ToString, D: Data<Elem = T>>( &self, x: &ArrayBase<D, Ix1>, ) -> Result<CsMat<f64>>
Given a sequence of n
documents, produces an array of size (n, vocabulary_entries)
where column j
of row i
is the number of occurrences of vocabulary entry j
in the text of index i
, scaled by the inverse document frequency.
Vocabulary entry j
is the string at the j
-th position in the vocabulary.
pub fn transform_files<P: AsRef<Path>>( &self, input: &[P], encoding: EncodingRef, trap: DecoderTrap, ) -> Result<CsMat<f64>>
Trait Implementations§
Source§impl Clone for FittedTfIdfVectorizer
impl Clone for FittedTfIdfVectorizer
Source§fn clone(&self) -> FittedTfIdfVectorizer
fn clone(&self) -> FittedTfIdfVectorizer
Returns a copy of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source
. Read moreAuto Trait Implementations§
impl !Freeze for FittedTfIdfVectorizer
impl !RefUnwindSafe for FittedTfIdfVectorizer
impl Send for FittedTfIdfVectorizer
impl !Sync for FittedTfIdfVectorizer
impl Unpin for FittedTfIdfVectorizer
impl UnwindSafe for FittedTfIdfVectorizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more§impl<T> Pointable for T
impl<T> Pointable for T
§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
The inverse inclusion map: attempts to construct
self
from the equivalent element of its
superset. Read more§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
Checks if
self
is actually part of its subset T
(and can be converted to it).§unsafe fn to_subset_unchecked(&self) -> SS
unsafe fn to_subset_unchecked(&self) -> SS
Use with care! Same as
self.to_subset
but without any property checks. Always succeeds.§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
The inclusion map: converts
self
to the equivalent element of its superset.