BLEU< ElemType, PrecisionType > Class Template Reference

BLEU, or the Bilingual Evaluation Understudy, is an algorithm for evaluating the quality of text which has been machine translated from one natural language to another. More...

Public Member Functions

 BLEU (const size_t maxOrder=4)
 Create an instance of BLEU class. More...

 
ElemType BLEUScore () const
 Get the BLEU Score. More...

 
ElemType BrevityPenalty () const
 Get the brevity penalty. More...

 
template
<
typename
ReferenceCorpusType
,
typename
TranslationCorpusType
>
ElemType Evaluate (const ReferenceCorpusType &referenceCorpus, const TranslationCorpusType &translationCorpus, const bool smooth=false)
 Computes the BLEU Score. More...

 
size_t MaxOrder () const
 Get the value of maximum length of tokens in n-grams. More...

 
size_t & MaxOrder ()
 Modify the value of maximum length of tokens in n-grams. More...

 
PrecisionType const & Precisions () const
 Get the precisions for corresponding order. More...

 
ElemType Ratio () const
 Get the ratio of translation to reference length ratio. More...

 
size_t ReferenceLength () const
 Get the value of reference length. More...

 
template
<
typename
Archive
>
void serialize (Archive &ar, const uint32_t version)
 Serialize the metric. More...

 
size_t TranslationLength () const
 Get the value of translation length. More...

 

Detailed Description


template<typename ElemType = float, typename PrecisionType = std::vector<ElemType>>
class mlpack::metric::BLEU< ElemType, PrecisionType >

BLEU, or the Bilingual Evaluation Understudy, is an algorithm for evaluating the quality of text which has been machine translated from one natural language to another.

It can also be used to evaluate text generated for a suite of natural language processing tasks.

The BLEU score is calculated using the following formula:

\begin{eqnarray*} \text{B} &=& bp \cdot \exp \left(\sum_{n=1}^{N} w \log p_n \right) \\ \text{where,} \\ bp &=& \text{brevity penalty} = \begin{cases} 1 & \text{if ratio} > 1 \\ \exp \left(1-\frac{1}{ratio}\right) & \text{otherwise} \end{cases} \\ p_n &=& \text{modified precision for n-gram,} \\ w &=& \frac {1}{maxOrder}, \\ ratio &=& \text{translation to reference length ratio,} \\ maxOrder &=& \text{maximum length of tokens in n-grams.} \end{eqnarray*}

The value of BLEU Score lies in between 0 and 1.

Template Parameters
ElemTypeType of the quantities in BLEU, e.g. (long double, double, float).
PrecisionTypeContainer type for precision for corresponding order. e.g. (std::vector<float>, std::vector<double>, or any such boost or armadillo container).

Definition at line 53 of file bleu.hpp.

Constructor & Destructor Documentation

◆ BLEU()

BLEU ( const size_t  maxOrder = 4)

Create an instance of BLEU class.

Parameters
maxOrderThe maximum length of tokens in n-grams.

Member Function Documentation

◆ BLEUScore()

ElemType BLEUScore ( ) const
inline

Get the BLEU Score.

Definition at line 113 of file bleu.hpp.

◆ BrevityPenalty()

ElemType BrevityPenalty ( ) const
inline

Get the brevity penalty.

Definition at line 116 of file bleu.hpp.

◆ Evaluate()

ElemType Evaluate ( const ReferenceCorpusType &  referenceCorpus,
const TranslationCorpusType &  translationCorpus,
const bool  smooth = false 
)

Computes the BLEU Score.

Template Parameters
ReferenceCorpusTypeType of reference corpus.
TranslationCorpusTypeType of translation corpus.
Parameters
referenceCorpusIt is an array of various references or documents. So, the $ referenceCorpus = \{reference_1, reference_2, \ldots \} $ and each reference is an array of paragraphs. So, $ reference_i = \{paragraph_1, paragraph_2, \ldots \} $ and then each paragraph is an array of tokenized words/string. Like, $ paragraph_i = \{word_1, word_2, \ldots \} $. For ex.
refCorpus = {{{"this", "is", "paragraph", "1", "from", "document", "1"},
{"this", "is", "paragraph", "2", "from", "document", "1"}},
{{"this", "is", "paragraph", "1", "from", "document", "2"},
{"this", "is", "paragraph", "2", "from", "document", "2"}}}
translationCorpusIt is an array of paragraphs which has been machine translated or generated for any natural language processing task. Like, $ translationCorpus = \{paragraph_1, paragraph_2, \ldots \} $. And then, each paragraph is an array of words. The ith paragraph from the corpus is $ paragraph_i = \{word_1, word_2, \ldots \} $. For ex.
transCorpus = {{"this", "is", "generated", "paragraph", "1"},
{"this", "is", "generated", "paragraph", "2"}}
smoothWhether or not to apply Lin et al. 2004 smoothing.
Returns
The Evaluate method returns the BLEU Score. This method also calculates other BLEU metrics (brevity penalty, translation length, reference length, ratio and precisions) which can be accessed by their corresponding accessor methods.

◆ MaxOrder() [1/2]

size_t MaxOrder ( ) const
inline

Get the value of maximum length of tokens in n-grams.

Definition at line 108 of file bleu.hpp.

◆ MaxOrder() [2/2]

size_t& MaxOrder ( )
inline

Modify the value of maximum length of tokens in n-grams.

Definition at line 110 of file bleu.hpp.

◆ Precisions()

PrecisionType const& Precisions ( ) const
inline

Get the precisions for corresponding order.

Definition at line 128 of file bleu.hpp.

◆ Ratio()

ElemType Ratio ( ) const
inline

Get the ratio of translation to reference length ratio.

Definition at line 125 of file bleu.hpp.

◆ ReferenceLength()

size_t ReferenceLength ( ) const
inline

Get the value of reference length.

Definition at line 122 of file bleu.hpp.

◆ serialize()

void serialize ( Archive &  ar,
const uint32_t  version 
)

Serialize the metric.

◆ TranslationLength()

size_t TranslationLength ( ) const
inline

Get the value of translation length.

Definition at line 119 of file bleu.hpp.


The documentation for this class was generated from the following file:
  • /home/ryan/src/mlpack.org/_src/mlpack-git/src/mlpack/core/metrics/bleu.hpp