module Linguistics::EN::Pluralization

Plural inflection methods for the English-language Linguistics module.

It provides conversion of plural forms of all nouns, most verbs, and some adjectives. It also provides “classical” variants (for example: “brother” -> “brethren”, “dogma” -> “dogmata”, etc.) where appropriate.

Constants

PL_adj_poss
PL_adj_poss_h
PL_adj_special
PL_adj_special_h
PL_count_one
PL_count_zero
PL_prep
PL_pron_acc
PL_pron_acc_h
PL_pron_nom
PL_pron_nom_h
PL_sb_C_a_ae

Classical “..a” -> “..ae”

PL_sb_C_a_ata

Classical “..a” -> “..ata”

PL_sb_C_en_ina

Classical “..en” -> “..ina”

PL_sb_C_ex_ices

Classical “..[ei]x” -> “..ices”

PL_sb_C_i

Arabic: “..” -> “..i”

PL_sb_C_im

Hebrew: “..” -> “..im”

PL_sb_C_ix_ices
PL_sb_C_o_i
PL_sb_C_o_i_a

Classical “..o” -> “..i” (but normally -> “..os”)

PL_sb_C_on_a

Classical “..on” -> “..a”

PL_sb_C_um_a

Classical “..um” -> “..a”

PL_sb_C_us_i

Classical “..us” -> “..i”

PL_sb_C_us_us

Classical “..us” -> “..us” (assimilated 4th declension latin nouns)

PL_sb_U_a_ae

Unconditional “..a” -> “..ae”

PL_sb_U_ex_ices

Unconditional “..[ei]x” -> “..ices”

PL_sb_U_ix_ices
PL_sb_U_man_mans

Unconditional “..man” -> “..mans”

PL_sb_U_o_os

Always “..o” -> “..os”

PL_sb_U_on_a

Unconditional “..on” -> “a”

PL_sb_U_um_a

Unconditional “..um” -> “..a”

PL_sb_U_us_i

Unconditional “..us” -> “i”

PL_sb_general
PL_sb_irregular
PL_sb_irregular_h
PL_sb_irregular_s

Plurals

PL_sb_military
PL_sb_postfix_adj
PL_sb_prep_compound
PL_sb_prep_dual_compound
PL_sb_singular_s

Singular words ending in …s (all inflect with …es)

PL_sb_uninflected
PL_sb_uninflected_herd

Don't inflect in classical mode, otherwise normal inflection

PL_sb_uninflected_s
PL_v_ambiguous_non_pres
PL_v_ambiguous_pres
PL_v_ambiguous_pres_h
PL_v_irregular_non_pres
PL_v_irregular_pres
PL_v_irregular_pres_h
PL_v_special_s

Private Class Methods

matchgroup( *parts ) click to toggle source

Utility function for creating Regexp unions

# File lib/linguistics/en/pluralization.rb, line 17
def self::matchgroup( *parts )
        return Regexp.union( *(parts.flatten) )
end

Public Instance Methods

plural( count=2 ) click to toggle source

Return the plural of the given phrase if count indicates it should be plural.

# File lib/linguistics/en/pluralization.rb, line 399
def plural( count=2 )
        phrase = if self.respond_to?( :to_int )
                        self.numwords
                else
                        self.to_s
                end

        self.log.debug "Pluralizing %p" % [ phrase ]
        pre = text = post = nil

        # If the string has whitespace, only pluralize the middle bit, but
        # preserve the whitespace to add back to the result.
        if md = /\A(\s*)(.+?)(\s*)\Z/.match( phrase.to_s )
                pre, text, post = md.captures
        else
                return phrase
        end

        plural = postprocess( text,
                pluralize_special_adjective(text, count) ||
                pluralize_special_verb(text, count) ||
                pluralize_noun(text, count) )

        return pre + plural + post
end
plural_adj( count=2 )
Alias for: plural_adjective
plural_adjective( count=2 ) click to toggle source

Return the plural of the given adjectival phrase if count indicates it should be plural.

# File lib/linguistics/en/pluralization.rb, line 461
def plural_adjective( count=2 )
        phrase = self.to_s
        md = /\A(\s*)(.+?)(\s*)\Z/.match( phrase )
        pre, word, post = md.captures

        return phrase if word.nil? or word.empty?

        plural = postprocess( word, pluralize_special_adjective(word, count) || word )

        return pre + plural + post
end
Also aliased as: plural_adj
plural_noun( count=2 ) click to toggle source

Return the plural of the given noun phrase if count indicates it should be plural.

# File lib/linguistics/en/pluralization.rb, line 429
def plural_noun( count=2 )
        phrase = self.to_s
        md = /\A(\s*)(.+?)(\s*)\Z/.match( phrase )
        pre, word, post = md.captures

        return phrase if word.nil? or word.empty?

        plural = postprocess( word, pluralize_noun(word, count) )

        return pre + plural + post
end
plural_verb( count=2 ) click to toggle source

Return the plural of the given verb phrase if count indicates it should be plural.

# File lib/linguistics/en/pluralization.rb, line 444
def plural_verb( count=2 )
        phrase = self.to_s
        md = /\A(\s*)(.+?)(\s*)\Z/.match( phrase )
        pre, word, post = md.captures

        return phrase if word.nil? or word.empty?

        plural = postprocess( word,
                pluralize_special_verb(word, count) ||
                pluralize_general_verb(word, count) )

        return pre + plural + post
end

Private Instance Methods

normalize_count( count, default=2 ) click to toggle source

Normalize a count to either 1 or 2 (singular or plural)

# File lib/linguistics/en/pluralization.rb, line 511
def normalize_count( count, default=2 )
        return default if count.nil? # Default to plural
        if /^(#{PL_count_one})$/i =~ count.to_s ||
           ( Linguistics::EN.classical? && /^(#{PL_count_zero})$/ =~ count.to_s )
                return 1
        else
                return default
        end
end
pluralize_general_verb( word, count ) click to toggle source

Pluralize regular verbs

# File lib/linguistics/en/pluralization.rb, line 739
def pluralize_general_verb( word, count )
        count = normalize_count( count )

        return word if /^(#{PL_count_one})$/i =~ count.to_s

        case word

        # Handle ambiguous present tenses  (simple and compound)
        when /^(#{PL_v_ambiguous_pres})((\s.*)?)$/i
                return PL_v_ambiguous_pres_h[ $1.downcase ] + $2

        # Handle ambiguous preterite and perfect tenses
        when /^(#{PL_v_ambiguous_non_pres})((\s.*)?)$/i
                return word

        # Otherwise, 1st or 2nd person is uninflected
        else
                return word
        end
end
pluralize_noun( word, count=2 ) click to toggle source

Pluralize nouns

# File lib/linguistics/en/pluralization.rb, line 523
def pluralize_noun( word, count=2 )
        self.log.debug "Trying to pluralize %p as a noun" % [ word ]

        value = nil
        count = normalize_count( count )

        return word if count == 1

        # Handle user-defined nouns
        #if value = ud_match( word, PL_sb_user_defined )
        #     return value
        #end

        # Handle empty word, singular count and uninflected plurals
        case word
        when ''
                self.log.debug "  empty string"
                return word
        when /^(#{PL_sb_uninflected})$/i
                self.log.debug "  uninflected plural"
                return word
        else
                if Linguistics::EN.classical? && /^(#{PL_sb_uninflected_herd})$/i =~ word
                        self.log.debug "  uninflected classical herd word"
                        return word
                end
        end

        # Handle compounds ("Governor General", "mother-in-law", "aide-de-camp", etc.)
        case word
        when /^(?:#{PL_sb_postfix_adj})$/i
                value = $2
                noun = $1
                self.log.debug "  postfixed adjectival compound noun phrase (#{value} -> #{noun})"
                return pluralize_noun( noun, 2 ) + value

        when /^(?:#{PL_sb_prep_dual_compound})$/i
                noun = $1
                value = [ $2, $3 ]
                self.log.debug "  prepositional dual compound noun phrase (%s -> %s %s)" %
                        [ noun, *value ]
                return pluralize_noun( noun, 2 ) + value[0] + pluralize_noun( value[1] )

        when /^(?:#{PL_sb_prep_compound})$/i
                noun = $1
                value = $2
                self.log.debug "  prepositional singular compound noun phrase (%s -> %s)" %
                        [ noun, value ]
                return pluralize_noun( noun, 2 ) + value

        # Handle pronouns
        when /^((?:#{PL_prep})\s+)(#{PL_pron_acc})$/i
                prep, pron = $1, $2
                self.log.debug "  prepositional pronoun phrase (%p + %p)" % [ prep, pron ]
                return prep + PL_pron_acc_h[ pron.downcase ]

        when /^(#{PL_pron_nom})$/i
                pron = $1
                self.log.debug "  nominative pronoun; using PL_pron_nom table"
                return PL_pron_nom_h[ word.downcase ]

        when /^(#{PL_pron_acc})$/i
                self.log.debug "  accusative pronoun; using PL_pron_acc table"
                return PL_pron_acc_h[ word.downcase ]

        # Handle isolated irregular plurals
        when /(.*)\b(#{PL_sb_irregular})$/i
                prefix, word = $1, $2
                self.log.debug "  isolated irregular; using PL_sb_irregular_h table"
                return prefix + PL_sb_irregular_h[ word.downcase ]

        # Unconditional ...man -> ...mans
        when /(#{PL_sb_U_man_mans})$/i
                word = $1
                self.log.debug "  unconditional man -> mans (%p)" % [ word ]
                return "#{word}s"

        # Handle families of irregular plurals
        when /(.*)man$/i then                  return "#{$1}men"
        when /(.*[ml])ouse$/i then             return "#{$1}ice"
        when /(.*)goose$/i then                return "#{$1}geese"
        when /(.*)tooth$/i then                return "#{$1}teeth"
        when /(.*)foot$/i then                 return "#{$1}feet"

        # Handle unassimilated imports
        when /(.*)ceps$/i then                 return word
        when /(.*)zoon$/i then                 return "#{$1}zoa"
        when /(.*[csx])is$/i then              return "#{$1}es"
        when /(#{PL_sb_U_ex_ices})ex$/i then   return "#{$1}ices"
        when /(#{PL_sb_U_ix_ices})ix$/i then   return "#{$1}ices"
        when /(#{PL_sb_U_um_a})um$/i then      return "#{$1}a"
        when /(#{PL_sb_U_us_i})us$/i then      return "#{$1}i"
        when /(#{PL_sb_U_on_a})on$/i then      return "#{$1}a"
        when /(#{PL_sb_U_a_ae})$/i then        return "#{$1}e"
        end


        # Handle incompletely assimilated imports in classical mode
        if Linguistics::EN.classical?
                self.log.debug "  checking for classical incompletely assimilated imports"
                case word
                when /(.*)trix$/i then               return "#{$1}trices"
                when /(.*)eau$/i then                return "#{$1}eaux"
                when /(.*)ieu$/i then                return "#{$1}ieux"
                when /(.{2,}[yia])nx$/i then         return "#{$1}nges"
                when /(#{PL_sb_C_en_ina})en$/i then  return "#{$1}ina"
                when /(#{PL_sb_C_ex_ices})ex$/i then return "#{$1}ices"
                when /(#{PL_sb_C_ix_ices})ix$/i then return "#{$1}ices"
                when /(#{PL_sb_C_um_a})um$/i then    return "#{$1}a"
                when /(#{PL_sb_C_us_i})us$/i then    return "#{$1}i"
                when /(#{PL_sb_C_us_us})$/i then     return "#{$1}"
                when /(#{PL_sb_C_a_ae})$/i then      return "#{$1}e"
                when /(#{PL_sb_C_a_ata})a$/i then    return "#{$1}ata"
                when /(#{PL_sb_C_o_i})o$/i then      return "#{$1}i"
                when /(#{PL_sb_C_on_a})on$/i then    return "#{$1}a"
                when /#{PL_sb_C_im}$/i then          return "#{word}im"
                when /#{PL_sb_C_i}$/i then           return "#{word}i"
                end
        end


        # Handle singular nouns ending in ...s or other silibants
        case word
        when /^(#{PL_sb_singular_s})$/i then    return "#{$1}es"
        when /^([A-Z].*s)$/ then                return "#{$1}es"
        when /(.*)([cs]h|[zx])$/i then          return "#{$1}#{$2}es"
        # when /(.*)(us)$/i then                return "#{$1}#{$2}es"

        # Handle ...f -> ...ves
        when /(.*[eao])lf$/i then              return "#{$1}lves"
        when /(.*[^d])eaf$/i then              return "#{$1}eaves"
        when /(.*[nlw])ife$/i then             return "#{$1}ives"
        when /(.*)arf$/i then                  return "#{$1}arves"

        # Handle ...y
        when /(.*[aeiou])y$/i then             return "#{$1}ys"
        when /(.*)Secretary$/ then             return "#{$1}Secretaries"
        when /([A-Z].*y)$/ then                return "#{$1}s"
        when /(.*)y$/i then                    return "#{$1}ies"

        # Handle ...o
        when /#{PL_sb_U_o_os}$/i then          return "#{word}s"
        when /[aeiou]o$/i then                 return "#{word}s"
        when /o$/i then                        return "#{word}es"

        # Otherwise just add ...s
        else
                self.log.debug "  appears to be regular; adding +s"
                return "#{word}s"
        end
end
pluralize_special_adjective( word, count ) click to toggle source

Handle special adjectives

# File lib/linguistics/en/pluralization.rb, line 762
def pluralize_special_adjective( word, count )
        self.log.debug "Trying to pluralize %p as a special adjective..." % [ word ]
        count ||= 1
        count = normalize_count( count )

        if /^(#{PL_count_one})$/i =~ count.to_s
                self.log.debug "  it's a single-count word; aborting"
                return nil
        end

        # Handle user-defined verbs
        #if value = ud_match( word, PL_adj_user_defined )
        #     return value
        #end

        case word

        # Handle known cases
        when /^(#{PL_adj_special})$/i
                key = $1.downcase
                self.log.debug "  yep, it's a special plural adjective (%p)" % [ key ]
                return PL_adj_special_h[ key ]

        # Handle possessives
        when /^(#{PL_adj_poss})$/i
                key = $1.downcase
                self.log.debug "  it's a special possessive adjective (%p)" % [ key ]
                return PL_adj_poss_h[ $1.downcase ]

        when /^(.*)'s?$/
                pl = $1.en.plural_noun( count )
                self.log.debug "  it has an apostrophe (%p); using generic possessive rules" % [ pl ]
                if /s$/ =~ pl
                        return "#{pl}'"
                else
                        return "#{pl}'s"
                end

        # Otherwise, no idea
        else
                self.log.debug "  nope."
                return nil
        end
end
pluralize_special_verb( word, count ) click to toggle source

Pluralize special verbs

# File lib/linguistics/en/pluralization.rb, line 678
def pluralize_special_verb( word, count )
        self.log.debug "Trying to pluralize %p as a special verb..." % [ word ]
        count ||= 1
        count = normalize_count( count )

        if /^(#{PL_count_one})$/i =~ count.to_s
                self.log.debug "  it's a single-count word, returning it unchanged."
                return word # :FIXME: should this return nil instead?
                # return nil
        end

        # Handle user-defined verbs
        #if value = ud_match( word, PL_v_user_defined )
        #     return value
        #end

        case word

        # Handle irregular present tense (simple and compound)
        when /^(#{PL_v_irregular_pres})((\s.*)?)$/i
                key = $1.downcase
                self.log.debug "  yep, it's an irregular present tense verb (%p)" % [ key ]
                return PL_v_irregular_pres_h[ $1.downcase ] + $2

        # Handle irregular future, preterite and perfect tenses
        when /^(#{PL_v_irregular_non_pres})((\s.*)?)$/i
                self.log.debug "  yep, it's an irregular non-present tense verb (%p)" % [ key ]
                return word

        # Handle special cases
        when /^(#{PL_v_special_s})$/
                self.log.debug "  it's a not special-case verb; aborting."
                return nil

        # Handle standard 3rd person (chop the ...(e)s off single words)
        when /^(.*)([cs]h|[x]|zz|ss)es$/i
                base, suffix = $1, $2
                self.log.debug "  it's a standard third-person verb (%p + %p)" % [ base, suffix ]
                return base + suffix
        when /^(..+)ies$/i
                verb = $1
                self.log.debug "  it's a standard third-person verb (%p + ies -> +y)" % [ verb ]
                return "#{verb}y"
        when /^(.+)oes$/i
                verb = $1
                self.log.debug "  it's a standard third-person verb (%p + oes -> +o)" % [ verb ]
                return "#{verb}o"
        when /^(.*[^s])s$/i
                verb = $1
                self.log.debug "  it's a standard third-person verb (%p + (^s)s -> -s)" % [ verb ]
                return verb

        # Otherwise, a regular verb (handle elsewhere)
        else
                self.log.debug "  nope. Either a regular verb or not a verb."
                return nil
        end
end
postprocess( original, inflected ) click to toggle source

Do normal/classical switching and match capitalization in inflected by examining the original input.

# File lib/linguistics/en/pluralization.rb, line 485
def postprocess( original, inflected )

        # If there's a classical variant, use it instead of the modern one if
        # classical mode is on.
        inflected.sub!( /([^|]+)\|(.+)/ ) do
                Linguistics::EN.classical? ? $2 : $1
        end

        # Try to duplicate the case of the original string
        case original
        when "I"
                return inflected
        when /^[A-Z]+$/
                return inflected.upcase
        when /^[A-Z]/
                # Can't use #capitalize, as it will downcase the rest of the string,
                # too.
                inflected[0,1] = inflected[0,1].upcase
                return inflected
        else
                return inflected
        end
end