abbreviate               package:base               R Documentation

_A_b_b_r_e_v_i_a_t_e _S_t_r_i_n_g_s

_D_e_s_c_r_i_p_t_i_o_n:

     Abbreviate strings to at least 'minlength' characters, such that
     they remain _unique_ (if they were).

_U_s_a_g_e:

     abbreviate(names.arg, minlength = 4, use.classes = TRUE,
                dot = FALSE, method = c("left.kept", "both.sides"))

_A_r_g_u_m_e_n_t_s:

names.arg: a character vector of names to be abbreviated, or an object
          to be coerced to a character vector by 'as.character'.

minlength: the minimum length of the abbreviations.

use.classes: logical (currently ignored by R).

     dot: logical: should a dot ('"."') be appended?

  method: a string specifying the method used with default
          '"left.kept"', see 'Details' below.

_D_e_t_a_i_l_s:

     The algorithm ('method = "left.kept"') used is similar to that of
     S.  For a single string it works as follows. First all spaces at
     the beginning of the string are stripped. Then (if necessary) any
     other spaces are stripped. Next, lower case vowels are removed
     (starting at the right) followed by lower case consonants. Finally
     if the abbreviation is still longer than 'minlength' upper case
     letters are stripped.

     Characters are always stripped from the end of the word first. If
     an element of 'names.arg' contains more than one word (words are
     separated by space) then at least one letter from each word will
     be retained.

     Missing ('NA') values are unaltered.

     If 'use.classes' is 'FALSE' then the only distinction is to be
     between letters and space.  This has NOT been implemented.

_V_a_l_u_e:

     A character vector containing abbreviations for the strings in its
     first argument.  Duplicates in the original 'names.arg' will be
     given identical abbreviations.  If any non-duplicated elements
     have the same 'minlength' abbreviations then, if 'method =
     "both.sides"' the basic internal 'abbreviate()' algorithm is
     applied to the characterwise _reversed_ strings; if there are
     still duplicated abbreviations, 'minlength' is incremented by one
     and new abbreviations are found for those elements only.  This
     process is repeated until all unique elements of 'names.arg' have
     unique abbreviations.

     The character version of 'names.arg' is attached to the returned
     value as a names argument: no other attributes are retained.

_W_a_r_n_i_n_g:

     This is really only suitable for English, and does not work
     correctly with non-ASCII characters in multibyte locales.  It will
     warn if used with non-ASCII characters.

_S_e_e _A_l_s_o:

     'substr'.

_E_x_a_m_p_l_e_s:

     x <- c("abcd", "efgh", "abce")
     abbreviate(x, 2)

     (st.abb <- abbreviate(state.name, 2))
     table(nchar(st.abb))# out of 50, 3 need 4 letters

     ## method="both.sides" helps:  no 4-letters, and only 4 3-letters:
     st.ab2 <- abbreviate(state.name, 2, method="both")
     table(nchar(st.ab2))
     ## Compare the two methods:
     cbind(st.abb, st.ab2)

