Function reference
-
browse_dot_graph() - Generates a temporary html file with visualization of given nstandr procedures and opens it in a browser (specified in
options('browser')). This requiresDiagrammeRpackage to be installed. If you do not have those you cancat()the returned string from `nstandr:::make_dot_graph()' to the R console and copy it to some web tool for dot visualization (e.g.http://magjac.com/graphviz-visual-editor)
-
check_rows() - Assumes that rows (if logical) are same length as x
-
cockburn_combabbrev() - Collapses single character sequences
-
cockburn_detect_corp() - Detect Corporates (code - 'firm')
-
cockburn_detect_govt() - Detect Goverment Organizations (Non-Corporates group)
-
cockburn_detect_hosp() - Detect Hospitals (Non-Corporates group)
-
cockburn_detect_indiv() - Detect Individuals (Non-Corporates group)
-
cockburn_detect_inst() - Detect Non-profit Institutes (Non-Corporates group)
-
cockburn_detect_inst_conds() - Detects Non-profit institutes with special conditions
-
cockburn_detect_inst_conds_1() - Detects Non-profit institutes with special conditions
-
cockburn_detect_inst_conds_2() - Detects Non-profit institutes with special conditions
-
cockburn_detect_inst_german() - Detects German Non-profit institutes
-
cockburn_detect_type() - Identifies Entity Type
-
cockburn_detect_univ() - Detect Universities (Non-Corporates group)
-
cockburn_detect_uspto() - Special USPTO codes. Codes as 'indiv'
-
cockburn_remove_standard_names() - Creates so called stem name (a name with all legal entity identifiers removed)
-
cockburn_remove_uspto() - Removes special USPTO codes.
-
cockburn_replace_compustat() - COMPUSTAT specific standardization for organizational names
-
cockburn_replace_compustat_names() - COMPUSTAT specific standardization for organizational names. Full name replacements.
-
cockburn_replace_derwent() - Performs Derwent standardization of organizational names
-
cockburn_replace_govt() - Cleanup Goverment Organizations (Non-Corporates group)
-
cockburn_replace_punctuation() - Removes punctuation and standardise some symbols.
-
cockburn_replace_standard_names() - Create standard name
-
cockburn_replace_type() - Cleanup Entity Type
-
cockburn_replace_univ() - Cleanup Universities (Non-Corporates group)
-
defactor() - Defactor the object
-
defactor_vector() - Converts factor to character
-
detect_legal_form() - Detect legal form
-
detect_patterns() - Codes strings (e.g., organizational names) based on certain patterns
-
escape_regex() - Escapes special for regex characters
-
escape_regex_for_type() - Escapes special for different types of pattern
-
escape_regex_for_types() - Escapes special for regex characters conditionally
-
get_dots() - Provides access to arguments of nested functions. Sort of an alterative mechanism to passing
...arguments but with more features.
-
get_standardize_options() - Gets
standardize_optionsat point with consistent updates up through calling stack.
-
get_target() - Gets a target vector to standardize.
-
get_vector() - Gets vector by column and defactor if needed. Optionaly one can provide a fallback_value which will be returned if col is not specified.
-
inset_target() - Insets target vector back to input object (
x)
-
is_empty() - Checks if string has something to print
-
magerman_condense() - Condensing names
-
magerman_detect_characters() - Detect candidates for characters that need to be cleaned
-
magerman_detect_comma_period_irregularities() - Detects comma period irregularities
-
magerman_detect_legal_form() - Detect legal form
-
magerman_detect_legal_form_beginning() - Detects legal form at the beginning of a name
-
magerman_detect_legal_form_end() - Detects legal form at the end of a name
-
magerman_detect_legal_form_middle() - Detects legal form in the middle of a name
-
magerman_detect_umlaut() - Detect umlauts
-
magerman_remove_common_words() - Remove common words
-
magerman_remove_common_words_anywhere() - Removes common words anywhere in a name
-
magerman_remove_common_words_at_the_beginning() - Removes common words at the beginning of a name
-
magerman_remove_common_words_at_the_end() - Removes common words at the end of a name
-
magerman_remove_double_quotation_marks_beginning_end() - Removes double quotation irregularities
-
magerman_remove_double_quotation_marks_irregularities() - Removes double quotation irregularities
-
magerman_remove_double_spaces() - Removes double spaces
-
magerman_remove_html_codes() - Removes html codes
-
magerman_remove_legal_form() - Removes legal form
-
magerman_remove_legal_form_and_clean() - Removes legal form
-
magerman_remove_non_alphanumeric_at_the_beginning() - Removes non alphanumeric characters at the beginning of a name
-
magerman_remove_non_alphanumeric_at_the_end() - Removes non alphanumeric characters at the end of a name
-
magerman_remove_special_characters() - Removes special characters
-
magerman_replace_accented_characters() - Replaces accented characters
-
magerman_replace_comma_period_irregularities() - Replaces comma period irregularities
-
magerman_replace_comma_period_irregularities_all() - Replaces comma and period irregularities
-
magerman_replace_legal_form_beginning() - Replaces legal form at the beginning of a name
-
magerman_replace_legal_form_end() - Replaces legal form at the end of a name
-
magerman_replace_legal_form_middle() - Replace legal form in the middle of a name
-
magerman_replace_proprietary_characters() - Replaces proprietary characters
-
magerman_replace_sgml_characters() - Replaces sgml characters
-
magerman_replace_spelling_variation() - Replaces spelling variation
-
magerman_replace_umlaut() - Replaces Umlauts
-
make_dot_edges() - Makes dot graph edges for visualizing arrows between sequence of procedures.
-
make_dot_graph() - Generates graph description for visualizing list of procedures in dot format.
-
make_dot_nodes() - Generates description of dot graph nodes.
-
paste_dot_node() - Makes a dot node (as html table) from procedure's attributes.
-
paste_dot_node_tr_td() - Makes TR TD record for dot node TABLE
-
replace_patterns() - A wrapper for string replacement and cbinding some columns.
-
save_dot_graph_as() - Saves dot graph as file using system command 'dot' from GraphViz (https://graphviz.org/) if installed.
-
standardize()make_std_names()make_standard_names()nstand() - Standardizes organizational names. Takes either vector or column in the table.
-
standardize_cockburn() - Standardizes strings using exact procedures described in Cockburn, et al. (2009)
-
standardize_dehtmlize() - Converts HTML characters to UTF-8
-
standardize_detect_enc() - Detects string encoding
-
standardize_is_data_empty() - Checks if all elements in vector(s) are either "", NA, NULL or have zero length
-
standardize_magerman() - Standardizes strings using exact procedures described in Magerman et al. 2009.
-
standardize_make_procedures_list() - Makes list of procedures calls from table.
-
standardize_omit_empty() - Removes elements that are either "", NA, NULL or have zero length
-
standardize_options() - Does nothing but stores (as its own default arguments) options that control vector handeling through standardization process. These options are available in most nstandr functions that accept
...parameter.
-
standardize_remove_brackets() - Removes brackets and content in brackets
-
standardize_remove_quotes() - Removes double quotes
-
standardize_squish_spaces() - Removes redundant whitespases
-
standardize_toascii() - Translates non-ascii symbols to its ascii equivalent
-
standardize_toupper() - Uppercases vector of interest in the object (table)
-
standardize_x_split() - Splits the object (table) in chunks by rows
-
unlist_if_possible() - If column in the
xtable is list unlist it if possible
-
visualize() - Visualizes list of procedures.
-
x_length() - Gets lengths of the object
-
x_width() - Gets width of the object