Pandas coalesce

Video Pandas coalesce

Add a column to the dataframe.

Intended to be the method-chaining alternative to:

df[column_name] = value

Example: Add a column of constant values to the dataframe.

>>> import pandas as pd >>> import janitor >>> df = pd.DataFrame({“a”: list(range(3)), “b”: list(“abc”)}) >>> df.add_column(column_name=”c”, value=1) a b c 0 0 a 1 1 1 b 1 2 2 c 1

Example: Add a column of different values to the dataframe.

>>> import pandas as pd >>> import janitor >>> df = pd.DataFrame({“a”: list(range(3)), “b”: list(“abc”)}) >>> df.add_column(column_name=”c”, value=list(“efg”)) a b c 0 0 a e 1 1 b f 2 2 c g

Read more: Tyler the creator cheetah hair

Example: Add a column using an iterator.

>>> import pandas as pd >>> import janitor >>> df = pd.DataFrame({“a”: list(range(3)), “b”: list(“abc”)}) >>> df.add_column(column_name=”c”, value=range(4, 7)) a b c 0 0 a 4 1 1 b 5 2 2 c 6

Parameters:

Name Type Description Default df DataFrame

A pandas DataFrame.

required column_name str

Name of the new column. Should be a string, in order for the column name to be compatible with the Feather binary format (this is a useful thing to have).

required value Union[List[Any], Tuple[Any], Any]

Read more: Matures tgp

Either a single value, or a list/tuple of values.

required fill_remaining bool

If value is a tuple or list that is smaller than the number of rows in the DataFrame, repeat the list or tuple (R-style) to the end of the DataFrame.

False

Returns:

Type Description DataFrame

A pandas DataFrame with an added column.

Read more: Nginx ingress kubernetes io from to www redirect

Exceptions:

Type Description ValueError

If attempting to add a column that already exists.

ValueError

If value has more elements that number of rows in the DataFrame.

ValueError

If attempting to add an iterable of values with a length not equal to the number of DataFrame rows.

ValueError

If value has length of 0.

Source code in janitor/functions/add_columns.py @pf.register_dataframe_method @deprecated_alias(col_name=”column_name”) def add_column( df: pd.DataFrame, column_name: str, value: Union[List[Any], Tuple[Any], Any], fill_remaining: bool = False, ) -> pd.DataFrame: “””Add a column to the dataframe. Intended to be the method-chaining alternative to: “`python df[column_name] = value “` Example: Add a column of constant values to the dataframe. >>> import pandas as pd >>> import janitor >>> df = pd.DataFrame({“a”: list(range(3)), “b”: list(“abc”)}) >>> df.add_column(column_name=”c”, value=1) a b c 0 0 a 1 1 1 b 1 2 2 c 1 Example: Add a column of different values to the dataframe. >>> import pandas as pd >>> import janitor >>> df = pd.DataFrame({“a”: list(range(3)), “b”: list(“abc”)}) >>> df.add_column(column_name=”c”, value=list(“efg”)) a b c 0 0 a e 1 1 b f 2 2 c g Example: Add a column using an iterator. >>> import pandas as pd >>> import janitor >>> df = pd.DataFrame({“a”: list(range(3)), “b”: list(“abc”)}) >>> df.add_column(column_name=”c”, value=range(4, 7)) a b c 0 0 a 4 1 1 b 5 2 2 c 6 :param df: A pandas DataFrame. :param column_name: Name of the new column. Should be a string, in order for the column name to be compatible with the Feather binary format (this is a useful thing to have). :param value: Either a single value, or a list/tuple of values. :param fill_remaining: If value is a tuple or list that is smaller than the number of rows in the DataFrame, repeat the list or tuple (R-style) to the end of the DataFrame. :returns: A pandas DataFrame with an added column. :raises ValueError: If attempting to add a column that already exists. :raises ValueError: If `value` has more elements that number of rows in the DataFrame. :raises ValueError: If attempting to add an iterable of values with a length not equal to the number of DataFrame rows. :raises ValueError: If `value` has length of `0`. “”” check(“column_name”, column_name, [str]) if column_name in df.columns: raise ValueError( f”Attempted to add column that already exists: ” f”{column_name}.” ) nrows = len(df) if hasattr(value, “__len__”) and not isinstance( value, (str, bytes, bytearray) ): len_value = len(value) # if `value` is a list, ndarray, etc. if len_value > nrows: raise ValueError( “`value` has more elements than number of rows ” f”in your `DataFrame`. vals: {len_value}, ” f”df: {nrows}” ) if len_value != nrows and not fill_remaining: raise ValueError( “Attempted to add iterable of values with length” ” not equal to number of DataFrame rows” ) if not len_value: raise ValueError( “`value` has to be an iterable of minimum length 1” ) elif fill_remaining: # relevant if a scalar val was passed, yet fill_remaining == True len_value = 1 value = [value] df = df.copy() if fill_remaining: times_to_loop = int(np.ceil(nrows / len_value)) fill_values = list(value) * times_to_loop df[column_name] = fill_values[:nrows] else: df[column_name] = value return df

Related Posts