data (module)¶
Submodules¶
-
data.dataloader.
combine_coordinators
(data)¶ Function to combine coordinator rows from 2 to 1.
Returns: - data : DataFrame
DataFrame with combined coordinator rows
-
data.dataloader.
correct_corrections
(df, corrections)¶ Function to correct data entry errors
Parameters: - df: DataFrame
- corrections: dictionary
key: participant number value: dictionary
key: column value: dictionary
key: incorrect value value: dictionary
- key: string
“value” or “column”
- value: anything that can go in a DataFrame cell
value or column name
Returns: - df2: DataFrame
updated
-
data.dataloader.
correct_targets
(df, targets)¶ Function to update targets that were steamrolled during data collection. 🧖
Parameters: - df: DataFrame
- targets: dictionary
- key: numeric
step
- value: string
target
- or string
URL to JSON with respective labels “number” and “target” for key and value
Returns: - df: DataFrame
-
data.dataloader.
count_ontarget_samples
(df, human_readable=False)¶ Function to count usable samples.
Parameters: - df: DataFrame
- human_readable: Boolean, optional
default=False
Returns: - ontarget_counts: DataFrame
MultiIndexed if human_readable, otherwise “step” by “participant”
-
data.dataloader.
dropX
(df, X=['X', 'x'])¶ Function to drop data annotated to drop.
Parameters: - df: DataFrame
- X: list of strings, optional
notes values indicating to drop an iteration
Returns: - df: DataFrame
-
data.dataloader.
index_participants
(df, starting_index=1)¶ Function to index participants
Returns: - participants_df: DataFrame
-
data.dataloader.
load_from_firebase
(dbURL='https://tingle-pilot-collected-data.firebaseio.com/', notes=False, start=None, stop=None, combine=False, marked=False)¶ Function to load data from Firebase. Requires [Firebase service account credentials](https://console.firebase.google.com/project/tingle-pilot-collected-data/settings/serviceaccounts/adminsdk) in JSON format.
Parameters: - dbURL : string (optional)
Firebase database to pull data from
- notes : Boolean (optional)
Return notes as well as data?
- start : date or datetime (optional)
start time of data to include (eg, datetime.date(2018,3,6))
- stop : date or datetime (optional)
stop time of data to include (eg, datetime.date(2018,3,6))
- combine : Boolean (optional)
combine coordinators into a single row?
- marked : Boolean (optional)
only include ontarget==True rows?
Returns: - data : DataFrame
Pandas DataFrame of data from Firebase
- notes : DataFrame (optional)
Pandas DataFrame of notes from Firebase iff parameter notes==True,
-
data.dataloader.
lookup_counts
(row, lookup_table, index='step', columns='participant', default=False)¶ Function to apply to a DataFrame to cross-reference counts in a lookup_table.
Parameters: - row: Series
row of a DataFrame
- lookup_table: DataFrame
DataFrame to cross-reference
- index: string or numeric, opitional
name of column in row that contains an index value for lookup_table, default = “step”
- columns: string or numeric, opitional
name of column in row that contains a column name for lookup_table, default = “participant”
- default: boolean or other, optional
value to return if lookup not in lookup table default = False
Returns: - value: boolean or other
the value at index, columns; otherwise default
-
data.dataloader.
update_from_one
(row)¶ Function to update rows that need updated from agreement to single_coordinator
Returns: - updated: Boolean or other
Examples
>>> import pandas as pd >>> row = pd.Series( ... { ... "one_coordinator": True, ... "both_coordinators": False, ... "needs_updated": True ... } ... ) >>> update_from_one(row) True
>>> import pandas as pd >>> row = pd.Series( ... { ... "one_coordinator": True, ... "both_coordinators": False, ... "needs_updated": False ... } ... ) >>> update_from_one(row) False
>>> import pandas as pd >>> row = pd.Series( ... { ... "one_coordinator": False, ... "both_coordinators": False, ... "needs_updated": True ... } ... ) >>> update_from_one(row) False
-
data.dataloader.
update_too_few
(df, condition)¶ Function to update a DataFrame with an inappropriate number of samples in coordinator agreement.
Parameters: - df: DataFrame
DataFrame to update
- condition: string
definition of inappropriate count, eg, “< 5”
Returns: - df: DataFrame
DataFrame updated with single-rater matches replacing dual-rater agreement in cases indicated by condition