Functions, Cleaning, Analysis and Sharing (FxCAs)

Author

Fendi Tsim

Published

July 14, 2023

Introduction

In this post, I present a multiple-script-interconnected, minimalistic framework that facilitates data cleaning and analysis in R. This framework (which I called ‘Functions, Cleaning, Analysis and Sharing’, or ‘FxCAs’) is based on the idea of ‘Division of Labor’, that is, each of the four R scripts serves one simple purpose (listed below):

Visualizing FxCAs framework

Custom Functions (Fx)

  • Allow user to load customized functions when separate R scripts (cleaning.R or analysis.R) are used

Cleaning

  • Contains codes that serve for data wrangling
  • Exports data for Analysis
  • Stores current progress in terms of timestamp for future references

Analysis

  • Contains codes for analyzing data
  • Exports data for presentation
  • Stores current progress in terms of timestamp for future references

Sharing

  • Allow user to share all three R scripts and other files from a local directory to one or more remote directories automatically with custom commands

Features

User-Friendliness

  • User can quickly make modification in one script, instead of one section in a lengthy script
  • I use ‘section’ as R’s in-built feature for dividing codes into sections (user can click the option on the bottom left with a brown hashtag to quickly jump from one section to another; user can also fold codes within a section for better viewing and editing)

Automatic Update among R scripts and RData

  • The lines under ‘prerequisite’ section in Cleaning and Analysis update the prerequisite automatically, such as which operating system it is in currently (here I specify between MacOS and Windows; user need to input them manually for the first time), importing the latest version of fx script, loading the latest version of cleaned data etc

Storage and Retrieval for Scripts and Data

  • Here I store Data as .Rdata based on the time it was stored. It allows user to go back easily to previous .RData for references
  • For scripts retrieval, the cleaning and analysis script retrieves the latest version of fx scripts in the directory automatically

Suggestions

  • Using R.project when applying this framework for convenience and efficiency
  • Inclusion of data.table in Cleaning script
  • Creating Cleaning and Analysis scripts automatically with user’s prompt
  • Storing (generalized) functions as R packages