A clean, suitably-structured, and well-documented data set is critical for efficient and accurate statistical analysis. Most commonly, data is imported into statistical analysis programs as a comma delimited text file. For easy and accurate importation of data into statistical software, it is essential that the data adhere to a regular structure with consistent entries.
While it is not required, using REDCap (Research Electronic Data Capture) can greatly simplify data collection and minimize costly and time-consuming data clean-up activities. REDCap is a secure web-based application for building and managing online databases for research and is supported by the CTSC Biomedical Informatics team. Regardless of the software used to record data, adhering to the following guidelines will facilitate importation of the data into statistical software. In addition, every data set must include a data dictionary that describes each variable and identifies acceptable values. Additional information on data dictionaries is available on the UC Davis REDCap website.
Additional tips for data management are available in the PDF document, “Guidance for Database Developers for Efficient Import to Statistical Software.”
Interactive Statistical Calculation Pages – Comprehensive list of sites for many statistical analyses, including power and sample size calculations. The website has a page listing websites for interactive analyses (“Interactive Stats”) and for free software (“Free Software”) packages that can be downloaded and run on your local computer. This website also has links to many technical resources on statistics, including general introductory material.
The R Project for Statistical Computing. R is a free statistical programming language that can be used for any and all statistical analyses. It is commonly used by CTSC statisticians. The down side is that it is a programming language and hence has a bit of a learning curve. However, the R Project site contains many documents to help users learn how to use R and many other resources are available online detailing how to conduct specific types of analyses.
MINITAB. Minitab is a commercial, easy to use statistical package with a drop-down menu interface. You can download a 30-day trial version for free.
SPSS, SAS, and JMP can be obtained at a reasonable cost through UC Davis Information and Educational Technology.
Power/Sample Size Calculations
Southwest Oncology Group Statistical Center Power and Sample Size Calculators. On-line sample size/power calculators for one and two sample tests of means and proportions as well as for simple survival analyses
G*Power: Statistical Power Analyses for Windows and Mac. Freely downloadable software that is easy to use with a detailed and helpful user manual. Wide range of statistical procedures are supported including common mean and proportion tests as well as multiple linear regression, logistic regression and poisson regression.
Russ Length’s Power and Sample Size Calculators. Comprehensive site for conducting power/sample size calculations on-line. Functions available for wide diversity of study designs.
Centre for Clinical Trials Power and sample size tools for one and two sample tests of means, proportions and survival data. Sample size calculators for tests of equality, non-inferiority/superiority and non-equivalence. Also has cross-over designs and Phase II clinical trial calculators. Examples provided for each situation. Companion text "Sample Size Calculations In Clinical Research" is freely downloadable from UC Davis library website. The site also has calculators for confidence intervals for proportions, correlation, relative risk, odds ratios, and diagnostic tests, and will perform McNemar’s test for paired binary data
Introduction to Clinical Research for Residents - This online course consists of readings compiled by the UC Davis and CTSC Biostatisticians.
The Little Handbook of Statistical Practice - Nice, relevant overviews of common statistical analyses are presented. Gives applied examples and interesting discussion of various topics relevant to applied data analysis.
UCLA’s Institute for Digital Research and Education – A wealth of information on conducting statistical analyses using SAS, R, SPSS, Stata, and Mplus is available from this site. The content includes examples of different types of analyses by explaining a motivating data set, providing code to analyze the data in one of the statistical packages, and reviewing and interpreting the output.
Clinical Research Case Studies - CTSPedia entries on selected clinical research topics, including step-by-step tutorials on common sample size calculations; handling outliers; dealing with selection bias in observational studies; and others.
Biostatistics for Non-statisticians – on-line video series from the University of Colorado CTSI.
Ohio State University CCTS – Papers on topics in study design and planning, power and sample size, statistical analysis.
Columbia University Irving Institute – List of references for study design and biostatistics for clinical trials.
University of Utah CCTS Seminar Series – Powerpoint presentations on diverse biostatistical topics including exact statistical inference, multiple imputation, mixed effect models, generalized linear models, survival analysis, epidemiology, and Bayesian methods.
Medical College of Wisconsin – YouTube seminar series with seminars on longitudinal analysis, survival analysis, propensity scores, Bayesian statistics, linear regression, sample size calculations, ANOVA, multiple comparisons, logistic regression among others.