Statistical thinking will one day be as necessary a qualification for efficient citizenship as the ability to read and write. HG Wells
Purpose of this site
This site is dedicated to discuss issues about data generated by medical and dental research, and tools to explore those data, extract information and generate knowledge in a valid and reproducible way.
Main topics are
- exploratory data analysis: data cleaning, wrangling, summarize and visualizations,
- examples of analyses of datasets with data from dental research or oral public health, and
- Comments on dental papers with innovative or creative exploratory data analysis.
Unless a good reason, every post will have at least one reproducible example.
Perhaps the biggest barrier to reproducible research is the lack of a deeply ingrained culture that simply requires reproducibility for all scientific claims. Not unlike the culture of replication that persists across all scientific disciplines, the scientific community needs to develop a “culture of reproducibility” for … science and require it of published claims. R Peng, 2011
Tools used in this site
This site discuss exploratory data analysis with R, Rstudio and the tidyverse package. This site was created using blogdown + Hugo.
About the author
Sergio Uribe. I am Leading Researcher at the Bioinformatics Research Unit Riga Stradins University and Associate Professor at the School of Dentistry, Universidad Austral de Chile. In the 90s I used Epi Info (6.04 of course), SPSS, SAS among others until in 2005~6 I discovered R. I used it intermittently, but the real incentive to use as my main tool of analysis was recently with:
- the development of the tidyverse, a set of tools that allow a human readable workflow for data analysis, and
- the development of rmarkdown, to create dynamic documents that allow the easy deployment of reproducible code integrated to documents.
Fancy or exotic stats can’t replace a good study design. So the simpler the stats, the strongest the study design and viceversa. No statistical method can effectively deal with the systematic biases that may result from a poorly designed study. Hence. the focus of this site is in the Exploratory Data Analysis part: cleaning, exploration visually and numerically before any significance test.
I enjoy learning statistics and I found particularly well written and accessible for any health professional the following books:
- Introductory Statistics with R Peter Dalgaard
- Open Intro Statistics, by Mine Çetinkaya-Rundel, Christopher Barr, David Diez (free)
- Statistics: An Introduction Using R by Michael J Crawley
- Discovering Statistics Using R by Andy Field (if you like Rock, this is for you)
- Biostatistics: The Bare Essentials by Geoffrey R. Norman and David L. Streiner
- R for Data Science by Hadley Wickham and Garrett Grolemund (free). If you read only one, read this one.
After listening to David Robinson in the DataFramed podcast I decided to share my learning journey through data science.
Please, feel free to contact me.