Objective: Obtain a position with a company that applies my statistical background to aid them in data mining and learning patterns to help improve data knowledge.
Intelegencia, Senior Statistical Analyst(Contractor) Oct 2010-Present Served as a consulting statistician to help business clients figure out what key attributes inside their databases drove customer satisfaction/interest. Interacted with many different levels of business management through conference meetings to get a feel for what clients were interested in when data mining. Combined and quality checked several months of web based real-estate data into different datasets ready for analysis. Developed and fine tuned logistic and proportional hazard models in SAS/R, to help compare different customer bases and asses their odds and risks. Applied Bayesian Networks to explore how customer attributes in surveys interacted to drive customer satisfaction. Used this knowledge to improve call process. Text mined user survey-data using R based text mining packages to figure out key-words/phrases used when clients were satisfied or unsatisfied with experiences. Used subversion and wiki to develop reproducible analyses /solid projects.
University of Alabama Birmingham, School of Public Health, Department of Biostatistics, Section On Statistical Genetics, Statistician I July 2007-Present Served as a consulting statistician to help principal investigators prepare and complete grants in the areas of genetic epidemiology, obesity-mortality, and survey analysis. Conducted many large scale statistical analyses(GWAS/linkage) using Unix-scripting(bash), clusters, and project management tools(subversion/JIRA/confluence). Quality checked large scale genetic data using languages such as R/Java, and prepared the data for further analyses in statistical programs such as R/SAS/SOLAR/PLINK. Produced univariate analysis reports in many projects giving investigators basic measures of center (mean/median/percentiles) and missing data rates to help reinforce quality checking. Automated the results of these analyses to fill tables and graphs in reports for presentations in conferences, or to be parts of articles for submissions to journals. Designed analyses and projects so that results could be reproduced by other statisticians, and levels of management could easily follow project development. Deadlines with clients set based on different funding levels. Interacted closely with faculty on grants and articles to rework analyses/tune models/implement different methods of imputation, survival analysis, and categorical analysis(SAS/R). Explored different statistical methodologies in journals. Advised on pros and cons for various statistical approaches (sampling, randomization, type I error and power). Conducted meta-analyses to analyze the effects of BMI, and physical activity on mortality (survival analysis), and study the effects of dieting (mixed models). Mined genetic/environmental data using Bayesian networks and clustering to explore how environmental effects interacted with genetic effects to drive risks in heart disease, or other investigator attributes of interest Maine Health Information Center, Senior Health Research Analyst Jun 2006-July 2007 Served as a consulting statistician for Maine Health Management Coalition Board in the area of clinical care quality control. Advised various pros and cons to different statistical approaches to clinical quality indicators. Developed and implemented own methodology to measure hospital clinical quality care amongst all Maine hospitals. Served as a consulting statistician for several project designs in the areas of: regression, sample size determination, control of type I error, power estimation, and design of experiments. Bayesian data linkage modeler for CODES 2000, Crash Outcome Data Evaluation System. Applied imputation to account for missing data, and analysis of imputed data. Encoded HEDIS 2004, 2005 Medicaid utilization rates for New Hampshire claims data using SAS,R,PL/SQL. Developed different generalized linear models to account for overdispersion in various health utilization rates across the states of Maine and New Hampshire. Applied Cluster Analysis and other exploratory multivariate techniques to group results from a 2006 statewide Office Practice Survey into health service area regions.
Consulting Statistician Aug 2005-Present Employed by Master/PhD level students, and Academia to help design experiments, encode simulations, and provide advice for using appropriate statistical methodologies. Analyzed, and presented experiments for clients using various software: SAS, R, SPSS, Minitab, Microsoft Power Point, Microsoft Word and Excel. Aided faculty in writing up sections of grants for power calculations, control of type I error.
Mathematics, Statistics Graduate Student Kansas State University, Aug. 2000-May 2006 Taught mathematics/statistics classes Aug 2000-May2004/ Aug 2004-Spring 2006
Management Information Systems University of Southern Maine, Spring 1998-May 2000 Developed and set up an Oracle/SQL database to summarize mass amounts of information. Used this knowledge to develop a SQL client program for USM police to monitor lost and found items on campus.
Education Kansas State University, Manhattan,KS MS Statistics 2006 / MS Mathematics 2005 University of Southern Maine,Portland,ME B.S. in computer Science May 2000 (Dual major: Mathematics, Computer science)
Data manipulation/Mining/Simulation Asses programming language strengths, time and resource limitations. Regular Expressions/Machine Learning: Bayesian Networks/Clustering. Create simulations according to different design specifications(Faculty Level). Pipeline different programs together to mine data (grep,DBMSCopy,SAS/R/Java) Mass reporting using R, MS Excel, SAS or Access.
Modeling/Experimental Design Constructed Logistic, Log linear models, Survival Models Factorial, Split-plot, repeated measures, and incomplete block design experience. Time-series knowledge and experience forecasting. Non-parametric methods use when assumptions for parametric models not met.
Relevant Graduate level Statistics classes: Analysis of Messy Data, Categorical Analysis, Design of Experiments, Simulations and Resampling, Time series, General Linear Models, Applied Linear Models, and Statistical Theory, Compiler Design, Survival Analysis. Statistical Software 3+ years in the following: SAS(SAS base, SAS/STAT), SAS MACROS, R Programming Software 3+ years in the following: C, Microsoft Visual Basic, Microsoft Access and functional languages such as Scheme, R, S, and ML. Microsoft Excel. SQL. Operating Systems 3+ years in the following, this includes different flavors: Unix, Windows, Linux. Use of Cygwin.
References: A list of references from previous clients and others is prepared and available upon request.
Kevin R. Fontaine, Raymond McCubrey, Tapan Mehta, Nicholas M. Pajewski, Scott W. Keith, Sai S. Bangalore, Carlos J. Crespo, David B. Allison. Body Mass Index and Mortality Rate among Hispanic Adults: A Pooled Analysis of Multiple Epidemiologic Datasets. International Journal of Obesity advance online publication 11 October 2011. [doi: 10.1038/ijo.2011.194]
James M. Shikany, Amy Thomas, Raymond McCubrey, Mark Beasly, David Allison. Randomized controlled trial of chewing gum for weight loss. Obesity. In press 2011.
James M. Shikany, Renee Desmond, Raymond McCubrey, David B. Allison. Meta-analysis of studies of a specific delivery mode for a modified-carbohydrate diet - the South Beach Diet. Journal of Human Nutrition and Dietetics. [Epub ahead of print. PMID: 21899599]
Sonia Lazzari, Sharon Starkey, John Reese, Andrea Ray-Chandler, Raymond McCubrey, and C. Michael Smith. Feeding behavior of Russian wheat aphid (Hemiptera: Aphididae) biotype 2 in response to wheat genotypes exhibiting antibiosis and tolerance resistance. J Econ Entomol 102(3):1291-300 (2009) [PMID 19610450]
Rose M McMurphy, Melissa R Stoll and Raymond McCubrey. Accuracy of an oscillometric blood pressure monitor during phenylephrine-induced hypertension in dogs. Am J Vet Res 67(9):1541-5 (2006). [PMID 16948598]