Community Central
Community Central

Introduction

Effect modification P value is a method to determine if there is a condition called homogeneous odds ratio which if present, interaction is possible and must be analyzed. Currently the CMH SAS® command is used to test for whether the odds ratio is homogeneous for large sample sizes with fixed effects. One of the tests currently used is the Breslow -Day test in the PROC FREQ CMH test (SAS). If the P value < alpha, for the new method, you reject the null and say the condition of homogeneous odds ratio is rejected, and that there is effect modification from EM. The new method can be used for large or small sample sizes, meant for random variables with a non- normal distribution, and includes a chi-square test of independence, as well as a test for large sampling approximation from P values of LRT, Score, and Wald tests, and the Durbin Watson statistic test for autocorrelation. A new procedure to transform and fit count data is demonstrated that allows you to use predicted versus expected counts and regression to obtain a P value from regression and residuals to evaluate effect modification P value. The area under the curve, ROC curves, and power is discussed in this paper to support the new method of the author. If the P value is greater than the alpha, then we fail to reject the null and say there is no effect modification. Later the PROC MIXED method will be clear with EM of matrix mathematics. The benefit is also for random effects where errors exist in the distribution. Non-normal data can be analyzed effectively by regression analysis with PROC MIXED. The power is good and standard errors through intercepts are visible in plots (Agravat 2009). Matrices allow for full confidence of the method due to an O statistics which is an expectation of the mean through matrix algebra.

 

Conditions

In this type of test of interaction, the problem of effect modification P value is meant for determining if there is a condition that is called homogeneity of odds ratio. The odds ratios must be somewhat equal because that is the question. No one category can have an odds ratio that is much greater than the others. In fact this becomes the null hypothesis, while the alternative becomes that you fail to prove that the odds ratios are not homogeneous which is when the Breslow- Day P value and the new Method have P > α. If the null is rejected, then a condition of effect modification exists. SAS tests for the standard condition of effect modification, are tested for the Breslow-Day test which is coded for by using the PROC FREQ (SAS) command with CMH as the option. Breslow- Day test is meant for linear data that has fixed effects that affect the inference made and is meant for large sample sizes only.  If the data is non- normal and covariates are independent of the outcome, then, you can consider that the assumptions of random effects have been met for this new method using PROC UNIVARIATE command and the possibility of non-normal distributions and PROC FREQ Chisq (SAS), and large sample approximation tests from LRT, Score, Wald test, and PROC AUTOREG. The Power for the new method is higher than Breslow- Day’s test (Agravat 2009). The ROC curve shows higher area under the curve for the author’s method than Breslow- Day test. The standard error for small sample sizes is smaller than Breslow –Day test. The P value obtained produces a lower value however the algorithm converges so the results for MLE are valid. There are other points such as maxRsq being good, C statistic being higher, and convergence of algorithm when testing for power (equaling 1) as well as P value that supports the benefits and use of this new method that is shown in SAS outputs.

O Statistics

Then the development of the O statistics, which have been proved to follow Basu’s theorem for chi square and independence shows the potential and application in a well fashioned test. The matrix has formula to comply with a ‘fit’ an algorithm that is changed made for effect modification as well. The two matrices from data transformed variables zxy and xzy are multiplied. The resultant is another matrix called the expectation of the mean. The formula of the O statistics, and the fit variable as a matrix, is different for confounding and multiplied to the count dataset in the rows. One must refer to the PROC IML algorithm to follow the application in SAS software to facilitate. This method will be demonstrated later. An expectation of the means is the resultant. The term or number is the ‘EM’ or an effect modification term sometimes used as ‘aem’ in literature.  EM follows asymptotic chi square which is indented for a large sample approximation.

This sample of the O statistics is from IHANCE of Hashibe et.al. 2007,an international study conducted in TAMPA, Houston, U.S., Europe, USSR, and more from LYON FRANCE. The data is reading race and no smoking no drinking as the exposure.

 

Obs(hat) =( Obs-  Obs(mean) )2/  (Obs)

Observed O stat Matrix

|1  1 |     | 1   1 795 |

|0 1 |    |1   1  2586  |

Observed O stat Matrix

| 2  2    3381  |

|2   2    2586  |

Expected O stat Matrix

|   1.1   1.1  1916  |

|   0.4.  0.4.  1120  |

 

Fixed vs. Random Effects:

 As stated Breslow –Day’s test is a fixed effect test, while the new Method is meant for random

effects. To begin with, the importance of understanding the meaning and significance of fixed

versus random effects is discussed. Fixed effects are considered non-experimenting controlling

or other variables using linear regression. One must include the variables to estimate the effects

of the variables. Next, the variable must be measured and chosen for model selection.

Ordinarily, if dependent variables are quantitative, then fixed effects can be implemented

through ordinary least squares. Thus, stable characteristics can be controlled eliminating bias.

Fixed effects ignore the between-person variation and focuses on the within-person variation. In

a linear regression model: Yij = b0 + b1xij +ai + eij Β1*x is fixed and all x’s are measured while

β1 is fixed. ε or the error term is defined as a random variable with a probability distribution that

is normal and mean 0 and variance sigma2. Thus the models can be both  fixed and random.

Fixed effects are also considered as pertaining to treatment parameters where there

is only one variable of interest. They are used to generalize results for within effects. In a

random effects model, the ε can exist and do not have to be zero and be considered. Random

effects exist when the variable is drawn from a probability distribution. Blocking, controls, and

repeated measures belong to random effects. Random effects are involved in determining

possible effects and confidence intervals. Unbalanced data may cause problems in inference

about treatments. Random effects are involved in clinical trials and in making causal inferences.

Random effects do not measure variables that are stable and unmeasured characteristics. Thus

the α’s is uncorrelated with the measured 3 parameters. Random effects model can be used to

generalize effects for all the variables that belongs in the same population.

 

 

Data Transformations

 

The author calculates, a new effect modification, P value statistic, from transformed count data.

The beta estimates are calculated from survival analysis representing interaction with

explanatory variable, depicting interaction with outcome and effect modifier, (Z is a variable for

the effect modifier). Zx represents confounder interaction with explanatory variable. Yz

represents interaction with outcome and confounder. Z represents the confounder. The model of

the (TM) SAS code class level variables is chosen sometimes dependent on whether the

outcome converges with all the variables in the model. In other words, one variable may be the

same as another and can be left out. The format is same for each level, the outcome comes first

with y =1 positive for lung cancer then follows with ’1’ for a fit variable. This row’s ’1’ means not

fitted values. Next, there is a 1 for the effect modifier and 1 for the explanatory variable Agravat

(2008). Then, the count or n is from the data set directly. For the next row, the outcome will be

same (y = 1) and ones for modifier and explanatory variable followed by the raw count. The next

two rows will have the fit values, therefore, both fit variables will be 0 in each row, and the fit

values will come from the sequence shown: for the z, or the effect modification variable, the

adjusted for count is the: observed(count)/ | βzxβz | as designed by the author.

 

For the explanatory variable, the new count estimate is: observed(count)/|βyz |. The value of

count comes from the observed count, and this method is used to calculate a new count. One

must alternate it until the symmetric count dataset is created and you must use the absolute

value of the beta estimates to adjust the count data. If there is a 0 in the count make it adjusted

1 in the count or ’n’ data column. The beta estimates are obtained from using the original count

data unadjusted. If the P value is greater than the alpha, then we fail to reject the null and say

there is no effect modification. The P values are used to choose which distribution is better for

beta; hence, this step is parametric. Interaction terms are used to measure beta for instance βzx

for the slope of vector of βzx the interaction between βz and βx Agravat (2009)

 

 

TOOL

SAS ( TM) as a tool and a means for analysis is very helpful for analysis of this problem of  

Effect Modification P value when the previous method is meant for large datasets and only fixed

effects. The variability of this new Method will allows SAS ( TM) to produce more meaningful

results with higher power and more area under the curve for large and small sample sizes with

non-normal distribution, independence, and random effects.

 

·         Testing for Non-Normal Distribution

·         Independence

·         Power Analysis of RANTBL

·         Autocorrelation and Randomness

·         ROC Curve

·         Regression

BETA ESTIMATES

Beta Estimates are obtained from R software in the manner they are shown. Potentially they

can be from AML matrix method. The confounder is the z term or zxy. The explanatory variable

is x term or xzy. One must follow the other variables such as cases as y term or the outcome for

the algorithm not to be confusing.  Data originally written this way to comply with R software.

 

Thebetaestimatesareobtainedfromthemethodshown.

library(survival)

control<c(1,1,0,0,1,1,0,0,1,1,0,0)

cases<c(1,0,1,0,1,0,1,0,1,0,1,0)

race<c(1,1,1,1,2,2,2,2,3,3,3,3)

count<c(795,2586,763,4397,111,233,62,238,40,152,45,170)

dataRace<data.frame(control,cases,race,count)

survreg( Surv(count,control)cases,dist="weibull",data=dataRace,)

Call:survreg(formula=Surv(count,control)cases,data=dataRace,dist="weibull")

Coefficients:(Intercept)cases7.8838841.419907

Scale=1.308425

Loglik(model)=48.4Loglik(interceptonly)= 49.1Chisq=1.45on1degreesoffreedom,p=0.23n=12

survreg( Surv(count,control)race,dist="weibull",data=dataRace,)

Call:survreg(formula=Surv(count,control)race,data=dataRace,dist="weibull")

Coefficients:(Intercept)race9.5751851.531427

Scale=0.760424

Loglik(model)=45.2Loglik(interceptonly)= 49.1Chisq=7.79on1degreesoffreedom,p=0.0053n=12

library(survival)

control<-c(1,1,0,0,1,1,0,0,1,1,0,0)

cases<-c(1,0,1,0,1,0,1,0,1,0,1,0)

race<-c(1,1,1,1,2,2,2,2,3,3,3,3)

count<-c(795,2586,763,4397,111,233,62,238,40,152,45,170)

dataRace<-data.frame(control,cases,race,count)

survreg( Surv(count,control)~cases, dist="weibull",data=dataRace,)

Call: survreg(formula=Surv(count,control)  cases,data = dataRace,dist ="weibull")

Coefficients (Intercept) cases 7.883884   1.419907

Scale= 1.308425

Loglik(model)= 48.4  Loglik(intercept  only)=

49.1Chisq= 1.45 on  1 degrees of freedom, p= 0.23 n= 12

survreg( Surv(count,control) race, dist="weibull",data=dataRace,)

Call: survreg(formula = Surv(count, control)race, data=data Race, dist ="weibull")

Coefficients: (Intercept) race 9.575185 1.531427

Scale=0.760424

Loglik(model)=  45.2 Loglik(intercept only)= 49.1 Chisq= 7.79on 1 degrees of freedom, p= 0.0053 n= 12

 

A SAS HYPERGEOMTRIC CODE

data micasesHoaemnewD;

input n age fit fitrawz lcwocz locusex total lem ;

label fitrawz = 'fit with the raw count for confounder z'

 cwoc = 'cases with oc use'

lcwocz = log of 'cases with oc use adjusted for confounder'

 locusex =log of '# in age stratum using ocs

 adjusted for explanatory'

n = '# of cases in age stratum'

ltotal =log of 'sample size in this age stratum' laem='log(em)';

 ; datalines; 6 0 1 2 1 1 292 .72

 21 0 1 9 1 1 444 1

 37 1 0 4 5.77 4.06 393 1

 71 1 0 6 5.95 3.75 442 1

99 1 1 6 1 1 405 5.56 ;

run;

proc mixed data=micasesHoaemnewD;

weight total;

class lcwocz ;

model age= lcwocz lem /solution ddfm=satterth covb chisq ;

run;

 

 

 

OUTPUT

Type 3 Tests of Fixed Effects  

Effect    Num   DF   Den DF   Chi-Square F Value     Pr > ChiSq      Pr > F 

lcwocz      2                     1         604.47         302.23     <0.0001       <0.0406 

lem         1                     1         412.74          412.74     <0.0001          <0.0313

 

Hyper geometric Distribution Code

 Agravat’s method for hyper-geometric distribution works well for the typical groups of data that involves time and age. Hyper-geometric distribution is involved with the number of successes of drawing from a population without replacement. Often the studies involve the term total, referring to total number in sample size. One category is count and may involve the number in time or age stratum like this example of myocardial infarction with exposures including birth control. The grouping of data to analyze variables still involves outcome variables first include the frequency of that outcome. The outcome variable for this code with PROC MIXED involves bivariate outcomes for grouped data that can be set by the individual and used to test per selected category for effect modification with “lAEM” variable log of “aem”. Fit variable is still used with the same procedure described for lung cancer data (Agravat 2009). There is also a fit variable called “fitrawz” where the raw count is kept for the effect modifier level group. There may be a variable for count of number of cases with exposure “-z” or effect modifier variable and number with exposure including age or time strata. When the weight is ’n’ the result is that there is a significant difference in risk for outcome of age category for exposure birth control use which includes the number in level of birth control adjusted for cases involving that category because P <0.0001.) There is asymptotic chi- square with “1” for age being a categorical variable which is selected by the study designer is between 35-49 years old. FOR lem and there is a statistically significant relationship for chi square and F statistics P<.0001 and P <0.0313.Both the chi square and F statistic converge due to large sample. “Generalized Linear Models”, (Nelder and McCullough 1989), stated that in large samples give the approximate distribution for χ2. With normality, there may be exact results. As n approaches ¥ , the degrees of freedom, in the denominator approaches infinity, and the F-statistic is equivalent to χ2. However, the author’s method is able to produce this convergence with non-normal data and small sample size, as in hyper-geometric data set regarding myocardial infarction, as well as in large samples as in the case of head/neck cancer. The author’s methods with the matrices, algorithm, aem, and PROC IML with PROC MIXED produces, asymptotic chi-squares in the case of effect modification.

EM CODE with PROC MIXED

data smtobAEMmixed;

 input cases fit zxy xzy aem count;

 datalines;

1 1 1 1 1470.29  795

 0 1 1 1 1 2586

1 0 332 569 1 763

 0 0 1913 3279  1 4397

 1 1 1 1 140.5678 111

 0 1 1 1 1 233

1 0 27 46 1 62

 0 0 104 178 1 238

1 1 1 1 283.71789 40

 0 1 1 1 1 152

1 0 20 34 1 45

 0 0 74 127 1 170 ;

 run;

proc mixed data=smtobAEMmixed;

weight count; class zxy ; model cases= zxy aem /solution ddfm=satterth covb chisq ;

 run;

 

 

 

Type 3 Tests of Fixed Effects

Effect Num DF Den DF Chi-Square F Value     Pr > ChiSq    Pr > F

Zxy                6       4            27.19           4.53     0.0001     0.0826

Em                1      4             21.56         21.56    <0.001     0.0097

 

The PROC MIXED and PROC IML code shows evidence again for effect modification by the

significant chi- square for “EM”. Also the F-statistic, F < 0 for the data set indicating statistically

significant evidence to reject the null of homogeneous odds for head/neck cancer due to no

drinking for level of race.

 

The O Stat Method and EM

Chi-square for Effect Modification for Head Neck Cancer in INHANCE The O Stat Method works

by utilizing the data transformed values that come from the data transformation shown in the

SAS code and making matrices from ”A New Effect Modification P Value Test Demonstrated” by

the author. The first two rows and 2x2 matrixes are multiplied by the 2x3 matrix for the next

columns. This yields an observed table 3a and expected table is calculated by the O statistic.

Next, calculate the expected matrix by first estimating the expected 2x3 table. The observed is

obtained by getting the first two rows and all columns of data with variables ”cases”, ”fit”, ”zxy”,

and ”xzy”. Then the 2x3 table is multiplied by the 2x2 table giving a 2x3 table. The new

expected table is a type of mean estimate of observed values calculated by multiplying the

observed value by row total, then divided by sum total. The O statistic is calculated obtaining

through matrices. Sum the O statistic of each set of these products of matrices for first step has

DF 2. The above matrix gives a 2x3 matrix. The same procedure is repeated for the next set

and this is shown in tables for rows 3 and 4 of the data transformation. This step yields several

0 values for the observed and the expected values for the 2x3 table for both the observed and

the expected hence this may indicate characteristics of singularity. This procedure is repeated in

SAS/IML (TM) ( see Formulas Calculating Risk Estimates and testing for Effect Modification and

Confounding for PROC IML code).

 

The matrix calculation is depicted below. This pattern continues in pairs of rows of the

data transformational method’s procedure. The above matrices are from the head/neck cancer

example (INHANCE study Hashibe et al.  2007).The properties of matrices include commutative

properties regard ing rings and for multiplication. If E1 is nonsingular, then k*E1 is non-singular

For a square matrix, which is also possible, for the matrices the commutative property of multi

plication shows that if E1 or E2 are singular when 0 is a possible determinant when dealing

with the O statistics determinants of certain elements, then if E1 and E2 are singular then

concentric rings are possible which may be related to how the matrices shown demonstrates the

involvement of complex and real numbers which shows properties of being both nonsingular

and singular because of nonzero values and being square and having 0 values. Thus the

allowance of complex numbers allows more calculations with than without them.

O Stat Matrix Values Observed

2x2 Observed

Col1 Col2 Col3 Total

2 2 3381 3385

1 1 2586 2588

Total 3 3 6967 6973

2x2 Mean

Col1 Col2 Col3 Total

1.1 1.1 1916 1918.2

.4 .4 1120 1120.8

Total 1.5 1.5 3036 3039

The columns of the O stat are column totals from the PROC IML code. The sum of the

O stat from the columns of Each matrix multiplication is summed for each pair of rows of the

data transformational code and summed with a P-value outputted by the SAS code in PROC

IML and the O stat total can be calculated to give the chi square P-value of the ”aem” or “EM”

which is the from the O stat calculated by the formula above. The mean O stat is calculated by

the value times the row total divided by overall total. The alternating matrix calculations give O

statistics that are used as ”em” for the PROC MIXED calculations. Each set of calculations

will give a number value followed by a 0 for the ”0” level fit variable which were adjusted for

by beta estimates. The number 1 is put for the column of first O statistic and ”em” value

that includes the ”0” fit variable continued throughout. Deter01 represents observed values and

deter1 represents observed mean in the PROC IML SAS code. ”EM” is another name for the

O stat for that matrix calculation. The zeros are ignored in the matrix calculation. The sum

of the ”em” or the O stats after all the matrix calculations give 1894.5757 with 12 degrees

of freedom for the data with a P-value from PROC IML of 0 and P < 0.0001 for chi-square for

”em” from PROC MIXED for the same variable ”em” that are statistically significant. You

may then reject the null of homogeneous odds. One may then reject the null of homogeneous

odds for head/neck cancer and no drinking as exposure and the level of races (Non-Hispanic,

black, and Hispanic) of the INHANCE study population. One may generalize that there may

be a difference by race for head/neck cancer for no drinking as exposure. The interaction may

be less due to protection which is possible. The hazard ratio is .213 which indicates less harm

for the outcome head/neck cancer and exposure no drinking based on level of race. The PROC

MIXED SAS code shows that the corresponding ”EM” values in the program created by the

author is significant for chi-square and P-value of 21.56 and P < 0.0001. The ”em” is also

significant by the F-statistic, which is for multivariate analysis, rejecting the difference for all

groups 21.56 and P < .0001. Hence the SAS code for the data transformational method follows

an asymptotic chi square statistic. The” zxy” variable has a significant P < 0.0001 that is

chi-square. ”ZXY” is the effect modified transformed variable. You may conclude therefore

that, the effect modifier variable is giving significant evidence to reject effect modification null

of homogeneous odds (Causal Inference and Proofs of Bio-statistics and

Probability).

 

Matrices and PROC MIXED

To start this procedure the, 2x2 by 2x3 matrices are multiplied as shown in ”Formulas

Calculating Risk Estimates and Testing for Effect Modification and Confounding Agravat

The means are also calculated in the same way for tables of observed and mean values. Next

using the formulas of the  O(mean) statistic, calculate the output, through the PROC IML code,

calculate the ”EM” variable for the SAS algorithm intended for evaluating for confounding with

PROC MIXED. The program is from the author Agravat (2011), and if  ”EM” is significant one

May conclude that the null of homogeneous null is rejected concluding effect modification exists.

The matrix formulas are shown here in the PROC IML code as well as the O statistics. In

the ”New Effect Modification P Value Test Demonstrated” Agravat (2009), the cases variable

is used in 1,0 1, 0 sequence. This algorithm for effect modification has ”fit” set to 1,0,1, and

0. In the effect modification algorithm, the technique using O statistics and matrices utilize

the observed products form matrix multiplications and mean matrices and the same method of

count data transformation. The rest of the count data has to follow the data transfomration

method.

 

INFERENCE

Effect Modification analysis of this study, with PROC IML, shows significant P < 0 with

alpha=.05 hence, the null of homogeneous odds is rejected and calculates values needed for

the PROC MIXED algorithm in SAS. PROC MIXED for effect modification Agravat (2011) has

P < 0.0001 for chi-square and P < 0.01 for F-statistic (a multivariate statistic) for ”em” indi

cating that the null is rejected and effect modification exists. One may conclude that there are

statistically different risks for head neck cancer for exposure nonsmoking per levels of race.

Since -2LL is 21.5 there is a good model fit with the ”em” method using O statistics Agravat

(2011). The conclusion is that per level of race (non-Hispanic, black, and Hispanic), the result

is different for nonsmoking vs. nondrinkers hence the homogeneous odds null is rejected and

effect modification exists. In support of this new effect modification method is that the power

of ”em” is 100 percent by exposure non-smoking.

 

CONCLUSION

The new effect modification P value result in Formulas calculation risk estimates and testing for effect modification and confounding using my new Method is α=.05 hence one concludes that there is effect modification and that the null of homogeneous odds ratios is rejected of no difference from passive smoke exposure for countries: United States, Great Britain, and Japan (Blot and Fraumeni 1986) P <0.0001. Since this method is indicated for random variables with non-normal data that have independent covariates, the inference will deal with differences in levels into design by chance not design that is found in variables of the same type of variable in the 12 same population. The relative difference among levels of interest chosen while the author’s method tells of the variable not typically chosen for their unique personal attributes, while the new method collectively represents random variables with non-normal distributions that are independent. From the data, the PROC UNIVARIATE (SAS TM) test show that P< .0003 hence there is evidence to reject normality. The chi square independence test of PROC FREQ (SAS) shows that P< .0001 which support independence between outcome of cases and the variable zxy.

  New Method for Effect Modification of Head/Neck Cancer Data: Results of Head Neck Cancer Study INHANCE and the category of never drinkers vs. never smokers were previously too small to be valid statistically according to Mia Hashibe. Thus, a larger category of this pool was added to study head neck cancer which is normally casually linked to cigarette smoking and drinking alcohol 75 percent [Mia Hashibe, et. al., 2007]. Per the check on assumptions for data distribution of head cancer from nondrinkers/nonsmokers (comprising 15.6 and 26.6 percent respectively of cases and controls for non-drinkers versus 10.5 and 37.9 percent of cases and controls for non-smokers from the pool from the study) and per race coded: 1 for non-Hispanic, 2 for black, and 3 for Hispanic, the chi-square is P <0.0001. The PROC AUTOREG (SAS) shows that the Durbin Watson Statistic is 3.23 indicating negative autocorrelation, hence the data is non-normal and involves random effects model. Heterogeneity is expected in this study, and regression may not be fixed or linear as a result. Effect modification is expected to exist for the outcome head/neck cancer for the level of race from the exposure of never drinking/never smoking based on P <0.0001. The risks for head/neck cancer for the three races (non-Hispanic, black, and Hispanic) vary by more than 10 percent for never drinkers vs. never smokers. The C statistic is .799 indicating a very good confidence for the results. The data is fairly large, over 11,500.  The algorithm for effect modification program converged as indicated.

                                                Reference

Agravat 2009 “A New Effect Modification P Value Test Demonstrated” SESUG 2009, Statistics and Data Analysis at: http://analytics.ncsu.edu/sesug/2009/SD018.Agravat.pdf  .

Agravat 2008 “Method for Calculating Odds Ratios Relative Risks, And Interaction”, at: http://health.usf.edu/medicine/research/ABSTRACT_BOOK_.pdf

Agravat,Manoj. Causal Inference. Sciencewise.

http://sciencewise.info/definitions/Causal_inference_by_Manoj_Agravat

Agravat, Manoj B. Odds Ratio and Hazard Analysis of Head Neck Cancer by Hpv Status with New Logits and Probability. Nov 2013.

https://www.worldwidejournals.com/indian-journal-of-applied-research-(IJAR)//file.php?val=November_2013_1383916346_5838e_122.pdf

Agravat, Manoj B Formulas Calculating Risk Estimates and Testing for Effect Modification and Confounding.

http://www.pharmasug.org/proceedings/2011/SP/PharmaSUG-2011-SP03.pdf

Agravat, Manoj. Proofs on Biostatistics and Probability. Science 2.0. March 2014.

https://www.researchgate.net/publication/260466141_Article_on_Proofs_of_Biostatistics_and_Probability

 

Blot W, Fraumeni J (1986), “Passive smoking and lung cancer”, Journal of National Cancer Institute.Vol.77 No.5:993- 1000.

Mia Hashibe ET. al. (2007) “Alcohol Drinking in Never Users of Tobacco, Cigarette Smoking in Never Drinkers, and the Risk of Head and Neck Cancer: Pooled Analysis in the International Head and Neck Cancer Epidemiology Consortium”. Journal of National Cancer Institute. Vol: (99) No.10/777-7789.

Thomas, G Jr.. “Calculus and Analytic Geometry”. Addison-Wesley: 1972.

Szklo, M., & Nieto, F. “Epidemiology Beyond the Basics”. Jones and Bartlett; 2007.

 Verbeke, G, Molenberghs, G.: “Linear Mixed Models for Longitudinal Data”. Springer Series in Statistics, 2000.

Dennis Wackerly, William Mendenhall, Richard Scheaffer, 2002. “Mathematical Statistics with Applications”.

Basu’s theorem

http://en.wikipedia.org/wiki/Basu%E2%80%99s_theorem .

Daniel Zelterman. “Advanced Log-linear” at

http://ftp.sas.com/samples/A57496  .

SAS (TM) and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries indicate USA registration

Agresti, A. “An Introduction to categorical Data Analysis”. NY: John Wiley and Sons; 1996.

R Development Core Team (2004). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-00-3

Babette Brumback, Arthur Berg (2008). “Statistics in Medicine” On Effect-Measure Modification: Relationships Among Changes in the Relative Risk, Odds Ratio, and Risk Difference.

Theodore Holford, Peter Van Ness, Joel Dubin, Program on Aging, Yale University School of Medicine, New Haven ,CT.,’ Power Simulation for Categorical Data Using the RANTBL Function’, SUGI30 Statistics and Data Analysis,pg, 207-230.

Nelder and McCullagh. “Generalized Linear Models”. Chapman and Hall/CRC: 1989.