Formally, in binary logistic regression there is a single binary dependent variable, coded by an indicator variable, where the two values are labeled "0" ...
Logisticregression
FromWikipedia,thefreeencyclopedia
Jumptonavigation
Jumptosearch
Statisticalmodelforabinarydependentvariable
"Logitmodel"redirectshere.NottobeconfusedwithLogitfunction.
Examplegraphofalogisticregressioncurvefittedtodata.Thecurveshowstheprobabilityofpassinganexam(binarydependentvariable)versushoursstudying(scalarindependentvariable).See§ Exampleforworkeddetails.
Instatistics,thelogisticmodel(orlogitmodel)isastatisticalmodelthatmodelstheprobabilityofaneventtakingplacebyhavingthelog-oddsfortheeventbealinearcombinationofoneormoreindependentvariables.Inregressionanalysis,logisticregression[1](orlogitregression)isestimatingtheparametersofalogisticmodel(thecoefficientsinthelinearcombination).Formally,inbinarylogisticregressionthereisasinglebinarydependentvariable,codedbyanindicatorvariable,wherethetwovaluesarelabeled"0"and"1",whiletheindependentvariablescaneachbeabinaryvariable(twoclasses,codedbyanindicatorvariable)oracontinuousvariable(anyrealvalue).Thecorrespondingprobabilityofthevaluelabeled"1"canvarybetween0(certainlythevalue"0")and1(certainlythevalue"1"),hencethelabeling;[2]thefunctionthatconvertslog-oddstoprobabilityisthelogisticfunction,hencethename.Theunitofmeasurementforthelog-oddsscaleiscalledalogit,fromlogisticunit,hencethealternativenames.See§ Backgroundand§ Definitionforformalmathematics,and§ Exampleforaworkedexample.
Binaryvariablesarewidelyusedinstatisticstomodeltheprobabilityofacertainclassoreventtakingplace,suchastheprobabilityofateamwinning,ofapatientbeinghealthy,etc.(see§ Applications),andthelogisticmodelhasbeenthemostcommonlyusedmodelforbinaryregressionsinceabout1970.[3]Binaryvariablescanbegeneralizedtocategoricalvariableswhentherearemorethantwopossiblevalues(e.g.whetheranimageisofacat,dog,lion,etc.),andthebinarylogisticregressiongeneralizedtomultinomiallogisticregression.Ifthemultiplecategoriesareordered,onecanusetheordinallogisticregression(forexampletheproportionaloddsordinallogisticmodel[4]).See§ Extensionsforfurtherextensions.Thelogisticregressionmodelitselfsimplymodelsprobabilityofoutputintermsofinputanddoesnotperformstatisticalclassification(itisnotaclassifier),thoughitcanbeusedtomakeaclassifier,forinstancebychoosingacutoffvalueandclassifyinginputswithprobabilitygreaterthanthecutoffasoneclass,belowthecutoffastheother;thisisacommonwaytomakeabinaryclassifier.
Analogouslinearmodelsforbinaryvariableswithadifferentsigmoidfunctioninsteadofthelogisticfunction(toconvertthelinearcombinationtoaprobability)canalsobeused,mostnotablytheprobitmodel;see§ Alternatives.Thedefiningcharacteristicofthelogisticmodelisthatincreasingoneoftheindependentvariablesmultiplicativelyscalestheoddsofthegivenoutcomeataconstantrate,witheachindependentvariablehavingitsownparameter;forabinarydependentvariablethisgeneralizestheoddsratio.Moreabstractly,thelogisticfunctionisthenaturalparameterfortheBernoullidistribution,andinthissenseisthe"simplest"waytoconvertarealnumbertoaprobability.Inparticular,itmaximizesentropy(minimizesaddedinformation),andinthissensemakesthefewestassumptionsofthedatabeingmodeled;see§ Maximumentropy.
Theparametersofalogisticregressionaremostcommonlyestimatedbymaximum-likelihoodestimation(MLE).Thisdoesnothaveaclosed-formexpression,unlikelinearleastsquares;see§ Modelfitting.LogisticregressionbyMLEplaysasimilarlybasicroleforbinaryorcategoricalresponsesaslinearregressionbyordinaryleastsquares(OLS)playsforscalarresponses:itisasimple,well-analyzedbaselinemodel;see§ Comparisonwithlinearregressionfordiscussion.ThelogisticregressionasageneralstatisticalmodelwasoriginallydevelopedandpopularizedprimarilybyJosephBerkson,[5]beginninginBerkson(1944)harvtxterror:notarget:CITEREFBerkson1944(help),wherehecoined"logit";see§ History.
PartofaseriesonRegressionanalysis
Models
Linearregression
Simpleregression
Polynomialregression
Generallinearmodel
Generalizedlinearmodel
Vectorgeneralizedlinearmodel
Discretechoice
Binomialregression
Binaryregression
Logisticregression
Multinomiallogisticregression
Mixedlogit
Probit
Multinomialprobit
Orderedlogit
Orderedprobit
Poisson
Multilevelmodel
Fixedeffects
Randomeffects
Linearmixed-effectsmodel
Nonlinearmixed-effectsmodel
Nonlinearregression
Supportvectorregression
Nonparametric
Semiparametric
Robust
Quantile
Isotonic
Principalcomponents
Leastangle
Local
Segmented
Errors-in-variables
Estimation
Leastsquares
Linear
Non-linear
Ordinary
Weighted
Generalized
Generalizedestimatingequation
Partial
Total
Non-negative
Ridgeregression
Regularized
Leastabsolutedeviations
Iterativelyreweighted
Bayesian
Bayesianmultivariate
Least-squaresspectralanalysis
HeteroscedasticityConsistentRegressionStandardErrors
HeteroscedasticityandAutocorrelationConsistentRegressionStandardErrors
Background
Regressionvalidation
Meanandpredictedresponse
Errorsandresiduals
Goodnessoffit
Studentizedresidual
Gauss–Markovtheorem
Mathematicsportalvte
Contents
1Applications
2Example
2.1Problem
2.2Model
2.3Fit
2.4Parameterestimation
2.5Predictions
2.6Modelevaluation
2.7Generalizations
3Background
3.1Definitionofthelogisticfunction
3.2Definitionoftheinverseofthelogisticfunction
3.3Interpretationoftheseterms
3.4Definitionoftheodds
3.5Theoddsratio
3.6Multipleexplanatoryvariables
4Definition
4.1Manyexplanatoryvariables,twocategories
4.2Multinomiallogisticregression:Manyexplanatoryvariablesandmanycategories
5Interpretations
5.1Asageneralizedlinearmodel
5.2Asalatent-variablemodel
5.3Two-waylatent-variablemodel
5.3.1Example
5.4Asa"log-linear"model
5.5Asasingle-layerperceptron
5.6Intermsofbinomialdata
6Modelfitting
6.1Maximumlikelihoodestimation(MLE)
6.2Iterativelyreweightedleastsquares(IRLS)
6.3Bayesian
6.4"Ruleoften"
7Errorandsignificanceoffit
7.1Devianceandlikelihoodratiotest─asimplecase
7.2Goodnessoffitsummary
7.2.1Devianceandlikelihoodratiotests
7.2.2Pseudo-R-squared
7.2.3Hosmer–Lemeshowtest
7.3Coefficientsignificance
7.3.1Likelihoodratiotest
7.3.2Waldstatistic
7.3.3Case-controlsampling
8Discussion
9Maximumentropy
9.1Proof
9.2Otherapproaches
10Comparisonwithlinearregression
11Alternatives
12History
13Extensions
14Software
15Seealso
16References
17Furtherreading
18Externallinks
Applications[edit]
Logisticregressionisusedinvariousfields,includingmachinelearning,mostmedicalfields,andsocialsciences.Forexample,theTraumaandInjurySeverityScore(TRISS),whichiswidelyusedtopredictmortalityininjuredpatients,wasoriginallydevelopedbyBoydetal.usinglogisticregression.[6]Manyothermedicalscalesusedtoassessseverityofapatienthavebeendevelopedusinglogisticregression.[7][8][9][10]Logisticregressionmaybeusedtopredicttheriskofdevelopingagivendisease(e.g.diabetes;coronaryheartdisease),basedonobservedcharacteristicsofthepatient(age,sex,bodymassindex,resultsofvariousbloodtests,etc.).[11][12]AnotherexamplemightbetopredictwhetheraNepalesevoterwillvoteNepaliCongressorCommunistPartyofNepalorAnyOtherParty,basedonage,income,sex,race,stateofresidence,votesinpreviouselections,etc.[13]Thetechniquecanalsobeusedinengineering,especiallyforpredictingtheprobabilityoffailureofagivenprocess,systemorproduct.[14][15]Itisalsousedinmarketingapplicationssuchaspredictionofacustomer'spropensitytopurchaseaproductorhaltasubscription,etc.[16]Ineconomicsitcanbeusedtopredictthelikelihoodofapersonendingupinthelaborforce,andabusinessapplicationwouldbetopredictthelikelihoodofahomeownerdefaultingonamortgage.Conditionalrandomfields,anextensionoflogisticregressiontosequentialdata,areusedinnaturallanguageprocessing.
Example[edit]
Problem[edit]
Theimagerepresentswhatisincludedinlogisticregression,includinganexploratoryvariable,event,andtwopossibleoutcomes.Theexploratoryvariableisunderlinedintheexampleabove,theeventistheexam,whiletheoutcomesareeitherpassorfail.Notethattheexploratoryvariable,event,andoutcomescanchangebasedonthelogisticregressionyouchoosetoconduct.Besidesexams,forexample,eventscanalsoincludeinterventions,treatments,gatherings,etc.,
Asasimpleexample,wecanusealogisticregressionwithoneexplanatoryvariableandtwocategoriestoanswerthefollowingquestion:
Agroupof20studentsspendsbetween0and6hoursstudyingforanexam.Howdoesthenumberofhoursspentstudyingaffecttheprobabilityofthestudentpassingtheexam?
Thereasonforusinglogisticregressionforthisproblemisthatthevaluesofthedependentvariable,passandfail,whilerepresentedby"1"and"0",arenotcardinalnumbers.Iftheproblemwaschangedsothatpass/failwasreplacedwiththegrade0–100(cardinalnumbers),thensimpleregressionanalysiscouldbeused.
Thetableshowsthenumberofhourseachstudentspentstudying,andwhethertheypassed(1)orfailed(0).
Hours(xk)
0.50
0.75
1.00
1.25
1.50
1.75
1.75
2.00
2.25
2.50
2.75
3.00
3.25
3.50
4.00
4.25
4.50
4.75
5.00
5.50
Pass(yk)
0
0
0
0
0
0
1
0
1
0
1
0
1
0
1
1
1
1
1
1
Wewishtofitalogisticfunctiontothedataconsistingofthehoursstudied(xk)andtheoutcomeofthetest(yk =1forpass,0forfail).Thedatapointsareindexedbythesubscriptkwhichrunsfrom
k
=
1
{\displaystylek=1}
to
k
=
K
=
20
{\displaystylek=K=20}
.Thexvariableiscalledthe"explanatoryvariable",andtheyvariableiscalledthe"categoricalvariable"consistingoftwocategories:"pass"or"fail"correspondingtothecategoricalvalues1and0respectively.
Model[edit]
Graphofalogisticregressioncurvefittedtothe(xm,ym)data.Thecurveshowstheprobabilityofpassinganexamversushoursstudying.
Thelogisticfunctionisoftheform:
p
(
x
)
=
1
1
+
e
−
(
x
−
μ
)
/
s
{\displaystylep(x)={\frac{1}{1+e^{-(x-\mu)/s}}}}
whereμisalocationparameter(themidpointofthecurve,where
p
(
μ
)
=
1
/
2
{\displaystylep(\mu)=1/2}
)andsisascaleparameter.Thisexpressionmayberewrittenas:
p
(
x
)
=
1
1
+
e
−
(
β
0
+
β
1
x
)
{\displaystylep(x)={\frac{1}{1+e^{-(\beta_{0}+\beta_{1}x)}}}}
where
β
0
=
−
μ
/
s
{\displaystyle\beta_{0}=-\mu/s}
andisknownastheintercept(itistheverticalinterceptory-interceptoftheline
y
=
β
0
+
β
1
x
{\displaystyley=\beta_{0}+\beta_{1}x}
),and
β
1
=
1
/
s
{\displaystyle\beta_{1}=1/s}
(inversescaleparameterorrateparameter):thesearethey-interceptandslopeofthelog-oddsasafunctionofx.Conversely,
μ
=
−
β
0
/
β
1
{\displaystyle\mu=-\beta_{0}/\beta_{1}}
and
s
=
1
/
β
1
{\displaystyles=1/\beta_{1}}
.
Fit[edit]
Theusualmeasureofgoodnessoffitforalogisticregressionuseslogisticloss(orlogloss),thenegativelog-likelihood.Foragivenxkandyk,write
p
k
=
p
(
x
k
)
{\displaystylep_{k}=p(x_{k})}
.The
p
k
{\displaystylep_{k}}
aretheprobabilitiesthatthecorresponding
y
k
{\displaystyley_{k}}
willbeunityand
1
−
p
k
{\displaystyle1-p_{k}}
aretheprobabilitiesthattheywillbezero(seeBernoullidistribution).Wewishtofindthevaluesof
β
0
{\displaystyle\beta_{0}}
and
β
1
{\displaystyle\beta_{1}}
whichgivethe"bestfit"tothedata.Inthecaseoflinearregression,thesumofthesquareddeviationsofthefitfromthedatapoints(yk),thesquarederrorloss,istakenasameasureofthegoodnessoffit,andthebestfitisobtainedwhenthatfunctionisminimized.
Theloglossforthek-thpointis:
{
−
ln
p
k
if
y
k
=
1
,
−
ln
(
1
−
p
k
)
if
y
k
=
0.
{\displaystyle{\begin{cases}-\lnp_{k}&{\text{if}}y_{k}=1,\\-\ln(1-p_{k})&{\text{if}}y_{k}=0.\end{cases}}}
Theloglosscanbeinterpretedasthe"surprisal"oftheactualoutcome
y
k
{\displaystyley_{k}}
relativetotheprediction
p
k
{\displaystylep_{k}}
,andisameasureofinformationcontent.Notethatloglossisalwaysgreaterthanorequalto0,equals0onlyincaseofaperfectprediction(i.e.,when
p
k
=
1
{\displaystylep_{k}=1}
and
y
k
=
1
{\displaystyley_{k}=1}
,or
p
k
=
0
{\displaystylep_{k}=0}
and
y
k
=
0
{\displaystyley_{k}=0}
),andapproachesinfinityasthepredictiongetsworse(i.e.,when
y
k
=
1
{\displaystyley_{k}=1}
and
p
k
→
0
{\displaystylep_{k}\to0}
or
y
k
=
0
{\displaystyley_{k}=0}
and
p
k
→
1
{\displaystylep_{k}\to1}
),meaningtheactualoutcomeis"moresurprising".Sincethevalueofthelogisticfunctionisalwaysstrictlybetweenzeroandone,theloglossisalwaysgreaterthanzeroandlessthaninfinity.Notethatunlikeinalinearregression,wherethemodelcanhavezerolossatapointbypassingthroughadatapoint(andzerolossoverallifallpointsareonaline),inalogisticregressionitisnotpossibletohavezerolossatanypoints,since
y
k
{\displaystyley_{k}}
iseither0or1,but
0
<
p
k
<
1
{\displaystyle0
0
i.e.
−
ε
i
<
β
⋅
X
i
,
0
otherwise.
{\displaystyleY_{i}={\begin{cases}1&{\text{if}}Y_{i}^{\ast}>0\{\text{i.e.}}-\varepsilon_{i}
0
∣
X
i
)
=
Pr
(
β
⋅
X
i
+
ε
i
>
0
)
=
Pr
(
ε
i
>
−
β
⋅
X
i
)
=
Pr
(
ε
i
<
β
⋅
X
i
)
(becausethelogisticdistributionissymmetric)
=
logit
−
1
(
β
⋅
X
i
)
=
p
i
(seeabove)
{\displaystyle{\begin{aligned}\Pr(Y_{i}=1\mid\mathbf{X}_{i})&=\Pr(Y_{i}^{\ast}>0\mid\mathbf{X}_{i})\\[5pt]&=\Pr({\boldsymbol{\beta}}\cdot\mathbf{X}_{i}+\varepsilon_{i}>0)\\[5pt]&=\Pr(\varepsilon_{i}>-{\boldsymbol{\beta}}\cdot\mathbf{X}_{i})\\[5pt]&=\Pr(\varepsilon_{i}
Y
i
0
∗
,
0
otherwise.
{\displaystyleY_{i}={\begin{cases}1&{\text{if}}Y_{i}^{1\ast}>Y_{i}^{0\ast},\\0&{\text{otherwise.}}\end{cases}}}
Thismodelhasaseparatelatentvariableandaseparatesetofregressioncoefficientsforeachpossibleoutcomeofthedependentvariable.Thereasonforthisseparationisthatitmakesiteasytoextendlogisticregressiontomulti-outcomecategoricalvariables,asinthemultinomiallogitmodel.Insuchamodel,itisnaturaltomodeleachpossibleoutcomeusingadifferentsetofregressioncoefficients.Itisalsopossibletomotivateeachoftheseparatelatentvariablesasthetheoreticalutilityassociatedwithmakingtheassociatedchoice,andthusmotivatelogisticregressionintermsofutilitytheory.(Intermsofutilitytheory,arationalactoralwayschoosesthechoicewiththegreatestassociatedutility.)Thisistheapproachtakenbyeconomistswhenformulatingdiscretechoicemodels,becauseitbothprovidesatheoreticallystrongfoundationandfacilitatesintuitionsaboutthemodel,whichinturnmakesiteasytoconsidervarioussortsofextensions.(Seetheexamplebelow.)
Thechoiceofthetype-1extremevaluedistributionseemsfairlyarbitrary,butitmakesthemathematicsworkout,anditmaybepossibletojustifyitsusethroughrationalchoicetheory.
Itturnsoutthatthismodelisequivalenttothepreviousmodel,althoughthisseemsnon-obvious,sincetherearenowtwosetsofregressioncoefficientsanderrorvariables,andtheerrorvariableshaveadifferentdistribution.Infact,thismodelreducesdirectlytothepreviousonewiththefollowingsubstitutions:
β
=
β
1
−
β
0
{\displaystyle{\boldsymbol{\beta}}={\boldsymbol{\beta}}_{1}-{\boldsymbol{\beta}}_{0}}
ε
=
ε
1
−
ε
0
{\displaystyle\varepsilon=\varepsilon_{1}-\varepsilon_{0}}
Anintuitionforthiscomesfromthefactthat,sincewechoosebasedonthemaximumoftwovalues,onlytheirdifferencematters,nottheexactvalues—andthiseffectivelyremovesonedegreeoffreedom.Anothercriticalfactisthatthedifferenceoftwotype-1extreme-value-distributedvariablesisalogisticdistribution,i.e.
ε
=
ε
1
−
ε
0
∼
Logistic
(
0
,
1
)
.
{\displaystyle\varepsilon=\varepsilon_{1}-\varepsilon_{0}\sim\operatorname{Logistic}(0,1).}
Wecandemonstratetheequivalentasfollows:
Pr
(
Y
i
=
1
∣
X
i
)
=
Pr
(
Y
i
1
∗
>
Y
i
0
∗
∣
X
i
)
=
Pr
(
Y
i
1
∗
−
Y
i
0
∗
>
0
∣
X
i
)
=
Pr
(
β
1
⋅
X
i
+
ε
1
−
(
β
0
⋅
X
i
+
ε
0
)
>
0
)
=
Pr
(
(
β
1
⋅
X
i
−
β
0
⋅
X
i
)
+
(
ε
1
−
ε
0
)
>
0
)
=
Pr
(
(
β
1
−
β
0
)
⋅
X
i
+
(
ε
1
−
ε
0
)
>
0
)
=
Pr
(
(
β
1
−
β
0
)
⋅
X
i
+
ε
>
0
)
(substitute
ε
asabove)
=
Pr
(
β
⋅
X
i
+
ε
>
0
)
(substitute
β
asabove)
=
Pr
(
ε
>
−
β
⋅
X
i
)
(now,sameasabovemodel)
=
Pr
(
ε
<
β
⋅
X
i
)
=
logit
−
1
(
β
⋅
X
i
)
=
p
i
{\displaystyle{\begin{aligned}\Pr(Y_{i}=1\mid\mathbf{X}_{i})={}&\Pr\left(Y_{i}^{1\ast}>Y_{i}^{0\ast}\mid\mathbf{X}_{i}\right)&\\[5pt]={}&\Pr\left(Y_{i}^{1\ast}-Y_{i}^{0\ast}>0\mid\mathbf{X}_{i}\right)&\\[5pt]={}&\Pr\left({\boldsymbol{\beta}}_{1}\cdot\mathbf{X}_{i}+\varepsilon_{1}-\left({\boldsymbol{\beta}}_{0}\cdot\mathbf{X}_{i}+\varepsilon_{0}\right)>0\right)&\\[5pt]={}&\Pr\left(({\boldsymbol{\beta}}_{1}\cdot\mathbf{X}_{i}-{\boldsymbol{\beta}}_{0}\cdot\mathbf{X}_{i})+(\varepsilon_{1}-\varepsilon_{0})>0\right)&\\[5pt]={}&\Pr(({\boldsymbol{\beta}}_{1}-{\boldsymbol{\beta}}_{0})\cdot\mathbf{X}_{i}+(\varepsilon_{1}-\varepsilon_{0})>0)&\\[5pt]={}&\Pr(({\boldsymbol{\beta}}_{1}-{\boldsymbol{\beta}}_{0})\cdot\mathbf{X}_{i}+\varepsilon>0)&&{\text{(substitute}}\varepsilon{\text{asabove)}}\\[5pt]={}&\Pr({\boldsymbol{\beta}}\cdot\mathbf{X}_{i}+\varepsilon>0)&&{\text{(substitute}}{\boldsymbol{\beta}}{\text{asabove)}}\\[5pt]={}&\Pr(\varepsilon>-{\boldsymbol{\beta}}\cdot\mathbf{X}_{i})&&{\text{(now,sameasabovemodel)}}\\[5pt]={}&\Pr(\varepsilon3.3.co;2-f.PMID 9160492.
^Harrell,FrankE.(2010).RegressionModelingStrategies:WithApplicationstoLinearModels,LogisticRegression,andSurvivalAnalysis.NewYork:Springer.ISBN 978-1-4419-2918-1.[page needed]
^abhttps://class.stanford.edu/c4x/HumanitiesScience/StatLearning/asset/classification.pdfslide16
^abMount,J.(2011)."TheEquivalenceofLogisticRegressionandMaximumEntropymodels"(PDF).RetrievedFeb23,2022.
^Ng,Andrew(2000)."CS229LectureNotes"(PDF).CS229LectureNotes:16–19.
^Rodríguez,G.(2007).LectureNotesonGeneralizedLinearModels.pp. Chapter3,page45.
^GarethJames;DanielaWitten;TrevorHastie;RobertTibshirani(2013).AnIntroductiontoStatisticalLearning.Springer.p. 6.
^Pohar,Maja;Blas,Mateja;Turk,Sandra(2004)."ComparisonofLogisticRegressionandLinearDiscriminantAnalysis:ASimulationStudy".MetodološkiZvezki.1(1).
^Cramer2002,pp. 3–5.
^Verhulst,Pierre-François(1838)."Noticesurlaloiquelapopulationpoursuitdanssonaccroissement"(PDF).CorrespondanceMathématiqueetPhysique.10:113–121.Retrieved3December2014.
^Cramer2002,p. 4,"Hedidnotsayhowhefittedthecurves."
^Verhulst,Pierre-François(1845)."Recherchesmathématiquessurlaloid'accroissementdelapopulation"[MathematicalResearchesintotheLawofPopulationGrowthIncrease].NouveauxMémoiresdel'AcadémieRoyaledesSciencesetBelles-LettresdeBruxelles.18.Retrieved2013-02-18.
^Cramer2002,p. 4.
^Cramer2002,p. 7.
^Cramer2002,p. 6.
^Cramer2002,p. 6–7.
^Cramer2002,p. 5.
^Cramer2002,p. 7–9.
^Cramer2002,p. 9.
^Cramer2002,p. 8,"AsfarasIcanseetheintroductionofthelogisticsasanalternativetothenormalprobabilityfunctionistheworkofasingleperson,JosephBerkson(1899–1982),..."
^Cramer2002,p. 11.
^abCramer,p. 13.sfnerror:notarget:CITEREFCramer(help)
^McFadden,Daniel(1973)."ConditionalLogitAnalysisofQualitativeChoiceBehavior"(PDF).InP.Zarembka(ed.).FrontiersinEconometrics.NewYork:AcademicPress.pp. 105–142.Archivedfromtheoriginal(PDF)on2018-11-27.Retrieved2019-04-20.
^Gelman,Andrew;Hill,Jennifer(2007).DataAnalysisUsingRegressionandMultilevel/HierarchicalModels.NewYork:CambridgeUniversityPress.pp. 79–108.ISBN 978-0-521-68689-1.
Furtherreading[edit]
Cox,DavidR.(1958)."Theregressionanalysisofbinarysequences(withdiscussion)".JRStatSocB.20(2):215–242.JSTOR 2983890.
Cox,DavidR.(1966)."Someproceduresconnectedwiththelogisticqualitativeresponsecurve".InF.N.David(1966)(ed.).ResearchPapersinProbabilityandStatistics(FestschriftforJ.Neyman).London:Wiley.pp. 55–71.
Cramer,J.S.(2002).Theoriginsoflogisticregression(PDF)(Technicalreport).Vol. 119.TinbergenInstitute.pp. 167–178.doi:10.2139/ssrn.360300.
Publishedin:Cramer,J.S.(2004)."Theearlyoriginsofthelogitmodel".StudiesinHistoryandPhilosophyofSciencePartC:StudiesinHistoryandPhilosophyofBiologicalandBiomedicalSciences.35(4):613–626.doi:10.1016/j.shpsc.2004.09.003.
Thiel,Henri(1969)."AMultinomialExtensionoftheLinearLogitModel".InternationalEconomicReview.10(3):251–59.doi:10.2307/2525642.JSTOR 2525642.
Wilson,E.B.;Worcester,J.(1943)."TheDeterminationofL.D.50andItsSamplingErrorinBio-Assay".ProceedingsoftheNationalAcademyofSciencesoftheUnitedStatesofAmerica.29(2):79–85.Bibcode:1943PNAS...29...79W.doi:10.1073/pnas.29.2.79.PMC 1078563.PMID 16588606.
Agresti,Alan.(2002).CategoricalDataAnalysis.NewYork:Wiley-Interscience.ISBN 978-0-471-36093-3.
Amemiya,Takeshi(1985)."QualitativeResponseModels".AdvancedEconometrics.Oxford:BasilBlackwell.pp. 267–359.ISBN 978-0-631-13345-2.
Balakrishnan,N.(1991).HandbookoftheLogisticDistribution.MarcelDekker,Inc.ISBN 978-0-8247-8587-1.
Gouriéroux,Christian(2000)."TheSimpleDichotomy".EconometricsofQualitativeDependentVariables.NewYork:CambridgeUniversityPress.pp. 6–37.ISBN 978-0-521-58985-7.
Greene,WilliamH.(2003).EconometricAnalysis,fifthedition.PrenticeHall.ISBN 978-0-13-066189-0.
Hilbe,JosephM.(2009).LogisticRegressionModels.Chapman&Hall/CRCPress.ISBN 978-1-4200-7575-5.
Hosmer,David(2013).Appliedlogisticregression.Hoboken,NewJersey:Wiley.ISBN 978-0470582473.
Howell,DavidC.(2010).StatisticalMethodsforPsychology,7thed.Belmont,CA;ThomsonWadsworth.ISBN 978-0-495-59786-5.
Peduzzi,P.;J.Concato;E.Kemper;T.R.Holford;A.R.Feinstein(1996)."Asimulationstudyofthenumberofeventspervariableinlogisticregressionanalysis".JournalofClinicalEpidemiology.49(12):1373–1379.doi:10.1016/s0895-4356(96)00236-3.PMID 8970487.
Berry,MichaelJ.A.;Linoff,Gordon(1997).DataMiningTechniquesForMarketing,SalesandCustomerSupport.Wiley.
Externallinks[edit]
WikiversityhaslearningresourcesaboutLogisticregression
MediarelatedtoLogisticregressionatWikimediaCommons
EconometricsLecture(topic:Logitmodel)onYouTubebyMarkThoma
LogisticRegressiontutorial
mlelr:softwareinCforteachingpurposes
vteStatistics
Outline
Index
DescriptivestatisticsContinuousdataCenter
Mean
Arithmetic
Cubic
Generalized/power
Geometric
Harmonic
Heinz
Lehmer
Median
Mode
Dispersion
Averageabsolutedeviation
Coefficientofvariation
Interquartilerange
Percentile
Range
Standarddeviation
Variance
Shape
Centrallimittheorem
Moments
Kurtosis
L-moments
Skewness
Countdata
Indexofdispersion
Summarytables
Contingencytable
Frequencydistribution
Groupeddata
Dependence
Partialcorrelation
Pearsonproduct-momentcorrelation
Rankcorrelation
Kendall'sτ
Spearman'sρ
Scatterplot
Graphics
Barchart
Biplot
Boxplot
Controlchart
Correlogram
Fanchart
Forestplot
Histogram
Piechart
Q–Qplot
Radarchart
Runchart
Scatterplot
Stem-and-leafdisplay
Violinplot
DatacollectionStudydesign
Effectsize
Missingdata
Optimaldesign
Population
Replication
Samplesizedetermination
Statistic
Statisticalpower
Surveymethodology
Sampling
Cluster
Stratified
Opinionpoll
Questionnaire
Standarderror
Controlledexperiments
Blocking
Factorialexperiment
Interaction
Randomassignment
Randomizedcontrolledtrial
Randomizedexperiment
Scientificcontrol
Adaptivedesigns
Adaptiveclinicaltrial
Stochasticapproximation
Up-and-downdesigns
Observationalstudies
Cohortstudy
Cross-sectionalstudy
Naturalexperiment
Quasi-experiment
StatisticalinferenceStatisticaltheory
Population
Statistic
Probabilitydistribution
Samplingdistribution
Orderstatistic
Empiricaldistribution
Densityestimation
Statisticalmodel
Modelspecification
Lpspace
Parameter
location
scale
shape
Parametricfamily
Likelihood (monotone)
Location–scalefamily
Exponentialfamily
Completeness
Sufficiency
Statisticalfunctional
Bootstrap
U
V
Optimaldecision
lossfunction
Efficiency
Statisticaldistance
divergence
Asymptotics
Robustness
FrequentistinferencePointestimation
Estimatingequations
Maximumlikelihood
Methodofmoments
M-estimator
Minimumdistance
Unbiasedestimators
Mean-unbiasedminimum-variance
Rao–Blackwellization
Lehmann–Scheffétheorem
Medianunbiased
Plug-in
Intervalestimation
Confidenceinterval
Pivot
Likelihoodinterval
Predictioninterval
Toleranceinterval
Resampling
Bootstrap
Jackknife
Testinghypotheses
1-&2-tails
Power
Uniformlymostpowerfultest
Permutationtest
Randomizationtest
Multiplecomparisons
Parametrictests
Likelihood-ratio
Score/Lagrangemultiplier
Wald
Specifictests
Z-test(normal)
Student'st-test
F-test
Goodnessoffit
Chi-squared
G-test
Kolmogorov–Smirnov
Anderson–Darling
Lilliefors
Jarque–Bera
Normality(Shapiro–Wilk)
Likelihood-ratiotest
Modelselection
Crossvalidation
AIC
BIC
Rankstatistics
Sign
Samplemedian
Signedrank(Wilcoxon)
Hodges–Lehmannestimator
Ranksum(Mann–Whitney)
Nonparametricanova
1-way(Kruskal–Wallis)
2-way(Friedman)
Orderedalternative(Jonckheere–Terpstra)
VanderWaerdentest
Bayesianinference
Bayesianprobability
prior
posterior
Credibleinterval
Bayesfactor
Bayesianestimator
Maximumposteriorestimator
CorrelationRegressionanalysisCorrelation
Pearsonproduct-moment
Partialcorrelation
Confoundingvariable
Coefficientofdetermination
Regressionanalysis
Errorsandresiduals
Regressionvalidation
Mixedeffectsmodels
Simultaneousequationsmodels
Multivariateadaptiveregressionsplines(MARS)
Linearregression
Simplelinearregression
Ordinaryleastsquares
Generallinearmodel
Bayesianregression
Non-standardpredictors
Nonlinearregression
Nonparametric
Semiparametric
Isotonic
Robust
Heteroscedasticity
Homoscedasticity
Generalizedlinearmodel
Exponentialfamilies
Logistic(Bernoulli) /Binomial /Poissonregressions
Partitionofvariance
Analysisofvariance(ANOVA,anova)
Analysisofcovariance
MultivariateANOVA
Degreesoffreedom
Categorical /Multivariate /Time-series /SurvivalanalysisCategorical
Cohen'skappa
Contingencytable
Graphicalmodel
Log-linearmodel
McNemar'stest
Cochran-Mantel-Haenszelstatistics
Multivariate
Regression
Manova
Principalcomponents
Canonicalcorrelation
Discriminantanalysis
Clusteranalysis
Classification
Structuralequationmodel
Factoranalysis
Multivariatedistributions
Ellipticaldistributions
Normal
Time-seriesGeneral
Decomposition
Trend
Stationarity
Seasonaladjustment
Exponentialsmoothing
Cointegration
Structuralbreak
Grangercausality
Specifictests
Dickey–Fuller
Johansen
Q-statistic(Ljung–Box)
Durbin–Watson
Breusch–Godfrey
Timedomain
Autocorrelation(ACF)
partial(PACF)
Cross-correlation(XCF)
ARMAmodel
ARIMAmodel(Box–Jenkins)
Autoregressiveconditionalheteroskedasticity(ARCH)
Vectorautoregression(VAR)
Frequencydomain
Spectraldensityestimation
Fourieranalysis
Least-squaresspectralanalysis
Wavelet
Whittlelikelihood
SurvivalSurvivalfunction
Kaplan–Meierestimator(productlimit)
Proportionalhazardsmodels
Acceleratedfailuretime(AFT)model
Firsthittingtime
Hazardfunction
Nelson–Aalenestimator
Test
Log-ranktest
ApplicationsBiostatistics
Bioinformatics
Clinicaltrials /studies
Epidemiology
Medicalstatistics
Engineeringstatistics
Chemometrics
Methodsengineering
Probabilisticdesign
Process /qualitycontrol
Reliability
Systemidentification
Socialstatistics
Actuarialscience
Census
Crimestatistics
Demography
Econometrics
Jurimetrics
Nationalaccounts
Officialstatistics
Populationstatistics
Psychometrics
Spatialstatistics
Cartography
Environmentalstatistics
Geographicinformationsystem
Geostatistics
Kriging
Category
Mathematicsportal
Commons
WikiProject
AuthoritycontrolNationallibraries
France(data)
Germany
Israel
UnitedStates
Other
FacetedApplicationofSubjectTerminology
Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Logistic_regression&oldid=1111304332"
Categories:LogisticregressionPredictiveanalyticsRegressionmodelsHiddencategories:WikipediaarticlesneedingpagenumbercitationsfromMay2012WikipediaarticlesneedingpagenumbercitationsfromOctober2019HarvandSfnno-targeterrorsArticleswithshortdescriptionShortdescriptionisdifferentfromWikidataAllarticleswithunsourcedstatementsArticleswithunsourcedstatementsfromJanuary2017ArticlesthatmaycontainoriginalresearchfromMay2022AllarticlesthatmaycontainoriginalresearchArticlestobeexpandedfromOctober2016AllarticlestobeexpandedArticlesusingsmallmessageboxesWikipediaarticlesneedingclarificationfromMay2017ArticleswithunsourcedstatementsfromOctober2019Allarticleswithspecificallymarkedweasel-wordedphrasesArticleswithspecificallymarkedweasel-wordedphrasesfromOctober2019CommonscategorylinkfromWikidataArticleswithBNFidentifiersArticleswithGNDidentifiersArticleswithJ9UidentifiersArticleswithLCCNidentifiersArticleswithFASTidentifiers
Navigationmenu
Personaltools
NotloggedinTalkContributionsCreateaccountLogin
Namespaces
ArticleTalk
English
Views
ReadEditViewhistory
More
Search
Navigation
MainpageContentsCurrenteventsRandomarticleAboutWikipediaContactusDonate
Contribute
HelpLearntoeditCommunityportalRecentchangesUploadfile
Tools
WhatlinkshereRelatedchangesUploadfileSpecialpagesPermanentlinkPageinformationCitethispageWikidataitem
Print/export
DownloadasPDFPrintableversion
Inotherprojects
WikimediaCommonsWikiversity
Languages
العربيةCatalàČeštinaDeutschEestiEspañolEuskaraفارسیFrançais한국어BahasaIndonesiaItalianoעבריתNederlands日本語PolskiPortuguêsРусскийSimpleEnglishSuomiSvenskaУкраїнська粵語中文
Editlinks