Is there a reason why not to use link-time optimization (LTO)?

文章推薦指數: 80 %
投票人數:10人

Now, with link-time-optimization, the tiny function read from read.c is likely to be inlined whereever it is called from client.c . Due to the ... Home Public Questions Tags Users Collectives ExploreCollectives FindaJob Jobs Companies Teams StackOverflowforTeams –Collaborateandshareknowledgewithaprivategroup. CreateafreeTeam WhatisTeams? Teams CreatefreeTeam CollectivesonStackOverflow Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost. Learnmore Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. Learnmore Isthereareasonwhynottouselink-timeoptimization(LTO)? AskQuestion Asked 7years,10monthsago Modified 3monthsago Viewed 39ktimes 79 13 GCC,MSVC,LLVM,andprobablyothertoolchainshavesupportforlink-time(wholeprogram)optimizationtoallowoptimizationofcallsamongcompilationunits. Isthereareasonnottoenablethisoptionwhencompilingproductionsoftware? c++cperformancecompilationcompiler-optimization Share Improvethisquestion Follow editedOct14,2018at18:39 PeterMortensen 29.8k2121goldbadges9898silverbadges124124bronzebadges askedMay19,2014at11:24 HonzaHonza 1,60433goldbadges1515silverbadges2222bronzebadges 14 4 SeeWhynotalwaysusecompileroptimization?.Theanswersthereareequallyapplicablehere. – Mankarse May19,2014at11:43 2 @MankarseHeasks"whencompilingproductionsoftware"somostoftheanswerstheredoesn'tapply. – Ali May19,2014at11:52 1 @user2485710:Doyouhavedocumentationforincompatibilitywithld?WhatIreadinthecurrentgccdocs(gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html)andinasomewhatoldwiki(gcc.gnu.org/wiki/LinkTimeOptimization)eithersaysnothingaboutldincompatibilities(gccdocs)orexplicitlystatescompatibility(wiki).Judgingfromthemodeofltooperation,namelyhavingadditionalinformationintheobjectfiles,myguesswouldbethattheobjectfilesmaintaincompatibility. – Peter-ReinstateMonica May19,2014at12:05 1 Enabling-O2makesadifferenceofca.+5secondsona10minutebuildhere.EnablingLTOmakesadifferenceofca+3minutes,andsometimesldrunsoutofaddressspace.Thisisagoodreasontoalwayscompilewith-O2(sotheexecutablesthatyoudebugarebinary-identicalwiththeonesyou'llship!)andnottouseLTOuntilitismatureenough(whichincludesacceptablespeed).Yourmileagemayvary. – Damon May19,2014at13:23 1 @Damon:ThereleasebuildisnotthebuildI'vebeendebugging,butthebuildwhichsurvivedtesting.Testgetsaseparatebuildanyhow,installedonacleanmachine(soIknowtheinstallpackageisn'tmissinganydependencies). – MSalters May19,2014at14:41  |  Show9morecomments 8Answers 8 Active Oldest Votes 53 Iassumethatby"productionsoftware"youmeansoftwarethatyoushiptothecustomers/goesintoproduction.TheanswersatWhynotalwaysusecompileroptimization?(kindlypointedoutbyMankarse)mostlyapplytosituationsinwhichyouwanttodebugyourcode(sothesoftwareisstillinthedevelopmentphase--notinproduction). 6yearshavepassedsinceIwrotethisanswer,andanupdateisnecessary.Backin2014,theissueswere: Linktimeoptimizationoccasionallyintroducedsubtlebugs,seeforexampleLink-timeoptimizationforthekernel.Iassumethisislessofanissueasof2020.Safeguardagainstthesekindsofcompilerandlinkerbugs:Haveappropriateteststocheckthecorrectnessofyoursoftwarethatyouareabouttoship. Increasedcompiletime.Thereareclaimsthatthesituationhassignificantlyimprovedsince2014,forexamplethankstoslimobjects. Largememoryusage.Thispostclaimsthatthesituationhasdrasticallyimprovedinrecentyears,thankstopartitioning. Asof2020,IwouldtrytouseLTObydefaultonanyofmyprojects. Share Improvethisanswer Follow editedMay24,2020at12:41 answeredMay19,2014at12:08 AliAli 53.6k2828goldbadges158158silverbadges254254bronzebadges 14 2 Iagreewithsuchanswer.IalsohavenocluewhynottouseLTObydefault.Thanksforconfirmation. – Honza May19,2014at12:24 3 @Honza:Probablybecauseittendstousemassiveamountsofresources.TrycompilingChromium,Firefox,orLibreOfficewithLTO...(FYI:Atleastoneofthemisnotevencompilableon32-bitmachineswithGNUld,evenwithoutLTO,simplybecausetheworkingsetdoesnotfitinvirtualaddressspace!) – R..GitHubSTOPHELPINGICE May19,2014at12:47 14 Mayintroduce?Unlessthecompilerisbroken,itwon't.Mayuncover?Sure.Ascananyotheroptimizationofbrokencode. – Deduplicator Oct14,2018at17:42 2 @DeduplicatorYoudorealizethattheanswerwaswrittenin2014,right?Atthetime,theimplementationofLTOwasstillsomewhatbuggy;seealsothearticleIlinkedto. – Ali May24,2020at11:37 1 @BogiInmyexperience,developersdon'thavetowaitforthecompilationofthereleasebuildtofinish.BuildingthereleaseversionshouldbepartofthereleaseprocessortheCI/CDpipeline.EvenifLTOisslow,itshouldnotmattertothedevelopersastheyarenotwaitingforit.Longreleasebuildtimesshouldnotblockthemintheirdailywork. – Ali May24,2020at11:43  |  Show9morecomments 12 Thisrecentquestionraisesanotherpossible(butratherspecific)caseinwhichLTOmayhaveundesirableeffects:ifthecodeinquestionisinstrumentedfortiming,andseparatecompilationunitshavebeenusedtotrytopreservetherelativeorderingoftheinstrumentedandinstrumentingstatements,thenLTOhasagoodchanceofdestroyingthenecessaryordering. Ididsayitwasspecific. Share Improvethisanswer Follow editedMay23,2017at12:09 CommunityBot 111silverbadge answeredJun14,2016at9:53 JeremyJeremy 4,74511goldbadge2525silverbadges4242bronzebadges Addacomment  |  6 Ifyouhavewellwrittencode,itshouldonlybeadvantageous.Youmayhitacompiler/linkerbug,butthisgoesforalltypesofoptimisation,thisisrare. Biggestdownsideisitdrasticallyincreaseslinktime. Share Improvethisanswer Follow editedNov9,2018at9:00 answeredNov8,2018at13:37 ericcurtinericcurtin 1,3301414silverbadges2020bronzebadges 10 Whydoesitincreasecompiletime?Isn'titthecasethatthecompilerstopscompilationatacertainpoint(itgeneratessomeinternalrepresentationofthecode,andputsthisintotheobjectfileinsteadofthefullycompiledcode),soitshouldbefasterinstead? – geza Nov8,2018at13:46 1 BecausethecompilermustnowcreatetheGIMPLEbytecodeaswellastheobjectfilesothelinkerhasenoughinformationtooptimise.CreatingthisGIMPLEbytecodehasoverhead. – ericcurtin Nov8,2018at15:16 AsfarasIknow,whenusingLTO,thecompilergeneratesonlythebytecode,i.e.,noprocessorspecificassemblyisemitted.Soitshouldbefaster. – geza Nov8,2018at17:12 TheGIMPLEispartoftheobjectfilealrightgcc.gnu.org/onlinedocs/gccint/LTO-Overview.html – ericcurtin Nov8,2018at17:50 Ithasadditionalcompiletimeoverheadonanycodebaseifyoutimeit – ericcurtin Nov8,2018at17:50  |  Show5morecomments 2 Apartfromtothis, Consideratypicalexamplefromembeddedsystem, voidfunction1(void){/*Dosomething*/}//locatedataddress0x1000 voidfunction2(void){/*Dosomething*/}//locatedataddress0x1100 voidfunction3(void){/*Dosomething*/}//locatedataddress0x1200 Withpredefinedaddressedfunctionscanbecalledthroughrelativeaddresseslikebellow, (*0x1000)();//expectedtocallfunction2 (*0x1100)();//expectedtocallfunction2 (*0x1200)();//expectedtocallfunction3 LOTcanleadtounexpectedbehavior. Share Improvethisanswer Follow editedDec31,2019at9:12 answeredMay30,2019at8:49 TruthSeekerTruthSeeker 1,3041010silverbadges2020bronzebadges 1 ThisisaninterestingcommentbecauseLTOcouldpotentiallycausethelinkertoinlinesmallandrarelyusedfunctions.ItestedaslightlydifferentexamplewithGCC9.2.1andClang8.0.0onFedoraanditworkedfine.TheonlydifferencewasthatIusedanarrayoffunctionpointers:``` typedefintFUNC(); FUNC*ptr[3]={func1,func2,func3}; return(*ptr)()+(*(ptr+1))()+(*(ptr+2))();``` – KonradKleine Oct29,2019at9:26 Addacomment  |  0 Giventhatthecodeisimplementedcorrectly,thenlinktimeoptimizationshouldnothaveanyimpactonthefunctionality.However,therearescenarioswherenot100%correctcodewilltypicallyjustworkwithoutlinktimeoptimization,butwithlinktimeoptimizationtheincorrectcodewillstopworking.Therearesimilarsituationswhenswitchingtohigheroptimizationlevels,like,from-O2to-O3withgcc. Thatis,dependingonyourspecificcontext(like,ageofthecodebase,sizeofthecodebase,depthoftests,areyoustartingyourprojectorareyouclosetofinalrelease,...)youwouldhavetojudgetheriskofsuchachange. Onescenariowherelink-time-optimizationcanleadtounexpectedbehaviorforwrongcodeisthefollowing: Imagineyouhavetwosourcefilesread.candclient.cwhichyoucompileintoseparateobjectfiles.Inthefileread.cthereisafunctionreadthatdoesnothingelsethanreadingfromaspecificmemoryaddress.Thecontentatthisaddress,however,shouldbemarkedasvolatile,butunfortunatelythatwasforgotten.Fromclient.cthefunctionreadiscalledseveraltimesfromthesamefunction.Sincereadonlyperformsonesinglereadfromtheaddressandthereisnooptimizationbeyondtheboundariesofthereadfunction,readwillalwayswhencalledaccesstherespectivememorylocation.Consequently,everytimewhenreadiscalledfromclient.c,thecodeinclient.cgetsafreshlyreadvaluefromtheaddress,justasifvolatilehadbeenused. Now,withlink-time-optimization,thetinyfunctionreadfromread.cislikelytobeinlinedwhereeveritiscalledfromclient.c.Duetothemissingvolatile,thecompilerwillnowrealizethatthecodereadsseveraltimesfromthesameaddress,andmaythereforeoptimizeawaythememoryaccesses.Consequently,thecodestartstobehavedifferently. Share Improvethisanswer Follow editedSep17,2020at21:08 answeredAug27,2019at17:07 DirkHerrmannDirkHerrmann 5,02111goldbadge1818silverbadges4444bronzebadges 1 Anothermorerelevantissueiscodewhichisnon-portablebutcorrectwhenprocessedbyimplementationsthat,asaformof"conforminglanguageextension",specifytheirbehaviorinmoresituationsthanmandatedbytheStandard. – supercat Mar23,2021at22:01 Addacomment  |  0 Ratherthanmandatingthatallimplementationssupportthesemanticsnecessarytoaccomplishalltasks,theStandardallowsimplementationsintendedtobesuitableforvarioustaskstoextendthelanguagebydefiningsemanticsincornercasesbeyondthosemandatedbytheCStandard,inwaysthatwouldbeusefulforthosetasks. Anextremelypopularextensionofthisformistospecifythatcross-modulefunctioncallswillbeprocessedinafashionconsistentwiththeplatform'sApplicationBinaryInterfacewithoutregardforwhethertheCStandardwouldrequiresuchtreatment. Thus,ifonemakesacross-modulecalltoafunctionlike: uint32_tread_uint32_bits(void*p) { return*(uint32_t*)p; } thegeneratedcodewouldreadthebitpatternina32-bitchunkofstorageataddressp,andinterpretitasauint32_tvalueusingtheplatform'snative32-bitintegerformat,withoutregardforhowthatchunkofstoragecametoholdthatbitpattern.Likewise,ifacompilerweregivensomethinglike: uint32_tread_uint32_bits(void*p); uint32_tf1bits,f2bits; voidtest(void) { floatf; f=1.0f; f1bits=read_uint32_bits(&f); f=2.0f; f2bits=read_uint32_bits(&f); } thecompilerwouldreservestorageforfonthestack,storethebitpatternfor1.0ftothatstorage,callread_uint32_bitsandstorethereturnedvalue,storethebitpatternfor2.0ftothatstorage,callread_uint32_bitsandstorethatreturnedvalue. TheStandardprovidesnosyntaxtoindicatethatthecalledfunctionmightreadthestoragewhoseaddressitreceivesusingtypeuint32_t,nortoindicatethatthepointerthefunctionwasgivenmighthavebeenwrittenusingtypefloat,becauseimplementationsintendedforlow-levelprogrammingalreadyextendedthelanguagetosupportedsuchsemanticswithoutusingspecialsyntax. Unfortunately,addinginLinkTimeOptimizationwillbreakanycodethatreliesuponthatpopularextension.Somepeoplemayviewsuchcodeasbroken,butifonerecognizestheSpiritofCprinciple"Don'tpreventprogrammersfromdoingwhatneedstobedone",theStandard'sfailuretomandatesupportforapopularextensioncannotbeviewedasintendingtodeprecateitsusageiftheStandardfailstoprovideanyreasonablealternative. Share Improvethisanswer Follow answeredMar22,2021at22:00 supercatsupercat 72.7k77goldbadges155155silverbadges196196bronzebadges 14 Howisthisrelevant?TypepunningisaClanguagefeaturecompletelyunrelatedtoLTO. – user4945014 Aug4,2021at19:09 @MattF.:IntheabsenceofLTO,abstractandphysicalmachinestateswillbesynchronizedwheneverexecutioncrossescompilation-unitboundaries.Ifcodestoresavaluetoa64-bitunsignedlongandpassesitsaddressasavoid*toafunctioninadifferentcompilationunitthatcastsittoa64-bitunsignedlonglong*anddereferencesit,thenunlesstheimplementationusesLTObehaviorwouldbedefinedintermsoftheplatformABIwithoutregardforwhetherthecalledfunctionaccessesstorageusingthesametypeasthecaller. – supercat Aug4,2021at19:17 @MattF.:Basically,mypointisthattheCommitteessawnoneedfortheStandardtoletprogrammersdemandthatcompilersdothingsthatprogrammersmightneedthemtodo,butwhichthey'dhavenowayofavoidingdoing,butthencompilerswerechangedsothatcompilerscouldavoidsuchthingswithoutregardforwhetherprogrammersmightneedthem. – supercat Aug4,2021at19:20 wouldbedefinedintermsoftheplatformABIwithoutregardforwhetherthecalledfunctionaccessesstorageusingthesametypeasthecaller.That'strueregardlessofLTO.Bydefinitionapointercastreinterpretsthetyperegardlessofitsactualdata. – user4945014 Aug4,2021at21:03 @MattF.:Ifacompilercanseethatafunctiononlywritestopointersoftypeunsignedlonglong,andneverdereferencesanypointersoftypeunsignedlong,itmayrefrainfromsynchronizingtheabstractandphysicalvaluesofobjectsoftypeunsignedlongbefore/aftercallingthefunction,thusbreakinganycodethatwouldrelyupontheoperationsontypeunsignedlongbeingprocessedaccordingtotheplatformABI. – supercat Aug4,2021at21:14  |  Show9morecomments 0 LTOcouldalsorevealedge-casebugsincode-signingalgorithms.Consideracode-signingalgorithmbasedoncertainexpectationsabouttheTEXTportionofsomeobjectormodule.NowLTOoptimizestheTEXTportionaway,orinlinesstuffintoitinawaythecode-signingalgorithmwasnotdesignedtohandle.Worstcasescenario,itonlyaffectsoneparticulardistributionpipelinebutnotanother,duetoasubtledifferenceinwhichencryptionalgorithmwasusedoneachpipeline.Goodluckfiguringoutwhytheappwon'tlaunchwhendistributedfrompipelineAbutnotB. Share Improvethisanswer Follow answeredNov15,2021at5:17 scalyscaly 15722silverbadges1414bronzebadges Addacomment  |  -1 LTOsupportisbuggyandLTOrelatedissueshaslowestpriorityforcompilerdevelopers.Forexample:mingw-w64-x86_64-gcc-10.2.0-5worksfinewithlto,mingw-w64-x86_64-gcc-10.2.0-6segfaulswithbogusaddress.WehavejustnoticedthatwindowsCIstoppedworking. Pleasereferthefollowingissueasanexample. Share Improvethisanswer Follow editedMar6,2021at11:13 answeredMar6,2021at10:21 puchupuchu 2,94366goldbadges3333silverbadges5757bronzebadges Addacomment  |  YourAnswer ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers. Draftsaved Draftdiscarded Signuporlogin SignupusingGoogle SignupusingFacebook SignupusingEmailandPassword Submit Postasaguest Name Email Required,butnevershown PostYourAnswer Discard Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy Nottheansweryou'relookingfor?Browseotherquestionstaggedc++cperformancecompilationcompiler-optimizationoraskyourownquestion. TheOverflowBlog RewritingBashscriptsinGousingblackboxtesting FeaturedonMeta StackExchangeQ&AaccesswillnotberestrictedinRussia PlannedmaintenancescheduledforFriday,March18th,00:30-2:00UTC... Improvingthefirst-timeaskerexperience-Whatwasaskingyourfirst... AnnouncinganA/BtestforaTrendingsortoption Linked 668 WhenshouldIwritethekeyword'inline'forafunction/method? 131 EnforcingstatementorderinC++ 141 Benefitsofheader-onlylibraries 39 Whynotalwaysusecompileroptimization? 5 Whatisthecostofenablingmemoryprofiling? 4 loop_apply.o:filenotrecognized:Fileformatnotrecognized 3 WhatGCCoptimizationflagsandtechniquesaresafeacrossCPUs? Related 879 Whyuseapparentlymeaninglessdo-whileandif-elsestatementsinmacros? 302 WhydoyouhavetolinkthemathlibraryinC? 955 WhyshouldC++programmersminimizeuseof'new'? 39 Whynotalwaysusecompileroptimization? 87 G++optimizationbeyond-O3/-Ofast 1791 WhyshouldIuseapointerratherthantheobjectitself? 967 SwiftBetaperformance:sortingarrays 11 Whydoes'LinkTimeOptimization'resultsinlargerbinaries? HotNetworkQuestions PotionMiscibility:growthandfirebreath Canweconsiderthe(Famous)"TrolleyProblem"asanOptimizationProblem? Createa3Dsnowman Ridingontheshoulderofaroadwitha"donotdriveonshoulder"sign ArethereRussianseparatistsinotherEast-Europeancountries? Howtodrawarectanglefillingthepageaftersometext? IsthereaneasywaytorememberMaxwellrelationinthermodynamics? HowshouldIcleanoldchemically-deteriorating(especiallyrubberized)stickyplastics? Whydoesthevapourbubbleinaninkjetprintheadcollapsesofast? Pairsofintegersorderedbytheirexponentiation CouldGödel’sincompletenesstheorembecircumventedwithaquine? EfficientwaytodrawcracksinTikZ? WhydoesRussianrubleconversionratetoUSDremainsconstantat1to0.01? Canyouhavetwofirstlanguages? Whydoesthe2004LexusRX330havetwodifferenttiresizescompatiblewithit? TryingtofindaUKFantasysingleplayerRPGMagazinefromaround1992/3 Whyiseveryonegivingmetissues? Usexargsandkillinscriptwithoptionalsignal Alwaysthesethree Writingpowerforalog Gettingtermsandonlyevaluatespecificpartsofaseries Howwouldasecrettunnelbekeptentirelysecret? Isthereanantonymfortheverb'besiege'? DetectNumericalInstabilitywithLarge-scaleoptimizationproblems morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. default Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings  



請為這篇文章評分?